Hi, The ABRT team had an idea - ABRT can catch problems from containers and ABRT displays problems in Cockpit. So why not to show problems that happened in containers directly under container overview?
Here I have a short user story for you (inspired by existing one): Robert Paulsson is a developer at a small IT company. He's a senior engineer, but got tossed the sysadmin hat at the company. The company have 3 internal servers that runs various services that the R&D unit needs for their day to day basis. He is responsible for these servers and services that are running on them.
Robert logs into the system via Cockpit and goes to the containers page. (version 1. - He immediately sees that something is wrong with running container (there is exclamation mark, see below), so he opens it to learn more.) (version 2. His colleagues brought to his notice, that they are not able to connect to one internal instance of their product. He knows in which container this instance is running, so he opens it to learn more.) He notices 'Problems' tab, opens it and sees that there is some problem. Clicks on it and is redirected into ABRT journal log. After reporting he finds out that the issue is known. He is able to fix this issue.
How it could work: (navigate to Containers) Open some container with the dropdown on the left of container name. You should see container details. Our idea was to add another tab 'Problems' next to details. It should contain list of all problems that occurred in that container. https://ibb.co/ePxerv (something like this) Each item should be clickable and redirect into logs. Maybe it would be nice to display some exclamation mark next to state in the default container view (before dropdown-ing it). https://ibb.co/fU7qya (something like this)
What do you think of this? Does it sound like a good idea? (The mock-ups are only for better understanding. Definitely I would like to see design from you)
All the best, Matt Marusak ABRT team
I like the idea. Andreas, don't we already have designs for containers that have gone "bad" (such as need security updates, failed scans) ...
This seems like related info. Where a container exited due to crashing and has further info to aid investigation.
Cheers,
Stef
On 06.09.2017 17:15, Matej Marusak wrote:
Hi, The ABRT team had an idea - ABRT can catch problems from containers and ABRT displays problems in Cockpit. So why not to show problems that happened in containers directly under container overview?
Here I have a short user story for you (inspired by existing one): Robert Paulsson is a developer at a small IT company. He's a senior engineer, but got tossed the sysadmin hat at the company. The company have 3 internal servers that runs various services that the R&D unit needs for their day to day basis. He is responsible for these servers and services that are running on them.
Robert logs into the system via Cockpit and goes to the containers page. (version 1. - He immediately sees that something is wrong with running container (there is exclamation mark, see below), so he opens it to learn more.) (version 2. His colleagues brought to his notice, that they are not able to connect to one internal instance of their product. He knows in which container this instance is running, so he opens it to learn more.) He notices 'Problems' tab, opens it and sees that there is some problem. Clicks on it and is redirected into ABRT journal log. After reporting he finds out that the issue is known. He is able to fix this issue.
How it could work: (navigate to Containers) Open some container with the dropdown on the left of container name. You should see container details. Our idea was to add another tab 'Problems' next to details. It should contain list of all problems that occurred in that container. https://ibb.co/ePxerv (something like this) Each item should be clickable and redirect into logs. Maybe it would be nice to display some exclamation mark next to state in the default container view (before dropdown-ing it). https://ibb.co/fU7qya (something like this)
What do you think of this? Does it sound like a good idea? (The mock-ups are only for better understanding. Definitely I would like to see design from you)
All the best, Matt Marusak ABRT team _______________________________________________ cockpit-devel mailing list -- cockpit-devel@lists.fedorahosted.org To unsubscribe send an email to cockpit-devel-leave@lists.fedorahosted.org
On 12.09.2017 13:17, Stef Walter wrote:
I like the idea. Andreas, don't we already have designs for containers that have gone "bad" (such as need security updates, failed scans) ...
This seems like related info. Where a container exited due to crashing and has further info to aid investigation.
As far as implementation, shouldn't this be integrated into kpod and make such information accessible in a standard way?
https://github.com/kubernetes-incubator/cri-o
Cheers,
Stef
Cheers,
Stef
On 06.09.2017 17:15, Matej Marusak wrote:
Hi, The ABRT team had an idea - ABRT can catch problems from containers and ABRT displays problems in Cockpit. So why not to show problems that happened in containers directly under container overview?
Here I have a short user story for you (inspired by existing one): Robert Paulsson is a developer at a small IT company. He's a senior engineer, but got tossed the sysadmin hat at the company. The company have 3 internal servers that runs various services that the R&D unit needs for their day to day basis. He is responsible for these servers and services that are running on them.
Robert logs into the system via Cockpit and goes to the containers page. (version 1. - He immediately sees that something is wrong with running container (there is exclamation mark, see below), so he opens it to learn more.) (version 2. His colleagues brought to his notice, that they are not able to connect to one internal instance of their product. He knows in which container this instance is running, so he opens it to learn more.) He notices 'Problems' tab, opens it and sees that there is some problem. Clicks on it and is redirected into ABRT journal log. After reporting he finds out that the issue is known. He is able to fix this issue.
How it could work: (navigate to Containers) Open some container with the dropdown on the left of container name. You should see container details. Our idea was to add another tab 'Problems' next to details. It should contain list of all problems that occurred in that container. https://ibb.co/ePxerv (something like this) Each item should be clickable and redirect into logs. Maybe it would be nice to display some exclamation mark next to state in the default container view (before dropdown-ing it). https://ibb.co/fU7qya (something like this)
What do you think of this? Does it sound like a good idea? (The mock-ups are only for better understanding. Definitely I would like to see design from you)
All the best, Matt Marusak ABRT team _______________________________________________ cockpit-devel mailing list -- cockpit-devel@lists.fedorahosted.org To unsubscribe send an email to cockpit-devel-leave@lists.fedorahosted.org
cockpit-devel mailing list -- cockpit-devel@lists.fedorahosted.org To unsubscribe send an email to cockpit-devel-leave@lists.fedorahosted.org
On 12.09.2017 13:17, Stef Walter wrote:
As far as implementation, shouldn't this be integrated into kpod and make such information accessible in a standard way?
Are information from kpod visible in Cockpit?
And what about containers that do not come from Kubernetes?
Cheers,
Stef
On 2017-09-06 17:15, Matej Marusak wrote:
(version 2. His colleagues brought to his notice, that they are not able to connect to one internal instance of their product. He knows in which container this instance is running, so he opens it to learn more.) He notices 'Problems' tab, opens it and sees that there is some problem. Clicks on it and is redirected into ABRT journal log. After reporting he finds out that the issue is known. He is able to fix this issue.
I like the version 2 story better, because I seldom find myself leisure-surfing my systems looking for problems.
The designs looks good to me! - Andreas
Hello Matej,
Matej Marusak [2017-09-06 15:15 -0000]:
The ABRT team had an idea - ABRT can catch problems from containers and ABRT displays problems in Cockpit.
Does ABRT actually support that already? I'm asking because a colleague and me worked on that feature in Apport a few years ago. It's actually quite tricky to get it right, robust [1], and useful [2], i. e. letting the crash handler on the host deal with a container process crash isn't going to be able to collect much information. A simple core dump is of course still useful for manual investigation, it's just rather laborious to get useful details out of it.
The concept of ABRT/Apport collecting "standard" operating system crash reports works rather well with full system containers (LXC or nspawn style) where the guest OS can run ABRT and the data collection, but my gut feeling is that you rather aim towards docker application containers here?
In the docker case, what do you intend to do with the ABRT reports? They most certainly shouldn't be sent to the Fedora crash database, they belong to the author of the affected docker container - but hub.docker.io doesn't have any kind of crash database or even a bug tracker.
Thanks,
Martin
[1] We first attempted to let the host's apport create the core dump and report, but this leads to a lot of corner cases such as http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-1318, http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-1324, or http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-1325.
[2] In order to collect any information in addition to the core dump, such as symbols, /proc/pid/maps files, OS and package versions, package-specific hooks to provide additional information, etc., assembling the crash report needs the file system and permissions from *inside* the container. Also, you don't even know what kind of operating system/which packages are running in the container.
Hello Matej,
Matej Marusak [2017-09-06 15:15 -0000]:
Does ABRT actually support that already? I'm asking because a colleague and me worked on that feature in Apport a few years ago. It's actually quite tricky to get it right, robust [1], and useful [2], i. e. letting the crash handler on the host deal with a container process crash isn't going to be able to collect much information. A simple core dump is of course still useful for manual investigation, it's just rather laborious to get useful details out of it.
It does. Or it depends on your understanding of 'support':) We do not know what package does the crash comes from - firstly because nobody implemented that but secondly it seems that ppl who create images do not really care about packages anymore - just copy what you need into image. We know container typ, container id, image and docker_inspect. Also cgroups, environ, limits, maps, mountinfo, namespaces, pwd... And as I look into that, we do not know OS, but know image, so it is lookable.
The concept of ABRT/Apport collecting "standard" operating system crash reports works rather well with full system containers (LXC or nspawn style) where the guest OS can run ABRT and the data collection, but my gut feeling is that you rather aim towards docker application containers here?
ABRT supports both types of containers. But yes, we aim for docker here.
In the docker case, what do you intend to do with the ABRT reports? They most certainly shouldn't be sent to the Fedora crash database, they belong to the author of the affected docker container - but hub.docker.io doesn't have any kind of crash database or even a bug tracker.
This is great question. Right now we are finishing new feature and prepared FAF image. The idea is that if you run huge amount of containers (such as OpenShift) you deploy FAF as well and set ABRT to report into your own FAF. This also supports reporting crashes without packages.
Thanks,
Martin
[1] We first attempted to let the host's apport create the core dump and report, but this leads to a lot of corner cases such as http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-1318, http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-1324, or http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2015-1325.
[2] In order to collect any information in addition to the core dump, such as symbols, /proc/pid/maps files, OS and package versions, package-specific hooks to provide additional information, etc., assembling the crash report needs the file system and permissions from *inside* the container. Also, you don't even know what kind of operating system/which packages are running in the container.
Any other comments? So what is the resolution of those comments? Should I continue on this? Or would you rather see this implemented through some other tools such as kpod?
cockpit-devel@lists.fedorahosted.org