276 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.85 % [jbd2/sda3-8]
When I booted, drive was clean!!
All my current running processes (i.e. ones I started), are quiescent.
Whith glitches like these, it makes a LOT more difficult to honetsly tell students why they should switch to Fedora, "because it is better????".
On 05/09/2018 10:58 AM, JD wrote:
276 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.85 % [jbd2/sda3-8]
When I booted, drive was clean!!
All my current running processes (i.e. ones I started), are quiescent.
Whith glitches like these, it makes a LOT more difficult to honetsly tell students why they should switch to Fedora, "because it is better????".
Whoa, whoa! jdb2 is the ext4 journalling daemon. It wouldn't be sucking up that kind of CPU continuously unless something is flogging the filesystem and that's what you have to look for.
It's not clear which device is getting hammered, so I'd run something like "iostat -p ALL 2" or something to see which device is getting poked, then chase things down from there (such as seeing what has open files on that device via lsof). Find the offending process, kill it and I'll bet that calms down. My first guess is something is doing a boatload of logging and that may not be obvious from a simple "ps" listing.
That being said, keep in mind that Fedora is bleeding edge stuff. You're undoubtedly going to get glitches but I seriously doubt this is a Fedora-specific issue. If you want stable, long lived stuff, use CentOS. ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com - - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - - - - Huked on foniks reely wurked for me! - ----------------------------------------------------------------------
On 05/09/2018 10:58 AM, JD wrote:
276 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.85 % [jbd2/sda3-8]
What command produces this output? From just that line, it looks like there is no data transferred at all. I've never understood what 100% of I/O bandwidth means. How is the maximum calculated? Or does it mean that out of all the data transferred, 100% came from this process?
On 05/09/2018 03:23 PM, Samuel Sieb wrote:
On 05/09/2018 10:58 AM, JD wrote:
276 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.85 % [jbd2/sda3-8]
What command produces this output?
It comes from iotop.
From just that line, it looks like there is no data transferred at all. I've never understood what 100% of I/O bandwidth means. How is the maximum calculated? Or does it mean that out of all the data transferred, 100% came from this process?
That's not what it means. The first percentage is how much of that task's execution time it spent being swapped in and out (0.00%). The second is how much of its execution time it spent waiting on I/O to complete (99.85% in this case).
Here it indicates someone is flogging an ext4 filesystem fairly hard (an indexer walking a directory tree, something logging, etc.). That also indicates he has 7 partitions using ext4 filesystems, but doesn't say which one is getting flogged. That's why I suggested an "iostat -p ALL 2" to identify the active device, followed by an "lsof" to see what processes have that device open to hunt the culprit down. ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com - - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - - - - Admitting you have a problem is the first step toward getting - - medicated for it. -- Jim Evarts (http://www.TopFive.com) - ----------------------------------------------------------------------
On 05/09/2018 05:37 PM, Rick Stevens wrote:
On 05/09/2018 03:23 PM, Samuel Sieb wrote:
On 05/09/2018 10:58 AM, JD wrote:
276 be/3 root 0.00 B/s 0.00 B/s 0.00 % 99.85 % [jbd2/sda3-8]What command produces this output?
It comes from iotop.
From just that line, it looks like there is no data transferred at all. I've never understood what 100% of I/O bandwidth means. How is the maximum calculated? Or does it mean that out of all the data transferred, 100% came from this process?
That's not what it means. The first percentage is how much of that task's execution time it spent being swapped in and out (0.00%). The second is how much of its execution time it spent waiting on I/O to complete (99.85% in this case).
Here it indicates someone is flogging an ext4 filesystem fairly hard (an indexer walking a directory tree, something logging, etc.). That also indicates he has 7 partitions using ext4 filesystems, but doesn't say which one is getting flogged. That's why I suggested an "iostat -p ALL 2" to identify the active device, followed by an "lsof" to see what processes have that device open to hunt the culprit down.
OK - Thanx!. Will try that next time it gets going again.
On 05/09/2018 04:37 PM, Rick Stevens wrote:
On 05/09/2018 03:23 PM, Samuel Sieb wrote:
From just that line, it looks like there is no data transferred at all. I've never understood what 100% of I/O bandwidth means. How is the maximum calculated? Or does it mean that out of all the data transferred, 100% came from this process?
That's not what it means. The first percentage is how much of that task's execution time it spent being swapped in and out (0.00%). The second is how much of its execution time it spent waiting on I/O to complete (99.85% in this case).
Right, so it's not actually transferring any data, it's just waiting on I/O for some reason.
On Wed, 9 May 2018 16:55:03 -0700 Samuel Sieb wrote:
Right, so it's not actually transferring any data, it's just waiting on I/O for some reason.
If it is a brand new just formatted ext4 system mounted for the first time, the system writes all the initial journal data structures (or something like that) as soon as it is mounted. Once it finishes, it never has to do it again.
On 05/09/2018 06:08 PM, Tom Horsley wrote:
On Wed, 9 May 2018 16:55:03 -0700 Samuel Sieb wrote:
Right, so it's not actually transferring any data, it's just waiting on I/O for some reason.
If it is a brand new just formatted ext4 system mounted for the first time, the system writes all the initial journal data structures (or something like that) as soon as it is mounted. Once it finishes, it never has to do it again.
Hi Tom, there are 4 partitions on /dev/sda. sda1 and sda2 are for ms windoze (one of them contains recovery tools). sda3 is the fedora boot partition, and sda4 is the fedora swap. The installation has been working find for well over a year. So, this is not a recent installation. But, the i/o load is indeed rather recent. It makes watching YT a very annoying experience of stop and go video. But after the jbod task finishes, I can watch YT without the stop and go annoyance.
On 05/09/2018 04:55 PM, Samuel Sieb wrote:
On 05/09/2018 04:37 PM, Rick Stevens wrote:
On 05/09/2018 03:23 PM, Samuel Sieb wrote:
From just that line, it looks like there is no data transferred at all. I've never understood what 100% of I/O bandwidth means. How is the maximum calculated? Or does it mean that out of all the data transferred, 100% came from this process?
That's not what it means. The first percentage is how much of that task's execution time it spent being swapped in and out (0.00%). The second is how much of its execution time it spent waiting on I/O to complete (99.85% in this case).
Right, so it's not actually transferring any data, it's just waiting on I/O for some reason.
Correct, and it's waiting because something is writing to the disk and blocking its ability to update the journal. Not completely blocking it as it spent 0.15% of its time doing something, uhm, "useful" during that sample. It's not even clear that there's an issue since I'd expect stuff like that to occur sporadically during times of heavy I/O. It's also common if you've just formatted an ext4 filesystem as it does, essentially, a "quick" format first, and slowly initializes the rest of the partition over some period of time. The idea is to make the filesystem usable quickly and get the housekeeping done in its spare time.
Assuming this isn't a fresh "mke2fs -t ext4" on the partition, I'd still try to find what's beating on that partition before messing around with anything drastic like changing the journal commit timing. The jdb2 thing in iotop (if it happens a LOT) indicates there's something weird going on--but it is NOT the cause of the problem. Don't shoot the messenger! ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks@alldigital.com - - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - - - - You possess a mind not merely twisted, but actually sprained. - ----------------------------------------------------------------------
On 05/09/2018 06:24 PM, Rick Stevens wrote:
On 05/09/2018 04:55 PM, Samuel Sieb wrote:
On 05/09/2018 04:37 PM, Rick Stevens wrote:
On 05/09/2018 03:23 PM, Samuel Sieb wrote:
From just that line, it looks likethere is no data transferred at all. I've never understood what 100% of I/O bandwidth means. How is the maximum calculated? Or does it mean that out of all the data transferred, 100% came from this process?
That's not what it means. The first percentage is how much of that task's execution time it spent being swapped in and out (0.00%). The second is how much of its execution time it spent waiting on I/O to complete (99.85% in this case).
Right, so it's not actually transferring any data, it's just waiting on I/O for some reason.
Correct, and it's waiting because something is writing to the disk and blocking its ability to update the journal. Not completely blocking it as it spent 0.15% of its time doing something, uhm, "useful" during that sample. It's not even clear that there's an issue since I'd expect stuff like that to occur sporadically during times of heavy I/O. It's also common if you've just formatted an ext4 filesystem as it does, essentially, a "quick" format first, and slowly initializes the rest of the partition over some period of time. The idea is to make the filesystem usable quickly and get the housekeeping done in its spare time.
Assuming this isn't a fresh "mke2fs -t ext4" on the partition, I'd still try to find what's beating on that partition before messing around with anything drastic like changing the journal commit timing. The jdb2 thing in iotop (if it happens a LOT) indicates there's something weird going on--but it is NOT the cause of the problem. Don't shoot the messenger!
Yes. As I replied earlier, next time it happens, I will post the output of iostat and lsof