On Tue, Mar 23, 2021 at 8:39 AM Richard Shaw <hobbes1069@gmail.com> wrote:
>
> I'm getting significant iowait while writing to a 100GB file.

High iowait means the system is under load and not CPU bound but IO
bound. It sounds like the drive is writing as fast as it can. What's
the workload?

I was syncing a 100GB blockchain, which means it was frequently getting appended to, so COW was really killing my I/O (iowait > 50%) but I had hoped that marking as nodatacow would be a 100% fix, however iowait would be quite low but jump up on a regular basis to 25%-50% occasionally locking up the GUI briefly. It was worst when the blockchain was syncing and I was rm the old COW version even after rm returned. I assume there was quite a bit of background tasks that were still updating.

Reproduce the GUI stalls and capture all of the
following:

sudo iostat -x -d -m 5

This is part of sysstat package (you can disable the service and timer
units it installs). Probably best to copy/paste into a plaint text
file and put it up in a file share service, most anything else is
going to wrap it, making it hard to read. A minute of capture while
the workload is proceeding is enough. Also capture a few

grep -R . /proc/pressure

Unfortunately it's now fully synced so I can easily reproduce the workload. I could move the file to another directory and start over but then then the file would start at 0 bytes and it takes a couple to three days for things to sync.

And each of these (workload doesn't need to be running)

lsblk -o NAME,FSTYPE,SIZE,FSUSE%,MOUNTPOINT,UUID,MIN-IO,SCHED,DISC-GRAN,MODEL
uname -r
mount | grep btrfs

$ lsblk -o NAME,FSTYPE,SIZE,FSUSE%,MOUNTPOINT,UUID,MIN-IO,SCHED,DISC-GRAN,MODEL

NAME FSTYPE SIZE FSUSE% MOUNTPOINT UUID MIN-IO SCHED DISC-GRAN MODEL
sda 2.7T 4096 bfq 0B ST3000DM008-2DM166
└─sda1 btrfs 2.7T 27% /home e80829f3-3dd3-486d-a553-dcf54b384c80 4096 bfq 0B
sr0 1024M 512 bfq 0B HL-DT-ST_BD-RE_WH14NS40
zram0 4G [SWAP] 4096 4K
nvme0n1 465.8G 512 none 512B Samsung SSD 970 EVO Plus 500GB
├─nvme0n1p1 vfat 600M 3% /boot/efi 98D5-E8CE 512 none 512B
├─nvme0n1p2 ext4 1G 28% /boot 48295095-3e89-4d32-905f-bbffcd2051ff 512 none 512B
└─nvme0n1p3 btrfs 464.2G 4% /var eca99700-77f4-44ea-b8d5-26673abc4d65 512 none 512B

>I have already made it nocow by copying it to another directory, marking the director nocow (+C) and using cat <oldfile> <newfile> to re-create it from scratch.
>
> I was under the impression that this should fix the problem.

It depends on the workload for this file. Was the 100G file fallocated
or created as a sparse file? File format?

I assume for a blockchain, starts small and just grows / appended to.

>
> On a tangent, it took about 30 minutes to delete the old file... My system is a Ryzen 5 3600 w/ 16GB or memory but it is a spinning disk. I use an NVME for the system and the spinning disk for /home.

filefrag 100G.file
What's the path to the file?

$ filefrag /home/richard/.bitmonero/lmdb/data.mdb

/home/richard/.bitmonero/lmdb/data.mdb: 1424 extents found

However, I let a rebalance run overnight.

Thanks,

Richard