On Tue, Mar 23, 2021 at 7:11 PM Chris Murphy <lists@colorremedies.com> wrote:
On Tue, Mar 23, 2021 at 8:39 AM Richard Shaw <hobbes1069@gmail.com> wrote:
>
> I'm getting significant iowait while writing to a 100GB file.

High iowait means the system is under load and not CPU bound but IO
bound. It sounds like the drive is writing as fast as it can.   What's
the workload?

I was syncing a 100GB blockchain, which means it was frequently getting appended to, so COW was really killing my I/O (iowait > 50%) but I had hoped that marking as nodatacow would be a 100% fix, however iowait would be quite low but jump up on a regular basis to 25%-50% occasionally locking up the GUI briefly. It was worst when the blockchain was syncing and I was rm the old COW version even after rm returned. I assume there was quite a bit of background tasks that were still updating.


Reproduce the GUI stalls and capture all of the
following:

sudo iostat -x -d -m 5

This is part of sysstat package (you can disable the service and timer
units it installs). Probably best to copy/paste into a plaint text
file and put it up in a file share service, most anything else is
going to wrap it, making it hard to read. A minute of capture while
the workload is proceeding is enough. Also capture a few

grep -R . /proc/pressure

Unfortunately it's now fully synced so I can easily reproduce the workload. I could move the file to another directory and start over but then then the file would start at 0 bytes and it takes a couple to three days for things to sync.  


And each of these (workload doesn't need to be running)

lsblk -o NAME,FSTYPE,SIZE,FSUSE%,MOUNTPOINT,UUID,MIN-IO,SCHED,DISC-GRAN,MODEL
uname -r
mount | grep btrfs

 $ lsblk -o NAME,FSTYPE,SIZE,FSUSE%,MOUNTPOINT,UUID,MIN-IO,SCHED,DISC-GRAN,MODEL
NAME        FSTYPE   SIZE FSUSE% MOUNTPOINT UUID                                 MIN-IO SCHED DISC-GRAN MODEL
sda                  2.7T                                                          4096 bfq          0B ST3000DM008-2DM166
└─sda1      btrfs    2.7T    27% /home      e80829f3-3dd3-486d-a553-dcf54b384c80   4096 bfq          0B
sr0                 1024M                                                           512 bfq          0B HL-DT-ST_BD-RE_WH14NS40
zram0                  4G        [SWAP]                                            4096              4K
nvme0n1            465.8G                                                           512 none       512B Samsung SSD 970 EVO Plus 500GB
├─nvme0n1p1 vfat     600M     3% /boot/efi  98D5-E8CE                               512 none       512B
├─nvme0n1p2 ext4       1G    28% /boot      48295095-3e89-4d32-905f-bbffcd2051ff    512 none       512B
└─nvme0n1p3 btrfs  464.2G     4% /var       eca99700-77f4-44ea-b8d5-26673abc4d65    512 none       512B

 
>I have already made it nocow by copying it to another directory, marking the director nocow (+C) and using cat <oldfile> <newfile> to re-create it from scratch.
>
> I was under the impression that this should fix the problem.

It depends on the workload for this file. Was the 100G file fallocated
or created as a sparse file? File format?

I assume for a blockchain, starts small and just grows / appended to.

 
>
> On a tangent, it took about 30 minutes to delete the old file... My system is a Ryzen 5 3600 w/ 16GB or memory but it is a spinning disk. I use an NVME for the system and the spinning disk for /home.

filefrag 100G.file
What's the path to the file?

$ filefrag /home/richard/.bitmonero/lmdb/data.mdb
/home/richard/.bitmonero/lmdb/data.mdb: 1424 extents found 

However, I let a rebalance run overnight. 

Thanks,
Richard