That network looks fine to me
I would try v3. I have had bad luck many times with v4 on a variety of different kernels. If the code is recovering from something related to a bug 45 seconds might be right to decide something that was working is no longer working.
I am not sure any amount of debugging would help (without having really verbose kernel debugging).
What is the current kernel you are running and trying a new one might be worth it. Though I don't see nfs changes/fixes listed in the 5.14.* or 5.13.* kernels changelog in the rpm file (rpm -q --changelog) and there are only a few listed at kernel.org for those kernels.
On Tue, Oct 5, 2021 at 11:04 AM Terry Barnaby terry1@beam.ltd.uk wrote:
sar -n EDEV reports all 0's all around then. There are some rxdrop/s of 0.02 occasionally on eno1 through the day (about 20 of these with minute based sampling). Today ifconfig lists 39 dropped RX packets out of 2357593. Not sure why there are some dropped packets. "ethtool -S eno1" doesn't seem to list any particular issues.
sar -n DEV does not appear to show anything at 10:51:30:
IFACE rxpck/s txpck/s rxkB/s txkB/s rxcmp/s txcmp/s rxmcst/s %ifutil10:44:04 eno1 18.29 19.54 5.81 5.25 0.00 0.00 0.00 0.00 10:45:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:45:04 eno1 20.45 22.52 5.96 5.79 0.00 0.00 0.00 0.00 10:46:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:46:04 eno1 22.50 24.26 7.52 7.88 0.00 0.00 0.00 0.01 10:47:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:47:04 eno1 21.53 22.75 7.27 5.71 0.00 0.00 0.00 0.01 10:48:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:48:04 eno1 222.03 284.24 173.49 367.55 0.00 0.00 0.00 0.30 10:49:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:49:04 eno1 11.83 12.28 2.74 3.98 0.00 0.00 0.00 0.00 10:50:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:50:04 eno1 15.72 14.13 4.33 3.80 0.00 0.00 0.00 0.00 10:51:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:51:04 eno1 11.00 10.53 3.48 2.63 0.00 0.00 0.00 0.00 10:52:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:52:04 eno1 13.48 13.45 4.21 4.56 0.00 0.00 0.00 0.00 10:53:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 10:53:04 eno1 21.76 23.98 6.99 10.26 0.00 0.00 0.00 0.01 10:54:04 lo 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
Also NFV4 uses TCP/IP I think by default and TCP/IP retries would be much quicker than 45 seconds. I do feel there is an issue in the NFS code somewhere, but I am biased about the speed of NFS directory access these days !
On 04/10/2021 17:06, Roger Heflin wrote:
Since it is recovering from it, maybe it is losing packets inside the network, what does "sar -n DEV" and "sar -n EDEV" look like during that time on both client seeing the pause and the server.
EDEV is typically all zeros unless something is lost. if something is being lost and it matches the times the time of hangs that could be it.
users mailing list -- users@lists.fedoraproject.org To unsubscribe send an email to users-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@lists.fedoraproject.org Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure