Re: ask the detail about 'read' and 'write' in sanlock

Friday, 7 June 2013

On Fri, Jun 07, 2013 at 10:27:43PM +0800, tsiren tsi wrote:
...
 In the diskio.c, if the scsi command was used, the read or write
timeout
 error would be more circumstantial. When the read/write timeout occurs, the
 scsi command could distinguish the actual reason, io busy, not ready,
 hardware error and so on. If the reason was io busy, we  can enlarge the
 timeout-time for robustness.

 What do you think about this? 
The only way I know of using scsi commands from userland is with sg.
sg is not very practical for i/o, and would be a big code change.
sanlock is used on multipath lvm LVs, which makes sg difficult.
Also, sanlock can be used on both devices and NFS files, and it is
nice to use the same code for both.

A more reasonable suggestion would be to keep the existing i/o paths
and use /dev/sg to get extra scsi information.  However, if you think
about how sanlock works, this extra information would not really help.
This is because it is not the host with i/o problems that needs extra
information, it is the *other* hosts that are monitoring it.  All
the other hosts would need to enlarge the timeout (and they would
also need this new timeout to be consistent among everyone.)

My suggestion is to monitor the io delays (and their causes) in your
environment.  If you find there are io delays, caused by system/storage
load, and they trigger timeouts (or come close to timing out), then
enlarge the io timeout used with sanlock_add_lockspace_timeout.

Dave

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: ask the detail about 'read' and 'write' in sanlock