Hello,
for a new platform I am building in Golang, I am trying to use sanlock,
and one of the rough edges is that it does not have Go bindings, and I am not a fan of using cgo for this, so I am using 'sanlock client command' with a bunch of chained commands to do what I need to do under a resource lease.
One of the issues I see with sanlock client command is that it relies on the sanlock daemon detecting the exit of the process to clean up the reservation instead of an explicit sanlock_release.
In my view this opens up for race conditions with multiple commands run in series, where we may get EAGAIN for no other reason than the demon not having fully processed the command exit.
Another issue I see is that often I need to run commands in a series, all under lock, where the order is important, and any failure should interrupt the command sequence.
This can be done with
sanlock client command -r ... -c /bin/sh 'command1 && command2'...
but I would prefer to keep shell script out of the question.
So I was thinking of a new sanlock client spawn -r ... -c arg0 arg1 arg2 ... -c ...
with repeatable -c options, so that multiple commands can be spawned in a sequence, and any failure would stop the sequence.
the commands would be run in a subprocess, with an explicit call to sanlock_release() after execution of all commands, so that on return from sanlock client command, we know that the lease is released.
What do you think, is this something interesting for upstream sanlock?
Thanks,
Claudio
On Sat, Mar 07, 2026 at 03:43:24PM +0100, Claudio Fontana via sanlock-devel wrote:
Hello,
for a new platform I am building in Golang, I am trying to use sanlock,
and one of the rough edges is that it does not have Go bindings, and I am not a fan of using cgo for this, so I am using 'sanlock client command' with a bunch of chained commands to do what I need to do under a resource lease.
One of the issues I see with sanlock client command is that it relies on the sanlock daemon detecting the exit of the process to clean up the reservation instead of an explicit sanlock_release.
In my view this opens up for race conditions with multiple commands run in series, where we may get EAGAIN for no other reason than the demon not having fully processed the command exit.
Another issue I see is that often I need to run commands in a series, all under lock, where the order is important, and any failure should interrupt the command sequence.
This can be done with
sanlock client command -r ... -c /bin/sh 'command1 && command2'...
but I would prefer to keep shell script out of the question.
So I was thinking of a new sanlock client spawn -r ... -c arg0 arg1 arg2 ... -c ...
with repeatable -c options, so that multiple commands can be spawned in a sequence, and any failure would stop the sequence.
the commands would be run in a subprocess, with an explicit call to sanlock_release() after execution of all commands, so that on return from sanlock client command, we know that the lease is released.
What do you think, is this something interesting for upstream sanlock?
Hi, so you want to acquire a resource lease, run a series of commands while that lease remains held, and then release the lease when they are done. The difficulty is deciding which process remains running throughout this entire sequence, which the sanlock daemon considers the lease holder.
IIUC, you're suggesting that a new "sanlock client spawn" would register itself as the supervising process (like "sanlock client command"), acquire the lease, then fork/exec each command specified by a series of -c. When all commands are done (or any fails), then the original "sanlock client spawn" process would release the lease. That sounds like a reasonable extension. A couple solvable issues come to mind related to recovery.
The sanlock daemon assumes by default that a connected process is the one both holding the lease and writing to the shared storage. If that process exits, the resource lease from the dead process held can be safely released automatically. In your use case (and often in other cases), it's not just the connected process writing to the shared storage. If the connected process exits, it doesn't mean that the resource lease can be safely dropped, because some other process may still be writing. lvmlockd has this same issue. The way to deal with this is for "sanlock client spawn" to:
1. Use SANLK_RES_PERSISTENT for the resource lease, so that the lease is not dropped if the supervising process exits without calling release explicitly. (e.g. if you kill the sanlock client spawn process, but one of the -c commands it forked is still writing to the shared storage.) The held lease will remain held as an orphan (clearing that orphan lease would require some other procedure specific to what you're doing.) If you want to try this, using -P 1 with sanlock client command will use PERSISTENT leases.
2. Call sanlock_restrict(SANLK_RESTRICT_SIGKILL | SANLK_RESTRICT_SIGTERM) after sanlock_register() from "sanlock client spawn". If the sanlock daemon cannot renew its lockspace lease while "sanlock client spawn" is running, the sanlock daemon will not simply use SIGTERM or SIGKILL to terminate the supervising spawn process to do recovery.
Dave
Hi,
On 3/9/26 20:45, David Teigland via sanlock-devel wrote:
On Sat, Mar 07, 2026 at 03:43:24PM +0100, Claudio Fontana via sanlock-devel wrote:
Hello,
for a new platform I am building in Golang, I am trying to use sanlock,
and one of the rough edges is that it does not have Go bindings, and I am not a fan of using cgo for this, so I am using 'sanlock client command' with a bunch of chained commands to do what I need to do under a resource lease.
One of the issues I see with sanlock client command is that it relies on the sanlock daemon detecting the exit of the process to clean up the reservation instead of an explicit sanlock_release.
In my view this opens up for race conditions with multiple commands run in series, where we may get EAGAIN for no other reason than the demon not having fully processed the command exit.
Another issue I see is that often I need to run commands in a series, all under lock, where the order is important, and any failure should interrupt the command sequence.
This can be done with
sanlock client command -r ... -c /bin/sh 'command1 && command2'...
but I would prefer to keep shell script out of the question.
So I was thinking of a new sanlock client spawn -r ... -c arg0 arg1 arg2 ... -c ...
with repeatable -c options, so that multiple commands can be spawned in a sequence, and any failure would stop the sequence.
the commands would be run in a subprocess, with an explicit call to sanlock_release() after execution of all commands, so that on return from sanlock client command, we know that the lease is released.
What do you think, is this something interesting for upstream sanlock?
Hi, so you want to acquire a resource lease, run a series of commands while that lease remains held, and then release the lease when they are done. The difficulty is deciding which process remains running throughout this entire sequence, which the sanlock daemon considers the lease holder.
IIUC, you're suggesting that a new "sanlock client spawn" would register itself as the supervising process (like "sanlock client command"), acquire the lease, then fork/exec each command specified by a series of -c. When all commands are done (or any fails), then the original "sanlock client spawn" process would release the lease. That sounds like a reasonable extension. A couple solvable issues come to mind related to recovery.
Yes, that is what I had in mind.
The sanlock daemon assumes by default that a connected process is the one both holding the lease and writing to the shared storage. If that process exits, the resource lease from the dead process held can be safely released automatically. In your use case (and often in other cases), it's not just the connected process writing to the shared storage. If the connected process exits, it doesn't mean that the resource lease can be safely dropped, because some other process may still be writing. lvmlockd has this same issue. The way to deal with this is for "sanlock client spawn" to:
- Use SANLK_RES_PERSISTENT for the resource lease, so that the lease is
not dropped if the supervising process exits without calling release explicitly. (e.g. if you kill the sanlock client spawn process, but one of the -c commands it forked is still writing to the shared storage.) The held lease will remain held as an orphan (clearing that orphan lease would require some other procedure specific to what you're doing.) If you want to try this, using -P 1 with sanlock client command will use PERSISTENT leases.
- Call sanlock_restrict(SANLK_RESTRICT_SIGKILL | SANLK_RESTRICT_SIGTERM)
after sanlock_register() from "sanlock client spawn". If the sanlock daemon cannot renew its lockspace lease while "sanlock client spawn" is running, the sanlock daemon will not simply use SIGTERM or SIGKILL to terminate the supervising spawn process to do recovery.
Dave
I have an initial implementation that I would like to share early, so you know what I am doing, and you might have some comments,
but I still have to test it. Thanks!
Claudio
----
commit 6f701e502aadb0e8385f28aeba1b2c281f1ecfab Author: Claudio Fontana cfontana@suse.de Date: Wed Mar 11 11:50:41 2026 +0100
sanlock: add sanlock client spawn action
this new action overcomes some limitations of action
sanlock client command
by allowing to run multiple commands in succession, all under the same lease, as long as they complete successfully (exit status 0), all spawned as child processes using fork() and exec().
After all processes are successfully executed, or at the first failure, the lease is released explicitly with sanlock_release() before exiting,
removing the time window in which the resource remains busy, just waiting for the POLLHUP in the daemon to be processed.
sanlock_internal.h: add new ACT_SPAWN action, and its options
main.c: implement command line parsing and the ACT_SPAWN action
sanlock.8: document the new action and its options
Signed-off-by: Claudio Fontana cfontana@suse.de
diff --git a/src/main.c b/src/main.c index 8317509..db0afe4 100644 --- a/src/main.c +++ b/src/main.c @@ -2319,6 +2319,7 @@ static void print_usage(void) printf("sanlock client inq_lockspace -s LOCKSPACE\n"); printf("sanlock client rem_lockspace -s LOCKSPACE\n"); printf("sanlock client command -r RESOURCE [-h 0|1] -c <path> <args>\n"); + printf("sanlock client spawn -r RESOURCE [-h 0|1] [-O 0|1] [-P 0|1] -c COUNT CMD [ARG...] [-c COUNT CMD [ARG...]]\n"); printf("sanlock client acquire -r RESOURCE -p|-C <id>\n"); printf("sanlock client convert -r RESOURCE -p|-C <id>\n"); printf("sanlock client release -r RESOURCE -p|-C <id>\n"); @@ -2369,6 +2370,52 @@ static void print_usage(void) printf("\n"); }
+/* + * read_spawn_command - parse a single "-c COUNT CMD [ARG...]" option group + * + * Called with I pointing at the COUNT argument (the option argument + * that follows -c). Parses COUNT and the subsequent CMD [ARG...] words, + * appends a new entry to com.spawn_args, and returns the updated index + * pointing at the next option. The caller can then check argv[i] to decide + * whether another -c group follows. + * + * Returns the updated index (>= 0) on success, or a negative error code. + */ +static int read_spawn_command(int i, int argc, char *argv[]) +{ + int count, k; + + if (i >= argc) { + log_tool("spawn -c requires COUNT and CMD"); + return -EINVAL; + } + count = strtol(argv[i], NULL, 0); + if (i + count >= argc) { + log_tool("spawn -c COUNT past the end of arguments"); + return -EINVAL; + } + + com.spawn_args = realloc(com.spawn_args, + (com.spawn_count + 1) * sizeof(struct spawn_cmd)); + if (!com.spawn_args) + return -ENOMEM; + + com.spawn_args[com.spawn_count].argc = count; + com.spawn_args[com.spawn_count].argv = malloc((count + 1) * sizeof(char *)); + if (!com.spawn_args[com.spawn_count].argv) + return -ENOMEM; + + for (k = 0; k < count; k++) { + i += 1; + com.spawn_args[com.spawn_count].argv[k] = strdup(argv[i]); + if (!com.spawn_args[com.spawn_count].argv[k]) + return -ENOMEM; + } + com.spawn_args[com.spawn_count].argv[count] = NULL; + com.spawn_count += 1; + return i + 1; +} + static int read_command_line(int argc, char *argv[]) { char optchar; @@ -2447,6 +2494,8 @@ static int read_command_line(int argc, char *argv[]) com.action = ACT_REM_LOCKSPACE; else if (!strcmp(act, "command")) com.action = ACT_COMMAND; + else if (!strcmp(act, "spawn")) + com.action = ACT_SPAWN; else if (!strcmp(act, "acquire")) com.action = ACT_ACQUIRE; else if (!strcmp(act, "convert")) @@ -2622,7 +2671,8 @@ static int read_command_line(int argc, char *argv[]) com.wait = atoi(optionarg); break; case 'h': - if (com.action == ACT_GETS || com.action == ACT_CLIENT_READ || com.action == ACT_COMMAND) + if (com.action == ACT_GETS || com.action == ACT_CLIENT_READ || + com.action == ACT_COMMAND || com.action == ACT_SPAWN) com.get_hosts = atoi(optionarg); else com.high_priority = atoi(optionarg); @@ -2720,7 +2770,20 @@ static int read_command_line(int argc, char *argv[]) break;
case 'c': - begin_command = 1; + if (com.action == ACT_SPAWN) { + /* + * repeated -c COUNT CMD [ARG...] where COUNT includes the + * CMD itself. This lets the parser consume arguments with no + * ambiguity regardless of their content. + */ + do { + i = read_spawn_command(i, argc, argv); + if (i < 0) + return i; + } while (i < argc && !strcmp(argv[i], "-c") && ++i); + } else { + begin_command = 1; + } break;
case 'A': @@ -3533,6 +3596,116 @@ static void do_client_version(void) (proto & 0x0000FFFF)); }
+/* + * log_owner - for ACT_COMMAND and ACT_SPAWN, log the owner when failing to acquire + * + * Only logs when option "-h 1" is used + */ +static void log_owner(struct sanlk_host *owner, char *owner_name) +{ + if (com.get_hosts && (owner->host_id || owner_name)) { + log_tool("owner: host_id %llu generation %llu timestamp %llu state %s name %s", + (unsigned long long)owner->host_id, + (unsigned long long)owner->generation, + (unsigned long long)owner->timestamp, + host_state_str(owner->flags), + owner_name ?: "none"); + } +} + +/* + * do_spawn - implementation of "sanlock client spawn" + * + * Acquires the resource lease, runs a sequence of commands specified via + * repeated "-c COUNT CMD [ARG...]" options, then releases the lease explicitly + * and synchronously before returning. + */ +static int do_spawn(void) +{ + struct sanlk_host owner = { 0 }; + char *owner_name = NULL; + uint32_t flags = 0; + pid_t pid; + int status, exit_code; + int i, fd, rv; + + if (!com.spawn_count) { + log_tool("spawn requires at least one -c command"); + return -EINVAL; + } + + log_tool("register"); + fd = sanlock_register(); + log_tool("register done %d", fd); + if (fd < 0) + return fd; + /* + * Do not allow the daemon to use SIGTERM or SIGKILL on this process + * for recovery. If the lockspace lease cannot be renewed, the daemon + * must fence the node via the watchdog instead. This prevents the + * unsafe scenario of a killed supervisor with a still-running writer. + */ + sanlock_restrict(fd, SANLK_RESTRICT_SIGKILL | SANLK_RESTRICT_SIGTERM); + flags |= com.orphan ? SANLK_ACQUIRE_ORPHAN : 0; + + log_tool("acquire fd %d", fd); + if (com.get_hosts) + rv = sanlock_acquire2(fd, -1, flags, com.res_args[0], NULL, &owner, &owner_name); + else + rv = sanlock_acquire(fd, -1, flags, com.res_count, com.res_args, NULL); + log_tool("acquire done %d", rv); + + if (rv < 0) { + log_owner(&owner, owner_name); + if (owner_name) + free(owner_name); + close(fd); + return rv; + } + /* + * Run each command sequentially under the held lease, stop on failure. + */ + exit_code = 0; + for (i = 0; i < com.spawn_count; i++) { + log_tool("spawn: %s", com.spawn_args[i].argv[0]); + pid = fork(); + if (pid < 0) { + log_tool("spawn fork failed: %s", strerror(errno)); + exit_code = -errno; + break; + } else if (pid == 0) { + /* child: close the lease fd so it is not inherited */ + close(fd); + execv(com.spawn_args[i].argv[0], com.spawn_args[i].argv); + perror("spawn execv failed"); + exit(1); + } + /* parent: wait for child to complete before continuing */ + if (waitpid(pid, &status, 0) < 0) { + log_tool("spawn waitpid failed: %s", strerror(errno)); + exit_code = -errno; + break; + } + exit_code = WIFEXITED(status) ? WEXITSTATUS(status) : 1; + log_tool("spawn done %d", exit_code); + if (exit_code != 0) + break; + } + /* + * Explicit synchronous release. When sanlock_release() returns the + * daemon has already processed the release, unlike ACT_COMMAND which + * relies on POLLHUP as an asynchronous method to release the lease. + */ + log_tool("release"); + rv = sanlock_release(fd, -1, SANLK_REL_ALL, 0, NULL); + log_tool("release done %d", rv); + close(fd); + if (rv < 0) { + return rv; + } + return exit_code; +} + static int do_client(void) { struct sanlk_host_event he; @@ -3550,7 +3723,7 @@ static int do_client(void) int i, fd; int rv = 0;
- if (com.action == ACT_COMMAND || com.action == ACT_ACQUIRE) { + if (com.action == ACT_COMMAND || com.action == ACT_SPAWN || com.action == ACT_ACQUIRE) { for (i = 0; i < com.res_count; i++) { res = com.res_args[i];
@@ -3592,6 +3765,10 @@ static int do_client(void) log_tool("shutdown done %d", rv); break;
+ case ACT_SPAWN: + rv = do_spawn(); + break; + case ACT_COMMAND: log_tool("register"); fd = sanlock_register(); @@ -3611,14 +3788,9 @@ static int do_client(void) log_tool("acquire done %d", rv);
if (rv < 0) { - if (com.get_hosts && (owner.host_id || owner_name)) { - log_tool("owner: host_id %llu generation %llu timestamp %llu state %s name %s", - (unsigned long long)owner.host_id, (unsigned long long)owner.generation, - (unsigned long long)owner.timestamp, host_state_str(owner.flags), - owner_name ?: "none"); - if (owner_name) - free(owner_name); - } + log_owner(&owner, owner_name); + if (owner_name) + free(owner_name); goto out; }
diff --git a/src/sanlock.8 b/src/sanlock.8 index e591060..a9562e5 100644 --- a/src/sanlock.8 +++ b/src/sanlock.8 @@ -765,6 +765,64 @@ Register with the sanlock daemon, acquire the specified resource lease, and exec the command at path with args. When the command exits, the sanlock daemon will release the lease. -c must be the final option.
+.BR "sanlock client spawn -r" " RESOURCE " \ +\fB-c\fP " " \fICOUNT\fP " " \fICMD\fP " [" \fIARG\fP "...] " \ +\fR[\fP\fB-c\fP " " \fICOUNT\fP " " \fICMD\fP " [" \fIARG\fP "...]\fR]\fP..." + +Register with the sanlock daemon, acquire the specified resource lease, +fork and exec each command specified by +.B -c +sequentially as separate processes, checking the exit status of each +process, and only moving to the next process on success. +After all processes are successfully executed, or at the first failure, +the lease is released explicitly before exiting. + +.BR "sanlock client spawn" differs substantially from +.BR "sanlock client command", which execs a single command, and relies +on the daemon detecting process exit via POLLHUP to release the lease +asynchronously. +Instead, this action uses fork and waitpid so that the process survives +to call +.B sanlock_release() +synchronously. This means that on exit, the sanlock daemon has already +processed the release, and a valid subsequent request to acquire the +same resource will not fail with EAGAIN. + +sanlock_restrict(SANLK_RESTRICT_SIGKILL | SANLK_RESTRICT_SIGTERM) is called +after sanlock_register() so that if the lockspace lease cannot be renewed, +the daemon fences the node via the watchdog rather than sending signals to +this process. A killed supervisor with a still-running writer child would +be unsafe. + +For each command to spawn, a separate +.B -c +option is used, followed by the count of the arguments and the arguments +themselves (mirroring the argc, argv convention). +This allows +.I COUNT +arguments to be consumed unambiguously regardless of their content, +including arguments that would otherwise look like options. +Commands are run in order and execution stops on the first failure, +mirroring "command1 && command2" shell behavior. + +.B -P 1 +PERSISTENT flag is recommended, so that if this process is killed while a +child is still running, the lease becomes an orphan. This protects against +the daemon releasing the lease while a child is still writing to storage. + +To release an orphan lease left by a crashed spawn invocation: +sanlock client release -r LOCKSPACE:RESOURCE:PATH:OFFSET -O 1 + +Use +.B -h 1 +to log the current owner information (host_id, generation, timestamp, +and owner name) when the acquire fails because another host holds the +lease. + +Use +.B -O 1 +to acquire an existing orphan lease + .BR "sanlock client acquire -r" " RESOURCE " \ \fB-p\fP " " \fIpid\fP .br diff --git a/src/sanlock_internal.h b/src/sanlock_internal.h index b58144b..ab74382 100644 --- a/src/sanlock_internal.h +++ b/src/sanlock_internal.h @@ -366,6 +366,11 @@ EXTERN struct client *client; #define DEFAULT_MAX_SECTORS_KB_ALIGN 0 /* set it to align size */ #define DEFAULT_MAX_SECTORS_KB_NUM 1024 /* set it to num KB for all lockspaces */
+struct spawn_cmd { + int argc; + char **argv; /* argv[argc] is NULL */ +}; + struct command_line { int type; /* COM_ */ int action; /* ACT_ */ @@ -437,6 +442,8 @@ struct command_line { struct sanlk_rindex rindex; /* -x RINDEX */ struct sanlk_lockspace lockspace; /* -s LOCKSPACE */ struct sanlk_resource *res_args[SANLK_MAX_RESOURCES]; /* -r RESOURCE */ + struct spawn_cmd *spawn_args; /* -c COUNT CMD [ARG...], for ACT_SPAWN */ + int spawn_count; };
EXTERN struct command_line com; @@ -498,6 +505,7 @@ enum { ACT_CLIENT_INIT_HOST, ACT_READ, ACT_SET_HOST, + ACT_SPAWN, };
EXTERN int external_shutdown;
On 3/11/26 13:26, Claudio Fontana via sanlock-devel wrote:
Hi,
On 3/9/26 20:45, David Teigland via sanlock-devel wrote:
On Sat, Mar 07, 2026 at 03:43:24PM +0100, Claudio Fontana via sanlock-devel wrote:
Hello,
for a new platform I am building in Golang, I am trying to use sanlock,
and one of the rough edges is that it does not have Go bindings, and I am not a fan of using cgo for this, so I am using 'sanlock client command' with a bunch of chained commands to do what I need to do under a resource lease.
One of the issues I see with sanlock client command is that it relies on the sanlock daemon detecting the exit of the process to clean up the reservation instead of an explicit sanlock_release.
In my view this opens up for race conditions with multiple commands run in series, where we may get EAGAIN for no other reason than the demon not having fully processed the command exit.
Another issue I see is that often I need to run commands in a series, all under lock, where the order is important, and any failure should interrupt the command sequence.
This can be done with
sanlock client command -r ... -c /bin/sh 'command1 && command2'...
but I would prefer to keep shell script out of the question.
So I was thinking of a new sanlock client spawn -r ... -c arg0 arg1 arg2 ... -c ...
with repeatable -c options, so that multiple commands can be spawned in a sequence, and any failure would stop the sequence.
the commands would be run in a subprocess, with an explicit call to sanlock_release() after execution of all commands, so that on return from sanlock client command, we know that the lease is released.
What do you think, is this something interesting for upstream sanlock?
Hi, so you want to acquire a resource lease, run a series of commands while that lease remains held, and then release the lease when they are done. The difficulty is deciding which process remains running throughout this entire sequence, which the sanlock daemon considers the lease holder.
IIUC, you're suggesting that a new "sanlock client spawn" would register itself as the supervising process (like "sanlock client command"), acquire the lease, then fork/exec each command specified by a series of -c. When all commands are done (or any fails), then the original "sanlock client spawn" process would release the lease. That sounds like a reasonable extension. A couple solvable issues come to mind related to recovery.
Yes, that is what I had in mind.
The sanlock daemon assumes by default that a connected process is the one both holding the lease and writing to the shared storage. If that process exits, the resource lease from the dead process held can be safely released automatically. In your use case (and often in other cases), it's not just the connected process writing to the shared storage. If the connected process exits, it doesn't mean that the resource lease can be safely dropped, because some other process may still be writing. lvmlockd has this same issue. The way to deal with this is for "sanlock client spawn" to:
- Use SANLK_RES_PERSISTENT for the resource lease, so that the lease is
not dropped if the supervising process exits without calling release explicitly. (e.g. if you kill the sanlock client spawn process, but one of the -c commands it forked is still writing to the shared storage.) The held lease will remain held as an orphan (clearing that orphan lease would require some other procedure specific to what you're doing.) If you want to try this, using -P 1 with sanlock client command will use PERSISTENT leases.
- Call sanlock_restrict(SANLK_RESTRICT_SIGKILL | SANLK_RESTRICT_SIGTERM)
after sanlock_register() from "sanlock client spawn". If the sanlock daemon cannot renew its lockspace lease while "sanlock client spawn" is running, the sanlock daemon will not simply use SIGTERM or SIGKILL to terminate the supervising spawn process to do recovery.
Dave
I have an initial implementation that I would like to share early, so you know what I am doing, and you might have some comments,
but I still have to test it. Thanks!
I did a first initial test which seems promising, but I'll test more.
One thing where I see a potential issue, but this is more about how I am using sanlock in my application than this patch,
is that I am currently taking the lease in order to de-provision the resource itself, so I am removing the resource file itself.
So when I do:
+ /* + * Explicit synchronous release. When sanlock_release() returns the + * daemon has already processed the release, unlike ACT_COMMAND which + * relies on POLLHUP as an asynchronous method to release the lease. + */ + log_tool("release"); + rv = sanlock_release(fd, -1, SANLK_REL_ALL, 0, NULL); + log_tool("release done %d", rv); + close(fd); + if (rv < 0) { + return rv; + } + return exit_code;
in the de-provisioning case, I then get error -2 (no such file or directory) from sanlock_release().
Which is kind of expected in this case, indeed we are removing the lease,
but I wonder if you have some comments on this usage pattern I have,
where I am taking the lease to actually remove the resource, and the last step in my mind is to remove the resource file. Is this a known usage pattern, it make sense to me, but does it to you?
Thank you,
Claudio
On Wed, Mar 11, 2026 at 06:16:12PM +0100, Claudio Fontana wrote:
One thing where I see a potential issue, but this is more about how I am using sanlock in my application than this patch,
is that I am currently taking the lease in order to de-provision the resource itself, so I am removing the resource file itself.
So when I do:
/** Explicit synchronous release. When sanlock_release() returns the* daemon has already processed the release, unlike ACT_COMMAND which* relies on POLLHUP as an asynchronous method to release the lease.*/log_tool("release");rv = sanlock_release(fd, -1, SANLK_REL_ALL, 0, NULL);log_tool("release done %d", rv);close(fd);if (rv < 0) {return rv;}return exit_code;in the de-provisioning case, I then get error -2 (no such file or directory) from sanlock_release().
Which is kind of expected in this case, indeed we are removing the lease,
but I wonder if you have some comments on this usage pattern I have,
where I am taking the lease to actually remove the resource, and the last step in my mind is to remove the resource file. Is this a known usage pattern, it make sense to me, but does it to you?
It seems ok, you're right that it's a pattern we haven't seen before. We probably want to formalize that usage with a sanlock_release() flag like SANLK_REL_NO_DISK telling sanlock to skip the on-disk portion of release, and avoid triggering an error for normal usage.
Dave
On 3/11/26 19:16, David Teigland wrote:
On Wed, Mar 11, 2026 at 06:16:12PM +0100, Claudio Fontana wrote:
One thing where I see a potential issue, but this is more about how I am using sanlock in my application than this patch,
is that I am currently taking the lease in order to de-provision the resource itself, so I am removing the resource file itself.
So when I do:
/** Explicit synchronous release. When sanlock_release() returns the* daemon has already processed the release, unlike ACT_COMMAND which* relies on POLLHUP as an asynchronous method to release the lease.*/log_tool("release");rv = sanlock_release(fd, -1, SANLK_REL_ALL, 0, NULL);log_tool("release done %d", rv);close(fd);if (rv < 0) {return rv;}return exit_code;in the de-provisioning case, I then get error -2 (no such file or directory) from sanlock_release().
Which is kind of expected in this case, indeed we are removing the lease,
but I wonder if you have some comments on this usage pattern I have,
where I am taking the lease to actually remove the resource, and the last step in my mind is to remove the resource file. Is this a known usage pattern, it make sense to me, but does it to you?
It seems ok, you're right that it's a pattern we haven't seen before. We probably want to formalize that usage with a sanlock_release() flag like SANLK_REL_NO_DISK telling sanlock to skip the on-disk portion of release, and avoid triggering an error for normal usage.
Dave
I have a tentative implementation of that, which is fairly straightforward except for one caveat.
One consequence for the semantics of the spawn action is that a new option (for example -d 1) is needed, but can only be considered if all commands spawned are successful.
So the release code in do_spawn() would become:
log_tool("release"); flags = SANLK_REL_ALL; if (exit_code == 0) { flags |= com.no_disk ? SANLK_REL_NO_DISK : 0; } rv = sanlock_release(fd, -1, flags, 0, NULL); log_tool("release done %d", rv); close(fd);
---- If that's acceptable I can submit right away the two patches.
The alternative is to do without SANLK_REL_NO_DISK, and in that case I could try to parse from the application the "release done %d" tool output, however this is at least theoretically unreliable since the output will be mixed with the spawned children output and error channels.
Any thoughts? Thanks,
Claudio
On Thu, Mar 12, 2026 at 04:12:33PM +0100, Claudio Fontana wrote:
log_tool("release"); flags = SANLK_REL_ALL; if (exit_code == 0) { flags |= com.no_disk ? SANLK_REL_NO_DISK : 0; } rv = sanlock_release(fd, -1, flags, 0, NULL); log_tool("release done %d", rv); close(fd);
If that's acceptable I can submit right away the two patches.
Yes, that looks like the right way to do it.
Dave
sanlock-devel@lists.fedorahosted.org