Re: [PATCH 3/3] sanlock: Setup priority before dropping privileges
by Nir Soffer
On Mon, May 4, 2020 at 11:48 PM Benjamin Marzinski <bmarzins(a)redhat.com> wrote:
>
> On Sun, May 03, 2020 at 09:32:30PM +0300, Nir Soffer wrote:
> > On Sun, May 3, 2020 at 7:46 PM Nir Soffer <nsoffer(a)redhat.com> wrote:
> > >
> > > On Sun, May 3, 2020 at 2:35 AM Nir Soffer <nirsof(a)gmail.com> wrote:
> > > >
> > > > sched_setscheduler() requires root, but we called it after dropping
> > > > privileges, so it always failed:
> > > >
> > > > 2020-02-13 12:34:19 1480 [8866]: sanlock daemon started 3.8.0 host a08359de-225c-4c21-a7d6-3623bb3bd6fb.host4
> > > > 2020-02-13 12:34:19 1480 [8866]: set scheduler RR|RESET_ON_FORK priority 99 failed: Operation not permitted
> > > >
> > > > Move setup_priority up before we drop privileges.
> > > >
> > > > With this change sanlock runs now with RR scheduler and expected
> > > > priority:
> > > >
> > > > $ ps -o cmd,cls,rtprio -p 2275
> > > > CMD CLS RTPRIO
> > > > /usr/sbin/sanlock daemon RR 99
> > >
> > > There is one issue, this works on Fedora but not on RHEL. On RHEL the
> > > call always fail,
> > > and on Fedora, setting priority works even without this change, so
> > > this change does not
> > > solve the problem.
> > >
> > > Looks like the way to use real time scheduler in RHEL 8 is using systemd:
> > > https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_fo...
> >
> > It does not work, I tried:
> >
> > diff --git a/init.d/sanlock.service.native b/init.d/sanlock.service.native
> > index f14eccd..53d61e9 100644
> > --- a/init.d/sanlock.service.native
> > +++ b/init.d/sanlock.service.native
> > @@ -2,6 +2,8 @@
> > Description=Shared Storage Lease Manager
> > After=syslog.target
> > Wants=wdmd.service
> > +CPUSchedulingPolicy=rr
> > +CPUSchedulingPriority=99
> >
> > [Service]
> > Type=forking
> > diff --git a/init.d/wdmd.service.native b/init.d/wdmd.service.native
> > index ab0828e..ed6a53d 100644
> > --- a/init.d/wdmd.service.native
> > +++ b/init.d/wdmd.service.native
> > @@ -1,6 +1,8 @@
> > [Unit]
> > Description=Watchdog Multiplexing Daemon
> > After=syslog.target
> > +CPUSchedulingPolicy=rr
> > +CPUSchedulingPriority=99
> >
> > [Service]
> > Type=forking
> >
> > But system fail to change cpu scheduling:
> >
> > # systemctl start wdmd
> > Job for wdmd.service failed because the control process exited with error code.
> > See "systemctl status wdmd.service" and "journalctl -xe" for details.
> > [root@host4 ~]# systemctl status wdmd
> > ● wdmd.service - Watchdog Multiplexing Daemon
> > Loaded: loaded (/usr/lib/systemd/system/wdmd.service; disabled;
> > vendor preset: disabled)
> > Active: failed (Result: exit-code) since Sun 2020-05-03 21:19:13 IDT; 7s ago
> > Process: 2379 ExecStartPre=/lib/systemd/systemd-wdmd watchdog-check
> > (code=exited, status=214/SETSCHEDULER)
> > Main PID: 2338 (code=exited, status=0/SUCCESS)
> >
> > May 03 21:19:13 host4 systemd[1]: Starting Watchdog Multiplexing Daemon...
> > May 03 21:19:13 host4 systemd[1]: wdmd.service: Control process
> > exited, code=exited status=214
> > May 03 21:19:13 host4 systemd[1]: wdmd.service: Failed with result 'exit-code'.
> > May 03 21:19:13 host4 systemd[1]: Failed to start Watchdog Multiplexing Daemon.
> >
> > Searching this issue, it seems that this feature is simply not
> > supported by systemd.
> >
> > I think we need to file RHEL systemd bug and get help from systemd
> > folks about this.
> >
> > Looking in Fedora 30 system we see:
> >
> > # ps -eo cmd,cls,rtprio | grep RR | grep -v grep
> > /sbin/multipathd -d -s RR 99
> > /usr/sbin/sanlock daemon RR 99
> > /usr/sbin/wdmd RR 99
> >
> > # ps -eo cmd,cls,rtprio | grep FF | grep -v grep
> > [migration/0] FF 99
> > [migration/1] FF 99
> > [watchdogd] FF 99
> > [irq/24-aerdrv] FF 50
> > [irq/24-pciehp] FF 50
> > [irq/25-aerdrv] FF 50
> > [irq/25-pciehp] FF 50
> > [irq/26-aerdrv] FF 50
> > [irq/26-pciehp] FF 50
> > [irq/27-aerdrv] FF 50
> > [irq/27-pciehp] FF 50
> > [irq/28-aerdrv] FF 50
> > [irq/28-pciehp] FF 50
> > [irq/29-aerdrv] FF 50
> > [irq/29-pciehp] FF 50
> > [irq/30-aerdrv] FF 50
> > [irq/30-pciehp] FF 50
> >
> >
> > Looking in other processes on RHEL 8.2, we see:
> >
> > # ps -eo cmd,cls,rtprio | grep RR | grep -v grep
> > (nothing)
> >
> > # ps -eo cmd,cls,rtprio | grep FF | grep -v grep
> > [migration/0] FF 99
> > [watchdog/0] FF 99
> > [watchdog/1] FF 99
> > [migration/1] FF 99
> > [watchdogd] FF 99
> > [irq/24-aerdrv] FF 50
> > [irq/24-pciehp] FF 50
> > [irq/25-aerdrv] FF 50
> > [irq/25-pciehp] FF 50
> > [irq/26-aerdrv] FF 50
> > [irq/26-pciehp] FF 50
> > [irq/27-aerdrv] FF 50
> > [irq/27-pciehp] FF 50
> > [irq/28-aerdrv] FF 50
> > [irq/28-pciehp] FF 50
> > [irq/29-aerdrv] FF 50
> > [irq/29-pciehp] FF 50
> > [irq/30-aerdrv] FF 50
> > [irq/30-pciehp] FF 50
> >
> > multipathd runs as:
> >
> > # ps -eo cmd,cls,rtprio | grep multipathd | grep -v grep
> > /sbin/multipathd -d -s TS -
> >
> > So looks like Ben has the same issue.
> >
> > So FF (SCHED_FIFO) works.
> >
> > Should we use FF (SCHED_FIFO) with max priority instead?
> >
> > We can try to use SCHED_RR, and fallback to SCHED_FIFO if failed.
> >
> > David, Ben, what do you think?
>
> I'm o.k. with doing that as a workaround. But since SCHED_FIFO loses the
> time limiting of SCHED_RR, perhaps I'll bump the SCHED_FIFO PRIO down a
> little, so that it doesn't interfere with things like watchdog, if it's
> running long.
Unfortunately this will also not work for systemd service. See reply to David
for more details.
>
> -Ben
>
> > > I'll send another patch.
> > >
> > > > Not running with real time scheduler may be the reason we see random
> > > > failures to write lockspace in oVirt system tests:
> > > > https://bugzilla.redhat.com/1247135
> > > >
> > > > Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
> > > > ---
> > > > src/main.c | 4 ++--
> > > > 1 file changed, 2 insertions(+), 2 deletions(-)
> > > >
> > > > diff --git a/src/main.c b/src/main.c
> > > > index 8c6eef8..ebc0b11 100644
> > > > --- a/src/main.c
> > > > +++ b/src/main.c
> > > > @@ -1750,14 +1750,14 @@ static int do_daemon(void)
> > > >
> > > > setup_host_name();
> > > >
> > > > + setup_priority();
> > > > +
> > > > setup_uid_gid();
> > > >
> > > > uname(&nodename);
> > > >
> > > > log_warn("sanlock daemon started %s host %s (%s)", VERSION, our_host_name_global, nodename.nodename);
> > > >
> > > > - setup_priority();
> > > > -
> > > > rv = thread_pool_create(DEFAULT_MIN_WORKER_THREADS, com.max_worker_threads);
> > > > if (rv < 0)
> > > > goto out;
> > > > --
> > > > 2.25.4
> > > >
>
3 years, 11 months
[PATCH 0/3] sanlock: Fix setup_priority
by Nir Soffer
Looks like using real time scheduler never worked, since we setup the scheduler
after droppig privileges.
Fix Makefile so we can build and test sanlock on Fedora 31.
Nir Soffer (3):
Makfile: Use PY_VERSION=3
Makefile: %systemd_postun requires arguments now
sanlock: Setup priority before dropping privileges
sanlock.spec.in | 6 +++---
src/main.c | 4 ++--
2 files changed, 5 insertions(+), 5 deletions(-)
--
2.25.4
3 years, 11 months
[PATCH] tox: Remove python 2 tests, add python 3.7, 3.8 tests
by Nir Soffer
Python 2 is dead for a while, and there is no point in running the tests
now. Python 3.7 and 3.8 are available and we want to test them on
travis.
Signed-off-by: Nir Soffer <nsoffer(a)redhat.com>
---
.travis.yml | 4 ++--
tox.ini | 5 ++---
2 files changed, 4 insertions(+), 5 deletions(-)
diff --git a/.travis.yml b/.travis.yml
index 89fb52a..bea805d 100644
--- a/.travis.yml
+++ b/.travis.yml
@@ -3,10 +3,10 @@ dist: xenial
language: python
python:
- - "2.7"
- "3.6"
- "3.7"
- - "3.8-dev"
+ - "3.8"
+ - "3.9-dev"
addons:
apt:
diff --git a/tox.ini b/tox.ini
index 4a28561..af8a6cc 100644
--- a/tox.ini
+++ b/tox.ini
@@ -4,7 +4,7 @@
# and then run "tox" from this directory.
[tox]
-envlist = py27,py36,flake8
+envlist = py{36,37,38},flake8
skipsdist = True
skip_missing_interpreters = True
@@ -18,8 +18,7 @@ whitelist_externals = make
deps =
pytest==4.0
commands =
- py27: make PY_VERSION=2.7 BUILDARGS="--build-lib={envsitepackagesdir}"
- py36: make PY_VERSION=3.6 BUILDARGS="--build-lib={envsitepackagesdir}"
+ py{36,37,38}: make BUILDARGS="--build-lib={envsitepackagesdir}"
pytest {posargs}
[testenv:flake8]
--
2.25.4
3 years, 11 months
[sanlock] branch master updated: release 3.8.1
by pagure@pagure.io
This is an automated email from the git hooks/post-receive script.
teigland pushed a commit to branch master
in repository sanlock.
The following commit(s) were added to refs/heads/master by this push:
new 07ab65a release 3.8.1
07ab65a is described below
commit 07ab65afb10c8f8c008880a73b7b7aaedbde0e15
Author: David Teigland <teigland(a)redhat.com>
AuthorDate: Fri May 1 10:15:14 2020 -0500
release 3.8.1
---
VERSION | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/VERSION b/VERSION
index 1981190..f280719 100644
--- a/VERSION
+++ b/VERSION
@@ -1 +1 @@
-3.8.0
+3.8.1
--
To stop receiving notification emails like this one, please contact
the administrator of this repository.
3 years, 11 months