Hi all,
As the time when Tiers will be split to 1/2/3 draws near, and with it tighter time limits on Tier 1 tasks, I thought that using "sleep n" for process synchronisation is a little bit too hackish.
The first routine I made is able to wait for some process to start listening on a socket (either network or UNIX). With that I'm able to execute tasks need to start many network servers in background much quicker on fast systems (majority of x86_64 machines) while the tasks still execute properly on s390x.
I'll be working on making a more universal solution, including solution to bug 970143.
Hubert Kario (2): new routines for socket based synchronisation add synchronisation to system library
src/Makefile | 1 + src/beakerlib.sh | 1 + src/synchronisation.sh | 182 +++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 184 insertions(+) create mode 100644 src/synchronisation.sh
Add new function, rlWaitForSocket, that can wait for a network service to start. To be used instead of `sleep' when testing network-centric services.
Signed-off-by: Hubert Kario hkario@redhat.com --- is the fatal error handling OK?
src/synchronisation.sh | 182 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 182 insertions(+) create mode 100644 src/synchronisation.sh
diff --git a/src/synchronisation.sh b/src/synchronisation.sh new file mode 100644 index 0000000..66078e3 --- /dev/null +++ b/src/synchronisation.sh @@ -0,0 +1,182 @@ +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# Name: synchronisation.sh - part of the BeakerLib project +# Description: Process synchronisation routines +# +# Author: Hubert Kario hkario@redhat.com +# +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# +# Copyright (c) 2013 Red Hat, Inc. All rights reserved. +# +# This copyrighted material is made available to anyone wishing +# to use, modify, copy, or redistribute it subject to the terms +# and conditions of the GNU General Public License version 2. +# +# This program is distributed in the hope that it will be +# useful, but WITHOUT ANY WARRANTY; without even the implied +# warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR +# PURPOSE. See the GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public +# License along with this program; if not, write to the Free +# Software Foundation, Inc., 51 Franklin Street, Fifth Floor, +# Boston, MA 02110-1301, USA. +# +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +getopt -T || ret=$? +if [ ${ret:-0} -ne 4 ]; then + echo "ERROR: Non enhanced getopt version detected" 1>&2 + exit 1 +fi + +: <<'=cut' +=pod + +=head1 NAME + +BeakerLib - synchronisation - Process synchronisation routines + +=head1 DESCRIPTION + +This is a library of helpers for process synchronisation +of applications. + +=head1 FUNCTIONS + +=cut + +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# name of routine +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +: <<'=cut' +=pod + +=head2 Process Synchronisation + +=head3 rlWaitForSocket + +Pauses script execution until socket starts listening. + + rlWaitForSocket {port|path} [-p PID] [-t time] + +=over + +=item port|path + +Network port to wait for opening or a path to UNIX socket. +Regular expressions are also supported. + +=item -t time + +Timeout in seconds (optional, default=120). If the socket +isn't opened between the time elapses the command FAILs. + +=item -p PID + +PID of the process to check before running command. If the process +exits before the socket is opened, the command FAILs. + +=back + +=cut + +rlWaitForSocket(){ + + local timeout=120 + local proc_pid=1 + local socket="" + local child_pid=0 + + # that is the GNU extended getopt syntax! + TEMP=$(getopt -o t:p: -n 'rlWaitForSocket' -- "$@") + if [[ $? != 0 ]] ; then echo "Terminating..." >&2 ; exit 1 ; fi + + eval set -- "$TEMP" + + while true ; do + case "$1" in + -t) timeout="$2"; shift 2 + ;; + -p) proc_pid="$2"; shift 2 + ;; + --) shift 1 + break + ;; + *) echo "Internal error!" >&2; exit 1 + ;; + esac + done + socket="$1" + # the case statement is a portable way to check if variable contains only + # digits (regexps are not available in old, RHEL3-era, bash) + case "$timeout" in + ''|*[!0-9]*) echo "rlWaitForSocket: Invalid timeout provided" 1>&2; exit 1;; + esac + case "$proc_pid" in + ''|*[!0-9]*) echo "rlWaitForSocket: Invalid PID provided" 1>&2; exit 1;; + esac + case "$socket" in + ''|*[!0-9]*) + #socket_type="network" + grep_opt=":$socket[[:space:]]" + ;; + "") echo "rlWaitForSocket: Empty socket specified" 1>&2 + exit 1 + ;; + *) + #socket_type="unix" + grep_opt="$socket" + ;; + esac + rlLog "Waiting max ${timeout}s for socket `$socket' to start listening" + + ( while true ; do + netstat -nl | grep -E "$grep_opt" >/dev/null + if [[ $? -eq 0 ]]; then + exit 0; + else + if [[ ! -e "/proc/$proc_pid" ]]; then + exit 1; + fi + sleep 1 + fi + done ) & + netstat_pid=$! + + ( sleep $timeout && kill -HUP $netstat_pid ) 2>/dev/null & + watcher=$! + + wait $netstat_pid + ret=$? + if [[ $ret -eq 0 ]]; then + kill -s SIGKILL $watcher 2>/dev/null + rlLog "Socket opened!" + else + if [[ $ret -eq 1 ]]; then + kill -s SIGKILL $watcher 2>/dev/null + rlLogWarning "PID terminated!" + else + rlLogWarning "Timeout elapsed" + fi + fi +} + +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +# AUTHORS +# ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +: <<'=cut' +=pod + +=head1 AUTHORS + +=over + +=item * + +Hubert Kario hkario@redhat.com + +=back + +=cut
On Tue, 2013-11-19 at 15:49 +0100, Hubert Kario wrote:
Add new function, rlWaitForSocket, that can wait for a network service to start. To be used instead of `sleep' when testing network-centric services.
The idea is great, much appreciated. I'm looking forward to the rest. The decision to put this into a new file is good and the naming is well descriptive. Thanks.
I would like to have a testcase for it in the testsuite if possible, though.
Have you checked the compatibility of the shell code with bash code back to RHEL4? I currently accept RHEL5+ code into upstream, but I would like to have some knowledge about where this runs. The comments indicate you have, but just to be sure...
More specific comments inline.
is the fatal error handling OK?
No: by directly exiting, you are e.g. preventing any cleanups from running (and logs from uploading when run on beaker), unfortunately. My preference would be to just return a RC and let the user handle the failure e.g. with rlRun. Use journal logs for messages. It's done this way everywhere BL.
[...]
+getopt -T || ret=$? +if [ ${ret:-0} -ne 4 ]; then
- echo "ERROR: Non enhanced getopt version detected" 1>&2
- exit 1
+fi
+: <<'=cut' +=pod
[...] Documentation is good.
+rlWaitForSocket(){
- local timeout=120
- local proc_pid=1
- local socket=""
- local child_pid=0
- # that is the GNU extended getopt syntax!
- TEMP=$(getopt -o t:p: -n 'rlWaitForSocket' -- "$@")
local, please
- if [[ $? != 0 ]] ; then echo "Terminating..." >&2 ; exit 1 ; fi
1) prefix messages so the user knows what is emitting them 2) Be more specific with failure messages: what happened? why premature exit? 3) Do not exit the script for the user (see the general comment above)
- eval set -- "$TEMP"
- while true ; do
case "$1" in-t) timeout="$2"; shift 2;;-p) proc_pid="$2"; shift 2;;--) shift 1break;;*) echo "Internal error!" >&2; exit 1
See 1),2),3) above
;;esac- done
- socket="$1"
- # the case statement is a portable way to check if variable contains only
- # digits (regexps are not available in old, RHEL3-era, bash)
- case "$timeout" in
''|*[!0-9]*) echo "rlWaitForSocket: Invalid timeout provided" 1>&2; exit 1;;
See 2),3) above
- esac
- case "$proc_pid" in
''|*[!0-9]*) echo "rlWaitForSocket: Invalid PID provided" 1>&2; exit 1;;
See 2),3) above
- esac
- case "$socket" in
''|*[!0-9]*)#socket_type="network"grep_opt="\:$socket[[:space:]]"
local, please
;;"") echo "rlWaitForSocket: Empty socket specified" 1>&2
See 2),3) above
exit 1;;*)#socket_type="unix"grep_opt="$socket";;- esac
- rlLog "Waiting max ${timeout}s for socket `$socket' to start listening"
Prefix message
- ( while true ; do
netstat -nl | grep -E "$grep_opt" >/dev/nullif [[ $? -eq 0 ]]; thenexit 0;elseif [[ ! -e "/proc/$proc_pid" ]]; thenexit 1;fisleep 1fi- done ) &
- netstat_pid=$!
local
- ( sleep $timeout && kill -HUP $netstat_pid ) 2>/dev/null &
- watcher=$!
local
- wait $netstat_pid
- ret=$?
local
- if [[ $ret -eq 0 ]]; then
kill -s SIGKILL $watcher 2>/dev/nullrlLog "Socket opened!"
Prefix
- else
if [[ $ret -eq 1 ]]; thenkill -s SIGKILL $watcher 2>/dev/nullrlLogWarning "PID terminated!"
Prefix
elserlLogWarning "Timeout elapsed"
Prefix
fi- fi
+}
[...]
PM
----- Original Message -----
From: "Petr Muller" pmuller@redhat.com To: beakerlib-devel@lists.fedorahosted.org Sent: Tuesday, 19 November, 2013 5:52:55 PM Subject: Re: [PATCH 1/2] new routines for socket based synchronisation
On Tue, 2013-11-19 at 15:49 +0100, Hubert Kario wrote:
Add new function, rlWaitForSocket, that can wait for a network service to start. To be used instead of `sleep' when testing network-centric services.
The idea is great, much appreciated. I'm looking forward to the rest. The decision to put this into a new file is good and the naming is well descriptive. Thanks.
I would like to have a testcase for it in the testsuite if possible, though.
sure, first wanted to know if the implementation is ok
Have you checked the compatibility of the shell code with bash code back to RHEL4? I currently accept RHEL5+ code into upstream, but I would like to have some knowledge about where this runs. The comments indicate you have, but just to be sure...
I did write it with limitations of tools in RHEL 3 in mind, haven't actually tested the whole script on it though, just code snippets and tool options.
Will do that before final version.
More specific comments inline.
is the fatal error handling OK?
No: by directly exiting, you are e.g. preventing any cleanups from running (and logs from uploading when run on beaker), unfortunately. My preference would be to just return a RC and let the user handle the failure e.g. with rlRun. Use journal logs for messages. It's done this way everywhere BL.
I'm assuming that by "RC" you mean "return code". I'll change it to do rlLogFatal and "return big-n" on unrecoverable errors then.
[...]
+getopt -T || ret=$? +if [ ${ret:-0} -ne 4 ]; then
- echo "ERROR: Non enhanced getopt version detected" 1>&2
- exit 1
+fi
+: <<'=cut' +=pod
[...] Documentation is good.
+rlWaitForSocket(){
- local timeout=120
- local proc_pid=1
- local socket=""
- local child_pid=0
- # that is the GNU extended getopt syntax!
- TEMP=$(getopt -o t:p: -n 'rlWaitForSocket' -- "$@")
local, please
- if [[ $? != 0 ]] ; then echo "Terminating..." >&2 ; exit 1 ; fi
- prefix messages so the user knows what is emitting them
ack
- Be more specific with failure messages: what happened? why premature
exit?
The detailed error messages are printed by getopt itself. If getopt returns non zero it means that the script is trying to use unsupported option so the user script is faulty.
- Do not exit the script for the user (see the general comment above)
(see general answer above) ;)
[...]
will fix other mentioned problems
On 11/19/2013 06:46 PM, Hubert Kario wrote:
is the fatal error handling OK?
No: by directly exiting, you are e.g. preventing any cleanups from running (and logs from uploading when run on beaker), unfortunately. My preference would be to just return a RC and let the user handle the failure e.g. with rlRun. Use journal logs for messages. It's done this way everywhere BL.
I'm assuming that by "RC" you mean "return code". I'll change it to do rlLogFatal and "return big-n" on unrecoverable errors then.
IMHO rlLogError would be better - it's used by other BeakerLib code (rlLogFatal being never used). Also returning 0 for success and 1 for fail (or 2/3/4/.. for special purposes) is widely accepted.
JJ
On Wed, 2013-11-20 at 13:37 +0100, Jiri Jaburek wrote:
On 11/19/2013 06:46 PM, Hubert Kario wrote:
is the fatal error handling OK?
No: by directly exiting, you are e.g. preventing any cleanups from running (and logs from uploading when run on beaker), unfortunately. My preference would be to just return a RC and let the user handle the failure e.g. with rlRun. Use journal logs for messages. It's done this way everywhere BL.
I'm assuming that by "RC" you mean "return code". I'll change it to do rlLogFatal and "return big-n" on unrecoverable errors then.
IMHO rlLogError would be better - it's used by other BeakerLib code (rlLogFatal being never used). Also returning 0 for success and 1 for fail (or 2/3/4/.. for special purposes) is widely accepted.
+1
PM
Make beakerlib load synchronisation related routines by default and install them.
Signed-off-by: Hubert Kario hkario@redhat.com --- are those the only two files in which new module name needs to be added?
src/Makefile | 1 + src/beakerlib.sh | 1 + 2 files changed, 2 insertions(+)
diff --git a/src/Makefile b/src/Makefile index c5263f7..4379bb0 100644 --- a/src/Makefile +++ b/src/Makefile @@ -33,6 +33,7 @@ MODULES=journal.sh\ performance.sh\ analyze.sh\ libraries.sh\ + synchronisation.sh\ virtualX.sh
FILES=$(MODULES) beakerlib.sh diff --git a/src/beakerlib.sh b/src/beakerlib.sh index 84cfcf0..fe8acbd 100644 --- a/src/beakerlib.sh +++ b/src/beakerlib.sh @@ -287,6 +287,7 @@ export BEAKERLIB=${BEAKERLIB:-"/usr/share/beakerlib"} . $BEAKERLIB/analyze.sh . $BEAKERLIB/performance.sh . $BEAKERLIB/virtualX.sh +. $BEAKERLIB/synchronisation.sh if [ -d $BEAKERLIB/plugins/ ] ; then for source in $BEAKERLIB/plugins/*.sh ; do . $source
beakerlib-devel@lists.fedorahosted.org