cluster: RHEL6 - cman: prevent libcman from causing SIGPIPE when corosync is down
by Christine Caulfield
Gitweb: http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=7f67d0a871e...
Commit: 7f67d0a871e5bbafcd98ceb381deb2539f4b0e93
Parent: 2616b150da0fb5e6bb24f9d3483b52ea3a89ef82
Author: Christine Caulfield <ccaulfie(a)redhat.com>
AuthorDate: Thu Dec 20 08:49:56 2012 +0000
Committer: Christine Caulfield <ccaulfie(a)redhat.com>
CommitterDate: Thu Dec 20 08:49:56 2012 +0000
cman: prevent libcman from causing SIGPIPE when corosync is down
If corosync goes down/is shut down cman will return 0 from cman_dispatch and
close the socket. However, if a cman write operation is issued before this
happens then SIGPIPE can result from the writev() call to an open, but
disconnected, FD.
This patch changes writev() to sendmg() so it can pass MSG_NOSIGNAL to the
system call and prevent SIGPIPEs from occurring.
Resolves rhbz#887787
Acked-By: Fabio M. Di Nitto <fdinitto(a)redhat.com>
Signed-off-by: Christine Caulfield <ccaulfie(a)redhat.com>
---
cman/lib/libcman.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/cman/lib/libcman.c b/cman/lib/libcman.c
index a89c731..a99f5a0 100644
--- a/cman/lib/libcman.c
+++ b/cman/lib/libcman.c
@@ -204,10 +204,19 @@ static int loopy_writev(int fd, struct iovec *iovptr, size_t iovlen)
{
size_t byte_cnt=0;
int len;
+ struct msghdr msg;
+
+ msg.msg_name = NULL;
+ msg.msg_namelen = 0;
+ msg.msg_control = NULL;
+ msg.msg_controllen = 0;
while (iovlen > 0)
{
- len = writev(fd, iovptr, iovlen);
+ msg.msg_iov = iovptr;
+ msg.msg_iovlen = iovlen;
+
+ len = sendmsg(fd, &msg, MSG_NOSIGNAL);
if (len <= 0)
return len;
11 years, 4 months
gfs2-utils: master - gfs2_trace: Added a script called gfs2_trace for kernel tracing debugging.
by shane bradley
Gitweb: http://git.fedorahosted.org/git/?p=gfs2-utils.git;a=commitdiff;h=41991d08...
Commit: 41991d08c42ff779d26143598d0835e88ab24f65
Parent: b3d765d9e7ec7c53bc7b88c7eef35f5a249ebcae
Author: Shane Bradley <sbradley(a)redhat.com>
AuthorDate: Wed Dec 19 09:38:48 2012 -0500
Committer: Shane Bradley <sbradley(a)redhat.com>
CommitterDate: Wed Dec 19 09:38:48 2012 -0500
gfs2_trace: Added a script called gfs2_trace for kernel tracing debugging.
The script gfs2_trace is a tool used for debugging kernel tracing events. The
script can enabled or disable all or selected GFS2 kernel tracing events. The
script can capture the trace events and write them to a specific file which will
be tarred up after the file is written.
Signed-off-by: Shane Bradley <sbradley(a)redhat.com>
---
gfs2/man/Makefile.am | 3 +-
gfs2/man/gfs2_trace.8 | 45 +++
gfs2/scripts/Makefile.am | 2 +-
gfs2/scripts/gfs2_trace | 790 ++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 838 insertions(+), 2 deletions(-)
diff --git a/gfs2/man/Makefile.am b/gfs2/man/Makefile.am
index 8655a76..82eab9a 100644
--- a/gfs2/man/Makefile.am
+++ b/gfs2/man/Makefile.am
@@ -8,4 +8,5 @@ dist_man_MANS = fsck.gfs2.8 \
gfs2_jadd.8 \
mkfs.gfs2.8 \
tunegfs2.8 \
- gfs2_lockcapture.8
+ gfs2_lockcapture.8 \
+ gfs2_trace.8
diff --git a/gfs2/man/gfs2_trace.8 b/gfs2/man/gfs2_trace.8
new file mode 100644
index 0000000..dd98072
--- /dev/null
+++ b/gfs2/man/gfs2_trace.8
@@ -0,0 +1,45 @@
+.TH gfs2_trace 8
+
+.SH NAME
+gfs2_trace \- can enable trace events, disable trace events, and capture data from GFS2 trace events.
+
+.SH SYNOPSIS
+.B gfs2_trace \fR[-dqEN] [-e \fItrace event name]\fR [-n \fItrace event name]\fR [-o \fIoutput filename]\fR
+.PP
+
+.SH DESCRIPTION
+\fIgfs2_trace\fR can enabled and disable trace events on all trace events or selected trace events. \fIgfs2_trace\fR can
+capture the output of the trace events and write the output to a file. When capturing trace events, the script will exit
+when control-c is pressed. The trace events will be then written to the selected file.
+.PP
+
+.SH OPTIONS
+.TP
+\fB-h, --help\fP
+Prints out a short usage message and exits.
+.TP
+\fB-d, --debug\fP
+enables debug logging.
+.TP
+\fB-q, --quiet\fP
+disables logging to console.
+.TP
+\fB-l, --list\fP
+lists the enabled state and filters for the GFS2 trace events
+.TP
+\fB-E, --enable_all_trace_events\fP
+enables all trace_events for GFS2
+.TP
+\fB-e \fI<trace event name>, \fB----enable_trace_event\fR=\fI<trace event name>\fP
+selected trace_events that will be enabled for GFS2
+.TP
+\fB-N, --disable_all_trace_events\fP
+disables all trace_events for GFS2
+.TP
+\fB-n \fI<trace event name>, \fB----disable_trace_event\fR=\fI<trace event name>\fP
+selected trace_events that will be enabled for GFS2
+.TP
+\fB-c \fI<output filename>, \fB--capture\fR=\fI<output filename>\fP
+enables capturing of trace events and will save the data to a file
+.
+.SH SEE ALSO
diff --git a/gfs2/scripts/Makefile.am b/gfs2/scripts/Makefile.am
index b88580e..2c51222 100644
--- a/gfs2/scripts/Makefile.am
+++ b/gfs2/scripts/Makefile.am
@@ -9,4 +9,4 @@ sbindir := $(shell rpl=0; test '$(exec_prefix):$(sbindir)' = /usr:/usr/sbin \
test $$rpl = 1 && echo /sbin || echo '$(exec_prefix)/sbin')
-dist_sbin_SCRIPTS = gfs2_lockcapture
+dist_sbin_SCRIPTS = gfs2_lockcapture gfs2_trace
diff --git a/gfs2/scripts/gfs2_trace b/gfs2/scripts/gfs2_trace
new file mode 100644
index 0000000..38b3d18
--- /dev/null
+++ b/gfs2/scripts/gfs2_trace
@@ -0,0 +1,790 @@
+#!/usr/bin/env python
+"""
+This script will enable or disable trace events for GFS2. The script can capture
+trace events and write the trace events captured to a file.
+
+When capturing events, hit "control-c" to exit and then the capture events will be
+written to a file. A file will be created by reading this pipe:
+/sys/kernel/debug/tracing/trace_pipe
+
+The debug directory is required to be mounted which will be mounted if not
+mounted. The trace events are located in this directory
+/sys/kernel/debug/tracing/events/gfs2.
+
+The file that can be used to validate what fields are valid for "filters" is
+described in the format file. For example:
+/sys/kernel/debug/tracing/events/gfs2/*/format
+
+@author : Shane Bradley
+@contact : sbradley(a)redhat.com
+@version : 0.9
+@copyright : GPLv2
+"""
+import sys
+import os
+import os.path
+import logging
+import platform
+import fileinput
+import tarfile
+import subprocess
+from optparse import OptionParser, Option
+
+# #####################################################################
+# Global Vars:
+# #####################################################################
+"""
+@cvar VERSION_NUMBER: The version number of this script.
+@type VERSION_NUMBER: String
+@cvar MAIN_LOGGER_NAME: The name of the logger.
+@type MAIN_LOGGER_NAME: String
+@cvar PATH_TO_DEBUG_DIR: The path to the debug directory for the linux kernel.
+@type PATH_TO_DEBUG_DIR: String
+@cvar PATH_TO_PID_FILENAME: The path to the pid file that will be used to make
+sure only 1 instance of this script is running at any time.
+@type PATH_TO_PID_FILENAME: String
+@cvar PATH_TO_GFS2_TRACE_EVENTS_DIR: The path to the directory that contains
+all the GFS2 trace events.
+@type PATH_TO_GFS2_TRACE_EVENTS_DIR: String
+@cvar PATH_TO_TRACE_PIPE: The path to the tracing pipe.
+@type PATH_TO_TRACE_PIPE: String
+"""
+VERSION_NUMBER = "0.9-1"
+MAIN_LOGGER_NAME = "gfs2trace"
+PATH_TO_DEBUG_DIR="/sys/kernel/debug"
+PATH_TO_PID_FILENAME = "/var/run/%s.pid" %(os.path.basename(sys.argv[0]))
+PATH_TO_GFS2_TRACE_EVENTS_DIR="%s/tracing/events/gfs2" %(PATH_TO_DEBUG_DIR)
+PATH_TO_TRACE_PIPE="%s/tracing/trace_pipe" %(PATH_TO_DEBUG_DIR)
+
+class FileUtils:
+ """
+ A class that provides static functions for files such as reading and
+ writing.
+ """
+ def getDataFromFile(pathToSrcFile) :
+ """
+ This function will return the data in an array. Where each newline in file
+ is a seperate item in the array. This should really just be used on
+ relatively small files.
+
+ None is returned if no file is found.
+
+ @return: Returns an array of Strings, where each newline in file is an item
+ in the array.
+ @rtype: Array
+
+ @param pathToSrcFile: The path to the file which will be read.
+ @type pathToSrcFile: String
+ """
+ if (len(pathToSrcFile) > 0) :
+ try:
+ fin = open(pathToSrcFile, "r")
+ data = fin.readlines()
+ fin.close()
+ return data
+ except (IOError, os.error):
+ message = "An error occured reading the file: %s." %(pathToSrcFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return None
+ getDataFromFile = staticmethod(getDataFromFile)
+
+ def writeToFile(pathToFilename, data, appendToFile=True, createFile=False):
+ """
+ This function will write a string to a file.
+
+ @return: Returns True if the string was successfully written to the file,
+ otherwise False is returned.
+ @rtype: Boolean
+
+ @param pathToFilename: The path to the file that will have a string written
+ to it.
+ @type pathToFilename: String
+ @param data: The string that will be written to the file.
+ @type data: String
+ @param appendToFile: If True then the data will be appened to the file, if
+ False then the data will overwrite the contents of the file.
+ @type appendToFile: Boolean
+ @param createFile: If True then the file will be created if it does not
+ exists, if False then file will not be created if it does not exist
+ resulting in no data being written to the file.
+ @type createFile: Boolean
+ """
+ [parentDir, filename] = os.path.split(pathToFilename)
+ if (os.path.isfile(pathToFilename) or (os.path.isdir(parentDir) and createFile)):
+ try:
+ filemode = "w"
+ if (appendToFile):
+ filemode = "a"
+ fout = open(pathToFilename, filemode)
+ fout.write(data + "\n")
+ fout.close()
+ return True
+ except UnicodeEncodeError, e:
+ message = "There was a unicode encode error writing to the file: %s." %(pathToFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except IOError:
+ message = "There was an error writing to the file: %s." %(pathToFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ return False
+ writeToFile = staticmethod(writeToFile)
+
+class TraceEvent:
+ """
+ A class that reprensents a trace event.
+ """
+ def __init__(self, pathToTraceEvent):
+ """
+ @param pathToTraceEvent: The path to the trace event directory.
+ @type pathToTraceEvent: String
+ """
+ self.__pathToTraceEvent = pathToTraceEvent
+
+ def __str__(self):
+ """
+ Returns a string representation of a TraceEvent.
+
+ @return: Returns a string representation of a TraceEvent.
+ @rtype: String
+ """
+ return "%s: %s" %(self.getName(), self.getEnable())
+
+ def __getEventPathItem(self, pathToFilename):
+ """
+ Returns the data contain in the file. If file does not exist then empty
+ array is returned.
+
+ @return: Returns the data contained in the file. If file does not exist
+ then empty array is returned.
+ @rtype: Array
+
+ @param pathToFilename: The path to the filename of the file in the trace
+ event's directory.
+ @type pathToFilename: String
+ """
+ output = FileUtils.getDataFromFile(pathToFilename)
+ if (output == None):
+ return []
+ return output
+
+ def __setEventPathItem(self, pathToFilename, contentsOfEventItem, appendToFile=True):
+ """
+ This function will write data to a file in the trace event's directory.
+
+ @return: Returns True is data was successfully written.
+ @rtype: Boolean
+ """
+ return FileUtils.writeToFile(pathToFilename, contentsOfEventItem, appendToFile)
+
+ def getPathToTraceEvent(self):
+ """
+ Returns the path to the trace event directory.
+
+ @return: Returns the path to the trace event directory.
+ @rtype: String
+ """
+ return self.__pathToTraceEvent
+
+ def getName(self):
+ """
+ Returns the shortname of the trace event.
+
+ @return: Returns the shortname of the trace event.
+ @rtype: String
+ """
+ return os.path.basename(self.getPathToTraceEvent())
+
+ def getEnable(self):
+ """
+ Returns the contents of the file "enable" in the trace event directory.
+
+ @return: Returns the contents of the file "enable" in the trace event
+ directory.
+ @rtype: String
+ """
+ fileContents = self.__getEventPathItem(os.path.join(self.getPathToTraceEvent(), "enable"))
+ if (len(fileContents) > 0):
+ return fileContents[0].rstrip()
+ else:
+ return ""
+
+ def getFilter(self):
+ """
+ Returns the contents of the file "filter" in the trace event
+ directory. Each line in the file is appened to an array that will be
+ returned.
+
+ @return: Returns the contents of the file "filter" in the trace event
+ directory.
+ @rtype: Array
+ """
+ return self.__getEventPathItem(os.path.join(self.getPathToTraceEvent(), "filter"))
+
+ def getFormat(self):
+ """
+ Returns the contents of the file "format" in the trace event
+ directory. Each line in the file is appened to an array that will be
+ returned.
+
+ @return: Returns the contents of the file "format" in the trace event
+ directory.
+ @rtype: Array
+ """
+ return self.__getEventPathItem(os.path.join(self.getPathToTraceEvent(), "format"))
+
+ def getID(self):
+ """
+ Returns the contents of the file "id" in the trace event directory.
+
+ @return: Returns the contents of the file "id" in the trace event
+ directory.
+ @rtype: String
+ """
+ fileContents = self.__getEventPathItem(os.path.join(self.getPathToTraceEvent(), "id"))
+ if (len(fileContents) > 0):
+ return fileContents[0].rstrip()
+ else:
+ return ""
+
+ def setEventEnable(self, eventEnableString):
+ """
+ Set the trace event to either enabled(1) or disabled(0) by writing to
+ the trace event's "enable" file.
+
+ @param eventEnableString: The value of the string should be 1 for
+ enabled or 0 for disabled.
+ @param eventEnableString: String
+ """
+ if ((eventEnableString == "0") or (eventEnableString == "1")):
+ self.__setEventPathItem(os.path.join(self.getPathToTraceEvent(), "enable"), eventEnableString, appendToFile=False)
+ else:
+ message = "The trace event \"enable\" file only accepts the values of 0 or 1. The value %s will not be written: %s." %(eventEnableString)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+
+class TraceEvents:
+ """
+ A class that is a container for multiple trace events that are located in a
+ directory.
+ """
+ def __init__(self, pathToTraceEvents):
+ """
+ @param pathToTraceEvents: The path to the directory that contains trace
+ events.
+ @type pathToTraceEvents: String
+ """
+ self.__pathToTraceEvents = pathToTraceEvents
+ self.__traceEventsMap = self.__generateTraceEvents()
+
+ def __generateTraceEvents(self):
+ """
+ Generates a map of all the trace events.
+
+ @return: Returns a map of all the TraceEvent found.
+ @rtype: Dict
+ """
+ traceEventsMap = {}
+ if (not os.path.exists(self.__pathToTraceEvents)):
+ message = "The path does not exist: %s" %(self.__pathToTraceEvents)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return traceEventsMap
+ elif (os.path.isdir(self.__pathToTraceEvents)):
+ dirlist = []
+ try:
+ dirlist = os.listdir(self.__pathToTraceEvents)
+ except OSError:
+ message = "There was error listing contents of the directory: %s" %(self.__pathToTraceEvents)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ for item in dirlist:
+ pathToItem = os.path.join(PATH_TO_GFS2_TRACE_EVENTS_DIR, item)
+ if (os.path.isdir(pathToItem)):
+ traceEvent = TraceEvent(pathToItem)
+ traceEventsMap[traceEvent.getName()] = traceEvent
+ return traceEventsMap
+
+ def getPathToTraceEvents(self):
+ """
+ Returns the path to the directory that contains all the trace events.
+
+ @return: Return the path to the directory that contains all the trace
+ events.
+ @rtype: String
+ """
+ return self.__pathToTraceEvents
+
+ def getTraceEventNames(self):
+ """
+ Returns a list of all the trace event names found.
+
+ @return: Returns a list of all the trace event names found.
+ @rtype: Array
+ """
+ return self.__traceEventsMap.keys()
+
+ def getTraceEvent(self, traceEventName):
+ """
+ Returns a TraceEvent that matches the traceEventName. If no match is
+ found then None is returned.
+
+ @return: Returns a TraceEvent that matches the traceEventName. If no
+ match is found then None is returned.
+ @rtype: TraceEvent
+ """
+ if (self.__traceEventsMap.has_key(traceEventName)):
+ return self.__traceEventsMap.get(traceEventName)
+ return None
+
+# #####################################################################
+# Helper functions.
+# #####################################################################
+def runCommand(command, listOfCommandOptions, standardOut=subprocess.PIPE, standardError=subprocess.PIPE):
+ """
+ This function will execute a command. It will return True if the return code
+ was zero, otherwise False is returned.
+
+ @return: Returns True if the return code was zero, otherwise False is
+ returned.
+ @rtype: Boolean
+
+ @param command: The command that will be executed.
+ @type command: String
+ @param listOfCommandOptions: The list of options for the command that will
+ be executed.
+ @type listOfCommandOptions: Array
+ @param standardOut: The pipe that will be used to write standard output. By
+ default the pipe that is used is subprocess.PIPE.
+ @type standardOut: Pipe
+ @param standardError: The pipe that will be used to write standard error. By
+ default the pipe that is used is subprocess.PIPE.
+ @type standardError: Pipe
+ """
+ stdout = ""
+ stderr = ""
+ try:
+ commandList = [command]
+ commandList += listOfCommandOptions
+ task = subprocess.Popen(commandList, stdout=standardOut, stderr=standardError)
+ task.wait()
+ (stdout, stderr) = task.communicate()
+ return (task.returncode == 0)
+ except OSError:
+ commandOptionString = ""
+ for option in listOfCommandOptions:
+ commandOptionString += "%s " %(option)
+ message = "An error occurred running the command: $ %s %s\n" %(command, commandOptionString)
+ if (len(stdout) > 0):
+ message += stdout
+ message += "\n"
+ if (len(stderr) > 0):
+ message += stderr
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+
+def mountFilesystem(filesystemType, pathToDevice, pathToMountPoint):
+ """
+ This function will attempt to mount a filesystem. If the filesystem is
+ already mounted or the filesystem was successfully mounted then True is
+ returned, otherwise False is returned.
+
+ @return: If the filesystem is already mounted or the filesystem was
+ successfully mounted then True is returned, otherwise False is returned.
+ @rtype: Boolean
+
+ @param filesystemType: The type of filesystem that will be mounted.
+ @type filesystemType: String
+ @param pathToDevice: The path to the device that will be mounted.
+ @type pathToDevice: String
+ @param pathToMountPoint: The path to the directory that will be used as the
+ mount point for the device.
+ @type pathToMountPoint: String
+ """
+ if (os.path.ismount(PATH_TO_DEBUG_DIR)):
+ return True
+ listOfCommandOptions = ["-t", filesystemType, pathToDevice, pathToMountPoint]
+ if (not runCommand("mount", listOfCommandOptions)):
+ message = "There was an error mounting the filesystem type %s for the device %s to the mount point %s." %(filesystemType, pathToDevice, pathToMountPoint)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return os.path.ismount(PATH_TO_DEBUG_DIR)
+
+def exitScript(removePidFile=True, errorCode=0):
+ """
+ This function will cause the script to exit or quit. It will return an error
+ code and will remove the pid file that was created.
+
+ @param removePidFile: If True(default) then the pid file will be remove
+ before the script exits.
+ @type removePidFile: Boolean
+ @param errorCode: The exit code that will be returned. The default value is 0.
+ @type errorCode: Int
+ """
+ if (removePidFile):
+ message = "Removing the pid file: %s" %(PATH_TO_PID_FILENAME)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ if (os.path.exists(PATH_TO_PID_FILENAME)):
+ try:
+ os.remove(PATH_TO_PID_FILENAME)
+ except IOError:
+ message = "There was an error removing the file: %s." %(PATH_TO_PID_FILENAME)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ message = "The script will exit."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ sys.exit(errorCode)
+
+def getMountedGFS2Filesystems():
+ """
+ This function returns a list of all the mounted GFS2 filesystems.
+
+ @return: Returns a list of all the mounted GFS2 filesystems.
+ @rtype: Array
+ """
+ fsType = "gfs2"
+ listOfMountedFilesystems = []
+ dataOutput = FileUtils.getDataFromFile("/proc/mounts")
+ if (not dataOutput == None):
+ for line in dataOutput:
+ splitLine = line.split()
+ if (len(splitLine) > 0):
+ if (splitLine[2] == fsType):
+ listOfMountedFilesystems.append(line)
+ return listOfMountedFilesystems
+# ##############################################################################
+# Get user selected options
+# ##############################################################################
+def __getOptions(version) :
+ """
+ This function creates the OptionParser and returns commandline
+ a tuple of the selected commandline options and commandline args.
+
+ The cmdlineOpts which is the options user selected and cmdLineArgs
+ is value passed and not associated with an option.
+
+ @return: A tuple of the selected commandline options and commandline args.
+ @rtype: Tuple
+
+ @param version: The version of the this script.
+ @type version: String
+ """
+ cmdParser = OptionParserExtended(version)
+ cmdParser.add_option("-d", "--debug",
+ action="store_true",
+ dest="enableDebugLogging",
+ help="enables debug logging",
+ default=False)
+ cmdParser.add_option("-q", "--quiet",
+ action="store_true",
+ dest="disableLoggingToConsole",
+ help="disables logging to console",
+ default=False)
+ cmdParser.add_option("-l", "--list",
+ action="store_true",
+ dest="listTraceEvents",
+ help="lists the enabled state and filters for the GFS2 trace events",
+ default=False)
+ cmdParser.add_option("-E", "--enable_all_trace_events",
+ action="store_true",
+ dest="enableAllTraceEvents",
+ help="enables all trace_events for GFS2",
+ default=False)
+ cmdParser.add_option("-e", "--enable_trace_event",
+ action="extend",
+ dest="enableTraceEventsList",
+ help="selected trace_events that will be enabled for GFS2",
+ type="string",
+ metavar="<trace event name>",
+ default=[])
+ cmdParser.add_option("-N", "--disable_all_trace_events",
+ action="store_true",
+ dest="disableAllTraceEvents",
+ help="disables all trace_events for GFS2",
+ default=False)
+ cmdParser.add_option("-n", "--disable_trace_event",
+ action="extend",
+ dest="disableTraceEventsList",
+ help="selected trace_events that will be disabled for GFS2",
+ type="string",
+ metavar="<trace event name>",
+ default=[])
+ cmdParser.add_option("-c", "--capture",
+ action="store",
+ dest="pathToOutputFilename",
+ help="enables capturing of trace events and will save the data to a file",
+ type="string",
+ metavar="<output filename>",
+ default="")
+
+ (cmdLineOpts, cmdLineArgs) = cmdParser.parse_args()
+ return (cmdLineOpts, cmdLineArgs)
+
+# ##############################################################################
+# OptParse classes for commandline options
+# ##############################################################################
+class OptionParserExtended(OptionParser):
+ """
+ This is the class that gets the command line options the end user
+ selects.
+ """
+ def __init__(self, version) :
+ """
+ @param version: The version of the this script.
+ @type version: String
+ """
+ self.__commandName = os.path.basename(sys.argv[0])
+ versionMessage = "%s %s\n" %(self.__commandName, version)
+
+ commandDescription ="%s can enable trace events, disable trace events, and capture data from GFS2 trace events.\n"%(self.__commandName)
+ OptionParser.__init__(self, option_class=ExtendOption,
+ version=versionMessage,
+ description=commandDescription)
+
+ def print_help(self):
+ """
+ Print examples at the bottom of the help message.
+ """
+ exampleMessage = "\nExamples:\n"
+ exampleMessage += "To list the enable status and filter for each trace event.\n"
+ exampleMessage += "# %s -l\n\n" %(self.__commandName)
+ exampleMessage += "To disable all trace events.\n"
+ exampleMessage += "# %s -N\n\n" %(self.__commandName)
+ exampleMessage += "To enable all trace events.\n"
+ exampleMessage += "# %s -E\n\n" %(self.__commandName)
+ exampleMessage += "To disable all trace events and then enable a couple trace events.\n"
+ exampleMessage += "# %s -N -e gfs2_demote_rq,gfs2_glock_state_change,gfs2_promote\n\n" %(self.__commandName)
+ exampleMessage += "To capture all the trace events and write to the file /tmp/gfs2_trace.log.\n"
+ exampleMessage += "# %s -c /tmp/gfs2_trace.log\n\n" %(self.__commandName)
+ exampleMessage += "To disable all trace events and then enable a couple trace events and capture the output to a file.\n"
+ exampleMessage += "# %s -N -e gfs2_demote_rq,gfs2_glock_state_change,gfs2_promote -c /tmp/gfs2_trace.log\n" %(self.__commandName)
+ self.print_version()
+ OptionParser.print_help(self)
+ print exampleMessage
+
+class ExtendOption (Option):
+ """
+ Allow to specify comma delimited list of entries for arrays
+ and dictionaries.
+ """
+ ACTIONS = Option.ACTIONS + ("extend",)
+ STORE_ACTIONS = Option.STORE_ACTIONS + ("extend",)
+ TYPED_ACTIONS = Option.TYPED_ACTIONS + ("extend",)
+
+ def take_action(self, action, dest, opt, value, values, parser):
+ """
+ This function is a wrapper to take certain options passed on command
+ prompt and wrap them into an Array.
+
+ @param action: The type of action that will be taken. For example:
+ "store_true", "store_false", "extend".
+ @type action: String
+ @param dest: The name of the variable that will be used to store the
+ option.
+ @type dest: String/Boolean/Array
+ @param opt: The option string that triggered the action.
+ @type opt: String
+ @param value: The value of opt(option) if it takes a
+ value, if not then None.
+ @type value:
+ @param values: All the opt(options) in a dictionary.
+ @type values: Dictionary
+ @param parser: The option parser that was orginally called.
+ @type parser: OptionParser
+ """
+ if (action == "extend") :
+ valueList=[]
+ try:
+ for v in value.split(","):
+ # Need to add code for dealing with paths if there is option for paths.
+ valueList.append(v)
+ except:
+ pass
+ else:
+ values.ensure_value(dest, []).extend(valueList)
+ else:
+ Option.take_action(self, action, dest, opt, value, values, parser)
+
+# ###############################################################################
+# Main Function
+# ###############################################################################
+if __name__ == "__main__":
+ try:
+ # #######################################################################
+ # Get the options from the commandline.
+ # #######################################################################
+ (cmdLineOpts, cmdLineArgs) = __getOptions(VERSION_NUMBER)
+
+ # #######################################################################
+ # Setup the logger
+ # #######################################################################
+ # Create the logger
+ logLevel = logging.INFO
+ logger = logging.getLogger(MAIN_LOGGER_NAME)
+ logger.setLevel(logLevel)
+ # Create a new status function and level.
+ logging.STATUS = logging.INFO + 2
+ logging.addLevelName(logging.STATUS, "STATUS")
+ # Create a function for the STATUS_LEVEL since not defined by python. This
+ # means you can call it like the other predefined message
+ # functions. Example: logging.getLogger("loggerName").status(message)
+ setattr(logger, "status", lambda *args: logger.log(logging.STATUS, *args))
+ streamHandler = logging.StreamHandler()
+ streamHandler.setFormatter(logging.Formatter("%(levelname)s %(message)s"))
+ logger.addHandler(streamHandler)
+
+ # Set options on logger for debugging or no logging.
+ if (cmdLineOpts.disableLoggingToConsole):
+ logging.disable(logging.CRITICAL)
+ elif (cmdLineOpts.enableDebugLogging) :
+ logging.getLogger(MAIN_LOGGER_NAME).setLevel(logging.DEBUG)
+ message = "Debugging has been enabled."
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+
+ # #######################################################################
+ # Check to see if pid file exists and error if it does.
+ # #######################################################################
+ if (os.path.exists(PATH_TO_PID_FILENAME)):
+ message = "The PID file %s already exists and this script cannot run till it does not exist." %(PATH_TO_PID_FILENAME)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ message = "Verify that there are no other existing processes running. If there are running processes those need to be stopped first and the file removed."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ exitScript(removePidFile=False, errorCode=1)
+ else:
+ message = "Creating the pid file: %s" %(PATH_TO_PID_FILENAME)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ # Creata the pid file so we dont have more than 1 process of this
+ # script running.
+ FileUtils.writeToFile(PATH_TO_PID_FILENAME, str(os.getpid()), createFile=True)
+
+ # #######################################################################
+ # Check to see if there any GFS2 filesystems mounted, if not then exit.
+ # #######################################################################
+ if (not len(getMountedGFS2Filesystems()) > 0):
+ message = "There was no GFS2 file-systems mounted."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ exitScript(errorCode=1)
+
+ # #######################################################################
+ # Check to see if the debug directory is mounted. If not then
+ # log an error.
+ # #######################################################################
+ if(mountFilesystem("debugfs", "none", PATH_TO_DEBUG_DIR)):
+ message = "The debug filesystem %s is mounted." %(PATH_TO_DEBUG_DIR)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ else:
+ message = "There was a problem mounting the debug filesystem: %s" %(PATH_TO_DEBUG_DIR)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ message = "The debug filesystem is required to be mounted for this script to run."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ exitScript(errorCode=1)
+
+ # #######################################################################
+ # List of the enable state and filters for each trace event
+ # #######################################################################
+ traceEvents = TraceEvents(PATH_TO_GFS2_TRACE_EVENTS_DIR)
+ listOfTraceEventNames = traceEvents.getTraceEventNames()
+
+ if (cmdLineOpts.listTraceEvents):
+ listOfTraceEventNames.sort()
+ maxTraceEventNameSize = len(max(listOfTraceEventNames, key=len))
+ traceEventsString = ""
+ for traceEventName in listOfTraceEventNames:
+ traceEvent = traceEvents.getTraceEvent(traceEventName)
+ if (not traceEventName == None):
+ traceEventEnableStatus = "UNKNOWN"
+ if (traceEvent.getEnable() == "0"):
+ traceEventEnableStatus = "DISABLED"
+ elif (traceEvent.getEnable() == "1"):
+ traceEventEnableStatus = "ENABLED"
+ whitespaces = ""
+ for i in range(0, (maxTraceEventNameSize - len(traceEventName))):
+ whitespaces += " "
+ traceEventsString += "%s %s%s\n" %(traceEventName, whitespaces, traceEventEnableStatus)
+ # Disable logging to console except for debug when we print into information to console.
+ logging.disable(logging.CRITICAL)
+ if (len(traceEventsString) > 0):
+ print "trace event name trace event status"
+ print "---------------- ------------------"
+ print traceEventsString.rstrip()
+ exitScript()
+ # #######################################################################
+ # Enable or Disable Trace Events
+ # #######################################################################
+ if (cmdLineOpts.disableAllTraceEvents):
+ message = "Disabling all trace events."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ for traceEventName in listOfTraceEventNames:
+ traceEvent = traceEvents.getTraceEvent(traceEventName)
+ if (not traceEvent == None):
+ traceEvent.setEventEnable("0")
+ elif (cmdLineOpts.enableAllTraceEvents):
+ message = "Enabling all trace events."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ for traceEventName in listOfTraceEventNames:
+ traceEvent = traceEvents.getTraceEvent(traceEventName)
+ if (not traceEvent == None):
+ traceEvent.setEventEnable("1")
+
+ if (len(cmdLineOpts.disableTraceEventsList) > 0):
+ for traceEventName in cmdLineOpts.disableTraceEventsList:
+ traceEvent = traceEvents.getTraceEvent(traceEventName)
+ if (not traceEvent == None):
+ message = "Disabling the selected trace event: %s" %(traceEvent.getName())
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ traceEvent.setEventEnable("0")
+ if (len(cmdLineOpts.enableTraceEventsList) > 0):
+ for traceEventName in cmdLineOpts.enableTraceEventsList:
+ traceEvent = traceEvents.getTraceEvent(traceEventName)
+ if (not traceEvent == None):
+ message = "Enabling the selected trace event: %s" %(traceEvent.getName())
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ traceEvent.setEventEnable("1")
+
+ # #######################################################################
+ # Capture the data generate by the trace events.
+ # #######################################################################
+ if (len(cmdLineOpts.pathToOutputFilename) > 0):
+ # Read from tracing pipe and write the output to a file.
+ message = "The capturing of the trace events that were enabled to a file will be started by reading the the trace pipe: %s." %(PATH_TO_TRACE_PIPE)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ message = "Leave this script running until you have capture all the data, then hit control-c to exit."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ try:
+ fout = open(cmdLineOpts.pathToOutputFilename, "w")
+ for line in fileinput.input(PATH_TO_TRACE_PIPE):
+ fout.write(line)
+ fout.close()
+ message = "The data was written to this file: %s" %(cmdLineOpts.pathToOutputFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ except KeyboardInterrupt:
+ message = "A control-c was detected and the capturing of trace events data will stop."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ fout.close()
+ message = "The data was written to this file: %s" %(cmdLineOpts.pathToOutputFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ except UnicodeEncodeError, e:
+ message = "There was a unicode encode error writing to the file: %s." %(cmdLineOpts.pathToOutputFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ except IOError:
+ message = "There was an error writing to the file: %s." %(cmdLineOpts.pathToOutputFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ message = "The capturing of the trace event data has completed."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ # Compress the file so that it will have a smaller file size.
+ pathToTarFilename = "%s.tar.bz2" %(os.path.splitext(cmdLineOpts.pathToOutputFilename)[0])
+ message = "Creating a compressed archvied file: %s" %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ try:
+ tar = tarfile.open(pathToTarFilename, "w:bz2")
+ tar.add(cmdLineOpts.pathToOutputFilename, arcname=os.path.basename(cmdLineOpts.pathToOutputFilename))
+ tar.close()
+ message = "The compressed archvied file was created: %s" %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ except tarfile.TarError:
+ message = "There was an error creating the tarfile: %s." %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ except KeyboardInterrupt:
+ print ""
+ message = "This script will exit since control-c was executed by end user."
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ exitScript()
+ # #######################################################################
+ # Exit the application with zero exit code since we cleanly exited.
+ # #######################################################################
+ exitScript()
11 years, 4 months
gfs2-utils: master - gfs2-utils: Add a doc on contributing
by Andrew Price
Gitweb: http://git.fedorahosted.org/git/?p=gfs2-utils.git;a=commitdiff;h=b3d765d9...
Commit: b3d765d9e7ec7c53bc7b88c7eef35f5a249ebcae
Parent: 699d35a6c313c4b2c6090a94939188e373a48168
Author: Andrew Price <anprice(a)redhat.com>
AuthorDate: Tue Dec 18 13:25:05 2012 +0000
Committer: Andrew Price <anprice(a)redhat.com>
CommitterDate: Tue Dec 18 14:03:14 2012 +0000
gfs2-utils: Add a doc on contributing
Add README.contributing to cover some common questions from
contributors.
Signed-off-by: Andrew Price <anprice(a)redhat.com>
---
doc/Makefile.am | 1 +
doc/README.contributing | 65 +++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 66 insertions(+), 0 deletions(-)
diff --git a/doc/Makefile.am b/doc/Makefile.am
index 05ec222..6a70f82 100644
--- a/doc/Makefile.am
+++ b/doc/Makefile.am
@@ -6,4 +6,5 @@ dist_doc_DATA = gfs2.txt \
COPYING.applications \
COPYING.libraries \
COPYRIGHT \
+ README.contributing \
README.licence
diff --git a/doc/README.contributing b/doc/README.contributing
new file mode 100644
index 0000000..d669271
--- /dev/null
+++ b/doc/README.contributing
@@ -0,0 +1,65 @@
+Contributing to gfs2-utils
+--------------------------
+
+Here are some brief guidelines to follow when contributing to gfs2-utils.
+
+Translations
+------------
+
+We use the Transifex translation service:
+
+ https://transifex.com/projects/p/gfs2-utils/
+
+See the documentation there for submitting translations.
+
+Patches
+-------
+
+We don't dictate any particular coding style but please try to use a style
+consistent with the existing code. If in doubt, the Linux kernel coding style
+document is a good guideline:
+
+ http://www.kernel.org/doc/Documentation/CodingStyle
+
+We use git for managing our source code and we assume here that you're familiar
+with git. Patches should apply cleanly to the latest master branch of
+gfs2-utils.git
+
+ http://git.fedorahosted.org/cgit/gfs2-utils.git
+
+For ease of review and maintenance each of your patches should address a single
+issue and if there are multiple issues please consider spreading your work over
+several patches. Ideally none of the individual patches should break the build.
+
+We value good commit logs, which should be of the form:
+
+ component: short patch summary
+
+ Longer description wrapped at approx. 72 columns explaining the problem the
+ patch addresses and how the patch addresses it.
+
+ Signed-off-by: Your Name <youremail(a)example.com>
+
+The "component" should be the name of the tool or the part of the code which
+the patch touches. As we share a mailing list with several projects it should
+make clear that it's a gfs2-utils patch. Some examples:
+
+Bad short logs:
+
+ Fix a bug
+ Add a test
+
+Good short logs:
+
+ fsck.gfs2: Fix a null pointer dereference in foo
+ gfs2-utils: Add a test for lgfs2_do_stuff
+
+Be sure to reference any relevant bug reports in your long description, e.g.
+
+ Ref: rhbz#012345
+ Fixes: rhbz#98765
+
+Please send patches to <cluster-devel(a)redhat.com>. We recommend using
+`git format-patch' to generate patch emails from your commits and `git
+send-email' for sending them to the list. See the git documentation for
+details.
11 years, 4 months
cluster: STABLE32 - qdiskd: change log level for an error message
by Fabio M. Di Nitto
Gitweb: http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=544a70cbcf9...
Commit: 544a70cbcf94410d277fd7123a1943c8d2bef790
Parent: c40959bc96cde293989296358ba06d09d6ee6d69
Author: Fabio M. Di Nitto <fdinitto(a)redhat.com>
AuthorDate: Tue Dec 18 14:50:43 2012 +0100
Committer: Fabio M. Di Nitto <fdinitto(a)redhat.com>
CommitterDate: Tue Dec 18 14:50:43 2012 +0100
qdiskd: change log level for an error message
Resolves: rhbz#888318
Signed-off-by: Fabio M. Di Nitto <fdinitto(a)redhat.com>
---
cman/qdisk/main.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/cman/qdisk/main.c b/cman/qdisk/main.c
index 8eb9a3a..68e72c7 100644
--- a/cman/qdisk/main.c
+++ b/cman/qdisk/main.c
@@ -1742,7 +1742,7 @@ get_static_config_data(qd_ctx *ctx, int ccsfd)
ctx->qc_token_timeout = atoi(val);
free(val);
if (ctx->qc_token_timeout < 10000) {
- logt_print(LOG_DEBUG, "Token timeout %d is too fast "
+ logt_print(LOG_ERR, "Token timeout %d is too fast "
"to use with qdiskd!\n",
ctx->qc_token_timeout);
return -1;
11 years, 4 months
gfs2-utils: master - Fix clang --analyze warning.
by Andrew Price
Gitweb: http://git.fedorahosted.org/git/?p=gfs2-utils.git;a=commitdiff;h=699d35a6...
Commit: 699d35a6c313c4b2c6090a94939188e373a48168
Parent: f98cba4c9517fa593171542abf7d76859bd6d700
Author: Sitsofe Wheeler <sitsofe(a)yahoo.com>
AuthorDate: Mon Dec 17 14:43:13 2012 +0000
Committer: Andrew Price <anprice(a)redhat.com>
CommitterDate: Tue Dec 18 10:43:46 2012 +0000
Fix clang --analyze warning.
- Return before a possible NULL pointer dereference.
---
gfs2/libgfs2/fs_bits.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)
diff --git a/gfs2/libgfs2/fs_bits.c b/gfs2/libgfs2/fs_bits.c
index 94a612b..e4b5505 100644
--- a/gfs2/libgfs2/fs_bits.c
+++ b/gfs2/libgfs2/fs_bits.c
@@ -149,6 +149,8 @@ int gfs2_set_bitmap(struct gfs2_sbd *sdp, uint64_t blkno, int state)
break;
}
+ if (bits == NULL)
+ return -1;
byte = (unsigned char *)(rgd->bh[buf]->b_data + bits->bi_offset) +
(rgrp_block/GFS2_NBBY - bits->bi_start);
bit = (rgrp_block % GFS2_NBBY) * GFS2_BIT_SIZE;
11 years, 4 months
gfs2-utils: master - gfs2-utils: Fix build warnings in Fedora 18
by Andrew Price
Gitweb: http://git.fedorahosted.org/git/?p=gfs2-utils.git;a=commitdiff;h=f98cba4c...
Commit: f98cba4c9517fa593171542abf7d76859bd6d700
Parent: be4e8573abaa77390337c41de27f0f9eb5b99eea
Author: Callum Massey <kais58(a)sucs.org>
AuthorDate: Fri Dec 14 22:30:36 2012 +0000
Committer: Andrew Price <anprice(a)redhat.com>
CommitterDate: Mon Dec 17 11:00:38 2012 +0000
gfs2-utils: Fix build warnings in Fedora 18
Dropped the * from gzFile *gzin_fd because gzFile is no longer defined
as a void pointer.
Rearranged incldes in libgfs2/lang.c so parser.h gets passed struct
lgfs2_lang_state correctly.
Signed-off-by: Callum Massey <kais58(a)sucs.org>
---
gfs2/edit/savemeta.c | 2 +-
gfs2/libgfs2/lang.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/gfs2/edit/savemeta.c b/gfs2/edit/savemeta.c
index f35c35d..cfe18eb 100644
--- a/gfs2/edit/savemeta.c
+++ b/gfs2/edit/savemeta.c
@@ -831,7 +831,7 @@ void savemeta(char *out_fn, int saveoption, int gziplevel)
exit(0);
}
-static int restore_data(int fd, gzFile *gzin_fd, int printblocksonly,
+static int restore_data(int fd, gzFile gzin_fd, int printblocksonly,
int find_highblk)
{
size_t rs;
diff --git a/gfs2/libgfs2/lang.c b/gfs2/libgfs2/lang.c
index ad9382f..40a4355 100644
--- a/gfs2/libgfs2/lang.c
+++ b/gfs2/libgfs2/lang.c
@@ -7,8 +7,8 @@
#include <limits.h>
#include <ctype.h>
-#include "parser.h"
#include "lang.h"
+#include "parser.h"
const char* ast_type_string[] = {
[AST_NONE] = "NONE",
11 years, 4 months
cluster: STABLE32 - If corosync goes down/is shut down cman will return 0 from cman_dispatch and close the socket. However, if a cman write operation is issued before this happens then SIGPIPE can result from the writev() call to an open, but disconnected, FD.
by Christine Caulfield
Gitweb: http://git.fedorahosted.org/git/?p=cluster.git;a=commitdiff;h=c40959bc96c...
Commit: c40959bc96cde293989296358ba06d09d6ee6d69
Parent: bd81c6f65aac30aaeb6d3eb129eab6ee3512d40c
Author: Christine Caulfield <ccaulfie(a)redhat.com>
AuthorDate: Mon Dec 17 10:17:37 2012 +0000
Committer: Christine Caulfield <ccaulfie(a)redhat.com>
CommitterDate: Mon Dec 17 10:17:37 2012 +0000
If corosync goes down/is shut down cman will return 0 from cman_dispatch and close the socket. However, if a cman write operation is issued before this happens then SIGPIPE can result from the writev() call to an open, but disconnected, FD.
This patch changes writev() to sendmg() so it can pass MSG_NOSIGNAL to the
system call and prevent SIGPIPEs from occurring.
Acked-By: Fabio M. Di Nitto <fdinitto(a)redhat.com>
Signed-off-by: Christine Caulfield <ccaulfie(a)redhat.com>
---
cman/lib/libcman.c | 11 ++++++++++-
1 files changed, 10 insertions(+), 1 deletions(-)
diff --git a/cman/lib/libcman.c b/cman/lib/libcman.c
index 6ed8ecb..367129c 100644
--- a/cman/lib/libcman.c
+++ b/cman/lib/libcman.c
@@ -204,10 +204,19 @@ static int loopy_writev(int fd, struct iovec *iovptr, size_t iovlen)
{
size_t byte_cnt=0;
int len;
+ struct msghdr msg;
+
+ msg.msg_name = NULL;
+ msg.msg_namelen = 0;
+ msg.msg_control = NULL;
+ msg.msg_controllen = 0;
while (iovlen > 0)
{
- len = writev(fd, iovptr, iovlen);
+ msg.msg_iov = iovptr;
+ msg.msg_iovlen = iovlen;
+
+ len = sendmsg(fd, &msg, MSG_NOSIGNAL);
if (len <= 0)
return len;
11 years, 4 months
gfs2-utils: master - gfs2-utils: Rename lockcapture directory to scripts
by Andrew Price
Gitweb: http://git.fedorahosted.org/git/?p=gfs2-utils.git;a=commitdiff;h=be4e8573...
Commit: be4e8573abaa77390337c41de27f0f9eb5b99eea
Parent: 882b2853f1d9545f86e942e6ad5cf0160413530c
Author: Andrew Price <anprice(a)redhat.com>
AuthorDate: Fri Dec 14 14:53:12 2012 +0000
Committer: Andrew Price <anprice(a)redhat.com>
CommitterDate: Fri Dec 14 14:53:12 2012 +0000
gfs2-utils: Rename lockcapture directory to scripts
As more scripts are likely to be added it makes sense to give this
directory a more generic name.
Signed-off-by: Andrew Price <anprice(a)redhat.com>
---
configure.ac | 2 +-
gfs2/Makefile.am | 2 +-
gfs2/lockcapture/Makefile.am | 12 -
gfs2/lockcapture/gfs2_lockcapture | 1231 -------------------------------------
gfs2/scripts/Makefile.am | 12 +
gfs2/scripts/gfs2_lockcapture | 1231 +++++++++++++++++++++++++++++++++++++
6 files changed, 1245 insertions(+), 1245 deletions(-)
diff --git a/configure.ac b/configure.ac
index 80670a4..3ba3b9b 100644
--- a/configure.ac
+++ b/configure.ac
@@ -267,7 +267,7 @@ AC_CONFIG_FILES([Makefile
gfs2/mkfs/Makefile
gfs2/tune/Makefile
gfs2/man/Makefile
- gfs2/lockcapture/Makefile
+ gfs2/scripts/Makefile
doc/Makefile
tests/Makefile
po/Makefile.in
diff --git a/gfs2/Makefile.am b/gfs2/Makefile.am
index 42a284f..645119f 100644
--- a/gfs2/Makefile.am
+++ b/gfs2/Makefile.am
@@ -1,4 +1,4 @@
MAINTAINERCLEANFILES = Makefile.in
SUBDIRS = libgfs2 convert edit fsck mkfs man \
- tune include lockcapture #init.d
+ tune include scripts #init.d
diff --git a/gfs2/lockcapture/Makefile.am b/gfs2/lockcapture/Makefile.am
deleted file mode 100644
index b88580e..0000000
--- a/gfs2/lockcapture/Makefile.am
+++ /dev/null
@@ -1,12 +0,0 @@
-MAINTAINERCLEANFILES = Makefile.in
-
-# When an exec_prefix setting would have us install into /usr/sbin,
-# use /sbin instead.
-# Accept an existing sbindir value of /usr/sbin (probably for older automake),
-# or an empty value, for automake-1.11 and newer.
-sbindir := $(shell rpl=0; test '$(exec_prefix):$(sbindir)' = /usr:/usr/sbin \
- || test '$(exec_prefix):$(sbindir)' = /usr: && rpl=1; \
- test $$rpl = 1 && echo /sbin || echo '$(exec_prefix)/sbin')
-
-
-dist_sbin_SCRIPTS = gfs2_lockcapture
diff --git a/gfs2/lockcapture/gfs2_lockcapture b/gfs2/lockcapture/gfs2_lockcapture
deleted file mode 100644
index 1a64188..0000000
--- a/gfs2/lockcapture/gfs2_lockcapture
+++ /dev/null
@@ -1,1231 +0,0 @@
-#!/usr/bin/env python
-"""
-The script gfs2_lockcapture will capture locking information from GFS2 file
-systems and DLM.
-
-@author : Shane Bradley
-@contact : sbradley(a)redhat.com
-@version : 0.9
-@copyright : GPLv2
-"""
-import sys
-import os
-import os.path
-import logging
-from optparse import OptionParser, Option
-import time
-import platform
-import shutil
-import subprocess
-import tarfile
-
-# #####################################################################
-# Global vars:
-# #####################################################################
-"""
-@cvar VERSION_NUMBER: The version number of this script.
-@type VERSION_NUMBER: String
-@cvar MAIN_LOGGER_NAME: The name of the logger.
-@type MAIN_LOGGER_NAME: String
-@cvar PATH_TO_DEBUG_DIR: The path to the debug directory for the linux kernel.
-@type PATH_TO_DEBUG_DIR: String
-@cvar PATH_TO_PID_FILENAME: The path to the pid file that will be used to make
-sure only 1 instance of this script is running at any time.
-@type PATH_TO_PID_FILENAME: String
-"""
-VERSION_NUMBER = "0.9-2"
-MAIN_LOGGER_NAME = "%s" %(os.path.basename(sys.argv[0]))
-PATH_TO_DEBUG_DIR="/sys/kernel/debug"
-PATH_TO_PID_FILENAME = "/var/run/%s.pid" %(os.path.basename(sys.argv[0]))
-
-# #####################################################################
-# Class to define what a clusternode is.
-# #####################################################################
-class ClusterNode:
- """
- This class represents a cluster node that is a current memeber in a cluster.
- """
- def __init__(self, clusternodeName, clusterName, mapOfMountedFilesystemLabels):
- """
- @param clusternodeName: The name of the cluster node.
- @type clusternodeName: String
- @param clusterName: The name of the cluster that this cluster node is a
- member of.
- @type clusterName: String
- @param mapOfMountedFilesystemLabels: A map of filesystem labels(key) for
- a mounted filesystem. The value is the line for the matching mounted
- filesystem from the mount -l command.
- @type mapOfMountedFilesystemLabels: Dict
- """
- self.__clusternodeName = clusternodeName
- self.__clusterName = clusterName
- self.__mapOfMountedFilesystemLabels = mapOfMountedFilesystemLabels
-
- def __str__(self):
- """
- This function will return a string representation of the object.
-
- @return: Returns a string representation of the object.
- @rtype: String
- """
- rString = ""
- rString += "%s:%s" %(self.getClusterName(), self.getClusterNodeName())
- fsLabels = self.__mapOfMountedFilesystemLabels.keys()
- fsLabels.sort()
- for fsLabel in fsLabels:
- rString += "\n\t%s --> %s" %(fsLabel, self.__mapOfMountedFilesystemLabels.get(fsLabel))
- return rString.rstrip()
-
- def getClusterNodeName(self):
- """
- Returns the name of the cluster node.
-
- @return: Returns the name of the cluster node.
- @rtype: String
- """
- return self.__clusternodeName
-
- def getClusterName(self):
- """
- Returns the name of cluster that this cluster node is a member of.
-
- @return: Returns the name of cluster that this cluster node is a member
- of.
- @rtype: String
- """
- return self.__clusterName
-
- def getMountedGFS2FilesystemNames(self, includeClusterName=True):
- """
- Returns the names of all the mounted GFS2 filesystems. By default
- includeClusterName is True which will include the name of the cluster
- and the GFS2 filesystem name(ex. f18cluster:mygfs2vol1) in the list of
- mounted GFS2 filesystems. If includeClusterName is False it will only
- return a list of all the mounted GFS2 filesystem names(ex. mygfs2vol1).
-
- @return: Returns a list of all teh mounted GFS2 filesystem names.
- @rtype: Array
-
- @param includeClusterName: By default this option is True and will
- include the name of the cluster and the GFS2 filesystem name. If False
- then only the GFS2 filesystem name will be included.
- @param includeClusterName: Boolean
- """
- # If true will prepend the cluster name to gfs2 fs name
- if (includeClusterName):
- return self.__mapOfMountedFilesystemLabels.keys()
- else:
- listOfGFS2MountedFilesystemLabels = []
- for fsLabel in self.__mapOfMountedFilesystemLabels.keys():
- fsLabelSplit = fsLabel.split(":", 1)
- if (len(fsLabelSplit) == 2):
- listOfGFS2MountedFilesystemLabels.append(fsLabelSplit[1])
- return listOfGFS2MountedFilesystemLabels
-
-# #####################################################################
-# Helper functions.
-# #####################################################################
-def runCommand(command, listOfCommandOptions, standardOut=subprocess.PIPE, standardError=subprocess.PIPE):
- """
- This function will execute a command. It will return True if the return code
- was zero, otherwise False is returned.
-
- @return: Returns True if the return code was zero, otherwise False is
- returned.
- @rtype: Boolean
-
- @param command: The command that will be executed.
- @type command: String
- @param listOfCommandOptions: The list of options for the command that will
- be executed.
- @type listOfCommandOptions: Array
- @param standardOut: The pipe that will be used to write standard output. By
- default the pipe that is used is subprocess.PIPE.
- @type standardOut: Pipe
- @param standardError: The pipe that will be used to write standard error. By
- default the pipe that is used is subprocess.PIPE.
- @type standardError: Pipe
- """
- stdout = ""
- stderr = ""
- try:
- commandList = [command]
- commandList += listOfCommandOptions
- task = subprocess.Popen(commandList, stdout=standardOut, stderr=standardError)
- task.wait()
- (stdout, stderr) = task.communicate()
- return (task.returncode == 0)
- except OSError:
- commandOptionString = ""
- for option in listOfCommandOptions:
- commandOptionString += "%s " %(option)
- message = "An error occurred running the command: $ %s %s\n" %(command, commandOptionString)
- if (len(stdout) > 0):
- message += stdout
- message += "\n"
- if (len(stderr) > 0):
- message += stderr
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
-
-def runCommandOutput(command, listOfCommandOptions, standardOut=subprocess.PIPE, standardError=subprocess.PIPE):
- """
- This function will execute a command. Returns the output that was written to standard output. None is
- returned if there was an error.
-
- @return: Returns the output that was written to standard output. None is
- returned if there was an error.
- @rtype: String
-
- @param command: The command that will be executed.
- @type command: String
- @param listOfCommandOptions: The list of options for the command that will
- be executed.
- @type listOfCommandOptions: Array
- @param standardOut: The pipe that will be used to write standard output. By
- default the pipe that is used is subprocess.PIPE.
- @type standardOut: Pipe
- @param standardError: The pipe that will be used to write standard error. By
- default the pipe that is used is subprocess.PIPE.
- @type standardError: Pipe
- """
- stdout = ""
- stderr = ""
- try:
- commandList = [command]
- commandList += listOfCommandOptions
- task = subprocess.Popen(commandList, stdout=standardOut, stderr=standardError)
- task.wait()
- (stdout, stderr) = task.communicate()
- except OSError:
- commandOptionString = ""
- for option in listOfCommandOptions:
- commandOptionString += "%s " %(option)
- message = "An error occurred running the command: $ %s %s\n" %(command, commandOptionString)
- if (len(stdout) > 0):
- message += stdout
- message += "\n"
- if (len(stderr) > 0):
- message += stderr
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return None
- return stdout.strip().rstrip()
-
-def writeToFile(pathToFilename, data, appendToFile=True, createFile=False):
- """
- This function will write a string to a file.
-
- @return: Returns True if the string was successfully written to the file,
- otherwise False is returned.
- @rtype: Boolean
-
- @param pathToFilename: The path to the file that will have a string written
- to it.
- @type pathToFilename: String
- @param data: The string that will be written to the file.
- @type data: String
- @param appendToFile: If True then the data will be appened to the file, if
- False then the data will overwrite the contents of the file.
- @type appendToFile: Boolean
- @param createFile: If True then the file will be created if it does not
- exists, if False then file will not be created if it does not exist
- resulting in no data being written to the file.
- @type createFile: Boolean
- """
- [parentDir, filename] = os.path.split(pathToFilename)
- if (os.path.isfile(pathToFilename) or (os.path.isdir(parentDir) and createFile)):
- try:
- filemode = "w"
- if (appendToFile):
- filemode = "a"
- fout = open(pathToFilename, filemode)
- fout.write(data + "\n")
- fout.close()
- return True
- except UnicodeEncodeError, e:
- message = "There was a unicode encode error writing to the file: %s." %(pathToFilename)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- except IOError:
- message = "There was an error writing to the file: %s." %(pathToFilename)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- return False
-
-def mkdirs(pathToDSTDir):
- """
- This function will attempt to create a directory with the path of the value of pathToDSTDir.
-
- @return: Returns True if the directory was created or already exists.
- @rtype: Boolean
-
- @param pathToDSTDir: The path to the directory that will be created.
- @type pathToDSTDir: String
- """
- if (os.path.isdir(pathToDSTDir)):
- return True
- elif ((not os.access(pathToDSTDir, os.F_OK)) and (len(pathToDSTDir) > 0)):
- try:
- os.makedirs(pathToDSTDir)
- except (OSError, os.error):
- message = "Could not create the directory: %s." %(pathToDSTDir)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- except (IOError, os.error):
- message = "Could not create the directory with the path: %s." %(pathToDSTDir)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- return os.path.isdir(pathToDSTDir)
-
-def removePIDFile():
- """
- This function will remove the pid file.
-
- @return: Returns True if the file was successfully remove or does not exist,
- otherwise False is returned.
- @rtype: Boolean
- """
- message = "Removing the pid file: %s" %(PATH_TO_PID_FILENAME)
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- if (os.path.exists(PATH_TO_PID_FILENAME)):
- try:
- os.remove(PATH_TO_PID_FILENAME)
- except IOError:
- message = "There was an error removing the file: %s." %(PATH_TO_PID_FILENAME)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return os.path.exists(PATH_TO_PID_FILENAME)
-
-def archiveData(pathToSrcDir):
- """
- This function will return the path to the tar.bz2 file that was created. If
- the tar.bz2 file failed to be created then an empty string will be returned
- which would indicate an error occurred.
-
- @return: This function will return the path to the tar.bz2 file that was
- created. If the tar.bz2 file failed to be created then an empty string will
- be returned which would indicate an error occurred.
- @rtype: String
-
- @param pathToSrcDir: The path to the directory that will be archived into a
- .tar.bz2 file.
- @type pathToSrcDir: String
- """
- if (os.path.exists(pathToSrcDir)):
- pathToTarFilename = "%s-%s.tar.bz2" %(pathToSrcDir, platform.node())
- if (os.path.exists(pathToTarFilename)):
- message = "A compressed archvied file already exists and will be removed: %s" %(pathToTarFilename)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
- try:
- os.remove(PATH_TO_PID_FILENAME)
- except IOError:
- message = "There was an error removing the file: %s." %(pathToTarFilename)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return ""
- message = "Creating a compressed archvied file: %s" %(pathToTarFilename)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
- try:
- tar = tarfile.open(pathToTarFilename, "w:bz2")
- tar.add(pathToSrcDir, arcname=os.path.basename(pathToSrcDir))
- tar.close()
- except tarfile.TarError:
- message = "There was an error creating the tarfile: %s." %(pathToTarFilename)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return ""
- if (os.path.exists(pathToTarFilename)):
- return pathToTarFilename
- return ""
-
-def getDataFromFile(pathToSrcFile) :
- """
- This function will return the data in an array. Where each newline in file
- is a seperate item in the array. This should really just be used on
- relatively small files.
-
- None is returned if no file is found.
-
- @return: Returns an array of Strings, where each newline in file is an item
- in the array.
- @rtype: Array
-
- @param pathToSrcFile: The path to the file which will be read.
- @type pathToSrcFile: String
- """
- if (len(pathToSrcFile) > 0) :
- try:
- fin = open(pathToSrcFile, "r")
- data = fin.readlines()
- fin.close()
- return data
- except (IOError, os.error):
- message = "An error occured reading the file: %s." %(pathToSrcFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return None
-
-def copyFile(pathToSrcFile, pathToDstFile):
- """
- This function will copy a src file to dst file.
-
- @return: Returns True if the file was copied successfully.
- @rtype: Boolean
-
- @param pathToSrcFile: The path to the source file that will be copied.
- @type pathToSrcFile: String
- @param pathToDstFile: The path to the destination of the file.
- @type pathToDstFile: String
- """
- if(not os.path.exists(pathToSrcFile)):
- message = "The file does not exist with the path: %s." %(pathToSrcFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- elif (not os.path.isfile(pathToSrcFile)):
- message = "The path to the source file is not a regular file: %s." %(pathToSrcFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- elif (pathToSrcFile == pathToDstFile):
- message = "The path to the source file and path to destination file cannot be the same: %s." %(pathToDstFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- else:
- # Create the directory structure if it does not exist.
- (head, tail) = os.path.split(pathToDstFile)
- if (not mkdirs(head)) :
- # The path to the directory was not created so file
- # could not be copied.
- return False
- # Copy the file to the dst path.
- try:
- shutil.copy(pathToSrcFile, pathToDstFile)
- except shutil.Error:
- message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- except OSError:
- message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- except IOError:
- message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- return (os.path.exists(pathToDstFile))
-
-def copyDirectory(pathToSrcDir, pathToDstDir):
- """
- This function will copy a src dir to dst dir.
-
- @return: Returns True if the dir was copied successfully.
- @rtype: Boolean
-
- @param pathToSrcDir: The path to the source dir that will be copied.
- @type pathToSrcDir: String
- @param pathToDstDir: The path to the destination of the dir.
- @type pathToDstDir: String
- """
- if(not os.path.exists(pathToSrcDir)):
- message = "The directory does not exist with the path: %s." %(pathToSrcDir)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- elif (not os.path.isdir(pathToSrcDir)):
- message = "The path to the source directory is not a directory: %s." %(pathToSrcDir)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- elif (pathToSrcDir == pathToDstDir):
- message = "The path to the source directory and path to destination directory cannot be the same: %s." %(pathToDstDir)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- else:
- if (not mkdirs(pathToDstDir)) :
- # The path to the directory was not created so file
- # could not be copied.
- return False
- # Copy the file to the dst path.
- dst = os.path.join(pathToDstDir, os.path.basename(pathToSrcDir))
- try:
- shutil.copytree(pathToSrcDir, dst)
- except shutil.Error:
- message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- except OSError:
- message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- except IOError:
- message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return False
- return (os.path.exists(dst))
-
-def backupOutputDirectory(pathToOutputDir):
- """
- This function will return True if the pathToOutputDir does not exist or the
- directory was successfully rename. If pathToOutputDir exists and was not
- successfully rename then False is returned.
-
- @return: Returns True if the pathToOutputDir does not exist or the directory
- was successfully rename. If pathToOutputDir exists and was not successfully
- rename then False is returned.
- @rtype: Boolean
-
- @param pathToOutputDir: The path to the directory that will be backed up.
- @type pathToOutputDir: String
- """
- if (os.path.exists(pathToOutputDir)):
- message = "The path already exists and could contain previous lockdump data: %s" %(pathToOutputDir)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- backupIndex = 1
- pathToDST = ""
- keepSearchingForIndex = True
- while (keepSearchingForIndex):
- pathToDST = "%s.bk-%d" %(pathToOutputDir, backupIndex)
- if (os.path.exists(pathToDST)):
- backupIndex += 1
- else:
- keepSearchingForIndex = False
- try:
- message = "The existing output directory will be renamed: %s to %s." %(pathToOutputDir, pathToDST)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
- shutil.move(pathToOutputDir, pathToDST)
- except shutil.Error:
- message = "There was an error renaming the directory: %s to %s." %(pathToOutputDir, pathToDST)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- except OSError:
- message = "There was an error renaming the directory: %s to %s." %(pathToOutputDir, pathToDST)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- # The path should not exists now, else there was an error backing up an
- # existing output directory.
- return (not os.path.exists(pathToOutputDir))
-
-def exitScript(removePidFile=True, errorCode=0):
- """
- This function will cause the script to exit or quit. It will return an error
- code and will remove the pid file that was created.
-
- @param removePidFile: If True(default) then the pid file will be remove
- before the script exits.
- @type removePidFile: Boolean
- @param errorCode: The exit code that will be returned. The default value is 0.
- @type errorCode: Int
- """
- if (removePidFile):
- removePIDFile()
- message = "The script will exit."
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- sys.exit(errorCode)
-
-# #####################################################################
-# Helper functions for gathering the lockdumps.
-# #####################################################################
-def getClusterNode(listOfGFS2Names):
- """
- This function return a ClusterNode object if the machine is a member of a
- cluster and has GFS2 filesystems mounted for that cluster. The
- listOfGFS2Names is a list of GFS2 filesystem that need to have their data
- capture. If the list is empty then that means that all the mounted GFS2
- filesystems will be captured, if list is not empty then only those GFS2
- filesystems in the list will have their data captured.
-
- @return: Returns a cluster node object if there was mounted GFS2 filesystems
- found that will have their data captured.
- @rtype: ClusterNode
-
- @param listOfGFS2Names: A list of GFS2 filesystem names that will have their
- data captured. If the list is empty then that means that all the mounted
- GFS2 filesystems will be captured, if list is not empty then only those GFS2
- filesystems in the list will have their data captured.
- @type listOfGFS2Names: Array
- """
- # Return a ClusterNode object if the clusternode and cluster name are found
- # in the output, else return None.
- clusterName = ""
- clusternodeName = ""
- if (runCommand("which", ["cman_tool"])):
- stdout = runCommandOutput("cman_tool", ["status"])
- if (not stdout == None):
- stdoutSplit = stdout.split("\n")
- clusterName = ""
- clusternodeName = ""
- for line in stdoutSplit:
- if (line.startswith("Cluster Name:")):
- clusterName = line.split("Cluster Name:")[1].strip().rstrip()
- if (line.startswith("Node name: ")):
- clusternodeName = line.split("Node name:")[1].strip().rstrip()
- elif (runCommand("which", ["corosync-cmapctl"])):
- # Another way to get the local cluster node is: $ crm_node -i; crm_node -l
- # Get the name of the cluster.
- stdout = runCommandOutput("corosync-cmapctl", ["-g", "totem.cluster_name"])
- if (not stdout == None):
- stdoutSplit = stdout.split("=")
- if (len(stdoutSplit) == 2):
- clusterName = stdoutSplit[1].strip().rstrip()
- # Get the id of the local cluster node so we can get the clusternode name
- thisNodeID = ""
- stdout = runCommandOutput("corosync-cmapctl", ["-g", "runtime.votequorum.this_node_id"])
- if (not stdout == None):
- stdoutSplit = stdout.split("=")
- if (len(stdoutSplit) == 2):
- thisNodeID = stdoutSplit[1].strip().rstrip()
- # Now that we the nodeid then we can get the clusternode name.
- if (len(thisNodeID) > 0):
- stdout = runCommandOutput("corosync-quorumtool", ["-l"])
- if (not stdout == None):
- for line in stdout.split("\n"):
- splitLine = line.split()
- if (len(splitLine) == 4):
- if (splitLine[0].strip().rstrip() == thisNodeID):
- clusternodeName = splitLine[3]
- break;
- # If a clusternode name and cluster name was found then return a new object
- # since this means this cluster is part of cluster.
- if ((len(clusterName) > 0) and (len(clusternodeName) > 0)):
- mapOfMountedFilesystemLabels = getLabelMapForMountedFilesystems(clusterName, getMountedGFS2Filesystems())
- # These will be the GFS2 filesystems that will have their lockdump information gathered.
- if (len(listOfGFS2Names) > 0):
- for label in mapOfMountedFilesystemLabels.keys():
- foundMatch = False
- for gfs2FSName in listOfGFS2Names:
- if ((gfs2FSName == label) or ("%s:%s"%(clusterName, gfs2FSName) == label)):
- foundMatch = True
- break
- if ((not foundMatch) and (mapOfMountedFilesystemLabels.has_key(label))):
- del(mapOfMountedFilesystemLabels[label])
- return ClusterNode(clusternodeName, clusterName, mapOfMountedFilesystemLabels)
- else:
- return None
-
-def getMountedGFS2Filesystems():
- """
- This function returns a list of all the mounted GFS2 filesystems.
-
- @return: Returns a list of all the mounted GFS2 filesystems.
- @rtype: Array
- """
- fsType = "gfs2"
- listOfMountedFilesystems = []
- stdout = runCommandOutput("mount", ["-l"])
- if (not stdout == None):
- stdoutSplit = stdout.split("\n")
- for line in stdoutSplit:
- splitLine = line.split()
- if (len(splitLine) >= 5):
- if (splitLine[4] == fsType):
- listOfMountedFilesystems.append(line)
- return listOfMountedFilesystems
-
-def getLabelMapForMountedFilesystems(clusterName, listOfMountedFilesystems):
- """
- This function will return a dictionary of the mounted GFS2 filesystem that
- contain a label that starts with the cluster name. For example:
- {'f18cluster:mygfs2vol1': '/dev/vdb1 on /mnt/gfs2vol1 type gfs2 (rw,relatime) [f18cluster:mygfs2vol1]'}
-
- @return: Returns a dictionary of the mounted GFS2 filesystems that contain a
- label that starts with the cluster name.
- @rtype: Dict
-
- @param clusterName: The name of the cluster.
- @type clusterName: String
- @param listOfMountedFilesystems: A list of all the mounted GFS2 filesystems.
- @type listOfMountedFilesystems: Array
- """
- mapOfMountedFilesystemLabels = {}
- for mountedFilesystem in listOfMountedFilesystems:
- splitMountedFilesystem = mountedFilesystem.split()
- fsLabel = splitMountedFilesystem[-1].strip().strip("[").rstrip("]")
- if (len(fsLabel) > 0):
- # Verify it starts with name of the cluster.
- if (fsLabel.startswith("%s:" %(clusterName))):
- mapOfMountedFilesystemLabels[fsLabel] = mountedFilesystem
- return mapOfMountedFilesystemLabels
-
-def mountFilesystem(filesystemType, pathToDevice, pathToMountPoint):
- """
- This function will attempt to mount a filesystem. If the filesystem is
- already mounted or the filesystem was successfully mounted then True is
- returned, otherwise False is returned.
-
- @return: If the filesystem is already mounted or the filesystem was
- successfully mounted then True is returned, otherwise False is returned.
- @rtype: Boolean
-
- @param filesystemType: The type of filesystem that will be mounted.
- @type filesystemType: String
- @param pathToDevice: The path to the device that will be mounted.
- @type pathToDevice: String
- @param pathToMountPoint: The path to the directory that will be used as the
- mount point for the device.
- @type pathToMountPoint: String
- """
- if (os.path.ismount(PATH_TO_DEBUG_DIR)):
- return True
- listOfCommandOptions = ["-t", filesystemType, pathToDevice, pathToMountPoint]
- if (not runCommand("mount", listOfCommandOptions)):
- message = "There was an error mounting the filesystem type %s for the device %s to the mount point %s." %(filesystemType, pathToDevice, pathToMountPoint)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- return os.path.ismount(PATH_TO_DEBUG_DIR)
-
-def gatherGeneralInformation(pathToDSTDir):
- """
- This function will gather general information about the cluster and write
- the results to a file. The following data will be captured: hostname, date,
- uname -a, uptime, contents of /proc/mounts, and ps h -AL -o tid,s,cmd.
-
-
- @param pathToDSTDir: This is the path to directory where the files will be
- written to.
- @type pathToDSTDir: String
- """
- # Gather some general information and write to system.txt.
- systemString = "HOSTNAME=%s\nTIMESTAMP=%s\n" %(platform.node(), time.strftime("%Y-%m-%d %H:%M:%S"))
- stdout = runCommandOutput("uname", ["-a"]).strip().rstrip()
- if (not stdout == None):
- systemString += "UNAMEA=%s\n" %(stdout)
- stdout = runCommandOutput("uptime", []).strip().rstrip()
- if (not stdout == None):
- systemString += "UPTIME=%s" %(stdout)
- writeToFile(os.path.join(pathToDSTDir, "hostinformation.txt"), systemString, createFile=True)
-
- # Copy misc files
- pathToSrcFile = "/proc/mounts"
- copyFile(pathToSrcFile, os.path.join(pathToDSTDir, pathToSrcFile.strip("/")))
- pathToSrcFile = "/proc/slabinfo"
- copyFile(pathToSrcFile, os.path.join(pathToDSTDir, pathToSrcFile.strip("/")))
-
- # Get "ps -eo user,pid,%cpu,%mem,vsz,rss,tty,stat,start,time,comm,wchan" data.
- command = "ps"
- pathToCommandOutput = os.path.join(pathToDSTDir, "ps_hALo-tid.s.cmd")
- try:
- fout = open(pathToCommandOutput, "w")
- #runCommand(command, ["-eo", "user,pid,%cpu,%mem,vsz,rss,tty,stat,start,time,comm,wchan"], standardOut=fout)
- runCommand(command, ["h", "-AL", "-o", "tid,s,cmd"], standardOut=fout)
- fout.close()
- except IOError:
- message = "There was an error the command output for %s to the file %s." %(command, pathToCommandOutput)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
-
-
-def isProcPidStackEnabled(pathToPidData):
- """
- Returns true if the init process has the file "stack" in its pid data
- directory which contains the task functions for that process.
-
- @return: Returns true if the init process has the file "stack" in its pid
- data directory which contains the task functions for that process.
- @rtype: Boolean
-
- @param pathToPidData: The path to the directory where all the pid data
- directories are located.
- @type pathToPidData: String
- """
- return os.path.exists(os.path.join(pathToPidData, "1/stack"))
-
-def gatherPidData(pathToPidData, pathToDSTDir):
- """
- This command will gather all the directories which contain data about all the pids.
-
- @return: Returns a list of paths to the directory that contains the
- information about the pid.
- @rtype: Array
-
- @param pathToPidData: The path to the directory where all the pid data
- directories are located.
- @type pathToPidData: String
- """
- # Status has: command name, pid, ppid, state, possibly registers
- listOfFilesToCopy = ["cmdline", "stack", "status"]
- listOfPathToPidsData = []
- if (os.path.exists(pathToPidData)):
- for srcFilename in os.listdir(pathToPidData):
- pathToPidDirDST = os.path.join(pathToDSTDir, srcFilename)
- if (srcFilename.isdigit()):
- pathToSrcDir = os.path.join(pathToPidData, srcFilename)
- for filenameToCopy in listOfFilesToCopy:
- copyFile(os.path.join(pathToSrcDir, filenameToCopy), os.path.join(pathToPidDirDST, filenameToCopy))
- if (os.path.exists(pathToPidDirDST)):
- listOfPathToPidsData.append(pathToPidDirDST)
- return listOfPathToPidsData
-
-def triggerSysRQEvents():
- """
- This command will trigger sysrq events which will write the output to
- /var/log/messages. The events that will be trigger are "m" and "t". The "m"
- event will dump information about memory allocation. The "t" event will dump
- all the threads state information.
- """
- command = "echo"
- pathToSysrqTriggerFile = "/proc/sysrq-trigger"
- # m - dump information about memory allocation
- # t - dump thread state information
- # triggers = ["m", "t"]
- triggers = ["t"]
- for trigger in triggers:
- try:
- fout = open(pathToSysrqTriggerFile, "w")
- runCommand(command, [trigger], standardOut=fout)
- fout.close()
- except IOError:
- message = "There was an error writing the command output for %s to the file %s." %(command, pathToSysrqTriggerFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
-
-def gatherLogs(pathToDSTDir):
- """
- This function will copy all the cluster logs(/var/log/cluster) and the
- system log(/var/log/messages) to the directory given by pathToDSTDir.
-
- @param pathToDSTDir: This is the path to directory where the files will be
- copied to.
- @type pathToDSTDir: String
- """
- pathToLogFile = "/var/log/messages"
- pathToDSTLogFile = os.path.join(pathToDSTDir, os.path.basename(pathToLogFile))
- copyFile(pathToLogFile, pathToDSTLogFile)
-
- pathToLogDir = "/var/log/cluster"
- if (os.path.exists(pathToLogDir)):
- pathToDSTLogDir = os.path.join(pathToDSTDir, os.path.basename(pathToLogDir))
- copyDirectory(pathToLogDir, pathToDSTDir)
-
-def gatherDLMLockDumps(pathToDSTDir, listOfGFS2Filesystems):
- """
- This function copies the debug files for dlm for a GFS2 filesystem in the
- list to a directory. The list of GFS2 filesystems will only include the
- filesystem name for each item in the list. For example: "mygfs2vol1"
-
- @param pathToDSTDir: This is the path to directory where the files will be
- copied to.
- @type pathToDSTDir: String
- @param listOfGFS2Filesystems: This is the list of the GFS2 filesystems that
- will have their debug directory copied.
- @type listOfGFS2Filesystems: Array
- """
- lockDumpType = "dlm"
- pathToSrcDir = os.path.join(PATH_TO_DEBUG_DIR, lockDumpType)
- pathToOutputDir = os.path.join(pathToDSTDir, lockDumpType)
- message = "Copying the files in the %s lockdump data directory %s." %(lockDumpType.upper(), pathToSrcDir)
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- for filename in os.listdir(pathToSrcDir):
- for name in listOfGFS2Filesystems:
- if (filename.startswith(name)):
- copyFile(os.path.join(pathToSrcDir, filename),
- os.path.join(os.path.join(pathToOutputDir, name), filename))
-
-def gatherGFS2LockDumps(pathToDSTDir, listOfGFS2Filesystems):
- """
- This function copies the debug directory for a GFS2 filesystems in the list
- to a directory. The list of GFS2 filesystems will include the cluster name
- and filesystem name for each item in the list. For example:
- "f18cluster:mygfs2vol1"
-
- @param pathToDSTDir: This is the path to directory where the files will be
- copied to.
- @type pathToDSTDir: String
- @param listOfGFS2Filesystems: This is the list of the GFS2 filesystems that
- will have their debug directory copied.
- @type listOfGFS2Filesystems: Array
- """
- lockDumpType = "gfs2"
- pathToSrcDir = os.path.join(PATH_TO_DEBUG_DIR, lockDumpType)
- pathToOutputDir = os.path.join(pathToDSTDir, lockDumpType)
- for dirName in os.listdir(pathToSrcDir):
- pathToCurrentDir = os.path.join(pathToSrcDir, dirName)
- if ((os.path.isdir(pathToCurrentDir)) and (dirName in listOfGFS2Filesystems)):
- message = "Copying the lockdump data for the %s filesystem: %s" %(lockDumpType.upper(), dirName)
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- copyDirectory(pathToCurrentDir, pathToOutputDir)
-
-# ##############################################################################
-# Get user selected options
-# ##############################################################################
-def __getOptions(version) :
- """
- This function creates the OptionParser and returns commandline
- a tuple of the selected commandline options and commandline args.
-
- The cmdlineOpts which is the options user selected and cmdLineArgs
- is value passed and not associated with an option.
-
- @return: A tuple of the selected commandline options and commandline args.
- @rtype: Tuple
-
- @param version: The version of the this script.
- @type version: String
- """
- cmdParser = OptionParserExtended(version)
- cmdParser.add_option("-d", "--debug",
- action="store_true",
- dest="enableDebugLogging",
- help="enables debug logging",
- default=False)
- cmdParser.add_option("-q", "--quiet",
- action="store_true",
- dest="disableLoggingToConsole",
- help="disables logging to console",
- default=False)
- cmdParser.add_option("-y", "--no_ask",
- action="store_true",
- dest="disableQuestions",
- help="disables all questions and assumes yes",
- default=False)
- cmdParser.add_option("-i", "--info",
- action="store_true",
- dest="enablePrintInfo",
- help="prints information about the mounted GFS2 file systems",
- default=False)
- cmdParser.add_option("-t", "--archive",
- action="store_true",
- dest="enableArchiveOutputDir",
- help="the output directory will be archived(tar) and compressed(.bz2)",
- default=False)
- cmdParser.add_option("-o", "--path_to_output_dir",
- action="store",
- dest="pathToOutputDir",
- help="the directory where all the collect data will be stored",
- type="string",
- metavar="<output directory>",
- default="")
- cmdParser.add_option("-r", "--num_of_runs",
- action="store",
- dest="numberOfRuns",
- help="number of runs capturing the lockdump data",
- type="int",
- metavar="<number of runs>",
- default=2)
- cmdParser.add_option("-s", "--seconds_sleep",
- action="store",
- dest="secondsToSleep",
- help="number of seconds to sleep between runs of capturing the lockdump data",
- type="int",
- metavar="<seconds to sleep>",
- default=120)
- cmdParser.add_option("-n", "--fs_name",
- action="extend",
- dest="listOfGFS2Names",
- help="name of the GFS2 filesystem(s) that will have their lockdump data captured",
- type="string",
- metavar="<name of GFS2 filesystem>",
- default=[])
- # Get the options and return the result.
- (cmdLineOpts, cmdLineArgs) = cmdParser.parse_args()
- return (cmdLineOpts, cmdLineArgs)
-
-# ##############################################################################
-# OptParse classes for commandline options
-# ##############################################################################
-class OptionParserExtended(OptionParser):
- """
- This is the class that gets the command line options the end user
- selects.
- """
- def __init__(self, version) :
- """
- @param version: The version of the this script.
- @type version: String
- """
- self.__commandName = os.path.basename(sys.argv[0])
- versionMessage = "%s %s\n" %(self.__commandName, version)
-
- commandDescription ="%s gfs2_lockcapture will capture locking information from GFS2 file systems and DLM.\n"%(self.__commandName)
-
- OptionParser.__init__(self, option_class=ExtendOption,
- version=versionMessage,
- description=commandDescription)
-
- def print_help(self):
- """
- Print examples at the bottom of the help message.
- """
- self.print_version()
- examplesMessage = "\n"
- examplesMessage = "\nPrints information about the available GFS2 filesystems that can have lockdump data captured."
- examplesMessage += "\n$ %s -i\n" %(self.__commandName)
-
- examplesMessage += "\nIt will do 3 runs of gathering the lockdump information in 10 second intervals for only the"
- examplesMessage += "\nGFS2 filesystems with the names myGFS2vol2,myGFS2vol1. Then it will archive and compress"
- examplesMessage += "\nthe data collected. All of the lockdump data will be written to the directory: "
- examplesMessage += "\n/tmp/2012-11-12_095556-gfs2_lockcapture and all the questions will be answered with yes.\n"
- examplesMessage += "\n$ %s -r 3 -s 10 -t -n myGFS2vol2,myGFS2vol1 -o /tmp/2012-11-12_095556-gfs2_lockcapture -y\n" %(self.__commandName)
-
- examplesMessage += "\nIt will do 2 runs of gathering the lockdump information in 25 second intervals for all the"
- examplesMessage += "\nmounted GFS2 filesystems. Then it will archive and compress the data collected. All of the"
- examplesMessage += "\nlockdump data will be written to the directory: /tmp/2012-11-12_095556-gfs2_lockcapture.\n"
- examplesMessage += "\n$ %s -r 2 -s 25 -t -o /tmp/2012-11-12_095556-gfs2_lockcapture\n" %(self.__commandName)
- OptionParser.print_help(self)
- print examplesMessage
-
-class ExtendOption (Option):
- """
- Allow to specify comma delimited list of entries for arrays
- and dictionaries.
- """
- ACTIONS = Option.ACTIONS + ("extend",)
- STORE_ACTIONS = Option.STORE_ACTIONS + ("extend",)
- TYPED_ACTIONS = Option.TYPED_ACTIONS + ("extend",)
-
- def take_action(self, action, dest, opt, value, values, parser):
- """
- This function is a wrapper to take certain options passed on command
- prompt and wrap them into an Array.
-
- @param action: The type of action that will be taken. For example:
- "store_true", "store_false", "extend".
- @type action: String
- @param dest: The name of the variable that will be used to store the
- option.
- @type dest: String/Boolean/Array
- @param opt: The option string that triggered the action.
- @type opt: String
- @param value: The value of opt(option) if it takes a
- value, if not then None.
- @type value:
- @param values: All the opt(options) in a dictionary.
- @type values: Dictionary
- @param parser: The option parser that was orginally called.
- @type parser: OptionParser
- """
- if (action == "extend") :
- valueList = []
- try:
- for v in value.split(","):
- # Need to add code for dealing with paths if there is option for paths.
- newValue = value.strip().rstrip()
- if (len(newValue) > 0):
- valueList.append(newValue)
- except:
- pass
- else:
- values.ensure_value(dest, []).extend(valueList)
- else:
- Option.take_action(self, action, dest, opt, value, values, parser)
-
-# ###############################################################################
-# Main Function
-# ###############################################################################
-if __name__ == "__main__":
- """
- When the script is executed then this code is ran.
- """
- try:
- # #######################################################################
- # Get the options from the commandline.
- # #######################################################################
- (cmdLineOpts, cmdLineArgs) = __getOptions(VERSION_NUMBER)
- # #######################################################################
- # Setup the logger and create config directory
- # #######################################################################
- # Create the logger
- logLevel = logging.INFO
- logger = logging.getLogger(MAIN_LOGGER_NAME)
- logger.setLevel(logLevel)
- # Create a new status function and level.
- logging.STATUS = logging.INFO + 2
- logging.addLevelName(logging.STATUS, "STATUS")
- # Create a function for the STATUS_LEVEL since not defined by python. This
- # means you can call it like the other predefined message
- # functions. Example: logging.getLogger("loggerName").status(message)
- setattr(logger, "status", lambda *args: logger.log(logging.STATUS, *args))
- streamHandler = logging.StreamHandler()
- streamHandler.setLevel(logLevel)
- streamHandler.setFormatter(logging.Formatter("%(levelname)s %(message)s"))
- logger.addHandler(streamHandler)
-
- # Please note there will not be a global log file created. If a log file
- # is needed then redirect the output. There will be a log file created
- # for each run in the corresponding directory.
-
- # #######################################################################
- # Set the logging levels.
- # #######################################################################
- if ((cmdLineOpts.enableDebugLogging) and (not cmdLineOpts.disableLoggingToConsole)):
- logging.getLogger(MAIN_LOGGER_NAME).setLevel(logging.DEBUG)
- streamHandler.setLevel(logging.DEBUG)
- message = "Debugging has been enabled."
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- if (cmdLineOpts.disableLoggingToConsole):
- logging.disable(logging.CRITICAL)
- # #######################################################################
- # Check to see if pid file exists and error if it does.
- # #######################################################################
- if (os.path.exists(PATH_TO_PID_FILENAME)):
- message = "The PID file %s already exists and this script cannot run till it does not exist." %(PATH_TO_PID_FILENAME)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- message = "Verify that there are no other existing processes running. If there are running processes those need to be stopped first and the file removed."
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- exitScript(removePidFile=False, errorCode=1)
- else:
- message = "Creating the pid file: %s" %(PATH_TO_PID_FILENAME)
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- # Creata the pid file so we dont have more than 1 process of this
- # script running.
- writeToFile(PATH_TO_PID_FILENAME, str(os.getpid()), createFile=True)
- # #######################################################################
- # Verify they want to continue because this script will trigger sysrq events.
- # #######################################################################
- if (not cmdLineOpts.disableQuestions):
- valid = {"yes":True, "y":True, "no":False, "n":False}
- question = "This script will trigger a sysrq -t event or collect the data for each pid directory located in /proc for each run. Are you sure you want to continue?"
- prompt = " [y/n] "
- while True:
- sys.stdout.write(question + prompt)
- choice = raw_input().lower()
- if (choice in valid):
- if (valid.get(choice)):
- # If yes, or y then exit loop and continue.
- break
- else:
- message = "The script will not continue since you chose not to continue."
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- exitScript(removePidFile=True, errorCode=1)
- else:
- sys.stdout.write("Please respond with '(y)es' or '(n)o'.\n")
- # #######################################################################
- # Get the clusternode name and verify that mounted GFS2 filesystems were
- # found.
- # #######################################################################
- clusternode = getClusterNode(cmdLineOpts.listOfGFS2Names)
- if (clusternode == None):
- message = "The cluster or cluster node name could not be found."
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- exitScript(removePidFile=True, errorCode=1)
- elif (not len(clusternode.getMountedGFS2FilesystemNames()) > 0):
- message = "There were no mounted GFS2 filesystems found."
- if (len(cmdLineOpts.listOfGFS2Names) > 0):
- message = "There were no mounted GFS2 filesystems found with the name:"
- for name in cmdLineOpts.listOfGFS2Names:
- message += " %s" %(name)
- message += "."
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- exitScript(removePidFile=True, errorCode=1)
- if (cmdLineOpts.enablePrintInfo):
- logging.disable(logging.CRITICAL)
- print "List of all the mounted GFS2 filesystems that can have their lockdump data captured:"
- print clusternode
- exitScript()
- # #######################################################################
- # Create the output directory to verify it can be created before
- # proceeding unless it is already created from a previous run data needs
- # to be analyzed. Probably could add more debugging on if file or dir.
- # #######################################################################
- pathToOutputDir = cmdLineOpts.pathToOutputDir
- if (not len(pathToOutputDir) > 0):
- pathToOutputDir = "%s" %(os.path.join("/tmp", "%s-%s-%s" %(time.strftime("%Y-%m-%d_%H%M%S"), clusternode.getClusterNodeName(), os.path.basename(sys.argv[0]))))
- # #######################################################################
- # Backup any existing directory with same name as current output
- # directory.
- # #######################################################################
- if (backupOutputDirectory(pathToOutputDir)):
- message = "This directory that will be used to capture all the data: %s" %(pathToOutputDir)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- if (not mkdirs(pathToOutputDir)):
- exitScript(errorCode=1)
- else:
- # There was an existing directory with same path as current output
- # directory and it failed to back it up.
- message = "Please change the output directory path (-o) or manual rename or remove the existing path: %s" %(pathToOutputDir)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- exitScript(errorCode=1)
- # #######################################################################
- # Check to see if the debug directory is mounted. If not then
- # log an error.
- # #######################################################################
- if(mountFilesystem("debugfs", "none", PATH_TO_DEBUG_DIR)):
- message = "The debug filesystem %s is mounted." %(PATH_TO_DEBUG_DIR)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- else:
- message = "There was a problem mounting the debug filesystem: %s" %(PATH_TO_DEBUG_DIR)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- message = "The debug filesystem is required to be mounted for this script to run."
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- exitScript(errorCode=1)
- # #######################################################################
- # Gather data and the lockdumps.
- # #######################################################################
- if (cmdLineOpts.numberOfRuns <= 0):
- message = "The number of runs should be greater than zero."
- exitScript(errorCode=1)
- for i in range(1,(cmdLineOpts.numberOfRuns + 1)):
- # The current log count that will start at 1 and not zero to make it
- # make sense in logs.
- # Add clusternode name under each run dir to make combining multple
- # clusternode gfs2_lockgather data together and all data in each run directory.
- pathToOutputRunDir = os.path.join(pathToOutputDir, "run%d/%s" %(i, clusternode.getClusterNodeName()))
- # Create the the directory that will be used to capture the data.
- if (not mkdirs(pathToOutputRunDir)):
- exitScript(errorCode=1)
- # Set the handler for writing to log file for this run.
- currentRunFileHandler = None
- pathToLogFile = os.path.join(pathToOutputRunDir, "%s.log" %(MAIN_LOGGER_NAME))
- if (((os.access(pathToLogFile, os.W_OK) and os.access("/tmp", os.R_OK))) or (not os.path.exists(pathToLogFile))):
- currentRunFileHandler = logging.FileHandler(pathToLogFile)
- currentRunFileHandler.setFormatter(logging.Formatter("%(asctime)s %(levelname)s %(message)s", "%Y-%m-%d %H:%M:%S"))
- logging.getLogger(MAIN_LOGGER_NAME).addHandler(currentRunFileHandler)
- message = "Pass (%d/%d): Gathering all the lockdump data." %(i, cmdLineOpts.numberOfRuns)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
-
- # Gather various bits of data from the clusternode.
- message = "Pass (%d/%d): Gathering general information about the host." %(i, cmdLineOpts.numberOfRuns)
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- gatherGeneralInformation(pathToOutputRunDir)
- # Going to sleep for 2 seconds, so that TIMESTAMP should be in the
- # past in the logs so that capturing sysrq data will be guaranteed.
- time.sleep(2)
- # Gather the backtraces for all the pids, by grabbing the /proc/<pid
- # number> or triggering sysrq events to capture task bask traces
- # from log.
- message = "Pass (%d/%d): Triggering the sysrq events for the host." %(i, cmdLineOpts.numberOfRuns)
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- # Gather the data in the /proc/<pid> directory if the file
- # </proc/<pid>/stack exists. If file exists we will not trigger
- # sysrq events.
- pathToPidData = "/proc"
- if (isProcPidStackEnabled(pathToPidData)):
- gatherPidData(pathToPidData, os.path.join(pathToOutputRunDir, pathToPidData.strip("/")))
- else:
- triggerSysRQEvents()
- # Gather the dlm locks.
- lockDumpType = "dlm"
- message = "Pass (%d/%d): Gathering the %s lock dumps for the host." %(i, cmdLineOpts.numberOfRuns, lockDumpType.upper())
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- gatherDLMLockDumps(pathToOutputRunDir, clusternode.getMountedGFS2FilesystemNames(includeClusterName=False))
- # Gather the glock locks from gfs2.
- lockDumpType = "gfs2"
- message = "Pass (%d/%d): Gathering the %s lock dumps for the host." %(i, cmdLineOpts.numberOfRuns, lockDumpType.upper())
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- gatherGFS2LockDumps(pathToOutputRunDir, clusternode.getMountedGFS2FilesystemNames())
- # Gather log files
- message = "Pass (%d/%d): Gathering the log files for the host." %(i, cmdLineOpts.numberOfRuns)
- logging.getLogger(MAIN_LOGGER_NAME).debug(message)
- gatherLogs(os.path.join(pathToOutputRunDir, "logs"))
- # Sleep between each run if secondsToSleep is greater than or equal
- # to 0 and current run is not the last run.
- if ((cmdLineOpts.secondsToSleep >= 0) and (i <= (cmdLineOpts.numberOfRuns))):
- message = "The script will sleep for %d seconds between each run of capturing the lockdump data." %(cmdLineOpts.secondsToSleep)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- time.sleep(cmdLineOpts.secondsToSleep)
- # Remove the handler:
- logging.getLogger(MAIN_LOGGER_NAME).removeHandler(currentRunFileHandler)
-
- # #######################################################################
- # Archive the directory that contains all the data and archive it after
- # all the information has been gathered.
- # #######################################################################
- message = "All the files have been gathered and this directory contains all the captured data: %s" %(pathToOutputDir)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- if (cmdLineOpts.enableArchiveOutputDir):
- message = "The lockdump data will now be archived. This could some time depending on the size of the data collected."
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- pathToTarFilename = archiveData(pathToOutputDir)
- if (os.path.exists(pathToTarFilename)):
- message = "The compressed archvied file was created: %s" %(pathToTarFilename)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- else:
- message = "The compressed archvied failed to be created: %s" %(pathToTarFilename)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- # #######################################################################
- except KeyboardInterrupt:
- print ""
- message = "This script will exit since control-c was executed by end user."
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- exitScript(errorCode=1)
- # #######################################################################
- # Exit the application with zero exit code since we cleanly exited.
- # #######################################################################
- exitScript()
diff --git a/gfs2/scripts/Makefile.am b/gfs2/scripts/Makefile.am
new file mode 100644
index 0000000..b88580e
--- /dev/null
+++ b/gfs2/scripts/Makefile.am
@@ -0,0 +1,12 @@
+MAINTAINERCLEANFILES = Makefile.in
+
+# When an exec_prefix setting would have us install into /usr/sbin,
+# use /sbin instead.
+# Accept an existing sbindir value of /usr/sbin (probably for older automake),
+# or an empty value, for automake-1.11 and newer.
+sbindir := $(shell rpl=0; test '$(exec_prefix):$(sbindir)' = /usr:/usr/sbin \
+ || test '$(exec_prefix):$(sbindir)' = /usr: && rpl=1; \
+ test $$rpl = 1 && echo /sbin || echo '$(exec_prefix)/sbin')
+
+
+dist_sbin_SCRIPTS = gfs2_lockcapture
diff --git a/gfs2/scripts/gfs2_lockcapture b/gfs2/scripts/gfs2_lockcapture
new file mode 100644
index 0000000..1a64188
--- /dev/null
+++ b/gfs2/scripts/gfs2_lockcapture
@@ -0,0 +1,1231 @@
+#!/usr/bin/env python
+"""
+The script gfs2_lockcapture will capture locking information from GFS2 file
+systems and DLM.
+
+@author : Shane Bradley
+@contact : sbradley(a)redhat.com
+@version : 0.9
+@copyright : GPLv2
+"""
+import sys
+import os
+import os.path
+import logging
+from optparse import OptionParser, Option
+import time
+import platform
+import shutil
+import subprocess
+import tarfile
+
+# #####################################################################
+# Global vars:
+# #####################################################################
+"""
+@cvar VERSION_NUMBER: The version number of this script.
+@type VERSION_NUMBER: String
+@cvar MAIN_LOGGER_NAME: The name of the logger.
+@type MAIN_LOGGER_NAME: String
+@cvar PATH_TO_DEBUG_DIR: The path to the debug directory for the linux kernel.
+@type PATH_TO_DEBUG_DIR: String
+@cvar PATH_TO_PID_FILENAME: The path to the pid file that will be used to make
+sure only 1 instance of this script is running at any time.
+@type PATH_TO_PID_FILENAME: String
+"""
+VERSION_NUMBER = "0.9-2"
+MAIN_LOGGER_NAME = "%s" %(os.path.basename(sys.argv[0]))
+PATH_TO_DEBUG_DIR="/sys/kernel/debug"
+PATH_TO_PID_FILENAME = "/var/run/%s.pid" %(os.path.basename(sys.argv[0]))
+
+# #####################################################################
+# Class to define what a clusternode is.
+# #####################################################################
+class ClusterNode:
+ """
+ This class represents a cluster node that is a current memeber in a cluster.
+ """
+ def __init__(self, clusternodeName, clusterName, mapOfMountedFilesystemLabels):
+ """
+ @param clusternodeName: The name of the cluster node.
+ @type clusternodeName: String
+ @param clusterName: The name of the cluster that this cluster node is a
+ member of.
+ @type clusterName: String
+ @param mapOfMountedFilesystemLabels: A map of filesystem labels(key) for
+ a mounted filesystem. The value is the line for the matching mounted
+ filesystem from the mount -l command.
+ @type mapOfMountedFilesystemLabels: Dict
+ """
+ self.__clusternodeName = clusternodeName
+ self.__clusterName = clusterName
+ self.__mapOfMountedFilesystemLabels = mapOfMountedFilesystemLabels
+
+ def __str__(self):
+ """
+ This function will return a string representation of the object.
+
+ @return: Returns a string representation of the object.
+ @rtype: String
+ """
+ rString = ""
+ rString += "%s:%s" %(self.getClusterName(), self.getClusterNodeName())
+ fsLabels = self.__mapOfMountedFilesystemLabels.keys()
+ fsLabels.sort()
+ for fsLabel in fsLabels:
+ rString += "\n\t%s --> %s" %(fsLabel, self.__mapOfMountedFilesystemLabels.get(fsLabel))
+ return rString.rstrip()
+
+ def getClusterNodeName(self):
+ """
+ Returns the name of the cluster node.
+
+ @return: Returns the name of the cluster node.
+ @rtype: String
+ """
+ return self.__clusternodeName
+
+ def getClusterName(self):
+ """
+ Returns the name of cluster that this cluster node is a member of.
+
+ @return: Returns the name of cluster that this cluster node is a member
+ of.
+ @rtype: String
+ """
+ return self.__clusterName
+
+ def getMountedGFS2FilesystemNames(self, includeClusterName=True):
+ """
+ Returns the names of all the mounted GFS2 filesystems. By default
+ includeClusterName is True which will include the name of the cluster
+ and the GFS2 filesystem name(ex. f18cluster:mygfs2vol1) in the list of
+ mounted GFS2 filesystems. If includeClusterName is False it will only
+ return a list of all the mounted GFS2 filesystem names(ex. mygfs2vol1).
+
+ @return: Returns a list of all teh mounted GFS2 filesystem names.
+ @rtype: Array
+
+ @param includeClusterName: By default this option is True and will
+ include the name of the cluster and the GFS2 filesystem name. If False
+ then only the GFS2 filesystem name will be included.
+ @param includeClusterName: Boolean
+ """
+ # If true will prepend the cluster name to gfs2 fs name
+ if (includeClusterName):
+ return self.__mapOfMountedFilesystemLabels.keys()
+ else:
+ listOfGFS2MountedFilesystemLabels = []
+ for fsLabel in self.__mapOfMountedFilesystemLabels.keys():
+ fsLabelSplit = fsLabel.split(":", 1)
+ if (len(fsLabelSplit) == 2):
+ listOfGFS2MountedFilesystemLabels.append(fsLabelSplit[1])
+ return listOfGFS2MountedFilesystemLabels
+
+# #####################################################################
+# Helper functions.
+# #####################################################################
+def runCommand(command, listOfCommandOptions, standardOut=subprocess.PIPE, standardError=subprocess.PIPE):
+ """
+ This function will execute a command. It will return True if the return code
+ was zero, otherwise False is returned.
+
+ @return: Returns True if the return code was zero, otherwise False is
+ returned.
+ @rtype: Boolean
+
+ @param command: The command that will be executed.
+ @type command: String
+ @param listOfCommandOptions: The list of options for the command that will
+ be executed.
+ @type listOfCommandOptions: Array
+ @param standardOut: The pipe that will be used to write standard output. By
+ default the pipe that is used is subprocess.PIPE.
+ @type standardOut: Pipe
+ @param standardError: The pipe that will be used to write standard error. By
+ default the pipe that is used is subprocess.PIPE.
+ @type standardError: Pipe
+ """
+ stdout = ""
+ stderr = ""
+ try:
+ commandList = [command]
+ commandList += listOfCommandOptions
+ task = subprocess.Popen(commandList, stdout=standardOut, stderr=standardError)
+ task.wait()
+ (stdout, stderr) = task.communicate()
+ return (task.returncode == 0)
+ except OSError:
+ commandOptionString = ""
+ for option in listOfCommandOptions:
+ commandOptionString += "%s " %(option)
+ message = "An error occurred running the command: $ %s %s\n" %(command, commandOptionString)
+ if (len(stdout) > 0):
+ message += stdout
+ message += "\n"
+ if (len(stderr) > 0):
+ message += stderr
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+
+def runCommandOutput(command, listOfCommandOptions, standardOut=subprocess.PIPE, standardError=subprocess.PIPE):
+ """
+ This function will execute a command. Returns the output that was written to standard output. None is
+ returned if there was an error.
+
+ @return: Returns the output that was written to standard output. None is
+ returned if there was an error.
+ @rtype: String
+
+ @param command: The command that will be executed.
+ @type command: String
+ @param listOfCommandOptions: The list of options for the command that will
+ be executed.
+ @type listOfCommandOptions: Array
+ @param standardOut: The pipe that will be used to write standard output. By
+ default the pipe that is used is subprocess.PIPE.
+ @type standardOut: Pipe
+ @param standardError: The pipe that will be used to write standard error. By
+ default the pipe that is used is subprocess.PIPE.
+ @type standardError: Pipe
+ """
+ stdout = ""
+ stderr = ""
+ try:
+ commandList = [command]
+ commandList += listOfCommandOptions
+ task = subprocess.Popen(commandList, stdout=standardOut, stderr=standardError)
+ task.wait()
+ (stdout, stderr) = task.communicate()
+ except OSError:
+ commandOptionString = ""
+ for option in listOfCommandOptions:
+ commandOptionString += "%s " %(option)
+ message = "An error occurred running the command: $ %s %s\n" %(command, commandOptionString)
+ if (len(stdout) > 0):
+ message += stdout
+ message += "\n"
+ if (len(stderr) > 0):
+ message += stderr
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return None
+ return stdout.strip().rstrip()
+
+def writeToFile(pathToFilename, data, appendToFile=True, createFile=False):
+ """
+ This function will write a string to a file.
+
+ @return: Returns True if the string was successfully written to the file,
+ otherwise False is returned.
+ @rtype: Boolean
+
+ @param pathToFilename: The path to the file that will have a string written
+ to it.
+ @type pathToFilename: String
+ @param data: The string that will be written to the file.
+ @type data: String
+ @param appendToFile: If True then the data will be appened to the file, if
+ False then the data will overwrite the contents of the file.
+ @type appendToFile: Boolean
+ @param createFile: If True then the file will be created if it does not
+ exists, if False then file will not be created if it does not exist
+ resulting in no data being written to the file.
+ @type createFile: Boolean
+ """
+ [parentDir, filename] = os.path.split(pathToFilename)
+ if (os.path.isfile(pathToFilename) or (os.path.isdir(parentDir) and createFile)):
+ try:
+ filemode = "w"
+ if (appendToFile):
+ filemode = "a"
+ fout = open(pathToFilename, filemode)
+ fout.write(data + "\n")
+ fout.close()
+ return True
+ except UnicodeEncodeError, e:
+ message = "There was a unicode encode error writing to the file: %s." %(pathToFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except IOError:
+ message = "There was an error writing to the file: %s." %(pathToFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ return False
+
+def mkdirs(pathToDSTDir):
+ """
+ This function will attempt to create a directory with the path of the value of pathToDSTDir.
+
+ @return: Returns True if the directory was created or already exists.
+ @rtype: Boolean
+
+ @param pathToDSTDir: The path to the directory that will be created.
+ @type pathToDSTDir: String
+ """
+ if (os.path.isdir(pathToDSTDir)):
+ return True
+ elif ((not os.access(pathToDSTDir, os.F_OK)) and (len(pathToDSTDir) > 0)):
+ try:
+ os.makedirs(pathToDSTDir)
+ except (OSError, os.error):
+ message = "Could not create the directory: %s." %(pathToDSTDir)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except (IOError, os.error):
+ message = "Could not create the directory with the path: %s." %(pathToDSTDir)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ return os.path.isdir(pathToDSTDir)
+
+def removePIDFile():
+ """
+ This function will remove the pid file.
+
+ @return: Returns True if the file was successfully remove or does not exist,
+ otherwise False is returned.
+ @rtype: Boolean
+ """
+ message = "Removing the pid file: %s" %(PATH_TO_PID_FILENAME)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ if (os.path.exists(PATH_TO_PID_FILENAME)):
+ try:
+ os.remove(PATH_TO_PID_FILENAME)
+ except IOError:
+ message = "There was an error removing the file: %s." %(PATH_TO_PID_FILENAME)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return os.path.exists(PATH_TO_PID_FILENAME)
+
+def archiveData(pathToSrcDir):
+ """
+ This function will return the path to the tar.bz2 file that was created. If
+ the tar.bz2 file failed to be created then an empty string will be returned
+ which would indicate an error occurred.
+
+ @return: This function will return the path to the tar.bz2 file that was
+ created. If the tar.bz2 file failed to be created then an empty string will
+ be returned which would indicate an error occurred.
+ @rtype: String
+
+ @param pathToSrcDir: The path to the directory that will be archived into a
+ .tar.bz2 file.
+ @type pathToSrcDir: String
+ """
+ if (os.path.exists(pathToSrcDir)):
+ pathToTarFilename = "%s-%s.tar.bz2" %(pathToSrcDir, platform.node())
+ if (os.path.exists(pathToTarFilename)):
+ message = "A compressed archvied file already exists and will be removed: %s" %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).status(message)
+ try:
+ os.remove(PATH_TO_PID_FILENAME)
+ except IOError:
+ message = "There was an error removing the file: %s." %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return ""
+ message = "Creating a compressed archvied file: %s" %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).status(message)
+ try:
+ tar = tarfile.open(pathToTarFilename, "w:bz2")
+ tar.add(pathToSrcDir, arcname=os.path.basename(pathToSrcDir))
+ tar.close()
+ except tarfile.TarError:
+ message = "There was an error creating the tarfile: %s." %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return ""
+ if (os.path.exists(pathToTarFilename)):
+ return pathToTarFilename
+ return ""
+
+def getDataFromFile(pathToSrcFile) :
+ """
+ This function will return the data in an array. Where each newline in file
+ is a seperate item in the array. This should really just be used on
+ relatively small files.
+
+ None is returned if no file is found.
+
+ @return: Returns an array of Strings, where each newline in file is an item
+ in the array.
+ @rtype: Array
+
+ @param pathToSrcFile: The path to the file which will be read.
+ @type pathToSrcFile: String
+ """
+ if (len(pathToSrcFile) > 0) :
+ try:
+ fin = open(pathToSrcFile, "r")
+ data = fin.readlines()
+ fin.close()
+ return data
+ except (IOError, os.error):
+ message = "An error occured reading the file: %s." %(pathToSrcFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return None
+
+def copyFile(pathToSrcFile, pathToDstFile):
+ """
+ This function will copy a src file to dst file.
+
+ @return: Returns True if the file was copied successfully.
+ @rtype: Boolean
+
+ @param pathToSrcFile: The path to the source file that will be copied.
+ @type pathToSrcFile: String
+ @param pathToDstFile: The path to the destination of the file.
+ @type pathToDstFile: String
+ """
+ if(not os.path.exists(pathToSrcFile)):
+ message = "The file does not exist with the path: %s." %(pathToSrcFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ elif (not os.path.isfile(pathToSrcFile)):
+ message = "The path to the source file is not a regular file: %s." %(pathToSrcFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ elif (pathToSrcFile == pathToDstFile):
+ message = "The path to the source file and path to destination file cannot be the same: %s." %(pathToDstFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ else:
+ # Create the directory structure if it does not exist.
+ (head, tail) = os.path.split(pathToDstFile)
+ if (not mkdirs(head)) :
+ # The path to the directory was not created so file
+ # could not be copied.
+ return False
+ # Copy the file to the dst path.
+ try:
+ shutil.copy(pathToSrcFile, pathToDstFile)
+ except shutil.Error:
+ message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except OSError:
+ message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except IOError:
+ message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ return (os.path.exists(pathToDstFile))
+
+def copyDirectory(pathToSrcDir, pathToDstDir):
+ """
+ This function will copy a src dir to dst dir.
+
+ @return: Returns True if the dir was copied successfully.
+ @rtype: Boolean
+
+ @param pathToSrcDir: The path to the source dir that will be copied.
+ @type pathToSrcDir: String
+ @param pathToDstDir: The path to the destination of the dir.
+ @type pathToDstDir: String
+ """
+ if(not os.path.exists(pathToSrcDir)):
+ message = "The directory does not exist with the path: %s." %(pathToSrcDir)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ elif (not os.path.isdir(pathToSrcDir)):
+ message = "The path to the source directory is not a directory: %s." %(pathToSrcDir)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ elif (pathToSrcDir == pathToDstDir):
+ message = "The path to the source directory and path to destination directory cannot be the same: %s." %(pathToDstDir)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ else:
+ if (not mkdirs(pathToDstDir)) :
+ # The path to the directory was not created so file
+ # could not be copied.
+ return False
+ # Copy the file to the dst path.
+ dst = os.path.join(pathToDstDir, os.path.basename(pathToSrcDir))
+ try:
+ shutil.copytree(pathToSrcDir, dst)
+ except shutil.Error:
+ message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except OSError:
+ message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except IOError:
+ message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ return (os.path.exists(dst))
+
+def backupOutputDirectory(pathToOutputDir):
+ """
+ This function will return True if the pathToOutputDir does not exist or the
+ directory was successfully rename. If pathToOutputDir exists and was not
+ successfully rename then False is returned.
+
+ @return: Returns True if the pathToOutputDir does not exist or the directory
+ was successfully rename. If pathToOutputDir exists and was not successfully
+ rename then False is returned.
+ @rtype: Boolean
+
+ @param pathToOutputDir: The path to the directory that will be backed up.
+ @type pathToOutputDir: String
+ """
+ if (os.path.exists(pathToOutputDir)):
+ message = "The path already exists and could contain previous lockdump data: %s" %(pathToOutputDir)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ backupIndex = 1
+ pathToDST = ""
+ keepSearchingForIndex = True
+ while (keepSearchingForIndex):
+ pathToDST = "%s.bk-%d" %(pathToOutputDir, backupIndex)
+ if (os.path.exists(pathToDST)):
+ backupIndex += 1
+ else:
+ keepSearchingForIndex = False
+ try:
+ message = "The existing output directory will be renamed: %s to %s." %(pathToOutputDir, pathToDST)
+ logging.getLogger(MAIN_LOGGER_NAME).status(message)
+ shutil.move(pathToOutputDir, pathToDST)
+ except shutil.Error:
+ message = "There was an error renaming the directory: %s to %s." %(pathToOutputDir, pathToDST)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ except OSError:
+ message = "There was an error renaming the directory: %s to %s." %(pathToOutputDir, pathToDST)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ # The path should not exists now, else there was an error backing up an
+ # existing output directory.
+ return (not os.path.exists(pathToOutputDir))
+
+def exitScript(removePidFile=True, errorCode=0):
+ """
+ This function will cause the script to exit or quit. It will return an error
+ code and will remove the pid file that was created.
+
+ @param removePidFile: If True(default) then the pid file will be remove
+ before the script exits.
+ @type removePidFile: Boolean
+ @param errorCode: The exit code that will be returned. The default value is 0.
+ @type errorCode: Int
+ """
+ if (removePidFile):
+ removePIDFile()
+ message = "The script will exit."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ sys.exit(errorCode)
+
+# #####################################################################
+# Helper functions for gathering the lockdumps.
+# #####################################################################
+def getClusterNode(listOfGFS2Names):
+ """
+ This function return a ClusterNode object if the machine is a member of a
+ cluster and has GFS2 filesystems mounted for that cluster. The
+ listOfGFS2Names is a list of GFS2 filesystem that need to have their data
+ capture. If the list is empty then that means that all the mounted GFS2
+ filesystems will be captured, if list is not empty then only those GFS2
+ filesystems in the list will have their data captured.
+
+ @return: Returns a cluster node object if there was mounted GFS2 filesystems
+ found that will have their data captured.
+ @rtype: ClusterNode
+
+ @param listOfGFS2Names: A list of GFS2 filesystem names that will have their
+ data captured. If the list is empty then that means that all the mounted
+ GFS2 filesystems will be captured, if list is not empty then only those GFS2
+ filesystems in the list will have their data captured.
+ @type listOfGFS2Names: Array
+ """
+ # Return a ClusterNode object if the clusternode and cluster name are found
+ # in the output, else return None.
+ clusterName = ""
+ clusternodeName = ""
+ if (runCommand("which", ["cman_tool"])):
+ stdout = runCommandOutput("cman_tool", ["status"])
+ if (not stdout == None):
+ stdoutSplit = stdout.split("\n")
+ clusterName = ""
+ clusternodeName = ""
+ for line in stdoutSplit:
+ if (line.startswith("Cluster Name:")):
+ clusterName = line.split("Cluster Name:")[1].strip().rstrip()
+ if (line.startswith("Node name: ")):
+ clusternodeName = line.split("Node name:")[1].strip().rstrip()
+ elif (runCommand("which", ["corosync-cmapctl"])):
+ # Another way to get the local cluster node is: $ crm_node -i; crm_node -l
+ # Get the name of the cluster.
+ stdout = runCommandOutput("corosync-cmapctl", ["-g", "totem.cluster_name"])
+ if (not stdout == None):
+ stdoutSplit = stdout.split("=")
+ if (len(stdoutSplit) == 2):
+ clusterName = stdoutSplit[1].strip().rstrip()
+ # Get the id of the local cluster node so we can get the clusternode name
+ thisNodeID = ""
+ stdout = runCommandOutput("corosync-cmapctl", ["-g", "runtime.votequorum.this_node_id"])
+ if (not stdout == None):
+ stdoutSplit = stdout.split("=")
+ if (len(stdoutSplit) == 2):
+ thisNodeID = stdoutSplit[1].strip().rstrip()
+ # Now that we the nodeid then we can get the clusternode name.
+ if (len(thisNodeID) > 0):
+ stdout = runCommandOutput("corosync-quorumtool", ["-l"])
+ if (not stdout == None):
+ for line in stdout.split("\n"):
+ splitLine = line.split()
+ if (len(splitLine) == 4):
+ if (splitLine[0].strip().rstrip() == thisNodeID):
+ clusternodeName = splitLine[3]
+ break;
+ # If a clusternode name and cluster name was found then return a new object
+ # since this means this cluster is part of cluster.
+ if ((len(clusterName) > 0) and (len(clusternodeName) > 0)):
+ mapOfMountedFilesystemLabels = getLabelMapForMountedFilesystems(clusterName, getMountedGFS2Filesystems())
+ # These will be the GFS2 filesystems that will have their lockdump information gathered.
+ if (len(listOfGFS2Names) > 0):
+ for label in mapOfMountedFilesystemLabels.keys():
+ foundMatch = False
+ for gfs2FSName in listOfGFS2Names:
+ if ((gfs2FSName == label) or ("%s:%s"%(clusterName, gfs2FSName) == label)):
+ foundMatch = True
+ break
+ if ((not foundMatch) and (mapOfMountedFilesystemLabels.has_key(label))):
+ del(mapOfMountedFilesystemLabels[label])
+ return ClusterNode(clusternodeName, clusterName, mapOfMountedFilesystemLabels)
+ else:
+ return None
+
+def getMountedGFS2Filesystems():
+ """
+ This function returns a list of all the mounted GFS2 filesystems.
+
+ @return: Returns a list of all the mounted GFS2 filesystems.
+ @rtype: Array
+ """
+ fsType = "gfs2"
+ listOfMountedFilesystems = []
+ stdout = runCommandOutput("mount", ["-l"])
+ if (not stdout == None):
+ stdoutSplit = stdout.split("\n")
+ for line in stdoutSplit:
+ splitLine = line.split()
+ if (len(splitLine) >= 5):
+ if (splitLine[4] == fsType):
+ listOfMountedFilesystems.append(line)
+ return listOfMountedFilesystems
+
+def getLabelMapForMountedFilesystems(clusterName, listOfMountedFilesystems):
+ """
+ This function will return a dictionary of the mounted GFS2 filesystem that
+ contain a label that starts with the cluster name. For example:
+ {'f18cluster:mygfs2vol1': '/dev/vdb1 on /mnt/gfs2vol1 type gfs2 (rw,relatime) [f18cluster:mygfs2vol1]'}
+
+ @return: Returns a dictionary of the mounted GFS2 filesystems that contain a
+ label that starts with the cluster name.
+ @rtype: Dict
+
+ @param clusterName: The name of the cluster.
+ @type clusterName: String
+ @param listOfMountedFilesystems: A list of all the mounted GFS2 filesystems.
+ @type listOfMountedFilesystems: Array
+ """
+ mapOfMountedFilesystemLabels = {}
+ for mountedFilesystem in listOfMountedFilesystems:
+ splitMountedFilesystem = mountedFilesystem.split()
+ fsLabel = splitMountedFilesystem[-1].strip().strip("[").rstrip("]")
+ if (len(fsLabel) > 0):
+ # Verify it starts with name of the cluster.
+ if (fsLabel.startswith("%s:" %(clusterName))):
+ mapOfMountedFilesystemLabels[fsLabel] = mountedFilesystem
+ return mapOfMountedFilesystemLabels
+
+def mountFilesystem(filesystemType, pathToDevice, pathToMountPoint):
+ """
+ This function will attempt to mount a filesystem. If the filesystem is
+ already mounted or the filesystem was successfully mounted then True is
+ returned, otherwise False is returned.
+
+ @return: If the filesystem is already mounted or the filesystem was
+ successfully mounted then True is returned, otherwise False is returned.
+ @rtype: Boolean
+
+ @param filesystemType: The type of filesystem that will be mounted.
+ @type filesystemType: String
+ @param pathToDevice: The path to the device that will be mounted.
+ @type pathToDevice: String
+ @param pathToMountPoint: The path to the directory that will be used as the
+ mount point for the device.
+ @type pathToMountPoint: String
+ """
+ if (os.path.ismount(PATH_TO_DEBUG_DIR)):
+ return True
+ listOfCommandOptions = ["-t", filesystemType, pathToDevice, pathToMountPoint]
+ if (not runCommand("mount", listOfCommandOptions)):
+ message = "There was an error mounting the filesystem type %s for the device %s to the mount point %s." %(filesystemType, pathToDevice, pathToMountPoint)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return os.path.ismount(PATH_TO_DEBUG_DIR)
+
+def gatherGeneralInformation(pathToDSTDir):
+ """
+ This function will gather general information about the cluster and write
+ the results to a file. The following data will be captured: hostname, date,
+ uname -a, uptime, contents of /proc/mounts, and ps h -AL -o tid,s,cmd.
+
+
+ @param pathToDSTDir: This is the path to directory where the files will be
+ written to.
+ @type pathToDSTDir: String
+ """
+ # Gather some general information and write to system.txt.
+ systemString = "HOSTNAME=%s\nTIMESTAMP=%s\n" %(platform.node(), time.strftime("%Y-%m-%d %H:%M:%S"))
+ stdout = runCommandOutput("uname", ["-a"]).strip().rstrip()
+ if (not stdout == None):
+ systemString += "UNAMEA=%s\n" %(stdout)
+ stdout = runCommandOutput("uptime", []).strip().rstrip()
+ if (not stdout == None):
+ systemString += "UPTIME=%s" %(stdout)
+ writeToFile(os.path.join(pathToDSTDir, "hostinformation.txt"), systemString, createFile=True)
+
+ # Copy misc files
+ pathToSrcFile = "/proc/mounts"
+ copyFile(pathToSrcFile, os.path.join(pathToDSTDir, pathToSrcFile.strip("/")))
+ pathToSrcFile = "/proc/slabinfo"
+ copyFile(pathToSrcFile, os.path.join(pathToDSTDir, pathToSrcFile.strip("/")))
+
+ # Get "ps -eo user,pid,%cpu,%mem,vsz,rss,tty,stat,start,time,comm,wchan" data.
+ command = "ps"
+ pathToCommandOutput = os.path.join(pathToDSTDir, "ps_hALo-tid.s.cmd")
+ try:
+ fout = open(pathToCommandOutput, "w")
+ #runCommand(command, ["-eo", "user,pid,%cpu,%mem,vsz,rss,tty,stat,start,time,comm,wchan"], standardOut=fout)
+ runCommand(command, ["h", "-AL", "-o", "tid,s,cmd"], standardOut=fout)
+ fout.close()
+ except IOError:
+ message = "There was an error the command output for %s to the file %s." %(command, pathToCommandOutput)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+
+
+def isProcPidStackEnabled(pathToPidData):
+ """
+ Returns true if the init process has the file "stack" in its pid data
+ directory which contains the task functions for that process.
+
+ @return: Returns true if the init process has the file "stack" in its pid
+ data directory which contains the task functions for that process.
+ @rtype: Boolean
+
+ @param pathToPidData: The path to the directory where all the pid data
+ directories are located.
+ @type pathToPidData: String
+ """
+ return os.path.exists(os.path.join(pathToPidData, "1/stack"))
+
+def gatherPidData(pathToPidData, pathToDSTDir):
+ """
+ This command will gather all the directories which contain data about all the pids.
+
+ @return: Returns a list of paths to the directory that contains the
+ information about the pid.
+ @rtype: Array
+
+ @param pathToPidData: The path to the directory where all the pid data
+ directories are located.
+ @type pathToPidData: String
+ """
+ # Status has: command name, pid, ppid, state, possibly registers
+ listOfFilesToCopy = ["cmdline", "stack", "status"]
+ listOfPathToPidsData = []
+ if (os.path.exists(pathToPidData)):
+ for srcFilename in os.listdir(pathToPidData):
+ pathToPidDirDST = os.path.join(pathToDSTDir, srcFilename)
+ if (srcFilename.isdigit()):
+ pathToSrcDir = os.path.join(pathToPidData, srcFilename)
+ for filenameToCopy in listOfFilesToCopy:
+ copyFile(os.path.join(pathToSrcDir, filenameToCopy), os.path.join(pathToPidDirDST, filenameToCopy))
+ if (os.path.exists(pathToPidDirDST)):
+ listOfPathToPidsData.append(pathToPidDirDST)
+ return listOfPathToPidsData
+
+def triggerSysRQEvents():
+ """
+ This command will trigger sysrq events which will write the output to
+ /var/log/messages. The events that will be trigger are "m" and "t". The "m"
+ event will dump information about memory allocation. The "t" event will dump
+ all the threads state information.
+ """
+ command = "echo"
+ pathToSysrqTriggerFile = "/proc/sysrq-trigger"
+ # m - dump information about memory allocation
+ # t - dump thread state information
+ # triggers = ["m", "t"]
+ triggers = ["t"]
+ for trigger in triggers:
+ try:
+ fout = open(pathToSysrqTriggerFile, "w")
+ runCommand(command, [trigger], standardOut=fout)
+ fout.close()
+ except IOError:
+ message = "There was an error writing the command output for %s to the file %s." %(command, pathToSysrqTriggerFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+
+def gatherLogs(pathToDSTDir):
+ """
+ This function will copy all the cluster logs(/var/log/cluster) and the
+ system log(/var/log/messages) to the directory given by pathToDSTDir.
+
+ @param pathToDSTDir: This is the path to directory where the files will be
+ copied to.
+ @type pathToDSTDir: String
+ """
+ pathToLogFile = "/var/log/messages"
+ pathToDSTLogFile = os.path.join(pathToDSTDir, os.path.basename(pathToLogFile))
+ copyFile(pathToLogFile, pathToDSTLogFile)
+
+ pathToLogDir = "/var/log/cluster"
+ if (os.path.exists(pathToLogDir)):
+ pathToDSTLogDir = os.path.join(pathToDSTDir, os.path.basename(pathToLogDir))
+ copyDirectory(pathToLogDir, pathToDSTDir)
+
+def gatherDLMLockDumps(pathToDSTDir, listOfGFS2Filesystems):
+ """
+ This function copies the debug files for dlm for a GFS2 filesystem in the
+ list to a directory. The list of GFS2 filesystems will only include the
+ filesystem name for each item in the list. For example: "mygfs2vol1"
+
+ @param pathToDSTDir: This is the path to directory where the files will be
+ copied to.
+ @type pathToDSTDir: String
+ @param listOfGFS2Filesystems: This is the list of the GFS2 filesystems that
+ will have their debug directory copied.
+ @type listOfGFS2Filesystems: Array
+ """
+ lockDumpType = "dlm"
+ pathToSrcDir = os.path.join(PATH_TO_DEBUG_DIR, lockDumpType)
+ pathToOutputDir = os.path.join(pathToDSTDir, lockDumpType)
+ message = "Copying the files in the %s lockdump data directory %s." %(lockDumpType.upper(), pathToSrcDir)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ for filename in os.listdir(pathToSrcDir):
+ for name in listOfGFS2Filesystems:
+ if (filename.startswith(name)):
+ copyFile(os.path.join(pathToSrcDir, filename),
+ os.path.join(os.path.join(pathToOutputDir, name), filename))
+
+def gatherGFS2LockDumps(pathToDSTDir, listOfGFS2Filesystems):
+ """
+ This function copies the debug directory for a GFS2 filesystems in the list
+ to a directory. The list of GFS2 filesystems will include the cluster name
+ and filesystem name for each item in the list. For example:
+ "f18cluster:mygfs2vol1"
+
+ @param pathToDSTDir: This is the path to directory where the files will be
+ copied to.
+ @type pathToDSTDir: String
+ @param listOfGFS2Filesystems: This is the list of the GFS2 filesystems that
+ will have their debug directory copied.
+ @type listOfGFS2Filesystems: Array
+ """
+ lockDumpType = "gfs2"
+ pathToSrcDir = os.path.join(PATH_TO_DEBUG_DIR, lockDumpType)
+ pathToOutputDir = os.path.join(pathToDSTDir, lockDumpType)
+ for dirName in os.listdir(pathToSrcDir):
+ pathToCurrentDir = os.path.join(pathToSrcDir, dirName)
+ if ((os.path.isdir(pathToCurrentDir)) and (dirName in listOfGFS2Filesystems)):
+ message = "Copying the lockdump data for the %s filesystem: %s" %(lockDumpType.upper(), dirName)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ copyDirectory(pathToCurrentDir, pathToOutputDir)
+
+# ##############################################################################
+# Get user selected options
+# ##############################################################################
+def __getOptions(version) :
+ """
+ This function creates the OptionParser and returns commandline
+ a tuple of the selected commandline options and commandline args.
+
+ The cmdlineOpts which is the options user selected and cmdLineArgs
+ is value passed and not associated with an option.
+
+ @return: A tuple of the selected commandline options and commandline args.
+ @rtype: Tuple
+
+ @param version: The version of the this script.
+ @type version: String
+ """
+ cmdParser = OptionParserExtended(version)
+ cmdParser.add_option("-d", "--debug",
+ action="store_true",
+ dest="enableDebugLogging",
+ help="enables debug logging",
+ default=False)
+ cmdParser.add_option("-q", "--quiet",
+ action="store_true",
+ dest="disableLoggingToConsole",
+ help="disables logging to console",
+ default=False)
+ cmdParser.add_option("-y", "--no_ask",
+ action="store_true",
+ dest="disableQuestions",
+ help="disables all questions and assumes yes",
+ default=False)
+ cmdParser.add_option("-i", "--info",
+ action="store_true",
+ dest="enablePrintInfo",
+ help="prints information about the mounted GFS2 file systems",
+ default=False)
+ cmdParser.add_option("-t", "--archive",
+ action="store_true",
+ dest="enableArchiveOutputDir",
+ help="the output directory will be archived(tar) and compressed(.bz2)",
+ default=False)
+ cmdParser.add_option("-o", "--path_to_output_dir",
+ action="store",
+ dest="pathToOutputDir",
+ help="the directory where all the collect data will be stored",
+ type="string",
+ metavar="<output directory>",
+ default="")
+ cmdParser.add_option("-r", "--num_of_runs",
+ action="store",
+ dest="numberOfRuns",
+ help="number of runs capturing the lockdump data",
+ type="int",
+ metavar="<number of runs>",
+ default=2)
+ cmdParser.add_option("-s", "--seconds_sleep",
+ action="store",
+ dest="secondsToSleep",
+ help="number of seconds to sleep between runs of capturing the lockdump data",
+ type="int",
+ metavar="<seconds to sleep>",
+ default=120)
+ cmdParser.add_option("-n", "--fs_name",
+ action="extend",
+ dest="listOfGFS2Names",
+ help="name of the GFS2 filesystem(s) that will have their lockdump data captured",
+ type="string",
+ metavar="<name of GFS2 filesystem>",
+ default=[])
+ # Get the options and return the result.
+ (cmdLineOpts, cmdLineArgs) = cmdParser.parse_args()
+ return (cmdLineOpts, cmdLineArgs)
+
+# ##############################################################################
+# OptParse classes for commandline options
+# ##############################################################################
+class OptionParserExtended(OptionParser):
+ """
+ This is the class that gets the command line options the end user
+ selects.
+ """
+ def __init__(self, version) :
+ """
+ @param version: The version of the this script.
+ @type version: String
+ """
+ self.__commandName = os.path.basename(sys.argv[0])
+ versionMessage = "%s %s\n" %(self.__commandName, version)
+
+ commandDescription ="%s gfs2_lockcapture will capture locking information from GFS2 file systems and DLM.\n"%(self.__commandName)
+
+ OptionParser.__init__(self, option_class=ExtendOption,
+ version=versionMessage,
+ description=commandDescription)
+
+ def print_help(self):
+ """
+ Print examples at the bottom of the help message.
+ """
+ self.print_version()
+ examplesMessage = "\n"
+ examplesMessage = "\nPrints information about the available GFS2 filesystems that can have lockdump data captured."
+ examplesMessage += "\n$ %s -i\n" %(self.__commandName)
+
+ examplesMessage += "\nIt will do 3 runs of gathering the lockdump information in 10 second intervals for only the"
+ examplesMessage += "\nGFS2 filesystems with the names myGFS2vol2,myGFS2vol1. Then it will archive and compress"
+ examplesMessage += "\nthe data collected. All of the lockdump data will be written to the directory: "
+ examplesMessage += "\n/tmp/2012-11-12_095556-gfs2_lockcapture and all the questions will be answered with yes.\n"
+ examplesMessage += "\n$ %s -r 3 -s 10 -t -n myGFS2vol2,myGFS2vol1 -o /tmp/2012-11-12_095556-gfs2_lockcapture -y\n" %(self.__commandName)
+
+ examplesMessage += "\nIt will do 2 runs of gathering the lockdump information in 25 second intervals for all the"
+ examplesMessage += "\nmounted GFS2 filesystems. Then it will archive and compress the data collected. All of the"
+ examplesMessage += "\nlockdump data will be written to the directory: /tmp/2012-11-12_095556-gfs2_lockcapture.\n"
+ examplesMessage += "\n$ %s -r 2 -s 25 -t -o /tmp/2012-11-12_095556-gfs2_lockcapture\n" %(self.__commandName)
+ OptionParser.print_help(self)
+ print examplesMessage
+
+class ExtendOption (Option):
+ """
+ Allow to specify comma delimited list of entries for arrays
+ and dictionaries.
+ """
+ ACTIONS = Option.ACTIONS + ("extend",)
+ STORE_ACTIONS = Option.STORE_ACTIONS + ("extend",)
+ TYPED_ACTIONS = Option.TYPED_ACTIONS + ("extend",)
+
+ def take_action(self, action, dest, opt, value, values, parser):
+ """
+ This function is a wrapper to take certain options passed on command
+ prompt and wrap them into an Array.
+
+ @param action: The type of action that will be taken. For example:
+ "store_true", "store_false", "extend".
+ @type action: String
+ @param dest: The name of the variable that will be used to store the
+ option.
+ @type dest: String/Boolean/Array
+ @param opt: The option string that triggered the action.
+ @type opt: String
+ @param value: The value of opt(option) if it takes a
+ value, if not then None.
+ @type value:
+ @param values: All the opt(options) in a dictionary.
+ @type values: Dictionary
+ @param parser: The option parser that was orginally called.
+ @type parser: OptionParser
+ """
+ if (action == "extend") :
+ valueList = []
+ try:
+ for v in value.split(","):
+ # Need to add code for dealing with paths if there is option for paths.
+ newValue = value.strip().rstrip()
+ if (len(newValue) > 0):
+ valueList.append(newValue)
+ except:
+ pass
+ else:
+ values.ensure_value(dest, []).extend(valueList)
+ else:
+ Option.take_action(self, action, dest, opt, value, values, parser)
+
+# ###############################################################################
+# Main Function
+# ###############################################################################
+if __name__ == "__main__":
+ """
+ When the script is executed then this code is ran.
+ """
+ try:
+ # #######################################################################
+ # Get the options from the commandline.
+ # #######################################################################
+ (cmdLineOpts, cmdLineArgs) = __getOptions(VERSION_NUMBER)
+ # #######################################################################
+ # Setup the logger and create config directory
+ # #######################################################################
+ # Create the logger
+ logLevel = logging.INFO
+ logger = logging.getLogger(MAIN_LOGGER_NAME)
+ logger.setLevel(logLevel)
+ # Create a new status function and level.
+ logging.STATUS = logging.INFO + 2
+ logging.addLevelName(logging.STATUS, "STATUS")
+ # Create a function for the STATUS_LEVEL since not defined by python. This
+ # means you can call it like the other predefined message
+ # functions. Example: logging.getLogger("loggerName").status(message)
+ setattr(logger, "status", lambda *args: logger.log(logging.STATUS, *args))
+ streamHandler = logging.StreamHandler()
+ streamHandler.setLevel(logLevel)
+ streamHandler.setFormatter(logging.Formatter("%(levelname)s %(message)s"))
+ logger.addHandler(streamHandler)
+
+ # Please note there will not be a global log file created. If a log file
+ # is needed then redirect the output. There will be a log file created
+ # for each run in the corresponding directory.
+
+ # #######################################################################
+ # Set the logging levels.
+ # #######################################################################
+ if ((cmdLineOpts.enableDebugLogging) and (not cmdLineOpts.disableLoggingToConsole)):
+ logging.getLogger(MAIN_LOGGER_NAME).setLevel(logging.DEBUG)
+ streamHandler.setLevel(logging.DEBUG)
+ message = "Debugging has been enabled."
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ if (cmdLineOpts.disableLoggingToConsole):
+ logging.disable(logging.CRITICAL)
+ # #######################################################################
+ # Check to see if pid file exists and error if it does.
+ # #######################################################################
+ if (os.path.exists(PATH_TO_PID_FILENAME)):
+ message = "The PID file %s already exists and this script cannot run till it does not exist." %(PATH_TO_PID_FILENAME)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ message = "Verify that there are no other existing processes running. If there are running processes those need to be stopped first and the file removed."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ exitScript(removePidFile=False, errorCode=1)
+ else:
+ message = "Creating the pid file: %s" %(PATH_TO_PID_FILENAME)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ # Creata the pid file so we dont have more than 1 process of this
+ # script running.
+ writeToFile(PATH_TO_PID_FILENAME, str(os.getpid()), createFile=True)
+ # #######################################################################
+ # Verify they want to continue because this script will trigger sysrq events.
+ # #######################################################################
+ if (not cmdLineOpts.disableQuestions):
+ valid = {"yes":True, "y":True, "no":False, "n":False}
+ question = "This script will trigger a sysrq -t event or collect the data for each pid directory located in /proc for each run. Are you sure you want to continue?"
+ prompt = " [y/n] "
+ while True:
+ sys.stdout.write(question + prompt)
+ choice = raw_input().lower()
+ if (choice in valid):
+ if (valid.get(choice)):
+ # If yes, or y then exit loop and continue.
+ break
+ else:
+ message = "The script will not continue since you chose not to continue."
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ exitScript(removePidFile=True, errorCode=1)
+ else:
+ sys.stdout.write("Please respond with '(y)es' or '(n)o'.\n")
+ # #######################################################################
+ # Get the clusternode name and verify that mounted GFS2 filesystems were
+ # found.
+ # #######################################################################
+ clusternode = getClusterNode(cmdLineOpts.listOfGFS2Names)
+ if (clusternode == None):
+ message = "The cluster or cluster node name could not be found."
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ exitScript(removePidFile=True, errorCode=1)
+ elif (not len(clusternode.getMountedGFS2FilesystemNames()) > 0):
+ message = "There were no mounted GFS2 filesystems found."
+ if (len(cmdLineOpts.listOfGFS2Names) > 0):
+ message = "There were no mounted GFS2 filesystems found with the name:"
+ for name in cmdLineOpts.listOfGFS2Names:
+ message += " %s" %(name)
+ message += "."
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ exitScript(removePidFile=True, errorCode=1)
+ if (cmdLineOpts.enablePrintInfo):
+ logging.disable(logging.CRITICAL)
+ print "List of all the mounted GFS2 filesystems that can have their lockdump data captured:"
+ print clusternode
+ exitScript()
+ # #######################################################################
+ # Create the output directory to verify it can be created before
+ # proceeding unless it is already created from a previous run data needs
+ # to be analyzed. Probably could add more debugging on if file or dir.
+ # #######################################################################
+ pathToOutputDir = cmdLineOpts.pathToOutputDir
+ if (not len(pathToOutputDir) > 0):
+ pathToOutputDir = "%s" %(os.path.join("/tmp", "%s-%s-%s" %(time.strftime("%Y-%m-%d_%H%M%S"), clusternode.getClusterNodeName(), os.path.basename(sys.argv[0]))))
+ # #######################################################################
+ # Backup any existing directory with same name as current output
+ # directory.
+ # #######################################################################
+ if (backupOutputDirectory(pathToOutputDir)):
+ message = "This directory that will be used to capture all the data: %s" %(pathToOutputDir)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ if (not mkdirs(pathToOutputDir)):
+ exitScript(errorCode=1)
+ else:
+ # There was an existing directory with same path as current output
+ # directory and it failed to back it up.
+ message = "Please change the output directory path (-o) or manual rename or remove the existing path: %s" %(pathToOutputDir)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ exitScript(errorCode=1)
+ # #######################################################################
+ # Check to see if the debug directory is mounted. If not then
+ # log an error.
+ # #######################################################################
+ if(mountFilesystem("debugfs", "none", PATH_TO_DEBUG_DIR)):
+ message = "The debug filesystem %s is mounted." %(PATH_TO_DEBUG_DIR)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ else:
+ message = "There was a problem mounting the debug filesystem: %s" %(PATH_TO_DEBUG_DIR)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ message = "The debug filesystem is required to be mounted for this script to run."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ exitScript(errorCode=1)
+ # #######################################################################
+ # Gather data and the lockdumps.
+ # #######################################################################
+ if (cmdLineOpts.numberOfRuns <= 0):
+ message = "The number of runs should be greater than zero."
+ exitScript(errorCode=1)
+ for i in range(1,(cmdLineOpts.numberOfRuns + 1)):
+ # The current log count that will start at 1 and not zero to make it
+ # make sense in logs.
+ # Add clusternode name under each run dir to make combining multple
+ # clusternode gfs2_lockgather data together and all data in each run directory.
+ pathToOutputRunDir = os.path.join(pathToOutputDir, "run%d/%s" %(i, clusternode.getClusterNodeName()))
+ # Create the the directory that will be used to capture the data.
+ if (not mkdirs(pathToOutputRunDir)):
+ exitScript(errorCode=1)
+ # Set the handler for writing to log file for this run.
+ currentRunFileHandler = None
+ pathToLogFile = os.path.join(pathToOutputRunDir, "%s.log" %(MAIN_LOGGER_NAME))
+ if (((os.access(pathToLogFile, os.W_OK) and os.access("/tmp", os.R_OK))) or (not os.path.exists(pathToLogFile))):
+ currentRunFileHandler = logging.FileHandler(pathToLogFile)
+ currentRunFileHandler.setFormatter(logging.Formatter("%(asctime)s %(levelname)s %(message)s", "%Y-%m-%d %H:%M:%S"))
+ logging.getLogger(MAIN_LOGGER_NAME).addHandler(currentRunFileHandler)
+ message = "Pass (%d/%d): Gathering all the lockdump data." %(i, cmdLineOpts.numberOfRuns)
+ logging.getLogger(MAIN_LOGGER_NAME).status(message)
+
+ # Gather various bits of data from the clusternode.
+ message = "Pass (%d/%d): Gathering general information about the host." %(i, cmdLineOpts.numberOfRuns)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ gatherGeneralInformation(pathToOutputRunDir)
+ # Going to sleep for 2 seconds, so that TIMESTAMP should be in the
+ # past in the logs so that capturing sysrq data will be guaranteed.
+ time.sleep(2)
+ # Gather the backtraces for all the pids, by grabbing the /proc/<pid
+ # number> or triggering sysrq events to capture task bask traces
+ # from log.
+ message = "Pass (%d/%d): Triggering the sysrq events for the host." %(i, cmdLineOpts.numberOfRuns)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ # Gather the data in the /proc/<pid> directory if the file
+ # </proc/<pid>/stack exists. If file exists we will not trigger
+ # sysrq events.
+ pathToPidData = "/proc"
+ if (isProcPidStackEnabled(pathToPidData)):
+ gatherPidData(pathToPidData, os.path.join(pathToOutputRunDir, pathToPidData.strip("/")))
+ else:
+ triggerSysRQEvents()
+ # Gather the dlm locks.
+ lockDumpType = "dlm"
+ message = "Pass (%d/%d): Gathering the %s lock dumps for the host." %(i, cmdLineOpts.numberOfRuns, lockDumpType.upper())
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ gatherDLMLockDumps(pathToOutputRunDir, clusternode.getMountedGFS2FilesystemNames(includeClusterName=False))
+ # Gather the glock locks from gfs2.
+ lockDumpType = "gfs2"
+ message = "Pass (%d/%d): Gathering the %s lock dumps for the host." %(i, cmdLineOpts.numberOfRuns, lockDumpType.upper())
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ gatherGFS2LockDumps(pathToOutputRunDir, clusternode.getMountedGFS2FilesystemNames())
+ # Gather log files
+ message = "Pass (%d/%d): Gathering the log files for the host." %(i, cmdLineOpts.numberOfRuns)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ gatherLogs(os.path.join(pathToOutputRunDir, "logs"))
+ # Sleep between each run if secondsToSleep is greater than or equal
+ # to 0 and current run is not the last run.
+ if ((cmdLineOpts.secondsToSleep >= 0) and (i <= (cmdLineOpts.numberOfRuns))):
+ message = "The script will sleep for %d seconds between each run of capturing the lockdump data." %(cmdLineOpts.secondsToSleep)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ time.sleep(cmdLineOpts.secondsToSleep)
+ # Remove the handler:
+ logging.getLogger(MAIN_LOGGER_NAME).removeHandler(currentRunFileHandler)
+
+ # #######################################################################
+ # Archive the directory that contains all the data and archive it after
+ # all the information has been gathered.
+ # #######################################################################
+ message = "All the files have been gathered and this directory contains all the captured data: %s" %(pathToOutputDir)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ if (cmdLineOpts.enableArchiveOutputDir):
+ message = "The lockdump data will now be archived. This could some time depending on the size of the data collected."
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ pathToTarFilename = archiveData(pathToOutputDir)
+ if (os.path.exists(pathToTarFilename)):
+ message = "The compressed archvied file was created: %s" %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ else:
+ message = "The compressed archvied failed to be created: %s" %(pathToTarFilename)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ # #######################################################################
+ except KeyboardInterrupt:
+ print ""
+ message = "This script will exit since control-c was executed by end user."
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ exitScript(errorCode=1)
+ # #######################################################################
+ # Exit the application with zero exit code since we cleanly exited.
+ # #######################################################################
+ exitScript()
11 years, 4 months
gfs2-utils: master - gfs2-utils tests: Add a script to exercise the utils
by Andrew Price
Gitweb: http://git.fedorahosted.org/git/?p=gfs2-utils.git;a=commitdiff;h=882b2853...
Commit: 882b2853f1d9545f86e942e6ad5cf0160413530c
Parent: b3ca8fbbf8f1ea9120988254ccc94b2197b0d493
Author: Andrew Price <anprice(a)redhat.com>
AuthorDate: Fri Dec 14 13:11:22 2012 +0000
Committer: Andrew Price <anprice(a)redhat.com>
CommitterDate: Fri Dec 14 14:48:55 2012 +0000
gfs2-utils tests: Add a script to exercise the utils
Add a test script to make it easy to run gfs2 utils with various options
and check their exit codes in sequence. The script is plugged into the
test suite and is run with 'make check'.
Signed-off-by: Andrew Price <anprice(a)redhat.com>
---
tests/Makefile.am | 4 ++-
tests/tool_tests.sh | 62 +++++++++++++++++++++++++++++++++++++++++++++++++++
2 files changed, 65 insertions(+), 1 deletions(-)
diff --git a/tests/Makefile.am b/tests/Makefile.am
index 71c1e08..d8aa8f2 100644
--- a/tests/Makefile.am
+++ b/tests/Makefile.am
@@ -1,4 +1,6 @@
-TESTS = check_libgfs2
+TESTS_ENVIRONMENT = TOPBUILDDIR=$(top_builddir)
+TESTS = check_libgfs2 tool_tests.sh
+EXTRA_DIST = tool_tests.sh
check_PROGRAMS = check_libgfs2
check_libgfs2_SOURCES = check_meta.c \
$(top_srcdir)/gfs2/libgfs2/libgfs2.h
diff --git a/tests/tool_tests.sh b/tests/tool_tests.sh
new file mode 100755
index 0000000..791b071
--- /dev/null
+++ b/tests/tool_tests.sh
@@ -0,0 +1,62 @@
+#!/bin/sh
+
+# This script runs gfs2 utils with various options, checking exit codes against
+# expected values. If any test fails to exit with an expected code, the exit code
+# of the whole script will be non-zero but the tests will continue to be run. The
+# sparse file which is used as the target of the tests can be configured by
+# setting the environment variables TEST_TARGET (the filename) and TEST_TARGET_SZ
+# (its apparent size in gigabytes). Defaults to "test_sparse" and 10GB.
+
+MKFS="${TOPBUILDDIR}/gfs2/mkfs/mkfs.gfs2 -qO"
+FSCK="${TOPBUILDDIR}/gfs2/fsck/fsck.gfs2 -qn"
+
+# Name of the sparse file we'll use for testing
+TEST_TARGET=${TEST_TARGET:-test_sparse}
+# Size, in GB, of the sparse file we'll create to run the tests
+TEST_TARGET_SZ=${TEST_TARGET_SZ:-10}
+[ $TEST_TARGET_SZ -gt 0 ] || { echo "Target size (in GB) must be greater than 0" >&2; exit 1; }
+# Overall success (so we can keep going if one test fails)
+TEST_RET=0
+
+fn_test()
+{
+ local expected="$1"
+ local cmd="$2"
+ echo -n "Running '$cmd' - (Exp: $expected Got: "
+ $cmd &> /dev/null;
+ local ret=$?
+ echo -n "$ret) "
+ if [ "$ret" != "$expected" ];
+ then
+ echo "FAIL"
+ TEST_RET=1
+ TEST_GRP_RET=1
+ else
+ echo "PASS"
+ fi
+}
+
+fn_rm_target()
+{
+ fn_test 0 "rm -f $TEST_TARGET"
+}
+
+fn_recreate_target()
+{
+ fn_rm_target
+ fn_test 0 "dd if=/dev/null of=$TEST_TARGET bs=1 count=0 seek=${TEST_TARGET_SZ}G"
+}
+
+
+# Tests start here
+fn_recreate_target
+fn_test 0 "$MKFS -p lock_nolock $TEST_TARGET"
+fn_test 0 "$MKFS -p lock_dlm -t foo:bar $TEST_TARGET"
+fn_test 255 "$MKFS -p badprotocol $TEST_TARGET"
+fn_test 0 "$FSCK $TEST_TARGET"
+
+# Tests end here
+
+# Clean up
+fn_test 0 "rm -f $TEST_TARGET"
+exit $TEST_RET
11 years, 4 months
gfs2-utils: master - gfs2-lockcapture: Modified some of the data gathered
by shane bradley
Gitweb: http://git.fedorahosted.org/git/?p=gfs2-utils.git;a=commitdiff;h=b3ca8fbb...
Commit: b3ca8fbbf8f1ea9120988254ccc94b2197b0d493
Parent: f9369035530d112ffcdb81868dadd42321651680
Author: Shane Bradley <sbradley(a)redhat.com>
AuthorDate: Fri Dec 14 09:25:30 2012 -0500
Committer: Shane Bradley <sbradley(a)redhat.com>
CommitterDate: Fri Dec 14 09:30:17 2012 -0500
gfs2-lockcapture: Modified some of the data gathered
Changed some var names in host data collected, added /proc/<pid>/ to files
collected, and added man page.
Signed-off-by: Shane Bradley <sbradley(a)redhat.com>
---
gfs2/lockcapture/gfs2_lockcapture | 465 ++++++++++++++++++++++++-------------
gfs2/man/Makefile.am | 3 +-
gfs2/man/gfs2_lockcapture.8 | 53 +++++
3 files changed, 364 insertions(+), 157 deletions(-)
diff --git a/gfs2/lockcapture/gfs2_lockcapture b/gfs2/lockcapture/gfs2_lockcapture
index a930a2f..1a64188 100644
--- a/gfs2/lockcapture/gfs2_lockcapture
+++ b/gfs2/lockcapture/gfs2_lockcapture
@@ -1,9 +1,7 @@
#!/usr/bin/env python
"""
-This script will gather GFS2 glocks and dlm lock dump information for a cluster
-node. The script can get all the mounted GFS2 filesystem data or set of selected
-GFS2 filesystems. The script will also gather some general information about the
-system.
+The script gfs2_lockcapture will capture locking information from GFS2 file
+systems and DLM.
@author : Shane Bradley
@contact : sbradley(a)redhat.com
@@ -35,7 +33,7 @@ import tarfile
sure only 1 instance of this script is running at any time.
@type PATH_TO_PID_FILENAME: String
"""
-VERSION_NUMBER = "0.9-1"
+VERSION_NUMBER = "0.9-2"
MAIN_LOGGER_NAME = "%s" %(os.path.basename(sys.argv[0]))
PATH_TO_DEBUG_DIR="/sys/kernel/debug"
PATH_TO_PID_FILENAME = "/var/run/%s.pid" %(os.path.basename(sys.argv[0]))
@@ -313,7 +311,7 @@ def archiveData(pathToSrcDir):
@type pathToSrcDir: String
"""
if (os.path.exists(pathToSrcDir)):
- pathToTarFilename = "%s.tar.bz2" %(pathToSrcDir)
+ pathToTarFilename = "%s-%s.tar.bz2" %(pathToSrcDir, platform.node())
if (os.path.exists(pathToTarFilename)):
message = "A compressed archvied file already exists and will be removed: %s" %(pathToTarFilename)
logging.getLogger(MAIN_LOGGER_NAME).status(message)
@@ -337,6 +335,127 @@ def archiveData(pathToSrcDir):
return pathToTarFilename
return ""
+def getDataFromFile(pathToSrcFile) :
+ """
+ This function will return the data in an array. Where each newline in file
+ is a seperate item in the array. This should really just be used on
+ relatively small files.
+
+ None is returned if no file is found.
+
+ @return: Returns an array of Strings, where each newline in file is an item
+ in the array.
+ @rtype: Array
+
+ @param pathToSrcFile: The path to the file which will be read.
+ @type pathToSrcFile: String
+ """
+ if (len(pathToSrcFile) > 0) :
+ try:
+ fin = open(pathToSrcFile, "r")
+ data = fin.readlines()
+ fin.close()
+ return data
+ except (IOError, os.error):
+ message = "An error occured reading the file: %s." %(pathToSrcFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return None
+
+def copyFile(pathToSrcFile, pathToDstFile):
+ """
+ This function will copy a src file to dst file.
+
+ @return: Returns True if the file was copied successfully.
+ @rtype: Boolean
+
+ @param pathToSrcFile: The path to the source file that will be copied.
+ @type pathToSrcFile: String
+ @param pathToDstFile: The path to the destination of the file.
+ @type pathToDstFile: String
+ """
+ if(not os.path.exists(pathToSrcFile)):
+ message = "The file does not exist with the path: %s." %(pathToSrcFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ elif (not os.path.isfile(pathToSrcFile)):
+ message = "The path to the source file is not a regular file: %s." %(pathToSrcFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ elif (pathToSrcFile == pathToDstFile):
+ message = "The path to the source file and path to destination file cannot be the same: %s." %(pathToDstFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ else:
+ # Create the directory structure if it does not exist.
+ (head, tail) = os.path.split(pathToDstFile)
+ if (not mkdirs(head)) :
+ # The path to the directory was not created so file
+ # could not be copied.
+ return False
+ # Copy the file to the dst path.
+ try:
+ shutil.copy(pathToSrcFile, pathToDstFile)
+ except shutil.Error:
+ message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except OSError:
+ message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except IOError:
+ message = "Cannot copy the file %s to %s." %(pathToSrcFile, pathToDstFile)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ return (os.path.exists(pathToDstFile))
+
+def copyDirectory(pathToSrcDir, pathToDstDir):
+ """
+ This function will copy a src dir to dst dir.
+
+ @return: Returns True if the dir was copied successfully.
+ @rtype: Boolean
+
+ @param pathToSrcDir: The path to the source dir that will be copied.
+ @type pathToSrcDir: String
+ @param pathToDstDir: The path to the destination of the dir.
+ @type pathToDstDir: String
+ """
+ if(not os.path.exists(pathToSrcDir)):
+ message = "The directory does not exist with the path: %s." %(pathToSrcDir)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ elif (not os.path.isdir(pathToSrcDir)):
+ message = "The path to the source directory is not a directory: %s." %(pathToSrcDir)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ elif (pathToSrcDir == pathToDstDir):
+ message = "The path to the source directory and path to destination directory cannot be the same: %s." %(pathToDstDir)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ else:
+ if (not mkdirs(pathToDstDir)) :
+ # The path to the directory was not created so file
+ # could not be copied.
+ return False
+ # Copy the file to the dst path.
+ dst = os.path.join(pathToDstDir, os.path.basename(pathToSrcDir))
+ try:
+ shutil.copytree(pathToSrcDir, dst)
+ except shutil.Error:
+ message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except OSError:
+ message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ except IOError:
+ message = "Cannot copy the directory %s to %s." %(pathToSrcDir, dst)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ return False
+ return (os.path.exists(dst))
+
def backupOutputDirectory(pathToOutputDir):
"""
This function will return True if the pathToOutputDir does not exist or the
@@ -464,8 +583,8 @@ def getClusterNode(listOfGFS2Names):
if (len(listOfGFS2Names) > 0):
for label in mapOfMountedFilesystemLabels.keys():
foundMatch = False
- for name in listOfGFS2Names:
- if ((name == label) or ("%s:%s"%(clusterName, name) == label)):
+ for gfs2FSName in listOfGFS2Names:
+ if ((gfs2FSName == label) or ("%s:%s"%(clusterName, gfs2FSName) == label)):
foundMatch = True
break
if ((not foundMatch) and (mapOfMountedFilesystemLabels.has_key(label))):
@@ -518,33 +637,6 @@ def getLabelMapForMountedFilesystems(clusterName, listOfMountedFilesystems):
mapOfMountedFilesystemLabels[fsLabel] = mountedFilesystem
return mapOfMountedFilesystemLabels
-def verifyDebugFilesystemMounted(enableMounting=True):
- """
- This function verifies that the debug filesystem is mounted. If the debug
- filesystem is mounted then True is returned, otherwise False is returned.
-
- @return: If the debug filesystem is mounted then True is returned, otherwise
- False is returned.
- @rtype: Boolean
-
- @param enableMounting: If True then the debug filesystem will be mounted if
- it is currently not mounted.
- @type enableMounting: Boolean
- """
- if (os.path.ismount(PATH_TO_DEBUG_DIR)):
- message = "The debug filesystem %s is mounted." %(PATH_TO_DEBUG_DIR)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- return True
- else:
- message = "The debug filesystem %s is not mounted." %(PATH_TO_DEBUG_DIR)
- logging.getLogger(MAIN_LOGGER_NAME).warning(message)
- if (cmdLineOpts.enableMountDebugFS):
- if(mountFilesystem("debugfs", "none", PATH_TO_DEBUG_DIR)):
- message = "The debug filesystem was mounted: %s." %(PATH_TO_DEBUG_DIR)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- return True
- return False
-
def mountFilesystem(filesystemType, pathToDevice, pathToMountPoint):
"""
This function will attempt to mount a filesystem. If the filesystem is
@@ -583,29 +675,24 @@ def gatherGeneralInformation(pathToDSTDir):
@type pathToDSTDir: String
"""
# Gather some general information and write to system.txt.
- systemString = "HOSTNAME: %s\nDATE: %s\n" %(platform.node(), time.strftime("%Y-%m-%d_%H:%M:%S"))
- stdout = runCommandOutput("uname", ["-a"])
+ systemString = "HOSTNAME=%s\nTIMESTAMP=%s\n" %(platform.node(), time.strftime("%Y-%m-%d %H:%M:%S"))
+ stdout = runCommandOutput("uname", ["-a"]).strip().rstrip()
if (not stdout == None):
- systemString += "UNAME-A: %s\n" %(stdout)
- stdout = runCommandOutput("uptime", [])
+ systemString += "UNAMEA=%s\n" %(stdout)
+ stdout = runCommandOutput("uptime", []).strip().rstrip()
if (not stdout == None):
- systemString += "UPTIME: %s\n" %(stdout)
- writeToFile(os.path.join(pathToDSTDir, "system.txt"), systemString, createFile=True)
+ systemString += "UPTIME=%s" %(stdout)
+ writeToFile(os.path.join(pathToDSTDir, "hostinformation.txt"), systemString, createFile=True)
- # Get "mount -l" filesystem data.
- command = "cat"
- pathToCommandOutput = os.path.join(pathToDSTDir, "cat-proc_mounts.txt")
- try:
- fout = open(pathToCommandOutput, "w")
- runCommand(command, ["/proc/mounts"], standardOut=fout)
- fout.close()
- except IOError:
- message = "There was an error the command output for %s to the file %s." %(command, pathToCommandOutput)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ # Copy misc files
+ pathToSrcFile = "/proc/mounts"
+ copyFile(pathToSrcFile, os.path.join(pathToDSTDir, pathToSrcFile.strip("/")))
+ pathToSrcFile = "/proc/slabinfo"
+ copyFile(pathToSrcFile, os.path.join(pathToDSTDir, pathToSrcFile.strip("/")))
# Get "ps -eo user,pid,%cpu,%mem,vsz,rss,tty,stat,start,time,comm,wchan" data.
command = "ps"
- pathToCommandOutput = os.path.join(pathToDSTDir, "ps.txt")
+ pathToCommandOutput = os.path.join(pathToDSTDir, "ps_hALo-tid.s.cmd")
try:
fout = open(pathToCommandOutput, "w")
#runCommand(command, ["-eo", "user,pid,%cpu,%mem,vsz,rss,tty,stat,start,time,comm,wchan"], standardOut=fout)
@@ -615,6 +702,48 @@ def gatherGeneralInformation(pathToDSTDir):
message = "There was an error the command output for %s to the file %s." %(command, pathToCommandOutput)
logging.getLogger(MAIN_LOGGER_NAME).error(message)
+
+def isProcPidStackEnabled(pathToPidData):
+ """
+ Returns true if the init process has the file "stack" in its pid data
+ directory which contains the task functions for that process.
+
+ @return: Returns true if the init process has the file "stack" in its pid
+ data directory which contains the task functions for that process.
+ @rtype: Boolean
+
+ @param pathToPidData: The path to the directory where all the pid data
+ directories are located.
+ @type pathToPidData: String
+ """
+ return os.path.exists(os.path.join(pathToPidData, "1/stack"))
+
+def gatherPidData(pathToPidData, pathToDSTDir):
+ """
+ This command will gather all the directories which contain data about all the pids.
+
+ @return: Returns a list of paths to the directory that contains the
+ information about the pid.
+ @rtype: Array
+
+ @param pathToPidData: The path to the directory where all the pid data
+ directories are located.
+ @type pathToPidData: String
+ """
+ # Status has: command name, pid, ppid, state, possibly registers
+ listOfFilesToCopy = ["cmdline", "stack", "status"]
+ listOfPathToPidsData = []
+ if (os.path.exists(pathToPidData)):
+ for srcFilename in os.listdir(pathToPidData):
+ pathToPidDirDST = os.path.join(pathToDSTDir, srcFilename)
+ if (srcFilename.isdigit()):
+ pathToSrcDir = os.path.join(pathToPidData, srcFilename)
+ for filenameToCopy in listOfFilesToCopy:
+ copyFile(os.path.join(pathToSrcDir, filenameToCopy), os.path.join(pathToPidDirDST, filenameToCopy))
+ if (os.path.exists(pathToPidDirDST)):
+ listOfPathToPidsData.append(pathToPidDirDST)
+ return listOfPathToPidsData
+
def triggerSysRQEvents():
"""
This command will trigger sysrq events which will write the output to
@@ -626,14 +755,15 @@ def triggerSysRQEvents():
pathToSysrqTriggerFile = "/proc/sysrq-trigger"
# m - dump information about memory allocation
# t - dump thread state information
- triggers = ["m", "t"]
+ # triggers = ["m", "t"]
+ triggers = ["t"]
for trigger in triggers:
try:
fout = open(pathToSysrqTriggerFile, "w")
runCommand(command, [trigger], standardOut=fout)
fout.close()
except IOError:
- message = "There was an error the command output for %s to the file %s." %(command, pathToSysrqTriggerFile)
+ message = "There was an error writing the command output for %s to the file %s." %(command, pathToSysrqTriggerFile)
logging.getLogger(MAIN_LOGGER_NAME).error(message)
def gatherLogs(pathToDSTDir):
@@ -645,24 +775,14 @@ def gatherLogs(pathToDSTDir):
copied to.
@type pathToDSTDir: String
"""
- if (mkdirs(pathToDSTDir)):
- # Copy messages logs that contain the sysrq data.
- pathToLogFile = "/var/log/messages"
- pathToDSTLogFile = os.path.join(pathToDSTDir, os.path.basename(pathToLogFile))
- try:
- shutil.copyfile(pathToLogFile, pathToDSTLogFile)
- except shutil.Error:
- message = "There was an error copying the file: %s to %s." %(pathToLogFile, pathToDSTLogFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ pathToLogFile = "/var/log/messages"
+ pathToDSTLogFile = os.path.join(pathToDSTDir, os.path.basename(pathToLogFile))
+ copyFile(pathToLogFile, pathToDSTLogFile)
- pathToLogDir = "/var/log/cluster"
+ pathToLogDir = "/var/log/cluster"
+ if (os.path.exists(pathToLogDir)):
pathToDSTLogDir = os.path.join(pathToDSTDir, os.path.basename(pathToLogDir))
- if (os.path.isdir(pathToLogDir)):
- try:
- shutil.copytree(pathToLogDir, pathToDSTLogDir)
- except shutil.Error:
- message = "There was an error copying the directory: %s to %s." %(pathToLogDir, pathToDSTLogDir)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ copyDirectory(pathToLogDir, pathToDSTDir)
def gatherDLMLockDumps(pathToDSTDir, listOfGFS2Filesystems):
"""
@@ -680,23 +800,13 @@ def gatherDLMLockDumps(pathToDSTDir, listOfGFS2Filesystems):
lockDumpType = "dlm"
pathToSrcDir = os.path.join(PATH_TO_DEBUG_DIR, lockDumpType)
pathToOutputDir = os.path.join(pathToDSTDir, lockDumpType)
- message = "Copying the files in the %s lockdump data directory %s for the selected GFS2 filesystem with dlm debug files." %(lockDumpType.upper(), pathToSrcDir)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
+ message = "Copying the files in the %s lockdump data directory %s." %(lockDumpType.upper(), pathToSrcDir)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
for filename in os.listdir(pathToSrcDir):
for name in listOfGFS2Filesystems:
if (filename.startswith(name)):
- pathToCurrentFilename = os.path.join(pathToSrcDir, filename)
- pathToDSTDir = os.path.join(pathToOutputDir, name)
- mkdirs(pathToDSTDir)
- pathToDSTFilename = os.path.join(pathToDSTDir, filename)
- try:
- shutil.copy(pathToCurrentFilename, pathToDSTFilename)
- except shutil.Error:
- message = "There was an error copying the file: %s to %s." %(pathToCurrentFilename, pathToDSTFilename)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- except OSError:
- message = "There was an error copying the file: %s to %s." %(pathToCurrentFilename, pathToDSTFilename)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ copyFile(os.path.join(pathToSrcDir, filename),
+ os.path.join(os.path.join(pathToOutputDir, name), filename))
def gatherGFS2LockDumps(pathToDSTDir, listOfGFS2Filesystems):
"""
@@ -718,18 +828,9 @@ def gatherGFS2LockDumps(pathToDSTDir, listOfGFS2Filesystems):
for dirName in os.listdir(pathToSrcDir):
pathToCurrentDir = os.path.join(pathToSrcDir, dirName)
if ((os.path.isdir(pathToCurrentDir)) and (dirName in listOfGFS2Filesystems)):
- mkdirs(pathToOutputDir)
- pathToDSTDir = os.path.join(pathToOutputDir, dirName)
- try:
- message = "Copying the lockdump data for the %s filesystem: %s" %(lockDumpType.upper(), dirName)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
- shutil.copytree(pathToCurrentDir, pathToDSTDir)
- except shutil.Error:
- message = "There was an error copying the directory: %s to %s." %(pathToCurrentDir, pathToDSTDir)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
- except OSError:
- message = "There was an error copying the directory: %s to %s." %(pathToCurrentDir, pathToDSTDir)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ message = "Copying the lockdump data for the %s filesystem: %s" %(lockDumpType.upper(), dirName)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ copyDirectory(pathToCurrentDir, pathToOutputDir)
# ##############################################################################
# Get user selected options
@@ -752,52 +853,57 @@ def __getOptions(version) :
cmdParser.add_option("-d", "--debug",
action="store_true",
dest="enableDebugLogging",
- help="Enables debug logging.",
+ help="enables debug logging",
default=False)
cmdParser.add_option("-q", "--quiet",
action="store_true",
dest="disableLoggingToConsole",
- help="Disables logging to console.",
+ help="disables logging to console",
+ default=False)
+ cmdParser.add_option("-y", "--no_ask",
+ action="store_true",
+ dest="disableQuestions",
+ help="disables all questions and assumes yes",
default=False)
cmdParser.add_option("-i", "--info",
action="store_true",
dest="enablePrintInfo",
- help="Prints to console some basic information about the GFS2 filesystems mounted on the cluster node.",
+ help="prints information about the mounted GFS2 file systems",
default=False)
- cmdParser.add_option("-M", "--mount_debug_fs",
+ cmdParser.add_option("-t", "--archive",
action="store_true",
- dest="enableMountDebugFS",
- help="Enables the mounting of the debug filesystem if it is not mounted. Default is disabled.",
+ dest="enableArchiveOutputDir",
+ help="the output directory will be archived(tar) and compressed(.bz2)",
default=False)
cmdParser.add_option("-o", "--path_to_output_dir",
action="store",
dest="pathToOutputDir",
- help="The path to the output directory where all the collect data will be stored. Default is /tmp/<date>-<hostname>-%s" %(os.path.basename(sys.argv[0])),
+ help="the directory where all the collect data will be stored",
type="string",
+ metavar="<output directory>",
default="")
cmdParser.add_option("-r", "--num_of_runs",
action="store",
dest="numberOfRuns",
- help="The number of lockdumps runs to do. Default is 2.",
+ help="number of runs capturing the lockdump data",
type="int",
+ metavar="<number of runs>",
default=2)
cmdParser.add_option("-s", "--seconds_sleep",
action="store",
dest="secondsToSleep",
- help="The number of seconds sleep between runs. Default is 120 seconds.",
+ help="number of seconds to sleep between runs of capturing the lockdump data",
type="int",
+ metavar="<seconds to sleep>",
default=120)
- cmdParser.add_option("-t", "--archive",
- action="store_true",
- dest="enableArchiveOutputDir",
- help="Enables archiving and compressing of the output directory with tar and bzip2. Default is disabled.",
- default=False)
cmdParser.add_option("-n", "--fs_name",
action="extend",
dest="listOfGFS2Names",
- help="List of GFS2 filesystems that will have their lockdump data gathered.",
+ help="name of the GFS2 filesystem(s) that will have their lockdump data captured",
type="string",
- default=[]) # Get the options and return the result.
+ metavar="<name of GFS2 filesystem>",
+ default=[])
+ # Get the options and return the result.
(cmdLineOpts, cmdLineArgs) = cmdParser.parse_args()
return (cmdLineOpts, cmdLineArgs)
@@ -817,7 +923,7 @@ class OptionParserExtended(OptionParser):
self.__commandName = os.path.basename(sys.argv[0])
versionMessage = "%s %s\n" %(self.__commandName, version)
- commandDescription ="%s will capture information about lockdata data for GFS2 and DLM required to analyze a GFS2 filesystem.\n"%(self.__commandName)
+ commandDescription ="%s gfs2_lockcapture will capture locking information from GFS2 file systems and DLM.\n"%(self.__commandName)
OptionParser.__init__(self, option_class=ExtendOption,
version=versionMessage,
@@ -831,10 +937,17 @@ class OptionParserExtended(OptionParser):
examplesMessage = "\n"
examplesMessage = "\nPrints information about the available GFS2 filesystems that can have lockdump data captured."
examplesMessage += "\n$ %s -i\n" %(self.__commandName)
- examplesMessage += "\nThis command will mount the debug directory if it is not mounted. It will do 3 runs of\n"
- examplesMessage += "gathering the lockdump information in 10 second intervals for only the GFS2 filesystems\n"
- examplesMessage += "with the names myGFS2vol2,myGFS2vol1. Then it will archive and compress the data collected."
- examplesMessage += "\n$ %s -M -r 3 -s 10 -t -n myGFS2vol2,myGFS2vol1\n" %(self.__commandName)
+
+ examplesMessage += "\nIt will do 3 runs of gathering the lockdump information in 10 second intervals for only the"
+ examplesMessage += "\nGFS2 filesystems with the names myGFS2vol2,myGFS2vol1. Then it will archive and compress"
+ examplesMessage += "\nthe data collected. All of the lockdump data will be written to the directory: "
+ examplesMessage += "\n/tmp/2012-11-12_095556-gfs2_lockcapture and all the questions will be answered with yes.\n"
+ examplesMessage += "\n$ %s -r 3 -s 10 -t -n myGFS2vol2,myGFS2vol1 -o /tmp/2012-11-12_095556-gfs2_lockcapture -y\n" %(self.__commandName)
+
+ examplesMessage += "\nIt will do 2 runs of gathering the lockdump information in 25 second intervals for all the"
+ examplesMessage += "\nmounted GFS2 filesystems. Then it will archive and compress the data collected. All of the"
+ examplesMessage += "\nlockdump data will be written to the directory: /tmp/2012-11-12_095556-gfs2_lockcapture.\n"
+ examplesMessage += "\n$ %s -r 2 -s 25 -t -o /tmp/2012-11-12_095556-gfs2_lockcapture\n" %(self.__commandName)
OptionParser.print_help(self)
print examplesMessage
@@ -869,11 +982,13 @@ class ExtendOption (Option):
@type parser: OptionParser
"""
if (action == "extend") :
- valueList=[]
+ valueList = []
try:
for v in value.split(","):
# Need to add code for dealing with paths if there is option for paths.
- valueList.append(v)
+ newValue = value.strip().rstrip()
+ if (len(newValue) > 0):
+ valueList.append(newValue)
except:
pass
else:
@@ -912,17 +1027,10 @@ if __name__ == "__main__":
streamHandler.setFormatter(logging.Formatter("%(levelname)s %(message)s"))
logger.addHandler(streamHandler)
- # Set the handler for writing to log file.
- pathToLogFile = "/tmp/%s.log" %(MAIN_LOGGER_NAME)
- if (((os.access(pathToLogFile, os.W_OK) and os.access("/tmp", os.R_OK))) or (not os.path.exists(pathToLogFile))):
- fileHandler = logging.FileHandler(pathToLogFile)
- fileHandler.setFormatter(logging.Formatter("%(asctime)s %(levelname)s %(message)s", "%Y-%m-%d %H:%M:%S"))
- logger.addHandler(fileHandler)
- message = "A log file will be created or appened to: %s" %(pathToLogFile)
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- else:
- message = "There was permission problem accessing the write attributes for the log file: %s." %(pathToLogFile)
- logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ # Please note there will not be a global log file created. If a log file
+ # is needed then redirect the output. There will be a log file created
+ # for each run in the corresponding directory.
+
# #######################################################################
# Set the logging levels.
# #######################################################################
@@ -949,6 +1057,26 @@ if __name__ == "__main__":
# script running.
writeToFile(PATH_TO_PID_FILENAME, str(os.getpid()), createFile=True)
# #######################################################################
+ # Verify they want to continue because this script will trigger sysrq events.
+ # #######################################################################
+ if (not cmdLineOpts.disableQuestions):
+ valid = {"yes":True, "y":True, "no":False, "n":False}
+ question = "This script will trigger a sysrq -t event or collect the data for each pid directory located in /proc for each run. Are you sure you want to continue?"
+ prompt = " [y/n] "
+ while True:
+ sys.stdout.write(question + prompt)
+ choice = raw_input().lower()
+ if (choice in valid):
+ if (valid.get(choice)):
+ # If yes, or y then exit loop and continue.
+ break
+ else:
+ message = "The script will not continue since you chose not to continue."
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ exitScript(removePidFile=True, errorCode=1)
+ else:
+ sys.stdout.write("Please respond with '(y)es' or '(n)o'.\n")
+ # #######################################################################
# Get the clusternode name and verify that mounted GFS2 filesystems were
# found.
# #######################################################################
@@ -976,8 +1104,6 @@ if __name__ == "__main__":
# proceeding unless it is already created from a previous run data needs
# to be analyzed. Probably could add more debugging on if file or dir.
# #######################################################################
- message = "The gathering of the lockdumps will be performed on the clusternode \"%s\" which is part of the cluster \"%s\"." %(clusternode.getClusterNodeName(), clusternode.getClusterName())
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
pathToOutputDir = cmdLineOpts.pathToOutputDir
if (not len(pathToOutputDir) > 0):
pathToOutputDir = "%s" %(os.path.join("/tmp", "%s-%s-%s" %(time.strftime("%Y-%m-%d_%H%M%S"), clusternode.getClusterNodeName(), os.path.basename(sys.argv[0]))))
@@ -1000,56 +1126,83 @@ if __name__ == "__main__":
# Check to see if the debug directory is mounted. If not then
# log an error.
# #######################################################################
- result = verifyDebugFilesystemMounted(cmdLineOpts.enableMountDebugFS)
- if (not result):
- message = "Please mount the debug filesystem before running this script. For example: $ mount none -t debugfs %s" %(PATH_TO_DEBUG_DIR)
+ if(mountFilesystem("debugfs", "none", PATH_TO_DEBUG_DIR)):
+ message = "The debug filesystem %s is mounted." %(PATH_TO_DEBUG_DIR)
+ logging.getLogger(MAIN_LOGGER_NAME).info(message)
+ else:
+ message = "There was a problem mounting the debug filesystem: %s" %(PATH_TO_DEBUG_DIR)
+ logging.getLogger(MAIN_LOGGER_NAME).error(message)
+ message = "The debug filesystem is required to be mounted for this script to run."
logging.getLogger(MAIN_LOGGER_NAME).info(message)
exitScript(errorCode=1)
-
# #######################################################################
# Gather data and the lockdumps.
# #######################################################################
- message = "The process of gathering all the required files will begin before capturing the lockdumps."
- logging.getLogger(MAIN_LOGGER_NAME).info(message)
- for i in range(0,cmdLineOpts.numberOfRuns):
+ if (cmdLineOpts.numberOfRuns <= 0):
+ message = "The number of runs should be greater than zero."
+ exitScript(errorCode=1)
+ for i in range(1,(cmdLineOpts.numberOfRuns + 1)):
# The current log count that will start at 1 and not zero to make it
# make sense in logs.
- currentLogRunCount = (i + 1)
# Add clusternode name under each run dir to make combining multple
# clusternode gfs2_lockgather data together and all data in each run directory.
pathToOutputRunDir = os.path.join(pathToOutputDir, "run%d/%s" %(i, clusternode.getClusterNodeName()))
+ # Create the the directory that will be used to capture the data.
if (not mkdirs(pathToOutputRunDir)):
exitScript(errorCode=1)
- # Gather various bits of data from the clusternode.
- message = "Gathering some general information about the clusternode %s for run %d/%d." %(clusternode.getClusterNodeName(), currentLogRunCount, cmdLineOpts.numberOfRuns)
+ # Set the handler for writing to log file for this run.
+ currentRunFileHandler = None
+ pathToLogFile = os.path.join(pathToOutputRunDir, "%s.log" %(MAIN_LOGGER_NAME))
+ if (((os.access(pathToLogFile, os.W_OK) and os.access("/tmp", os.R_OK))) or (not os.path.exists(pathToLogFile))):
+ currentRunFileHandler = logging.FileHandler(pathToLogFile)
+ currentRunFileHandler.setFormatter(logging.Formatter("%(asctime)s %(levelname)s %(message)s", "%Y-%m-%d %H:%M:%S"))
+ logging.getLogger(MAIN_LOGGER_NAME).addHandler(currentRunFileHandler)
+ message = "Pass (%d/%d): Gathering all the lockdump data." %(i, cmdLineOpts.numberOfRuns)
logging.getLogger(MAIN_LOGGER_NAME).status(message)
+
+ # Gather various bits of data from the clusternode.
+ message = "Pass (%d/%d): Gathering general information about the host." %(i, cmdLineOpts.numberOfRuns)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
gatherGeneralInformation(pathToOutputRunDir)
- # Trigger sysrq events to capture memory and thread information
- message = "Triggering the sysrq events for the clusternode %s for run %d/%d." %(clusternode.getClusterNodeName(), currentLogRunCount, cmdLineOpts.numberOfRuns)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
- triggerSysRQEvents()
+ # Going to sleep for 2 seconds, so that TIMESTAMP should be in the
+ # past in the logs so that capturing sysrq data will be guaranteed.
+ time.sleep(2)
+ # Gather the backtraces for all the pids, by grabbing the /proc/<pid
+ # number> or triggering sysrq events to capture task bask traces
+ # from log.
+ message = "Pass (%d/%d): Triggering the sysrq events for the host." %(i, cmdLineOpts.numberOfRuns)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
+ # Gather the data in the /proc/<pid> directory if the file
+ # </proc/<pid>/stack exists. If file exists we will not trigger
+ # sysrq events.
+ pathToPidData = "/proc"
+ if (isProcPidStackEnabled(pathToPidData)):
+ gatherPidData(pathToPidData, os.path.join(pathToOutputRunDir, pathToPidData.strip("/")))
+ else:
+ triggerSysRQEvents()
# Gather the dlm locks.
lockDumpType = "dlm"
- message = "Gathering the %s lock dumps for clusternode %s for run %d/%d." %(lockDumpType.upper(), clusternode.getClusterNodeName(), currentLogRunCount, cmdLineOpts.numberOfRuns)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
+ message = "Pass (%d/%d): Gathering the %s lock dumps for the host." %(i, cmdLineOpts.numberOfRuns, lockDumpType.upper())
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
gatherDLMLockDumps(pathToOutputRunDir, clusternode.getMountedGFS2FilesystemNames(includeClusterName=False))
# Gather the glock locks from gfs2.
lockDumpType = "gfs2"
- message = "Gathering the %s lock dumps for clusternode %s for run %d/%d." %(lockDumpType.upper(), clusternode.getClusterNodeName(), currentLogRunCount, cmdLineOpts.numberOfRuns)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
+ message = "Pass (%d/%d): Gathering the %s lock dumps for the host." %(i, cmdLineOpts.numberOfRuns, lockDumpType.upper())
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
gatherGFS2LockDumps(pathToOutputRunDir, clusternode.getMountedGFS2FilesystemNames())
# Gather log files
- message = "Gathering the log files for the clusternode %s for run %d/%d." %(clusternode.getClusterNodeName(), currentLogRunCount, cmdLineOpts.numberOfRuns)
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
+ message = "Pass (%d/%d): Gathering the log files for the host." %(i, cmdLineOpts.numberOfRuns)
+ logging.getLogger(MAIN_LOGGER_NAME).debug(message)
gatherLogs(os.path.join(pathToOutputRunDir, "logs"))
# Sleep between each run if secondsToSleep is greater than or equal
# to 0 and current run is not the last run.
- if ((cmdLineOpts.secondsToSleep >= 0) and (i < (cmdLineOpts.numberOfRuns - 1))):
- message = "The script will sleep for %d seconds between each run of capturing the lockdumps." %(cmdLineOpts.secondsToSleep)
+ if ((cmdLineOpts.secondsToSleep >= 0) and (i <= (cmdLineOpts.numberOfRuns))):
+ message = "The script will sleep for %d seconds between each run of capturing the lockdump data." %(cmdLineOpts.secondsToSleep)
logging.getLogger(MAIN_LOGGER_NAME).info(message)
- message = "The script is sleeping before beginning the next run."
- logging.getLogger(MAIN_LOGGER_NAME).status(message)
time.sleep(cmdLineOpts.secondsToSleep)
+ # Remove the handler:
+ logging.getLogger(MAIN_LOGGER_NAME).removeHandler(currentRunFileHandler)
+
# #######################################################################
# Archive the directory that contains all the data and archive it after
# all the information has been gathered.
diff --git a/gfs2/man/Makefile.am b/gfs2/man/Makefile.am
index 83d6251..8655a76 100644
--- a/gfs2/man/Makefile.am
+++ b/gfs2/man/Makefile.am
@@ -7,4 +7,5 @@ dist_man_MANS = fsck.gfs2.8 \
gfs2_grow.8 \
gfs2_jadd.8 \
mkfs.gfs2.8 \
- tunegfs2.8
+ tunegfs2.8 \
+ gfs2_lockcapture.8
diff --git a/gfs2/man/gfs2_lockcapture.8 b/gfs2/man/gfs2_lockcapture.8
new file mode 100644
index 0000000..854cd71
--- /dev/null
+++ b/gfs2/man/gfs2_lockcapture.8
@@ -0,0 +1,53 @@
+.TH gfs2_lockcapture 8
+
+.SH NAME
+gfs2_lockcapture \- will capture locking information from GFS2 file systems and DLM.
+
+.SH SYNOPSIS
+.B gfs2_lockcapture \fR[-dqyt] [-o \fIoutput directory]\fR [-r \fInumber of runs]\fR [-s \fIseconds to sleep]\fR [-n \fIname of GFS2 filesystem]\fP
+.PP
+.B gfs2_lockcapture \fR[-dqyi]
+
+.SH DESCRIPTION
+\fIgfs2_lockcapture\fR is used to capture all the GFS2 lockdump data and
+corresponding DLM data. The command can be configured to capture the data
+multiple times and how much time to sleep between each iteration of capturing
+the data. By default all of the mounted GFS2 filesystems will have their data
+collected unless GFS2 filesystems are specified.
+.PP
+Please note that sysrq -t and -m events are trigger or the pid directories in /proc are
+collected on each iteration of capturing the data.
+
+.SH OPTIONS
+.TP
+\fB-h, --help\fP
+Prints out a short usage message and exits.
+.TP
+\fB-d, --debug\fP
+enables debug logging.
+.TP
+\fB-q, --quiet\fP
+disables logging to console.
+.TP
+\fB-y, --no_ask\fP
+disables all questions and assumes yes.
+.TP
+\fB-i, --info\fP
+prints information about the mounted GFS2 file systems.
+.TP
+\fB-t, --archive\fP
+the output directory will be archived(tar) and compressed(.bz2).
+.TP
+\fB-o \fI<output directory>, \fB--path_to_output_dir\fR=\fI<output directory>\fP
+the directory where all the collect data will stored.
+.TP
+\fB-r \fI<number of runs>, \fB--num_of_runs\fR=\fI<number of runs>\fP
+number of runs capturing the lockdump data.
+.TP
+\fB-s \fI<seconds to sleep>, \fB--seconds_sleep\fR=\fI<seconds to sleep>\fP
+number of seconds to sleep between runs of capturing the lockdump data.
+.TP
+\fB-n \fI<name of GFS2 filesystem>, \fB--fs_name\fR=\fI<name of GFS2 filesystem>\fP
+name of the GFS2 filesystem(s) that will have their lockdump data captured.
+.
+.SH SEE ALSO
11 years, 4 months