On 08/11/10 - 09:26:43AM, Mohammed Morsi wrote:
This patch contains the rough first pass of dbomatic's
implementation.
There are various things TBD such as event error handling and making sure
all the correct events are collected with all their metadata but what
is present should be a rough estimate as to the parsing strategy.
To get condor to yield the neccessary info for dbomatic to parse, add the
following to /var/lib/condor/condor_config.local
EVENT_LOG=$(LOG)/EventLog
EVENT_LOG_USE_XML=True
EVENT_LOG_JOB_AD_INFORMATION_ATTRS=Owner,GlobalJobId,Cmd,JobStartDate,JobCurrentStartDate,JobFinishedHookDone
Also be sure to set CONDOR_HOST appropriately (I merely set it to 'localhost'
in my case).
This change to CONDOR_HOST shouldn't be necessary with the latest
condor_config.local I have up on my webpage.
<snip>
diff --git a/src/dbomatic/dbomatic.rb b/src/dbomatic/dbomatic.rb
new file mode 100644
index 0000000..8b80c81
--- /dev/null
+++ b/src/dbomatic/dbomatic.rb
@@ -0,0 +1,90 @@
+# Copyright (C) 2010 Red Hat, Inc.
+# Written by Mohammed Morsi <mmorsi(a)redhat.com>
+#
+# This program is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; version 2 of the License.
+#
+# This program is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+# GNU General Public License for more details.
+#
+# You should have received a copy of the GNU General Public License
+# along with this program; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+# MA 02110-1301, USA. A copy of the GNU General Public License is
+# also available at
http://www.gnu.org/copyleft/gpl.html.
+
+$: << File.join(File.dirname(__FILE__), "../dutils")
+require 'dutils'
+require 'nokogiri'
+
+# Handle the event log's xml
+class CondorEventLog < Nokogiri::XML::SAX::Document
+ attr_accessor :tag, :event_type, :event_cmd, :event_time
+
+ # Store the name of the event log attribute we're looking at
+ def start_element(element, attributes)
+ @tag = attributes[1] if element == "a"
+ end
+
+ # Store the value of the event log attribute we're looking at
+ def characters(string)
+ unless string.strip == ""
+ if @tag == "MyType"
+ @event_type = string
+ elsif @tag == "Cmd"
+ @event_cmd = string
+ elsif @tag == "EventTime"
+ @event_time = string
+ end
+ end
+ end
+
+ # Create a new entry for events which we have all the neccessary data for
+ def end_element(element)
+ if element == "c" && !(a)event_cmd.nil?
+ inst = Instance.find(:first, :conditions => ['condor_job_id = ?',
@event_cmd])
+ #puts "Instance event #{inst.name} #{@event_type} #{@event_time}"
+ InstanceEvent.create! :instance => inst,
+ :event_type => @event_type,
+ :event_time => @event_time
+ @tag = @event_type = @event_cmd = @event_time = nil
We may want to add additional fields in the future, but this is a good start.
+ end
+ end
+end
+parser = Nokogiri::XML::SAX::PushParser.new(CondorEventLog.new)
+
+# XXX bit of a hack, condor event log doesn't seem to have a top level element
+# enclosing everything else in the doc (as standards conforming xml must).
+# Create one for parsing purposes.
+parser << "<events>"
+
+# last time the event log was modified
+event_log_timestamp = nil
+
+# last position we've read in the log
+event_log_position = 0
This is a problem for dbomatic restarts. That is, if you get a few events,
and then restart dbomatic, you'll go back to the beginning of the event log
and re-add those same events to the table. We are going to have to track this
on persistent storage somehow.
+
+# set true to terminate dbomatic
+terminate = false
+until terminate
+ log_file = File.open("/var/log/condor/EventLog")
+
+ # Condor seems to open / close the event log for every
+ # entry. Simply poll for new data in the log
+ unless log_file.mtime == event_log_timestamp
+ event_log_timestamp = log_file.mtime
+ log_file.pos = event_log_position
+ while c = log_file.getc
+ parser << c.chr
+ end
This is probably better done by getting line-by-line (instead of character
by character), no?
+ event_log_position = log_file.pos
+ end
+
+ sleep 1
The sleep is fine for now, but I would prefer if we used something like
inotify to get notified of changes to the file. That's a future optimization,
though.
+end
+
+parser << "</events>"
+parser.finish
So, this is a really good start. It looks like it's the framework we need
for doing additional work here. The biggest questions that come to mind have
to do more with the types of events that condor is generating, and whether
we can properly calculate our QoS data from it. In that vein, could you
schedule some actual guests on, say, EC2, and then see if the information we
are gathering can answer:
1) How long it took between us submitting the request, and the request to
be started? (i.e. gone through matchmaking and the deltacloud GAHP).
2) How long it took between the request being started and the guest going to
running? (this should be the time the backend cloud took to get the guest up
and running).
And other metrics you can think of.
--
Chris Lalancette