Static Analysis: proposed interchange format ("firehose")

Daniel Veillard veillard at redhat.com
Thu Jan 17 05:33:13 UTC 2013


On Wed, Jan 16, 2013 at 03:53:56PM -0500, David Malcolm wrote:
> This is a followup to my proposal in
> http://lists.fedoraproject.org/pipermail/devel/2012-December/175232.html
> 
> I want a common output format for static analysis tools so that we can
> easily slurp the results from different tools into a database and have a
> common system for managing the results (marking false positives, having
> automated de-duplication, etc).
> 
> (I like the name "firehose" for the overall system since it describes
> the issue we'll have of managing the flood of data).
> 
> I came up with an XML format, which I've uploaded code to here:
> https://github.com/fedora-static-analysis/firehose
> 
> Does this look sane?  I think that it should be possible to write

  okay, taking the question from the XML side, so analysing the
firehose.rng schemas driving the format. Points and remarks as i go
through it:

 - the cwe attribute is a number or free form ? if a number add
   and explicit rule to check its type.
 - the sut content choice is a bit weird on one side you have text
   on the other you have <rpm>, I would  still allow a free form
   description but in an element at the same level of rpm
   something like
   <choice>
     <element name="description">
       <text/>
     </element>
     <element name="rpm">
       ...
     <element>
   For the sake of larger usage, i would also make some room for
   debian, and also expand that to be able to express a given file
   to give an example allowing extra details there, and make some
   if not all of the attributes optionals, for example to be able
   to express independance say on the arch:
   <sut>
     <file>/usr/bin/xmllint</file>
     <package type="rpm" name="libxml2" version="2.9.0" release="1.fc17">
   </sut>
   so optional file element, extra type attribute, use package to not
   feel tied to rpm, but use a type attribute to distinguish :-)

 - for notes i would separate them
   <notes>
     <note>...</note>
     <note>...</note>
   </notes>
   since they are likely to me entered manually, and you may want to
   track who entered them as you go.

 - I would use <where> instead of <point> myself but i understand your
   logic too

Long reply but overall that look mostly fine from my very narrow POV

Daniel


-- 
Daniel Veillard      | Open Source and Standards, Red Hat
veillard at redhat.com  | libxml Gnome XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | virtualization library  http://libvirt.org/


More information about the devel mailing list