Static Analysis: some UI ideas
David Malcolm
dmalcolm at redhat.com
Mon Feb 4 21:37:45 UTC 2013
On Mon, 2013-02-04 at 22:13 +0100, Kamil Dudka wrote:
> On Monday, February 04, 2013 15:04:36 David Malcolm wrote:
> > I've been experimenting with some UI ideas for reporting static analysis
> > results: I've linked to two different UI reports below.
> >
> > My hope is that we'll have a server in the Fedora infrastructure for
> > browsing results, marking things as false positives etc.
> >
> > However, for the purposes of simplicity during experimentation I'm
> > simply building static HTML reports.
> >
> > My #1 requirement when I'm viewing static analysis results is that I
> > want to *see the code* with the report, so I've attempted to simply show
> > the code with warnings shown inline.
>
> Does it mean you need to keep the unpacked source files for all scanned
> packages? Then you will easily run out of disk space after scanning a few
> versions of libreoffice.
Content-addressed storage: they're named by SHA-1 sum of their contents,
similar to how git does it, so if the bulk of the files don't change,
they have the same SHA-1 sum and are only stored once. See e.g.:
http://fedorapeople.org/~dmalcolm/static-analysis/2013-01-30/python-ethtool-0.7-4.fc19.src.rpm/static-analysis/sources/
I probably should gzip them as well.
Currently it's capturing all C files that have GCC invoked on them, or
are mentioned in a warning (e.g. a .h file with an inline function with
a bug). I could tweak things so it only captures files that are
mentioned in a warning.
> > Note also that when we have a server we can do all kinds of
> > auto-filtering behaviors so that e.g. package maintainers only see
> > warnings from tests that have decent signal:noise ratio (perhaps with
> > other warnings greyed out, or similar).
>
> It would be cool if the auto-filtering techniques were implemented in
> standalone utilities operating on text files so that we have separated
> algorithms from presentation of the results. It is easy to use a filter-like
> utility on a server, but painful to use a server for processing local text
> files.
I guess the issue is: where do you store the knowledge about good vs bad
warnings? My plan was to store it server-side. But we could generate
summaries and have them available client-side. For example, if, say
cppcheck's "useClosedFile" test has generated 100 issues of which 5 have
received human attention: 1 has been marked as a true positive, and 4
has been marked as false positives. We could then say ("cppcheck",
"useClosedFile") has a signal:noise ratio of 1:4. We could then
generate a summary of these (tool, testID) ratios for use by clients,
which could then a user-configurable signal:noise threshold, so you can
say: "only show me results from tests that achieve 1:2 or better".
> > Results of an srpm build
> > ========================
> > The first experimental report can be seen here:
> > http://fedorapeople.org/~dmalcolm/static-analysis/2013-02-01/policycoreutils
> > -2.1.13-27.2.fc17.src.rpm-001.html
> >
> > It shows warnings from 4 different static analyzers when rebuilding a
> > particular srpm (policycoreutils-2.1.13-27.2.fc17). There's a summary
> > table at the top of the report showing for each source files in the
> > build which analyzers found reports (those that found any are
> > highlighted in red). Each row has a <a> linking you to a report on each
> > source file. Those source files that have issues have a table showing
> > the issues, with links to them. The issue are shown inline within the
> > syntax-colored source files.
> >
> > Ideally there'd by support for using "n" and "p" to move to
> > next/previous error (with a little javascript), but for now I've been
> > using "back" in the browser to navigate through the tables.
> >
> > An example of an error shown inline:
> > http://fedorapeople.org/~dmalcolm/static-analysis/2013-02-01/policycoreutils
> > -2.1.13-27.2.fc17.src.rpm-001.html#file-868b5c03918269eaabebfedc41eaf32e3903
> > 57be-line-791 shows a true error in seunshare.c found by cppcheck ("foo =
> > realloc(foo, , )" is always a mistake, since if realloc fails you get
> > NULL back, but still have responsibility for freeing the old foo).
>
> The limitation of javascript-based UIs is that they are read-only. Some
> developers prefer to go through the defects using their own environment
> (eclipse, vim, emacs, ...) rather than a web browser so that they can fix
> them immediately. We should support both approaches I guess.
Both approaches. What we could do is provide a tool ("fedpkg
get-errors" ?) that captures the errors in the same output format as
gcc. That way if you run it from say gcc, the *compilation* buffer has
everything in the right format, and emacs' goto-next-error stuff works.
>
> > Comparison report
> > =================
> > The second experimental report can be seen here:
> > http://fedorapeople.org/~dmalcolm/static-analysis/2013-02-04/comparison-of-p
> > ython-ethtool-builds.html
> >
> > It shows a comparison of the results of two different builds of a
> > package (python-ethtool), again running multiple analyzers.
> > (specifically, a comparison between 0.7 and an snapshot of upstream
> > git).
> >
> > It's similar to the first report, but instead of showing one file at a
> > time, it shows a side-by-side diff of old vs new file.
>
> Does it assume that you have 1:1 file mapping between old and new versions of
> the package? What will happen if the source files are renamed, moved, merged,
> split, etc.?
Currently it's matching on 4 things:
* by name of test tool (e.g. "clang-analyzer")
* by filename of C file within the tarball (so e.g.
'/builddir/build/BUILD/python-ethtool-0.7/python-ethtool/etherinfo.c'
becomes 'python-ethtool/etherinfo.c', allowing different versions to be
compared)
* function name (or None)
* text of message
See "make-comparative-report.py:ComparativeIssues" in
https://github.com/fedora-static-analysis/mock-with-analysis/blob/master/reports/make-comparative-report.py
Thanks for the feedback
Dave
More information about the devel
mailing list