Hi,
Debian maintain a list of CPE inormation for packages on their security tracker http://svn.debian.org/wsvn/secure-testing/data/CPE/list The CPE information is not complete and does not contain version information. This makes it relatively static except when packages are added or removed from the repository. It can be useful to maintain this limited CPE information for searching purposes.
In the past I generated an automatic mapping between packages in Debian and Fedora https://github.com/silviocesare/Equivalent-Packages/blob/master/NearestNeigh....
From combining the Debian CPE list and my package mappings, I can generate a
CPE list for Fedora. The list would not cover all of Fedora's packages and I could not guarantee 100% accuracy, however such a list may be useful.
I can create this list if the security team or developers are interested and perhaps it could be put on the Fedora wiki.
Apologies if this has already ben answered. I have asked Fedora in several forums if similar information (such as package mappings) would be useful, and the general consensus thus far has been that it is not needed. However, while package mappings might not be useful to Fedora, perhaps a partial CPE list could be.
CC me on responses.
-- Silvio Cesare
Hi Silvio!
On Mon, 31 Jan 2011 19:21:39 +1100 Silvio Cesare wrote:
Debian maintain a list of CPE inormation for packages on their security tracker http://svn.debian.org/wsvn/secure-testing/data/CPE/list
We currently do not use CPE names for security tracking in Fedora, so I don't see an obvious benefit maintaining such list. Can you explain briefly how you use it for Debian security tracking and what benefits it brings?
This makes it relatively static except when packages are added or removed from the repository.
It's not that uncommon to see new packages added to Fedora repositories even after the release of some Fedora version.
In the past I generated an automatic mapping between packages in Debian and Fedora https://github.com/silviocesare/Equivalent-Packages/blob/master/NearestNeigh...
I played a little more with this list and noticed few problems: - quite a few Debian packages map to Fedora arptools or binclock. Probably packages with not much sources, where other file (license, configure) confuse your tool to match unrelated packages - there does not seem to be a good way to list cases where multiple components contain the same sources. In Fedora, mingw32-* packages are a good example, and the list often maps Debian package foo to Fedora package mingw32-foo, while there is Fedora package foo that should be similarly good match. Another example is zlib:arm-gp2x-linux-zlib.
Did you review "unexpected matches" to see if the sources are really similar, and how the match is picked when there are multiple "good candidates"?
In answer to your questions about the Equivalent-Packages process:
1) You are right that the tool can get confused when there is little source in the package or if the majority of files include common things like readme/todo/makefile etc. One thing I could do is exclude source files which are excessively common. I do that in other tools where I use package similarity, but tried to keep this package equivalent tool as simple as possible. It is not difficult to implement and it should reduce some of the false positives.
2) When there are multiple possible matches, I simply choose the package with the highest similarity. One thing that I did was to run the tool against one repo only (instead of say the Fedora repo against the Debian repo) to find near duplicate packages. I have done this only so far for Debian https://github.com/silviocesare/Equivalent-Packages/blob/master/Clusters/Deb...
The first entry in the list is the base package, the remaining entries are the near duplicate packages and their similarities to the base package. An example from the Debian repo -->
libxml-um-perl libapp-control-perl:0.846154 libcrypt-des-ede3-perl:0.846154 libdata-buffer-perl:0.846154 libdb-file-lock-perl:0.846154 libio-tee-perl:0.846154 liblingua-preferred-perl:0.846154 liblingua-pt-stemmer-perl:0.846154 liblog-tracemessages-perl:0.846154 libpdf-reuse-barcode-perl:0.846154 libsort-fields-perl:0.846154 libtemplate-plugin-calendar-simple-perl:0.846154 libxml-filter-detectws-perl:0.846154 libxml-filter-saxt-perl:0.846154 libxml-handler-printevents-perl:0.846154 libxml-handler-trees-perl:0.846154 libxml-regexp-perl:0.846154
One possible method of reducing false positives is to ignore packages which are equivalent to more than one other package. Or perhaps it could require human intervention. 3) I did some trivial testing of unexpected matches. In fact one thing I looked at was when the same package name was in Fedora and Debian but the similarity was so low it didn't match. Suprisingly a not insignificant number of packages were like this. And manual verification showed in the ones I looked at, they were different packages. This demonstrates that if you base equivalence on names only, then you will get false positives.
I could add heuristics based on the package name to request human intervention, ie. if two packages are found similar and if the package names do not have 50% overlap, then request human verification. I am not sure how useful this will be because from experience, package names can sometimes be problematic.
-- Silvio
On Tue, Feb 1, 2011 at 1:13 AM, Tomas Hoger thoger@redhat.com wrote:
Hi Silvio!
On Mon, 31 Jan 2011 19:21:39 +1100 Silvio Cesare wrote:
Debian maintain a list of CPE inormation for packages on their security tracker http://svn.debian.org/wsvn/secure-testing/data/CPE/list
We currently do not use CPE names for security tracking in Fedora, so I don't see an obvious benefit maintaining such list. Can you explain briefly how you use it for Debian security tracking and what benefits it brings?
This makes it relatively static except when packages are added or removed from the repository.
It's not that uncommon to see new packages added to Fedora repositories even after the release of some Fedora version.
In the past I generated an automatic mapping between packages in Debian and Fedora
https://github.com/silviocesare/Equivalent-Packages/blob/master/NearestNeigh...
I played a little more with this list and noticed few problems:
- quite a few Debian packages map to Fedora arptools or binclock.
Probably packages with not much sources, where other file (license, configure) confuse your tool to match unrelated packages
- there does not seem to be a good way to list cases where multiple
components contain the same sources. In Fedora, mingw32-* packages are a good example, and the list often maps Debian package foo to Fedora package mingw32-foo, while there is Fedora package foo that should be similarly good match. Another example is zlib:arm-gp2x-linux-zlib.
Did you review "unexpected matches" to see if the sources are really similar, and how the match is picked when there are multiple "good candidates"?
-- Tomas Hoger / Red Hat Security Response Team
From what I can gather the CPE list is used for Debian which is discussed
briefly http://lists.debian.org/debian-devel/2011/02/msg00005.html the list is used to check that the NVD CVE lists match up to Debian advisories, ie to catch any missing vulnerabilities/packages in CVE but not in Debian.
I have created the CPE list for Fedora 13. There are only about 300 entries due partly to the fact that Debian's CPE list only contains about 1100 entries/packages.
https://github.com/silviocesare/Equivalent-Packages/blob/master/CPE/Fedora13... Visually looking over the results, it seems fairly reasonable for much of the data. Some entries are clearly wrong and would need to be corrected. I'm not expecting to Fedora to necessarily use this list, but it's there if you want to.
Incidentally, a CPE list has uses in other applications besides security. It seems cross distro app installers want CPE info or package equivalencies, eg http://lists.debian.org/debian-devel/2011/01/msg00676.html
-- Silvio
On Tue, Feb 1, 2011 at 1:13 AM, Tomas Hoger thoger@redhat.com wrote:
Hi Silvio!
On Mon, 31 Jan 2011 19:21:39 +1100 Silvio Cesare wrote:
Debian maintain a list of CPE inormation for packages on their security tracker http://svn.debian.org/wsvn/secure-testing/data/CPE/list
We currently do not use CPE names for security tracking in Fedora, so I don't see an obvious benefit maintaining such list. Can you explain briefly how you use it for Debian security tracking and what benefits it brings?
This makes it relatively static except when packages are added or removed from the repository.
It's not that uncommon to see new packages added to Fedora repositories even after the release of some Fedora version.
In the past I generated an automatic mapping between packages in Debian and Fedora
https://github.com/silviocesare/Equivalent-Packages/blob/master/NearestNeigh...
I played a little more with this list and noticed few problems:
- quite a few Debian packages map to Fedora arptools or binclock.
Probably packages with not much sources, where other file (license, configure) confuse your tool to match unrelated packages
- there does not seem to be a good way to list cases where multiple
components contain the same sources. In Fedora, mingw32-* packages are a good example, and the list often maps Debian package foo to Fedora package mingw32-foo, while there is Fedora package foo that should be similarly good match. Another example is zlib:arm-gp2x-linux-zlib.
Did you review "unexpected matches" to see if the sources are really similar, and how the match is picked when there are multiple "good candidates"?
-- Tomas Hoger / Red Hat Security Response Team
security@lists.fedoraproject.org