Hello all, it took us a few years but we are finally getting rid of the PDC project. Thanks to the ARC research we identified use cases in our tooling and proposed solution.
The essential functionalities currently provided by PDC will be re-implemented in other applications within our release infrastructure, as there are no immediate plans for their replacement and are currently maintained
This work is anticipated to span several months for completion. However, before we embark on this endeavor,
we would like to proactively share our proposed solution with all of you and gather your valuable feedback.
Below, we outline our strategy to preserve the core functionality of PDC by leveraging existing applications within our ecosystem.
Current uses of PDC:
Currently, we rely on the Package Database (PDC) for various data management tasks, including:
1.
Critical Path Package Tracking: Bodhi leverages PDC to track packages on the critical path. 2.
Retirement of Packages and Service Level Agreements (SLAs): PDC assists in managing the retirement of packages and their associated SLAs. 3.
Metadata for Nightly Composes: Our Release Engineering and Fedora Quality Assurance teams rely on PDC for metadata related to nightly composes.
More info on the usage can be found here: https://fedora-arc.readthedocs.io/en/latest/pdc/users.html
Specific Endpoints in Use:
We interact with the following endpoints in PDC to access and manage our data:
/rest_api/v1/global-components: This endpoint stores package names.
/rest_api/v1/component-branches: Here, we store SLAs, track active/retired status, and manage the critical-path flag.
/rest_api/v1/component-branch-slas: An auxiliary endpoint dedicated to SLAs.
/rest_api/v1/compose-images: This endpoint stores essential files, such as composeinfo.json and image-manifest.json.
/rest_api/v1/releases: It houses metadata on various release types (e.g., GA, updates, EUS, AUS, ELS, fast, updates-testing) for product versions, indicating their active status.
/rest_api/v1/rpms: This endpoint establishes links between RPM packages and composes.
/rest_api/v1/product-versions: Here, we store major versions of Fedora.
More info on the endpoints: https://fedora-arc.readthedocs.io/en/latest/pdc/endpoints.html
Upcoming Changes
Bodhi:
Bodhi will assume responsibility for the following tasks, reducing our reliance on PDC:
/rest_api/v1/releases/: Bodhi will now manage release-related data.
/rest_api/v1/component-branches/: Specifically, Bodhi will handle the critical-path flag.
Bodhi's existing framework aligns well with releases and components. To enhance this, we will create an auxiliary table that pairs this data with additional metadata,
predominantly focusing on the critical-path flag. Previously, we had to query this information from PDC.
It's essential to note that PDC is not the definitive source of truth for critical-path packages. The Fedora Project Critical Path Package Wiki indicates that the source of truth lies within the Fedora Comps repository.
You can find specific information by searching for groups with the "critical-path-*" prefix, as demonstrated here.
While the data is accessible through DNF, generating it can be time-consuming. PDC serves as a pre-computed cache.
Previously, we followed this process to update it, but now we have transitioned to using this method.
The primary application of this information during the Fedora release cycle is in Bodhi, where it is used to enforce stricter requirements on critical-path package updates. For further details, please refer to this link.
Pagure-dist-git:
Pagure-dist-git will take over several responsibilities from PDC, including:
/rest_api/v1/product-versions
/rest_api/v1/global-components
/rest_api/v1/component-branches/
/rest_api/v1/component-branch-slas/
Pagure already has a robust database of global components (repositories) and product versions (repository branches).
It utilizes the PDC API to query component branches when a package is retired, and an auxiliary table in Pagure-dist-git will store the reasons for orphaning these components.
A list of all identified uses of PDC API can be found in the original ARC investigation: https://fedora-arc.readthedocs.io/en/latest/pdc/users.html
Projects not considered in the original arc investigation:
MDapi
Toddlers
Toddlers took over the functionality of the fedscm-admin tool and it's more or less a 1:1 rewrite of the tool, use cases should be the same as fedscm-admin.
Remaining Endpoints:
A few endpoints will remain unchanged:
/rest_api/v1/compose-images/: Given that we primarily store JSON blobs here, we have decided, based on discussions, to store the JSON data on a network-accessible file server.
/rest_api/v1/rpms/: This endpoint does not currently have a dedicated application, but our Quality Assurance team uses it to query packages in specific composes.
CPE is scheduled to commence work on this project in approximately two weeks. If you have any feedback regarding the proposed solution to facilitate the decommissioning of PDC
or if you have any questions or require additional information, please don't hesitate to get in touch with us.
Your input and inquiries are greatly appreciated.
On Mon, 2023-09-04 at 16:51 +0200, Tomas Hrcka wrote:
Current uses of PDC:
Critical Path Package Tracking: Bodhi leverages PDC to track packages on the critical path.
/rest_api/v1/component-branches/: Specifically, Bodhi will handle the critical-path flag.
Bodhi's existing framework aligns well with releases and components. To enhance this, we will create an auxiliary table that pairs this data with additional metadata,
predominantly focusing on the critical-path flag. Previously, we had to query this information from PDC.
It's essential to note that PDC is not the definitive source of truth for critical-path packages. The Fedora Project Critical Path Package Wiki indicates that the source of truth lies within the Fedora Comps repository.
You can find specific information by searching for groups with the "critical-path-*" prefix, as demonstrated here.
While the data is accessible through DNF, generating it can be time-consuming. PDC serves as a pre-computed cache.
Previously, we followed this process to update it, but now we have transitioned to using this method.
The primary application of this information during the Fedora release cycle is in Bodhi, where it is used to enforce stricter requirements on critical-path package updates. For further details, please refer to this link.
This is all done already. I did it months ago:
https://github.com/fedora-infra/bodhi/pull/4755 https://github.com/fedora-infra/bodhi/pull/4759 https://pagure.io/fedora-infra/ansible/pull-request/1294 https://pagure.io/fedora-infra/ansible/c/fea60aab95bd45960dbf4a0514c5df28a86... https://pagure.io/fedora-infra/ansible/c/842db118e88b987dbb5ab98f4d17556aba0...
We no longer use PDC for critical path information and, indeed, we dropped the code for doing so from Bodhi last month:
https://github.com/fedora-infra/bodhi/pull/5431
BTW, the email seems kind of weird. There are lots of places where it seems like it's referring to external information - "as demonstrated here", "Previously, we followed this process to update it, but now we have transitioned to using this method" - but it does not actually link to anything external, so I have no idea what the references are meant to be?
On Mon, Sep 04, 2023 at 04:51:22PM +0200, Tomas Hrcka wrote:
Hello all, it took us a few years but we are finally getting rid of the PDC project. Thanks to the ARC research we identified use cases in our tooling and proposed solution.
The essential functionalities currently provided by PDC will be re-implemented in other applications within our release infrastructure, as there are no immediate plans for their replacement and are currently maintained
This work is anticipated to span several months for completion. However, before we embark on this endeavor,
we would like to proactively share our proposed solution with all of you and gather your valuable feedback.
Below, we outline our strategy to preserve the core functionality of PDC by leveraging existing applications within our ecosystem.
Current uses of PDC:
Currently, we rely on the Package Database (PDC) for various data management tasks, including:
Critical Path Package Tracking: Bodhi leverages PDC to track packages on the critical path.
As Adam mentioned this is already not in pdc. ;)
Retirement of Packages and Service Level Agreements (SLAs): PDC assists in managing the retirement of packages and their associated SLAs.
Yeah. The super big one is that its queried from a git commit hook for all src.fedoraproject.org git commits. Right now if pdc is down, no one could commit anything.
Metadata for Nightly Composes: Our Release Engineering and Fedora Quality Assurance teams rely on PDC for metadata related to nightly composes.
More info on the usage can be found here: https://fedora-arc.readthedocs.io/en/latest/pdc/users.html
mass rebuild of modules can be dropped. ;)
fedscm-admin is now the scm requests toddler. It still uses pdc tho of course.
Specific Endpoints in Use:
...snip...
Upcoming Changes
Bodhi:
Bodhi will assume responsibility for the following tasks, reducing our reliance on PDC:
/rest_api/v1/releases/: Bodhi will now manage release-related data.
Do note that bodhi still has a window after we are 'go' for a relase where it thinks it's released, but it's not yet. We probibly need to address this if we are moving this to bodhi.
/rest_api/v1/component-branches/: Specifically, Bodhi will handle the critical-path flag.
Already done.
...snip...
Pagure-dist-git:
Pagure-dist-git will take over several responsibilities from PDC, including:
/rest_api/v1/product-versions
/rest_api/v1/global-components
/rest_api/v1/component-branches/
/rest_api/v1/component-branch-slas/
Pagure already has a robust database of global components (repositories) and product versions (repository branches).
It utilizes the PDC API to query component branches when a package is retired, and an auxiliary table in Pagure-dist-git will store the reasons for orphaning these components.
So, I know this will work... but it means more closely tying ourselves to pagure-dist-git. ;(
With modules going out of the picture, most branches just have the release cycle of the fedora or rhel release they are based on, so couldn't we just default that somewhere?
There's also flatpaks, but I think we could also tie them to release eol's.
So, is it possible to just not keep these things?
A list of all identified uses of PDC API can be found in the original ARC investigation: https://fedora-arc.readthedocs.io/en/latest/pdc/users.html
Projects not considered in the original arc investigation:
MDapi
Toddlers
Toddlers took over the functionality of the fedscm-admin tool and it's more or less a 1:1 rewrite of the tool, use cases should be the same as fedscm-admin.
yeah.
Remaining Endpoints:
A few endpoints will remain unchanged:
/rest_api/v1/compose-images/: Given that we primarily store JSON blobs here, we have decided, based on discussions, to store the JSON data on a network-accessible file server.
What server? Where? I think the only thing that uses this is fedfind?
I really suggest at the start of this work, we just plan out exactly what changes before doing anything. (ie, merge this exact PR that changes this).
kevin
On Tue, Sep 05, 2023 at 11:35:19AM -0700, Kevin Fenzi wrote:
On Mon, Sep 04, 2023 at 04:51:22PM +0200, Tomas Hrcka wrote:
Hello all, it took us a few years but we are finally getting rid of the PDC project. Thanks to the ARC research we identified use cases in our tooling and proposed solution.
The essential functionalities currently provided by PDC will be re-implemented in other applications within our release infrastructure, as there are no immediate plans for their replacement and are currently maintained
This work is anticipated to span several months for completion. However, before we embark on this endeavor,
we would like to proactively share our proposed solution with all of you and gather your valuable feedback.
Below, we outline our strategy to preserve the core functionality of PDC by leveraging existing applications within our ecosystem.
Current uses of PDC:
Currently, we rely on the Package Database (PDC) for various data management tasks, including:
Critical Path Package Tracking: Bodhi leverages PDC to track packages on the critical path.
As Adam mentioned this is already not in pdc. ;)
Retirement of Packages and Service Level Agreements (SLAs): PDC assists in managing the retirement of packages and their associated SLAs.
Yeah. The super big one is that its queried from a git commit hook for all src.fedoraproject.org git commits. Right now if pdc is down, no one could commit anything.
Metadata for Nightly Composes: Our Release Engineering and Fedora Quality Assurance teams rely on PDC for metadata related to nightly composes.
More info on the usage can be found here: https://fedora-arc.readthedocs.io/en/latest/pdc/users.html
mass rebuild of modules can be dropped. ;)
fedscm-admin is now the scm requests toddler. It still uses pdc tho of course.
Specific Endpoints in Use:
...snip...
Upcoming Changes
Bodhi:
Bodhi will assume responsibility for the following tasks, reducing our reliance on PDC:
/rest_api/v1/releases/: Bodhi will now manage release-related data.
Do note that bodhi still has a window after we are 'go' for a relase where it thinks it's released, but it's not yet. We probibly need to address this if we are moving this to bodhi.
/rest_api/v1/component-branches/: Specifically, Bodhi will handle the critical-path flag.
Already done.
...snip...
Pagure-dist-git:
Pagure-dist-git will take over several responsibilities from PDC, including:
/rest_api/v1/product-versions
/rest_api/v1/global-components
/rest_api/v1/component-branches/
/rest_api/v1/component-branch-slas/
Pagure already has a robust database of global components (repositories) and product versions (repository branches).
It utilizes the PDC API to query component branches when a package is retired, and an auxiliary table in Pagure-dist-git will store the reasons for orphaning these components.
So, I know this will work... but it means more closely tying ourselves to pagure-dist-git. ;(
With modules going out of the picture, most branches just have the release cycle of the fedora or rhel release they are based on, so couldn't we just default that somewhere?
In the pkgdb time, the EOL status was basically simply computed from the release status, ie: what we still have at: https://admin.fedoraproject.org/pkgdb/api/collections (looks like we should fix the branchname in that json) but we could just go back to this :)
Pierre
Sorry for the confusion with work that is already done, We can drop the critpath thanks Adam!
As it goes for EoL and package retirement we for the past few releases we are saving EOL date in bodhi. So getting EOL for specific release is not a problem once the release is out.
For storing the orphaning reason and other potential metadata. We can store some of it in git in form of notes on branches not necessarily in pagure-disgit specific code-base.
With toddlers i think the path is clear we need to use bodhi as a source of truth about releases. Similar work as on toddlers will need to be done on mdapi
For the compose metadata we can store the the json blobs on fedorapeople for now and search for some stable place.
On Wed, Sep 6, 2023 at 12:23 PM Pierre-Yves Chibon pingou@pingoured.fr wrote:
On Tue, Sep 05, 2023 at 11:35:19AM -0700, Kevin Fenzi wrote:
On Mon, Sep 04, 2023 at 04:51:22PM +0200, Tomas Hrcka wrote:
Hello all, it took us a few years but we are finally getting rid of
the PDC
project. Thanks to the ARC research we identified use cases in our
tooling
and proposed solution.
The essential functionalities currently provided by PDC will be re-implemented in other applications within our release
infrastructure, as
there are no immediate plans for their replacement and are currently maintained
This work is anticipated to span several months for completion.
However,
before we embark on this endeavor,
we would like to proactively share our proposed solution with all of
you
and gather your valuable feedback.
Below, we outline our strategy to preserve the core functionality of
PDC by
leveraging existing applications within our ecosystem.
Current uses of PDC:
Currently, we rely on the Package Database (PDC) for various data management tasks, including:
Critical Path Package Tracking: Bodhi leverages PDC to track
packages on
the critical path.
As Adam mentioned this is already not in pdc. ;)
Retirement of Packages and Service Level Agreements (SLAs): PDC
assists
in managing the retirement of packages and their associated SLAs.
Yeah. The super big one is that its queried from a git commit hook for all src.fedoraproject.org git commits. Right now if pdc is down, no one could commit anything.
Metadata for Nightly Composes: Our Release Engineering and Fedora Quality Assurance teams rely on PDC for metadata related to nightly composes.
More info on the usage can be found here: https://fedora-arc.readthedocs.io/en/latest/pdc/users.html
mass rebuild of modules can be dropped. ;)
fedscm-admin is now the scm requests toddler. It still uses pdc tho of course.
Specific Endpoints in Use:
...snip...
Upcoming Changes
Bodhi:
Bodhi will assume responsibility for the following tasks, reducing our reliance on PDC:
/rest_api/v1/releases/: Bodhi will now manage release-related data.
Do note that bodhi still has a window after we are 'go' for a relase where it thinks it's released, but it's not yet. We probibly need to address this if we are moving this to bodhi.
/rest_api/v1/component-branches/: Specifically, Bodhi will handle the critical-path flag.
Already done.
...snip...
Pagure-dist-git:
Pagure-dist-git will take over several responsibilities from PDC,
including:
/rest_api/v1/product-versions
/rest_api/v1/global-components
/rest_api/v1/component-branches/
/rest_api/v1/component-branch-slas/
Pagure already has a robust database of global components
(repositories)
and product versions (repository branches).
It utilizes the PDC API to query component branches when a package is retired, and an auxiliary table in Pagure-dist-git will store the
reasons
for orphaning these components.
So, I know this will work... but it means more closely tying ourselves to pagure-dist-git. ;(
With modules going out of the picture, most branches just have the release cycle of the fedora or rhel release they are based on, so couldn't we just default that somewhere?
In the pkgdb time, the EOL status was basically simply computed from the release status, ie: what we still have at: https://admin.fedoraproject.org/pkgdb/api/collections (looks like we should fix the branchname in that json) but we could just go back to this :)
Pierre _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Mon, Sep 11, 2023 at 03:08:50PM +0200, Tomas Hrcka wrote:
Sorry for the confusion with work that is already done, We can drop the critpath thanks Adam!
As it goes for EoL and package retirement we for the past few releases we are saving EOL date in bodhi. So getting EOL for specific release is not a problem once the release is out.
yeah, the reason we needed it in pdc before was stream branches.
I think once flatpaks are moved to the new setup we won't have any _new_ stream branches. However, if we are going to support updating modules for f37/f38, we may need to figure out something there...
For storing the orphaning reason and other potential metadata. We can store some of it in git in form of notes on branches not necessarily in pagure-disgit specific code-base.
yeah, I think moving some of this that makes sense into git is reasonable.
With toddlers i think the path is clear we need to use bodhi as a source of truth about releases. Similar work as on toddlers will need to be done on mdapi
For the compose metadata we can store the the json blobs on fedorapeople for now and search for some stable place.
I don't think we should use fedorapeople for anything like this. If we need just a space we could use /pub/alt/something/ ?
These are the things that fedfind/qa users? Do we have examples of this data?
Thanks for working on this!
kevin --
On Wed, Sep 6, 2023 at 12:23 PM Pierre-Yves Chibon pingou@pingoured.fr wrote:
On Tue, Sep 05, 2023 at 11:35:19AM -0700, Kevin Fenzi wrote:
On Mon, Sep 04, 2023 at 04:51:22PM +0200, Tomas Hrcka wrote:
Hello all, it took us a few years but we are finally getting rid of
the PDC
project. Thanks to the ARC research we identified use cases in our
tooling
and proposed solution.
The essential functionalities currently provided by PDC will be re-implemented in other applications within our release
infrastructure, as
there are no immediate plans for their replacement and are currently maintained
This work is anticipated to span several months for completion.
However,
before we embark on this endeavor,
we would like to proactively share our proposed solution with all of
you
and gather your valuable feedback.
Below, we outline our strategy to preserve the core functionality of
PDC by
leveraging existing applications within our ecosystem.
Current uses of PDC:
Currently, we rely on the Package Database (PDC) for various data management tasks, including:
Critical Path Package Tracking: Bodhi leverages PDC to track
packages on
the critical path.
As Adam mentioned this is already not in pdc. ;)
Retirement of Packages and Service Level Agreements (SLAs): PDC
assists
in managing the retirement of packages and their associated SLAs.
Yeah. The super big one is that its queried from a git commit hook for all src.fedoraproject.org git commits. Right now if pdc is down, no one could commit anything.
Metadata for Nightly Composes: Our Release Engineering and Fedora Quality Assurance teams rely on PDC for metadata related to nightly composes.
More info on the usage can be found here: https://fedora-arc.readthedocs.io/en/latest/pdc/users.html
mass rebuild of modules can be dropped. ;)
fedscm-admin is now the scm requests toddler. It still uses pdc tho of course.
Specific Endpoints in Use:
...snip...
Upcoming Changes
Bodhi:
Bodhi will assume responsibility for the following tasks, reducing our reliance on PDC:
/rest_api/v1/releases/: Bodhi will now manage release-related data.
Do note that bodhi still has a window after we are 'go' for a relase where it thinks it's released, but it's not yet. We probibly need to address this if we are moving this to bodhi.
/rest_api/v1/component-branches/: Specifically, Bodhi will handle the critical-path flag.
Already done.
...snip...
Pagure-dist-git:
Pagure-dist-git will take over several responsibilities from PDC,
including:
/rest_api/v1/product-versions
/rest_api/v1/global-components
/rest_api/v1/component-branches/
/rest_api/v1/component-branch-slas/
Pagure already has a robust database of global components
(repositories)
and product versions (repository branches).
It utilizes the PDC API to query component branches when a package is retired, and an auxiliary table in Pagure-dist-git will store the
reasons
for orphaning these components.
So, I know this will work... but it means more closely tying ourselves to pagure-dist-git. ;(
With modules going out of the picture, most branches just have the release cycle of the fedora or rhel release they are based on, so couldn't we just default that somewhere?
In the pkgdb time, the EOL status was basically simply computed from the release status, ie: what we still have at: https://admin.fedoraproject.org/pkgdb/api/collections (looks like we should fix the branchname in that json) but we could just go back to this :)
Pierre _______________________________________________ devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
-- Tomas Hrcka fas: humaton libera.CHAT: jednorozec
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Mon, 2023-09-11 at 09:40 -0700, Kevin Fenzi wrote:
These are the things that fedfind/qa users? Do we have examples of this data?
Okay, here is (again) a list of the stuff fedfind uses. You can see sample data for each compose in the web UI itself, PDC actually has a rather good interactive API view.
1. WHAT: https://pdc.fedoraproject.org/rest_api/v1/composes/ HOW: ?release=fedora-NN&compose_label=label ?compose_id=cid WHY: To find a compose's compose ID from its label, or a compose's label from its compose ID. These functions are used on various paths. The most important one in real life is probably when we are creating and updating validation event pages for candidate composes, between the time the compose exists and the time it is synced to alt.fp.o - during this time wikitcms may need to look the compose up by label, but for fedfind to actually find it, cid_from_label has to work, because it can only 'natively' find a compose by label if it's been synced to alt.fp.o; if the compose has not yet been synced, the only way fedfind can find the compose is to use cid_from_label then locate it on kojipkgs under its compose ID. WORKAROUND: it would be difficult to work around this if we did not have the ability. For wikitcms we could try 'embedding' the compose ID in the event pages somehow. For some other less important uses there is probably no workaround.
2. WHAT: https://pdc.fedoraproject.org/rest_api/v1/compose-images/ HOW: compose-images/(composeid) WHY: To provide image metadata for nightly composes that have been garbage-collected, and releases in the mirror system after they have been split across directories and their metadata stripped WORKAROUND: if there is no store of metadata by compose ID then fedfind just can't do this any more. It's not critical functionality to anything important I'm aware of, though - it's just a nice feature that can be used to e.g. run long-term analysis of image size changes across multiple releases, stuff like that. If this goes away I will just drop this feature from fedfind and you will no longer be able to find the real metadata for these composes (it would provide synthesized metadata for mirrored releases, but not garbage-collected nightlies). Note, retrieving the original metadata for mirrored releases depends on the cid_from_label feature (see 1).
3. WHAT: https://pdc.fedoraproject.org/rest_api/v1/releases/ HOW: ?version=39&name=Fedora (for e.g.) WHY: To find the compose that was "previous" to any given compose. The first result for this PDC query for any given release is a dict with a handy 'compose_set' key whose value is a list of all the composes for that release, in reverse order by date. So we can easily find the compose that came 'before' any given compose. This is, I *think*, only used by the 'check-compose' script which sends out those "compose check report" emails. WORKAROUND: fedfind has a hideous alternative implementation of this which involves just blindly trying possible decrements and seeing if they exist. e.g. if you ask for the compose previous to Fedora-39- 20230911.n.1, it will try 20230911.n.0, if that doesn't exist, it will try 20230910.n.5 (we start counting backwards from 5 because, hey, gotta start somewhere, and doing more than 5 composes on one day is unlikely), then 20230910.n.4, and so on, until it hits something that exists. So, if we can't get a nice data set like this any more...we'll just have to use that. Bleh. We might also just drop check-compose and kill the feature, I suppose. I don't pay as much attention to those reports as I used to.
4. WHAT: https://pdc.fedoraproject.org/rest_api/v1/rpms/ HOW: ?compose=cid&arch=src&name=^package1$&name=^package2$.... WHY: To get the NEVR (name, epoch, version, release) for a given set of packages from a compose. This can be used by relvalconsumer (the thing that creates the wiki validation events) to decide whether any "important" packages changed between the last tested compose and the compose it's currently deciding whether to create a validation event for: if it's been more than 3 days but less than 14 days since the last event, it will create a new event *if* any important package's version changed. WORKAROUND: Looking back on the history of this, I don't think anything's actually using it right now. There is an ugly alternative version of this one, too, which scrapes dl.fedoraproject.org's HTML output (told you it was ugly!). I initially replaced that version with the PDC one for some compose types, but then ran into problems with that, and ultimately decided to provide both side-by-side; I think relvalconsumer is the only thing that actually uses this feature, and it currently uses the ugly HTML scraping version, not the PDC version. I probably meant to try and make it use the PDC version but never got around to it. The ugly version is working okay in practice, so I suppose the workaround here would just be to drop the pretty PDC version and rely on HTML scraping forever (sob).
I thought this was exactly the JSON blob we will implement, If you look at the model of PDC https://product-definition-center.github.io/product-definition-center/_image...
Those endpoints are covered by relations Release, Compose, RPM through some others. The PDC approach is more or less a normalized RDBMS model. This should be replaceable by a much simpler denormalized approach and create one entry for each compose with all the data in one place.
It is also mentioned in the arc docs https://fedora-arc.readthedocs.io/en/latest/pdc/users.html#fedoraqa-fedfind
On Tue, Sep 12, 2023 at 2:18 AM Adam Williamson adamwill@fedoraproject.org wrote:
On Mon, 2023-09-11 at 09:40 -0700, Kevin Fenzi wrote:
These are the things that fedfind/qa users? Do we have examples of this data?
Okay, here is (again) a list of the stuff fedfind uses. You can see sample data for each compose in the web UI itself, PDC actually has a rather good interactive API view.
- WHAT: https://pdc.fedoraproject.org/rest_api/v1/composes/ HOW: ?release=fedora-NN&compose_label=label ?compose_id=cid WHY: To find a compose's compose ID from its label, or a compose's
label from its compose ID. These functions are used on various paths. The most important one in real life is probably when we are creating and updating validation event pages for candidate composes, between the time the compose exists and the time it is synced to alt.fp.o - during this time wikitcms may need to look the compose up by label, but for fedfind to actually find it, cid_from_label has to work, because it can only 'natively' find a compose by label if it's been synced to alt.fp.o; if the compose has not yet been synced, the only way fedfind can find the compose is to use cid_from_label then locate it on kojipkgs under its compose ID. WORKAROUND: it would be difficult to work around this if we did not have the ability. For wikitcms we could try 'embedding' the compose ID in the event pages somehow. For some other less important uses there is probably no workaround.
- WHAT: https://pdc.fedoraproject.org/rest_api/v1/compose-images/ HOW: compose-images/(composeid) WHY: To provide image metadata for nightly composes that have been
garbage-collected, and releases in the mirror system after they have been split across directories and their metadata stripped WORKAROUND: if there is no store of metadata by compose ID then fedfind just can't do this any more. It's not critical functionality to anything important I'm aware of, though - it's just a nice feature that can be used to e.g. run long-term analysis of image size changes across multiple releases, stuff like that. If this goes away I will just drop this feature from fedfind and you will no longer be able to find the real metadata for these composes (it would provide synthesized metadata for mirrored releases, but not garbage-collected nightlies). Note, retrieving the original metadata for mirrored releases depends on the cid_from_label feature (see 1).
- WHAT: https://pdc.fedoraproject.org/rest_api/v1/releases/ HOW: ?version=39&name=Fedora (for e.g.) WHY: To find the compose that was "previous" to any given compose.
The first result for this PDC query for any given release is a dict with a handy 'compose_set' key whose value is a list of all the composes for that release, in reverse order by date. So we can easily find the compose that came 'before' any given compose. This is, I *think*, only used by the 'check-compose' script which sends out those "compose check report" emails. WORKAROUND: fedfind has a hideous alternative implementation of this which involves just blindly trying possible decrements and seeing if they exist. e.g. if you ask for the compose previous to Fedora-39- 20230911.n.1, it will try 20230911.n.0, if that doesn't exist, it will try 20230910.n.5 (we start counting backwards from 5 because, hey, gotta start somewhere, and doing more than 5 composes on one day is unlikely), then 20230910.n.4, and so on, until it hits something that exists. So, if we can't get a nice data set like this any more...we'll just have to use that. Bleh. We might also just drop check-compose and kill the feature, I suppose. I don't pay as much attention to those reports as I used to.
- WHAT: https://pdc.fedoraproject.org/rest_api/v1/rpms/ HOW: ?compose=cid&arch=src&name=^package1$&name=^package2$.... WHY: To get the NEVR (name, epoch, version, release) for a given set
of packages from a compose. This can be used by relvalconsumer (the thing that creates the wiki validation events) to decide whether any "important" packages changed between the last tested compose and the compose it's currently deciding whether to create a validation event for: if it's been more than 3 days but less than 14 days since the last event, it will create a new event *if* any important package's version changed. WORKAROUND: Looking back on the history of this, I don't think anything's actually using it right now. There is an ugly alternative version of this one, too, which scrapes dl.fedoraproject.org's HTML output (told you it was ugly!). I initially replaced that version with the PDC one for some compose types, but then ran into problems with that, and ultimately decided to provide both side-by-side; I think relvalconsumer is the only thing that actually uses this feature, and it currently uses the ugly HTML scraping version, not the PDC version. I probably meant to try and make it use the PDC version but never got around to it. The ugly version is working okay in practice, so I suppose the workaround here would just be to drop the pretty PDC version and rely on HTML scraping forever (sob). -- Adam Williamson (he/him/his) Fedora QA Fedora Chat: @adamwill:fedora.im | Mastodon: @adamw@fosstodon.org https://www.happyassassin.net _______________________________________________ infrastructure mailing list -- infrastructure@lists.fedoraproject.org To unsubscribe send an email to infrastructure-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/infrastructure@lists.fedorapro... Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Wed, 2023-09-13 at 16:49 +0200, Tomas Hrcka wrote:
I thought this was exactly the JSON blob we will implement, If you look at the model of PDC https://product-definition-center.github.io/product-definition-center/_image...
Those endpoints are covered by relations Release, Compose, RPM through some others. The PDC approach is more or less a normalized RDBMS model. This should be replaceable by a much simpler denormalized approach and create one entry for each compose with all the data in one place.
That's missing the 'composes' endpoint use (#1 in my list below). Not sure if that's your fault or mine, sorry if I left it out of a previous list.
Does your replacement plan cover the "previous_release" use case, where we need a list of all the composes for each release in the order they happened?
Thanks!
Ok, now we are on the same page.
On Wed, Sep 13, 2023 at 8:04 PM Adam Williamson adamwill@fedoraproject.org wrote:
On Wed, 2023-09-13 at 16:49 +0200, Tomas Hrcka wrote:
I thought this was exactly the JSON blob we will implement, If you look
at
the model of PDC
https://product-definition-center.github.io/product-definition-center/_image...
Those endpoints are covered by relations Release, Compose, RPM through
some
others. The PDC approach is more or less a normalized RDBMS model. This should be replaceable by a much simpler denormalized approach and create one entry for each compose with all the data in one place.
That's missing the 'composes' endpoint use (#1 in my list below). Not sure if that's your fault or mine, sorry if I left it out of a previous list.
Does your replacement plan cover the "previous_release" use case, where we need a list of all the composes for each release in the order they happened?
Yes idea was to have the compose_id as a document root since it is unique for each compose. something like this:
Fedora-Rawhide-20230913.n.0:{ compose_related_metadata:[metadata0:metadata,metadata1:metadata] rpms:[nvr:other_rdata,nvr1:other_rdata1] images:{Fedora-Sway-Live-x86_64-Rawhide-20230913.n.0.iso:"some metadata"} }
If we store the data in the correct format for fedfind use case we can minimize the number of operations that are needed compared to REST API.
I will start working on this on the 15th. My idea is to start with the data format, Once that is sorted out.
Just a friendly reminder we will not drop PDC suddenly. The plan is to implement the new solution and have it running in parallel with PDC for a while, Once we are sure everything is working we will start to sunset PDC.
Thanks!
Adam Williamson (he/him/his) Fedora QA Fedora Chat: @adamwill:fedora.im | Mastodon: @adamw@fosstodon.org https://www.happyassassin.net
devel mailing list -- devel@lists.fedoraproject.org To unsubscribe send an email to devel-leave@lists.fedoraproject.org Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@lists.fedoraproject.org Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
On Thu, 2023-09-14 at 07:41 +0200, Tomas Hrcka wrote:
Ok, now we are on the same page.
On Wed, Sep 13, 2023 at 8:04 PM Adam Williamson adamwill@fedoraproject.org wrote:
On Wed, 2023-09-13 at 16:49 +0200, Tomas Hrcka wrote:
I thought this was exactly the JSON blob we will implement, If you look
at
the model of PDC
https://product-definition-center.github.io/product-definition-center/_image...
Those endpoints are covered by relations Release, Compose, RPM through
some
others. The PDC approach is more or less a normalized RDBMS model. This should be replaceable by a much simpler denormalized approach and create one entry for each compose with all the data in one place.
That's missing the 'composes' endpoint use (#1 in my list below). Not sure if that's your fault or mine, sorry if I left it out of a previous list.
Does your replacement plan cover the "previous_release" use case, where we need a list of all the composes for each release in the order they happened?
Yes idea was to have the compose_id as a document root since it is unique for each compose. something like this:
Fedora-Rawhide-20230913.n.0:{ compose_related_metadata:[metadata0:metadata,metadata1:metadata] rpms:[nvr:other_rdata,nvr1:other_rdata1] images:{Fedora-Sway-Live-x86_64-Rawhide-20230913.n.0.iso:"some metadata"} }
Sure, that works. These dicts should just be straight dumps of the metadata files from the composes - composeinfo.json, rpms.json and images.json - e.g. the ones at https://kojipkgs.fedoraproject.org/compose/rawhide/Fedora-Rawhide-20230913.n... . I'm pretty sure that's what they are in PDC already.
Note, I still don't think this covers my case #1 exactly: well, it would probably make cid_to_label possible, but as described, I'm not sure it would make label_to_cid possible. We can query PDC with a release number and a label - a label is something like Beta-1.1 - and it will return a single compose, from which we can read the compose ID. If the proposal is essentially just a JSON dict whose keys are compose IDs...well, I suppose technically we could grab the whole thing and do some operations on it to find the compose with a given label, but it's a bit more awkward than just querying. And if it's just a flat dict of every compose ever, presumably it'll be paged, and finding the one you want might take a while?
On Mon, Sep 04, 2023 at 04:51:22PM +0200, Tomas Hrcka wrote:
Hello all, it took us a few years but we are finally getting rid of the PDC project. Thanks to the ARC research we identified use cases in our tooling and proposed solution.
The essential functionalities currently provided by PDC will be re-implemented in other applications within our release infrastructure, as there are no immediate plans for their replacement and are currently maintained
This work is anticipated to span several months for completion. However, before we embark on this endeavor,
we would like to proactively share our proposed solution with all of you and gather your valuable feedback.
Below, we outline our strategy to preserve the core functionality of PDC by leveraging existing applications within our ecosystem.
Current uses of PDC:
Currently, we rely on the Package Database (PDC) for various data management tasks, including:
Critical Path Package Tracking: Bodhi leverages PDC to track packages on the critical path. 2.
Retirement of Packages and Service Level Agreements (SLAs): PDC assists in managing the retirement of packages and their associated SLAs.
Hmm, what SLAs? With the removal of modularity, the idea of SLAs also died. Packages are supported until the release for which they are built goes EOL. So this whole idea and related functionality is not needed.
Metadata for Nightly Composes: Our Release Engineering and Fedora Quality Assurance teams rely on PDC for metadata related to nightly composes.
Zbyszek
infrastructure@lists.fedoraproject.org