This is an updated version of the DNF UUID proposal based on feedback
received on the mailing list.
== Summary ==
Right now, we estimate installed Fedora systems by counting unique IP
addresses which show up in our updates mirror statistics. We need
better data than that. There are some proposals for more complicated
systems, but a quick thing we can
This is an update of a
previous proposal] to use a UUID to distinguish unique systems, as
openSUSE does (see https://metrics.opensuse.org/
). See also
this previous Fedora Council discussion] and
this devel list thread].
== Owner ==
* Name: [[User:mattdm|Matthew Miller]]
* Email: mattdm
== Detailed Description ==
=== The problem ===
* A. Currently, we can only count Fedora OS use by observing IP
addresses. This is subject to undercounting due to NAT — and to
overcounting due to short DHCP leases and laptops moving between work
or school and home or coffee shop.
* B. We can count what releases are observed, but we can’t distinguish variants.
* C. We can’t count quickly because various logs are copied back to a
central server and data is not consistent for several days.
=== Constraints ===
* The Fedora community cares about privacy and is adverse to tracking
measures. We don't want to track; just count.
* For this reason, we don’t want to use any identifier like
/etc/machine-id which may be used for other purposes — or in fact any
UUID at all
* And, also for that reason, there needs to be a relatively easy way to opt out.
* This needs to work with Yum/DNF, MicroDNF, PackageKit, Cockpit,
rpm-ostree, GNOME Software, Muon, and software update mechanisms used
in other spins.
* We need to be able to distinguish between short-lived instances
(like temporary containers or test machines) and actual installations.
=== Non-Goals ===
* We don’t want to track users, just count systems.
* Except for distinguishing temporary installations from “real” use,
we don’t need to track systems over time. We just want a daily or
weekly moment-in-time count.
* Being able to see how systems are upgraded over time might be
interesting but isn’t as important as privacy concerns.
== Proposal ==
# Add VARIANT_ID (see [[Changes/Label Our Variants]]) to string
reported to when metadata is requested from fedora update servers
# Current requests include machine architecture and Fedora OS version
as part of the path; we may want to also put those in a standard
format for easy processing (implementation detail)
# Add a new "countme" variable. This variable will:
#* Start as a "true" value,
#* Reset to a "false" value the first time the client successfully
makes a request to Fedora mirror servers, and
#* Be reset to a "true" value after seven days.
This way, rather than filtering by unique IP addresses, we can count
only the "true" requests, so we count each machine once — but no more
==== Options for "true" values ====
Rather than a simple boolean, we'd like the "countme" variable to act
as an increment-counter. That is, it would be "1" the first week, "2"
the second week, "3" the third week, and so on. This will let us sort
out short-lived test or CI infrastructure machines and get a better
picture of how systems are used over time, without tracking individual
systems. Optionally, we could have a cap on the maximum value to
mitigate risk of uniqueness for systems which have been running for a
very long time (it may be that there are only a few systems running
for exactly 327 weeks, for example). As the supported lifetime of a
Fedora release is about 30 months, a logical cutoff would be around 60
weeks — the counter could go from "59" to "old".
== Benefit to Fedora ==
* Better metrics overall
* Public stats page updated automatically
* Better knowledge of relative use of different variants
* Insight into Fedora's use in short-lived test systems and temporary
containers vs. longer-term installations
== Scope ==
* Proposal owners: work with DNF team and infrastructure to implement
the countme feature and corresponding backend data collection
* DNF team: feature work
* Maintainers of other package management tools: make sure feature
works in these cases as well
* Other developers: Spin maintainers should make sure that VARIANT_ID
is being set in /etc/os-release
* Release engineering: may need changes to fedora-repos package
** List of deliverables: affects all deliverables
* Policies and guidelines: none
* Trademark approval: none
== Upgrade/compatibility impact ==
Older versions will not have the counting enabled; we will keep
collecting stats in the traditional way for those systems.
== How To Test ==
Once the system is in place, we will see data collected.
== User Experience ==
User experience will not change. Users who wish to opt out of counting
will have an easy way to do so.
== Dependencies ==
== Contingency Plan ==
* Contingency mechanism: continue counting the old way
* Contingency deadline: does not block release; we can ship with the
feature incomplete, although it would certainly be most useful to have
it available at GA
* Blocks release? No
* Blocks product? No
== Documentation ==
Release notes need to be written, and documentation describing how to opt out.
== Release Notes ==
This needs to be written but depends on exact implementation.
Fedora Program Manager