----- Original Message -----
From: "Sam Whited" sam@samwhited.com To: "rprichard via golang-dev" golang-dev@googlegroups.com Sent: Monday, March 4, 2019 7:06:37 PM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
TL;DR — Right now when fetching a module Google isn't involved, and I'd like it to stay that way. Please let me configure and use different notaries without also using a proxy.
There is no plan to allow use of alternate notaries, which would add complexity and potentially reduce the overall security of the system, allowing different users to be attacked by compromising different notaries.
If you're somewhere that Google doesn't service (eg. due to U.S. export laws) does this mean you're entirely out of luck and can't use the new security features? It seems unfortunate that a developer in Iran would have to fall back to the TOFU model just because Google decided to bless themselves and not allow anyone else to run a notary or jump through extra hoops to configure or run a proxy.
We originally considered having multiple notaries signing individual go.sum entries and requiring the go command to collect signatures from a quorum of notaries before accepting an entry. That design depended on the uptime of multiple services and could still be compromised undetectably by compromising enough notaries. That is, that design would blindly trust a quorum of notaries.
This seems strictly superior to blindly trusting a single notary. Why does Google think it will be more secure than Google and Mozilla, or Google and Microsoft, or Google and <your company>?
As far as uptime is concerned, you're always limited to the uptime of your least available notary, I don't necessarily think Google's uptime will be any better than any other large company that might choose to run a notary, so that seems fine: I have the option of trusting only notaries with high availability, meaning that it's likely no worse than if I just trusted Google.
The design presented here uses the transparent log eliminates blind trust in a quorum of notaries and instead uses a “trust but verify” model with a single notary.
We could also "trust but verify" a quorum of notaries, so this seems like a false dichotomy.
On a more personal note, having a non-Google controlled notary seems like an absolute requirement to me and I would prefer to avoid using a Google run service entirely if possible. Google doesn't need to know anything about what modules my company is using. I do appreciate the privacy section, which addresses this, but tying the notary use to using a proxy or running your own proxy is likely out of reach for many individuals (including me).
—Sam
I have a similar fears and worries here. First thing that popped on my mind(personally as a external observer) this is a move of Google to fully productize(data mine) a Go users community(for very limited payback), even if it is not a main driver it will make it irresistibly easy to do so(with google hosted notary).
IMHO this feature should be opt-in and the actual default instance of the notary should be run by trustworthy and independent 3rd party(ideally NPO) with clear and transparent privacy policy and on community accessible infrastructure(so a community members can actually participate in the maintenance and verify policies surrounding it, i.e. not like a current Go infra is run, only Google employees can contribute/touch it).
As I will carefully evaluate all the details of the final implementation, but from this first look I'm currently leaning to "patch out" or de-configure by default this feature in Fedora when/if it lands in upstream GC, to preserve the privacy of the users.
JC
PS: This is just my personal Fedora's GC maintainer take on this issue.
On Mon, Mar 4, 2019, at 17:32, Russ Cox wrote:
Hi all,
I wanted to let people here know about the proposal design we just published: golang.org/design/25530-notary, for golang.org/issue/25530. (We also mentioned this general idea in our blog post from back in December, https://blog.golang.org/modules2019.)
As usual, comments are welcome on the issue or, if you prefer not to use the issue tracker, here in this thread.
Thanks. Russ
-- You received this message because you are subscribed to the Google Groups "golang-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang- dev+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
-- Sam Whited
-- You received this message because you are subscribed to the Google Groups "golang-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
On Thu, Mar 7, 2019 at 5:52 AM Jakub Cajka jcajka@redhat.com wrote:
As I will carefully evaluate all the details of the final implementation, but from this first look I'm currently leaning to "patch out" or de-configure by default this feature in Fedora when/if it lands in upstream GC, to preserve the privacy of the users.
Thanks for this feedback. I would only ask that you remember that publishing the design proposal is the *first* step in the public discussion, not the last. Please remember to check back in for the final result before you decide to apply local patches in Fedora.
Russ
----- Original Message -----
From: "Russ Cox" rsc@golang.org To: "Jakub Cajka" jcajka@redhat.com Cc: "Sam Whited" sam@samwhited.com, "rprichard via golang-dev" golang-dev@googlegroups.com, golang@lists.fedoraproject.org Sent: Thursday, March 7, 2019 9:02:50 PM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
On Thu, Mar 7, 2019 at 5:52 AM Jakub Cajka jcajka@redhat.com wrote:
As I will carefully evaluate all the details of the final implementation, but from this first look I'm currently leaning to "patch out" or de-configure by default this feature in Fedora when/if it lands in upstream GC, to preserve the privacy of the users.
Thanks for this feedback. I would only ask that you remember that publishing the design proposal is the *first* step in the public discussion, not the last. Please remember to check back in for the final result before you decide to apply local patches in Fedora.
Russ
I will be observing the future iterations. I'm looking forward to them :).
JC
Hi Jakub,
The notary part of Go modules, like the rest of the module implementation, suffers from a lack of understanding of integration and QA workflows, and a simplistic dev-centric worldview.
As designed, it will have to be patched out by every single entity trying to perform large scale Go code integration and QA, starting with the usual suspects, Linux distributions. Where that will leave upstream Go, since its main target platform is Linux, I have no idea (I suspect there will be some angst @Google about it).
I write this from the POW of the person that tries to update the Fedora Linux Go tooling so we can ship Go with module mode on in august. Whatever we do at that time will be mirrored by Fedora downstreams like RHEL and Centos and all the other Linux variants that get their inspiration Fedora-side.
It's not an armchair analysis, I already wrote way too much module- oriented custom code because the upstream tooling is deficient, and I've read my share of upstream issue reports.
CONTINUOUS TRANSPARENT INTERNET LOOKUPS ARE NOT ACCEPTABLE POST−DEV
When you integrate large volumes of code, targeting multiple hardware architectures, it’s not acceptable to have the result of the Arm QA run differ from the result of the x86_64 run just because it was scheduled some minutes later on the QA farm, the internet state changed in the meanwhile, and the Go tools decided to “help” you by looking at the internet behind your back and updated transparently the state being QAed. That's nothing new of Go specific, and that’s why our build system disables network access in build containers after populating them with the code to be built and tested, and already did so before someone decided to invent Go.
But, that QA reproducibility constrain was not understood by Go developers, and pretty much all the Go tools that were ported to module mode, will attempt Internet accesses at the slightest occasion, and change the project state based on those access results, with no ability to control or disable it, and nasty failure modes if the internet access fails.
The Go issue tracker is filling up with reports of people expressing their incredulity (or more) after hitting a variation of this problem on their QA systems.
QA−ED CODE IS NOT PRISTINE DEV CODE
No code is ever perfect.
When you integrate large volumes of code, targeting multiple hardware architectures, you *will* hit problems missed by the original upstream dev team. And you *will* have to patch them up, and you *will* have to build the result without waiting for upstream to look at it because, in production, the show must go on, some of those will eventually have zero-day security implications, and the upstreaming lag is not acceptable (assuming upstream is available and friendly, which is not necessarily the case).
So no serious shop is ever going to build its information system from pristine upstream dev code or pristine upstream Go modules. Pristine upstream code building is a nice ideal, but in the real world, it’s utopia.
If you don’t believe me, take your favorite app, and try to build a system containing it from genuine unadultered upstream code (that means, without relying on Fedora, Debian or whatever, since we all know they are full of “nasty” patched code).
Again, that technical real-world-is-not-perfect was not understood by Go developers. Sure they gave us the replace directive in Go module. But that is opt-in. So it only works on a small scale, when fixing third party code is an exception, and you only have a handful of projects that use this third-party code.
On a large scale everyone will just preventively rename everything by default just to retain the ability to perform fixes easily. You'll end up with blanket replacement of every "upstream" module name with "qaed/upstream" forks. And probably easier to rewrite all the imports to point to qaed/upstream by default in the source code.
That will result in a huge mess, much larger than the current GOPATH vendoring mess.
All because the design does not take into account the last mile QA patching that occurs before putting code in production.
Blindly enforcing renaming rules because you don't understand or accept the existence of QA middlemen produced things like Iceweasel. And I don't think Mozilla or Debian were ever happy about it.
PUBLISHING QA-ED CODE CAN NOT DEPEND ON A THIRD PARTY
At least, not if you want the result to work in a free software ecosystem. Freedom 4 “The freedom to distribute copies of your modified versions to others” is not “The freedom to distribute copies of your modified versions to others, but only if you tell this third party first”.
And Google is free to choose not to target a free software ecosystem, but that means all the entities that do target a free software ecosystem (like Linux distributions) or have happily built their infrastructure on free software products (like every single major cloud operator out there), will have fork Go, or reorient their software investments somewhere else, or some combination of those. And I don't think Google wants that, or will be happy about it.
But, that's the situation the current notary design will create. Because in trying to enforce a dev utopia, it is going to break the workflows of a large proportion of the current Go ecosystem.
Any working system will need a way to declare trust in a local authority (the shop QA team) or a set of authorities (the shop QA team, and trusted third parties), and not have this trust rely on continuous access to a server, regardless of who controls it (so detached digital signatures, not direct server hash lookups).
And I’ll stop here before I write things I will regret later.
Regards,
----- Original Message -----
From: "Nicolas Mailhot" nicolas.mailhot@laposte.net To: golang@lists.fedoraproject.org, "Sam Whited" sam@samwhited.com Cc: "rprichard via golang-dev" golang-dev@googlegroups.com Sent: Friday, March 8, 2019 11:06:44 AM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
Hi Jakub,
The notary part of Go modules, like the rest of the module implementation, suffers from a lack of understanding of integration and QA workflows, and a simplistic dev-centric worldview.
As designed, it will have to be patched out by every single entity trying to perform large scale Go code integration and QA, starting with the usual suspects, Linux distributions. Where that will leave upstream Go, since its main target platform is Linux, I have no idea (I suspect there will be some angst @Google about it).
I write this from the POW of the person that tries to update the Fedora Linux Go tooling so we can ship Go with module mode on in august. Whatever we do at that time will be mirrored by Fedora downstreams like RHEL and Centos and all the other Linux variants that get their inspiration Fedora-side.
It's not an armchair analysis, I already wrote way too much module- oriented custom code because the upstream tooling is deficient, and I've read my share of upstream issue reports.
CONTINUOUS TRANSPARENT INTERNET LOOKUPS ARE NOT ACCEPTABLE POST−DEV
When you integrate large volumes of code, targeting multiple hardware architectures, it’s not acceptable to have the result of the Arm QA run differ from the result of the x86_64 run just because it was scheduled some minutes later on the QA farm, the internet state changed in the meanwhile, and the Go tools decided to “help” you by looking at the internet behind your back and updated transparently the state being QAed. That's nothing new of Go specific, and that’s why our build system disables network access in build containers after populating them with the code to be built and tested, and already did so before someone decided to invent Go.
But, that QA reproducibility constrain was not understood by Go developers, and pretty much all the Go tools that were ported to module mode, will attempt Internet accesses at the slightest occasion, and change the project state based on those access results, with no ability to control or disable it, and nasty failure modes if the internet access fails.
The Go issue tracker is filling up with reports of people expressing their incredulity (or more) after hitting a variation of this problem on their QA systems.
QA−ED CODE IS NOT PRISTINE DEV CODE
No code is ever perfect.
When you integrate large volumes of code, targeting multiple hardware architectures, you *will* hit problems missed by the original upstream dev team. And you *will* have to patch them up, and you *will* have to build the result without waiting for upstream to look at it because, in production, the show must go on, some of those will eventually have zero-day security implications, and the upstreaming lag is not acceptable (assuming upstream is available and friendly, which is not necessarily the case).
So no serious shop is ever going to build its information system from pristine upstream dev code or pristine upstream Go modules. Pristine upstream code building is a nice ideal, but in the real world, it’s utopia.
If you don’t believe me, take your favorite app, and try to build a system containing it from genuine unadultered upstream code (that means, without relying on Fedora, Debian or whatever, since we all know they are full of “nasty” patched code).
Again, that technical real-world-is-not-perfect was not understood by Go developers. Sure they gave us the replace directive in Go module. But that is opt-in. So it only works on a small scale, when fixing third party code is an exception, and you only have a handful of projects that use this third-party code.
On a large scale everyone will just preventively rename everything by default just to retain the ability to perform fixes easily. You'll end up with blanket replacement of every "upstream" module name with "qaed/upstream" forks. And probably easier to rewrite all the imports to point to qaed/upstream by default in the source code.
That will result in a huge mess, much larger than the current GOPATH vendoring mess.
All because the design does not take into account the last mile QA patching that occurs before putting code in production.
Blindly enforcing renaming rules because you don't understand or accept the existence of QA middlemen produced things like Iceweasel. And I don't think Mozilla or Debian were ever happy about it.
PUBLISHING QA-ED CODE CAN NOT DEPEND ON A THIRD PARTY
At least, not if you want the result to work in a free software ecosystem. Freedom 4 “The freedom to distribute copies of your modified versions to others” is not “The freedom to distribute copies of your modified versions to others, but only if you tell this third party first”.
And Google is free to choose not to target a free software ecosystem, but that means all the entities that do target a free software ecosystem (like Linux distributions) or have happily built their infrastructure on free software products (like every single major cloud operator out there), will have fork Go, or reorient their software investments somewhere else, or some combination of those. And I don't think Google wants that, or will be happy about it.
But, that's the situation the current notary design will create. Because in trying to enforce a dev utopia, it is going to break the workflows of a large proportion of the current Go ecosystem.
Any working system will need a way to declare trust in a local authority (the shop QA team) or a set of authorities (the shop QA team, and trusted third parties), and not have this trust rely on continuous access to a server, regardless of who controls it (so detached digital signatures, not direct server hash lookups).
And I’ll stop here before I write things I will regret later.
Regards,
-- Nicolas Mailhot
Hello,
first this is kind of off topic to the notary issue discussed in the thread, so stating new one.
I'm happy to see that you have notice the go modules and complexities that it will bring along :). I have been trying to bring this up on the SIG and at meetings for past several months(thanks for bringing it up on the last meeting, just read the log), but nobody has been interested in discussing this ahead of time(I don't blame you/them). I don't believe that I'm the right person that you want to focus on with this long list of notes(what you are writing, I can't really do anything about). You should definitively bring this up on the upstream golang-dev list(AFAIK you need to be subscribed there to post), they are really keen for any feedback.
JC
golang mailing list -- golang@lists.fedoraproject.org To unsubscribe send an email to golang-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/golang@lists.fedoraproject.org
On Fri, Mar 8, 2019 at 8:09 AM 'Nicolas Mailhot' via golang-dev < golang-dev@googlegroups.com> wrote:
The notary part of Go modules, like the rest of the module implementation, suffers from a lack of understanding of integration and QA workflows, and a simplistic dev-centric worldview.
This is not an auspicious beginning. This mail came across as trying more to be antagonistic than constructive. Even so, I really would like to understand your concerns and either help address misunderstandings or adjust the design appropriately.
CONTINUOUS TRANSPARENT INTERNET LOOKUPS ARE NOT ACCEPTABLE POST−DEV
[snip]
But, that QA reproducibility constrain was not understood by Go developers, and pretty much all the Go tools that were ported to module mode, will attempt Internet accesses at the slightest occasion, and change the project state based on those access results, with no ability to control or disable it, and nasty failure modes if the internet access fails.
This really couldn't be farther from the truth.
You can turn off module changes using -mod=readonly in CI/CD systems. We added it explicitly for that use case. Quoting golang.org/issue/26361 (July 2018):
CI systems need a way to test what's in the go.mod and fail if something is missing instead of looking to satisfy dependencies automatically. We meant for -getmode=local to mean this, but it really means something a little bit different. We should add a -getmode that does mean "you can't change go.mod". Maybe -getmode=noauto.
The eventual spelling was -mod=readonly, and it shipped in Go 1.11 with the initial module preview.
More generally, we understand the importance of reproducibility. For example, see this post https://research.swtch.com/vgo-repro and this talk https://www.youtube.com/watch?v=F8nrpe0XWRg.
The Go issue tracker is filling up with reports of people expressing
their incredulity (or more) after hitting a variation of this problem on their QA systems.
If you could point to specific examples, that would help me understand your concerns a bit better. I did try to find some but failed:
$ issue cmd/go QA 26746 proposal: doc/install: define minimum supported VCS versions 14812 runtime: GC causes latency spikes $ issue is:closed cmd/go QA 24301 cmd/go: add package version support to Go toolchain 22491 proposal: cmd/link: support BUILD_PATH_PREFIX_MAP for reproducible binaries when built under varying path 27160 x/build: set up AIX builder 12660 x/mobile: support Xcode 7 $
QA−ED CODE IS NOT PRISTINE DEV CODE
No code is ever perfect.
When you integrate large volumes of code, targeting multiple hardware architectures, you *will* hit problems missed by the original upstream dev team. And you *will* have to patch them up, and you *will* have to build the result without waiting for upstream to look at it because, in production, the show must go on, some of those will eventually have zero-day security implications, and the upstreaming lag is not acceptable (assuming upstream is available and friendly, which is not necessarily the case).
So no serious shop is ever going to build its information system from pristine upstream dev code or pristine upstream Go modules. Pristine upstream code building is a nice ideal, but in the real world, it’s utopia.
If you don’t believe me, take your favorite app, and try to build a system containing it from genuine unadultered upstream code (that means, without relying on Fedora, Debian or whatever, since we all know they are full of “nasty” patched code).
Again, that technical real-world-is-not-perfect was not understood by Go developers. Sure they gave us the replace directive in Go module. But that is opt-in. So it only works on a small scale, when fixing third party code is an exception, and you only have a handful of projects that use this third-party code.
On a large scale everyone will just preventively rename everything by default just to retain the ability to perform fixes easily. You'll end up with blanket replacement of every "upstream" module name with "qaed/upstream" forks. And probably easier to rewrite all the imports to point to qaed/upstream by default in the source code.
That will result in a huge mess, much larger than the current GOPATH vendoring mess.
All because the design does not take into account the last mile QA patching that occurs before putting code in production.
Blindly enforcing renaming rules because you don't understand or accept the existence of QA middlemen produced things like Iceweasel. And I don't think Mozilla or Debian were ever happy about it.
I've read this a couple times and found it a little hard to follow, but I think what you are saying is that the go.sum and notary checks are going to cause serious problems because Fedora and other distributions want to create modified copies of module versions and use them for building the software they ship. I don't see why that would be the case.
I would have expected that if Fedora modified a library, they would give it a different version number, so that for example modifying v1.2.2 would produce v1.2.3-fedora.1 (it would be nice if SemVer had a kind of 'post-release' syntax; maybe some day). Go.sum and notary checks would only trigger at all if Fedora were to take v1.2.2, modify it, and then try to pass it off as the original v1.2.2, but that's indistinguishable from a man-in-the-middle attack. Adding a suffix to the version string to indicate the presence of Fedora-specific patches does not seem, at least to me, to be anywhere near the level of renaming Firefox to IceWeasel, and it avoids confusion about "which" v1.2.2 a build is using.
Furthermore, once you have created those versions and want to build software using them, all it takes is to create a new module with just a go.mod file (no source code) listing those versions as requirements and then run builds referring to the unmodified original top-level targets. Those targets' dependencies will be bumped forward to the Fedora-patched versions listed in the go.mod file, without any need to modify any of the client modules at all.
Or maybe I misunderstood what you were trying to say in this section.
PUBLISHING QA-ED CODE CAN NOT DEPEND ON A THIRD PARTY
At least, not if you want the result to work in a free software ecosystem. Freedom 4 “The freedom to distribute copies of your modified versions to others” is not “The freedom to distribute copies of your modified versions to others, but only if you tell this third party first”.
And Google is free to choose not to target a free software ecosystem, but that means all the entities that do target a free software ecosystem (like Linux distributions) or have happily built their infrastructure on free software products (like every single major cloud operator out there), will have fork Go, or reorient their software investments somewhere else, or some combination of those. And I don't think Google wants that, or will be happy about it.
But, that's the situation the current notary design will create. Because in trying to enforce a dev utopia, it is going to break the workflows of a large proportion of the current Go ecosystem.
I'm still not sure exactly what the dev utopia is. Can you help me understand better?
Is the utopia the fact there would be a consistent meaning for mymodule@v1.2.2 across all possible references and that any modified source code would have to present a modified version number? Is the inability to silently modify code without changing the version number what's going to break the workflows of a large proportion of the Go ecosystem? I'd like to understand that better if so.
Any working system will need a way to declare trust in a local authority (the shop QA team) or a set of authorities (the shop QA team, and trusted third parties), and not have this trust rely on continuous access to a server, regardless of who controls it (so detached digital signatures, not direct server hash lookups).
Your use of the word "continuous", both in this quoted text and in the heading earlier, makes me think you might not realize that the notary access is only needed when adding a new line to the go.sum file.
The steady state of a module that a module's go.mod file lists specific versions of all its direct dependencies, and its go.sum file lists the cryptographic hashes of all its direct and indirect dependencies. The only time the system falls out of the steady state is when a code change adds an import of a new module; the next build of that changed code reestablishes the steady state. And the 'go mod tidy' command's job is exactly to establish that state for all possible builds in the module. Once that state is established - once go.sum has a hash for each dependency needed in a build - there is *zero* notary access during any builds of that module. And as I mentioned above, a CI/CD system or any other system can use -mod=readonly (either on the command line or in the $GOFLAGS environment variable) to disable any attempt to reestablish the steady state (that is, disable any attempt to modify go.mod or go.sum) except by an explicit 'go get' or 'go mod tidy' command. So the notary access is occasional, predictable, and controllable. It is *not* continuous.
Put a different way, the notary is how the go command populates go.sum by default. If a QA system needs to populate it a different way, that's fine: one option is 'go get -insecure', another option is GONOVERIFY=*, and a third option is to just write the desired go.sum directly: it's a simple, line-oriented text file. Even if a QA team insists on modifying mymodule@v1.2.2 without giving it a new version, if they also update the go.sum files of modules using mymodule@v1.2.2, the builds will succeed, all without any notary access.
Stepping back from the specific details, again I am more than happy to understand everyone's concerns with the notary design and work to address them. We've been having a productive conversation on the GitHub issue, and I hope we can have a productive conversation here too.
Best, Russ
Le vendredi 08 mars 2019 à 10:28 -0500, Russ Cox a écrit :
On Fri, Mar 8, 2019 at 8:09 AM 'Nicolas Mailhot' via golang-dev < golang-dev@googlegroups.com> wrote:
The notary part of Go modules, like the rest of the module implementation, suffers from a lack of understanding of integration and QA workflows, and a simplistic dev-centric worldview.
This is not an auspicious beginning. This mail came across as trying more to be antagonistic than constructive
I'm sorry, my level of English is not sufficient to convey information you want to ignore, and dress it up so you feel good about it. I've tried to stick to plain facts. If you object to plain facts I can't do anything about it.
Even so, I really would like to understand your concerns and either help address misunderstandings or adjust the design appropriately.
CONTINUOUS TRANSPARENT INTERNET LOOKUPS ARE NOT ACCEPTABLE POST−DEV
[snip]
But, that QA reproducibility constrain was not understood by Go developers, and pretty much all the Go tools that were ported to module mode, will attempt Internet accesses at the slightest occasion, and change the project state based on those access results, with no ability to control or disable it, and nasty failure modes if the internet access fails.
This really couldn't be farther from the truth.
You can turn off module changes using -mod=readonly in CI/CD systems.
It would be nice if it where true. Unfortunately even a simple command like tell me what you know about the local code tree go list -json -mod=readonly ./...
will abort if the CI/CD system cuts network access to make sure builds do not depend on Internet state.
That's what the nasty failure modes are about. If the go command is not sure about anything, it will “solve” things by trying to download new bits from the Internet. If the Internet is not available, it will abort violently, not degrade gracefully and work with what it has.
The no-modification no-internet CI/CD constrain was not taken into account into the original design. It was bolted on later, and the bolting is imperfect. Just take a vacation in some paradisiac place with no internet access, and see how far you can do go coding in module mode. *That* will replicate our CI/CD constrains (paradisiac place as a bonus, we don't have those in our build farms – see I'm trying not to be antagonistic).
QA−ED CODE IS NOT PRISTINE DEV CODE
I've read this a couple times and found it a little hard to follow, but I think what you are saying is that the go.sum and notary checks are going to cause serious problems because Fedora and other distributions want to create modified copies of module versions and use them for building the software they ship. I don't see why that would be the case.
I would have expected that if Fedora modified a library, they would give it a different version number, so that for example modifying v1.2.2 would produce v1.2.3-fedora.1
And that's not the case neither for Fedora, nor RHEL, nor Debian, nor pretty much any large scale integrator, because when you integrate masses of third-party code you will eventually hit bugs in pretty much every component, so having to run patched code at every layer is the norm not the exception.
It would be terribly inconvenient to have to rename or renumber or replace everything, and then have to convince all the other components to use the renamed or renumbered versions of the components they depend on. It would be even more inconvenient to do it just in time and continuously flip flop between upstream and local names and numbers.
That would actually add friction to merging back and returning to pristine upstream code, because merging back would now cost a rename/renumber, instead of being a net win. Patching is forced on you (you hit a bug in upstream code). Upstreaming is a deliberate virtous choice (you've already fixed your problem locally). If you make upstreaming more expensive, people just stop doing it.
There are legal ways to force distributions to do the wasteful renaming/renumbering dance. They will take it as an hostile imposition. Where do you think Iceweasel came from? No love lost here.
Go.sum and notary checks would only trigger at all if Fedora were to take v1.2.2, modify it, and then try to pass it off as the original v1.2.2
That's the standard Linux distribution workflow.
but that's indistinguishable from a man-in-the-middle attack.
The whole purpose of a distribution, is to be a giant middleman, freely chosen by the end user, between the code released upstream, and this end user¹. All middlemen are not hostile. You need to understand that or your system will not work.
In the real world you have friendly middlemen. Sometimes layers of them (it is quite common for local organisations to add another level of changes over distro changes). Sometimes those middlemen make changes. Sometimes they just check things for nastiness.
A correct trust test is not “it exists exactly this way on the Internet”. What kind of assurance is that? If I dump malware on a public URL, it’s trusted as long as no AV touches it mid-flight?
A correct trust test is “has the state been signed by an entity the user trusts”.
So, digital signatures. A way to configure the public keys of trusted third parties. And, if no signature by a trusted third party is found locally, at last resort, and only if the user asked for it, look on the internet if a notary signature exists. Of course, that's a piss-poor level or trust, but better that than nothing.
Furthermore, once you have created those versions and want to build software using them, all it takes is to create a new module with just a go.mod file (no source code) listing those versions as requirements and then run builds referring to the unmodified original top-level targets. Those targets' dependencies will be bumped forward to the Fedora-patched versions listed in the go.mod file, without any need to modify any of the client modules at all.
# dnf repoquery --disablerepo=* --enablerepo=rawhide \ --whatprovides 'golang(*)' |wc -l 766
(here Debian people are laughing at me because we are lagging behind them on the Go integration front)
We have several hundreds more Go components in the integration queue waiting for review.
Some of those will end up as several Go modules.
And you want us to renumber all this, and patch all the other module files that use those with the new numbers, and redo it all every time something changes, all year round, just because the go command can’t accept that the code state Fedora builds from, may differ from the one observed on the Internet?
Really?
REALLY ?
Do you really REALLY think any sane integrator will play this dance long instead of patching out notary checks from its go command, and be done with it?
Is the utopia the fact there would be a consistent meaning for mymodule@v1.2.2 across all possible references and that any modified source code would have to present a modified version number?
The utopia is thinking that everything is released in a perfect state by upstream, so changes downstream need not happen, and therefore any downstream change is necessarily an attempt to inject malware by Mr Nasty.
Is the inability to silently modify code without changing the version number what's going to break the workflows of a large proportion of the Go ecosystem? I'd like to understand that better if so.
Distributions distinguish between the upstream version number and their own build id. So you have
mymodule@v1.2.2 release 1 mymodule@v1.2.2 release 2
and so on. Each release id is a separate build that can involve different patches (some release ids are more complex than a single number).
Only a single release can exist within the distribution at a given time, so makefiles, go mod files and so on need not differentiate between release X and release Y, they will only see one of those at any time and they better work with it because they won't be given or allowed any other.
System artefacts (libraries, binaries, go modules) are built from plain upstream source code, as downloaded by distribution processes directly from uptream VCS or website, after preparation (removal of problem parts, patching, etc) in pristine containers, isolated from the Internet, and populated only with a minimal distribution installation and the content of the distribution components necessary for their build.
So to build github.com/my/thing version x.y.z that declares in its module file
module github.com/my/thing
require ( github.com/some/dependency v1.2.3 github.com/another/dependency/v4 v4.0.0 )
A. We will populate a clean container with
1. a minimal system 2. the go compiler 3. the most recent system component that provides the go module github.com/some/dependency ≥ 1.2.3 and < 2 and all its dependencies (go module as produced by our own built of this other component, not as downloaded by go get from the internet) 4. the most recent system component that provides the go module github.com/another/dependency/v4 ≥ 4.0.0 and < 5 and all its dependencies (ditto) 5. the source code for github.com/my/thing x.y.z as downloaded and checked from the internet by a human (not by go get), and then sealed
B. We will cut internet access of this container
C. We will prepare the github.com/my/thing x.y.z source (remove problem parts, patch bugs, remove vendored code, remove at least local replaces in go.mod)
I'm fairly certain we will nuke go.sum because we want to build from our reference state against our reference versions, not reproduce whatever upstream tested on ubuntu or windows.
D. We will point the go compiler to the local module files (via GOPROXY)
E. We will ask a go command to transform our github.com/my/thing x.y.z source state in a module files that can be used by other Go code (zip ziphash info mod files) once deployed in our GOPROXY directory. It would be nice if go mod pack existed upstream otherwise we will write and use our own utility.
That will involve sanitizing the upstream mod file (remove indirect requires, remove local replaces, not sure about non-local replaces yes, by gut feeling is that we should remove them too and force the fixing of imports in source preparation, but I'm not sure yet)
For various technical reasons it's not possible to deploy directly in GOPROXY at this stage, so at this point you have the prepared github.com/my/thing x.y.z source code in unpacked state in a directory, the module it needs in GOPROXY, and a packed version of github.com/my/thing x.y.z in a staging directory.
D. we will ask the go compiler to build the various binaries that need to be produced from github.com/my/thing version x.y.z
E. we will pack the result files (binaries and go modules) in system components, so they can be deployed at need. The component containing github.com/my/thing x.y.z will record a need for github.com/some/dependency ≥ 1.2.3 and < 2 and github.com/another/dependency/v4 ≥ 4.0.0 and < 5
So the internet access won't exist when you think it exists, the files arrive on disk without go get, and the state they are available on is not the upstream state.
Having worked in a proprietary integration shop before, and working with proprietary integrators today, the workflows are no so different, except proprietary shops tend to be a lot laxer, allowing internet access when they should not, and either processing code as downloaded from the internet, with no checks, or forking it to death, without trying to re- attach to upstream.
Best regards,
¹ The function of this middleman is beat upstream code into shape so the user does not have to do it itself. Beating into shape does involve large-scale changing of upstream code.
Users choose the distribution system because their alternatives are: 1. to hope every single upstream they use never makes a mistake requiring a last-mile fix (fat chance on that) 2. wasting their time doing the last-mile fixing themselves, instead of delegating this function to distributors.
It's all free software. An unhappy distribution user can take the source code he needs and get it integrated somewhere else. No strings attached.
On Fri, Mar 8, 2019 at 4:52 PM 'Nicolas Mailhot' via golang-dev golang-dev@googlegroups.com wrote:
Le vendredi 08 mars 2019 à 10:28 -0500, Russ Cox a écrit :
On Fri, Mar 8, 2019 at 8:09 AM 'Nicolas Mailhot' via golang-dev < golang-dev@googlegroups.com> wrote:
The notary part of Go modules, like the rest of the module implementation, suffers from a lack of understanding of integration and QA workflows, and a simplistic dev-centric worldview.
This is not an auspicious beginning. This mail came across as trying more to be antagonistic than constructive
I'm sorry, my level of English is not sufficient to convey information you want to ignore, and dress it up so you feel good about it. I've tried to stick to plain facts. If you object to plain facts I can't do anything about it.
You say that this is information that we want to ignore. Why leads you to say such a thing? What makes you think that we want to ignore it? We have never claimed to know everything. Adding modules to the go tool is a work in progress.
Problems can be identified and solved but a necessary step is that everyone treat each other with respect. When you accuse others of wanting to ignore you, that makes your message seem like an attack, and indeed makes people more likely to ignore it. Your message would be a lot easier to read if you did just stick to plain facts.
Thanks.
Ian
On Fri, Mar 8, 2019 at 4:52 PM 'Nicolas Mailhot' via golang-dev < golang-dev@googlegroups.com> wrote:
It would be nice if it where true. Unfortunately even a simple command like tell me what you know about the local code tree go list -json -mod=readonly ./...
will abort if the CI/CD system cuts network access to make sure builds do not depend on Internet state.
The behavior you described sounds like an issue to me. However, I wasn't able to easily reproduce it locally: I tried running "strace -f go list -json -mod=readonly ./... |& grep connect" under a few scenarios (e.g., missing go.sum, or incomplete go.mod file), and none of them seemed to result in network access.
Can you provide more detailed reproduction steps?
The no-modification no-internet CI/CD constrain was not taken into
account into the original design. It was bolted on later, and the bolting is imperfect.
I don't think this is true. Google's internal build system has used this same "no-modification no-internet" constraint for around a decade:
http://google-engtools.blogspot.com/2011/09/build-in-cloud-distributing-buil... https://mike-bland.com/2012/10/01/tools.html#blaze-forge-srcfs-objfs
Supporting this usage pattern has been and remains very important to Go.
I would have expected that if Fedora modified a library, they
would give it a different version number, so that for example modifying v1.2.2 would produce v1.2.3-fedora.1
And that's not the case neither for Fedora, nor RHEL, nor Debian, nor pretty much any large scale integrator,
When I run "gcc --version" on my work Debian machine, I see:
gcc (Debian 7.3.0-5) 7.3.0
If Debian is okay naming their version of GCC "Debian 7.3.0-5", I'd think they're okay releasing a patched Go package as v1.2.3-debian.1.
You even seem to suggest this later when talking about "mymodule@v1.2.2 release 1" and "mymodule@v1.2.2 release 2." Russ is just talking about a different way of encoding those version strings.
because when you integrate
masses of third-party code you will eventually hit bugs in pretty much every component, so having to run patched code at every layer is the norm not the exception.
The need to make changes to open source packages is familiar to Google. E.g., Google's internal build system incorporates a lot of open source packages, and many need to be modified to accommodate Google's internal development idiosyncracies. Even the Go toolchain and standard library themselves are patched internally.
I think it would help if you could highlight the specific technical hurdles you're running into trying to bundle Go packages into RPMs. I expect the mechanisms are in place to do what you need, but I can believe the tooling could use improvements to handle Linux-distribution-scale integration efforts better, and I think the Go project is interested in supporting those efforts.
And you want us to renumber all this, and patch all the other module
files that use those with the new numbers, and redo it all every time something changes, all year round, just because the go command can’t accept that the code state Fedora builds from, may differ from the one observed on the Internet?
Really?
REALLY ?
Do you really REALLY think any sane integrator will play this dance long instead of patching out notary checks from its go command, and be done with it?
I would expect the amount of manual integration work needed would scale with the amount of local changes required, not with the amount of dependencies. E.g., if you have to modify a core Go module to work better on Fedora, I would think you make that one change, and your build system and tooling would handle automatically rebuilding dependencies as appropriate. I wouldn't expect you to have to manually renumber/rename every downstream dependency.
If you're finding that's not the case, please share the issues you're running into so they can be discussed concretely and addressed.
The utopia is thinking that everything is released in a perfect state by
upstream, so changes downstream need not happen,
As pointed out above, Google and Go's build systems are not built for this utopia, but for the real world where downstream changes are needed.
Finally, you included a detailed description of your package build system (which sounds very similar to Google's internal build system), but didn't seem to highlight how the Go module system or proposed Go notary system cause problems.
E.g., you mention that in step A.3 that you download and install the Fedora version of any package dependencies. I would think these packages can contain any extra metadata for step C to be able to rename/renumber the current package (if even necessary) without Internet access.
Last time this discussion came up in person, someone from the Go team suggested replacing all the go.mod files with new ones that are entirely based on package metadata -- i.e. declared dependencies, their installed location etc.
This seems like a reasonable solution, and it would be a pity if the notary made this impossible.
On Sat, Mar 9, 2019 at 3:01 AM 'Matthew Dempsky' via golang-dev golang-dev@googlegroups.com wrote:
On Fri, Mar 8, 2019 at 4:52 PM 'Nicolas Mailhot' via golang-dev golang-dev@googlegroups.com wrote:
It would be nice if it where true. Unfortunately even a simple command like tell me what you know about the local code tree go list -json -mod=readonly ./...
will abort if the CI/CD system cuts network access to make sure builds do not depend on Internet state.
The behavior you described sounds like an issue to me. However, I wasn't able to easily reproduce it locally: I tried running "strace -f go list -json -mod=readonly ./... |& grep connect" under a few scenarios (e.g., missing go.sum, or incomplete go.mod file), and none of them seemed to result in network access.
Can you provide more detailed reproduction steps?
The no-modification no-internet CI/CD constrain was not taken into account into the original design. It was bolted on later, and the bolting is imperfect.
I don't think this is true. Google's internal build system has used this same "no-modification no-internet" constraint for around a decade:
http://google-engtools.blogspot.com/2011/09/build-in-cloud-distributing-buil... https://mike-bland.com/2012/10/01/tools.html#blaze-forge-srcfs-objfs
Supporting this usage pattern has been and remains very important to Go.
I would have expected that if Fedora modified a library, they would give it a different version number, so that for example modifying v1.2.2 would produce v1.2.3-fedora.1
And that's not the case neither for Fedora, nor RHEL, nor Debian, nor pretty much any large scale integrator,
When I run "gcc --version" on my work Debian machine, I see:
gcc (Debian 7.3.0-5) 7.3.0If Debian is okay naming their version of GCC "Debian 7.3.0-5", I'd think they're okay releasing a patched Go package as v1.2.3-debian.1.
You even seem to suggest this later when talking about "mymodule@v1.2.2 release 1" and "mymodule@v1.2.2 release 2." Russ is just talking about a different way of encoding those version strings.
because when you integrate masses of third-party code you will eventually hit bugs in pretty much every component, so having to run patched code at every layer is the norm not the exception.
The need to make changes to open source packages is familiar to Google. E.g., Google's internal build system incorporates a lot of open source packages, and many need to be modified to accommodate Google's internal development idiosyncracies. Even the Go toolchain and standard library themselves are patched internally.
I think it would help if you could highlight the specific technical hurdles you're running into trying to bundle Go packages into RPMs. I expect the mechanisms are in place to do what you need, but I can believe the tooling could use improvements to handle Linux-distribution-scale integration efforts better, and I think the Go project is interested in supporting those efforts.
And you want us to renumber all this, and patch all the other module files that use those with the new numbers, and redo it all every time something changes, all year round, just because the go command can’t accept that the code state Fedora builds from, may differ from the one observed on the Internet?
Really?
REALLY ?
Do you really REALLY think any sane integrator will play this dance long instead of patching out notary checks from its go command, and be done with it?
I would expect the amount of manual integration work needed would scale with the amount of local changes required, not with the amount of dependencies. E.g., if you have to modify a core Go module to work better on Fedora, I would think you make that one change, and your build system and tooling would handle automatically rebuilding dependencies as appropriate. I wouldn't expect you to have to manually renumber/rename every downstream dependency.
If you're finding that's not the case, please share the issues you're running into so they can be discussed concretely and addressed.
The utopia is thinking that everything is released in a perfect state by upstream, so changes downstream need not happen,
As pointed out above, Google and Go's build systems are not built for this utopia, but for the real world where downstream changes are needed.
Finally, you included a detailed description of your package build system (which sounds very similar to Google's internal build system), but didn't seem to highlight how the Go module system or proposed Go notary system cause problems.
E.g., you mention that in step A.3 that you download and install the Fedora version of any package dependencies. I would think these packages can contain any extra metadata for step C to be able to rename/renumber the current package (if even necessary) without Internet access.
-- You received this message because you are subscribed to the Google Groups "golang-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Le vendredi 08 mars 2019 à 18:01 -0800, Matthew Dempsky a écrit :
On Fri, Mar 8, 2019 at 4:52 PM 'Nicolas Mailhot' via golang-dev < golang-dev@googlegroups.com> wrote:
It would be nice if it where true. Unfortunately even a simple command like tell me what you know about the local code tree go list -json -mod=readonly ./...
will abort if the CI/CD system cuts network access to make sure builds do not depend on Internet state.
The behavior you described sounds like an issue to me. However, I wasn't able to easily reproduce it locally: I tried running "strace -f go list -json -mod=readonly ./... |& grep connect" under a few scenarios (e.g., missing go.sum, or incomplete go.mod file), and none of them seemed to result in network access.
Can you provide more detailed reproduction steps?
That's because upstream go workflows (from a distribution POW) cheat by downloading blindly things from the internet before they have been checked, filling the go cache silently with them, and grandfathering directly downloaded third party code in a build as original source code.
We do not allow that. We make the promise to our users that we only make available things, that we created from direct upstream source code, and third party things, that already passed distribution QA checks in their own separate build.
And that, only if this those party things were identified as necessarily for the build¹ and installed (for go modules) in the GOPROXY directory used for this specific build via system components.
At the very very first steps of a build recipe when we ask the copy of the codebase being built "what are you, what are you composed of, what to you need for building" there is literally only the codebase being inspected and the go compiler available and GOPROXY is empty. Because we need the answer to this question, to translate it into system component ids, so the build container can be populated with those system components, making the module files available inside GOPROXY.
But in module mode, go will assume everything is available all the time, and if it is not available it can be downloaded directly from the internet, so instead of answering the question from direct first-level elements (the codebase and nothing else) it will try to work transitively and abort because second level code is not available yet.
I would have expected that if Fedora modified a library, they would give it a different version number, so that for example modifying v1.2.2 would produce v1.2.3-fedora.1
And that's not the case neither for Fedora, nor RHEL, nor Debian, nor pretty much any large scale integrator,
When I run "gcc --version" on my work Debian machine, I see:
gcc (Debian 7.3.0-5) 7.3.0
And when you run $ ls -l /usr/lib
you won't see any libfoo.so.x.y.z-debian or libfoo.so.x.y.z-5
Because the only libfoo.so.x.y.z allowed on system at any given time is the one built from (debian-patched) foo x.y.z source code, no matter if it's release 1 2 3 or 5 of the Debian build, no matter the level of Debian patching and massaging each build carries.
So giving the underlying build command the "you're building 7.3.0 build release 5 debian" info so it appears in version info of the binary is OK and encouraged (we'll even pass it as ldflags info to make sure it is recorded in reach binary).
But having the underlying build command try to discriminate between different build releases, or the build releases or the third party artefacts brought in the build root (like would happen if the build release was exposed to the golang semver resolution engine) is not OK.
The underlying build command has not business doing any processing on release info, it's system component metadata, the system component engine will give it the release it needs to work with at any given time, if the build command tries to second guess the system component engine things start breaking fast.
If Debian is okay naming their version of GCC "Debian 7.3.0-5", I'd think they're okay releasing a patched Go package as v1.2.3-debian.1.
You even seem to suggest this later when talking about " mymodule@v1.2.2 release 1" and "mymodule@v1.2.2 release 2." Russ is just talking about a different way of encoding those version strings.
because when you integrate masses of third-party code you will eventually hit bugs in pretty much every component, so having to run patched code at every layer is the norm not the exception.
The need to make changes to open source packages is familiar to Google. E.g., Google's internal build system incorporates a lot of open source packages, and many need to be modified to accommodate Google's internal development idiosyncracies. Even the Go toolchain and standard library themselves are patched internally.
I think it would help if you could highlight the specific technical hurdles you're running into trying to bundle Go packages into RPMs. I expect the mechanisms are in place to do what you need, but I can believe the tooling could use improvements to handle Linux- distribution-scale integration efforts better, and I think the Go project is interested in supporting those efforts.
Ok, thanks a lot, that would make things a lot simpler.
I'll list here the various commands we need implemented to plug go modules in our system. Implemented means by upstream go or ourselves, though we obviously would prefer it to be upstream go side, because that ensures they are not broken in the future by upstream go tooling changes, that other distros use the same commands as us, that we can limit divergence and share the QA and fixing burden.
I believe that the commands needed apt-side for Debian and Ubuntu, are pretty much the same, because apt/deb and dnf/yum/rpm have been on a convergent evolution track for a long time, and at this point their high-level architecture is the same, only implementation details differ (we customarily scrouge Debian patches and fixes and they scrouge ours, and we metoo each other's problem reports in upstream issue trackers).
Like I already wrote, our build process starts with a bare raw unpacked copy of upstream sources in a specific directory. This copy will have already been modified to remove things we do not like and patch out problems². Obviously the patching out is an iterative process that's why each release of a build may carry slightly different patches.
This copy typically does not include any VCS info³. Since go expects to find the module version information there, and does not record it in the project go.mod like other languages, life already sucks for us because we need to carry a variable with this info and pass it manually to later commands.
[available resources: — minimal system, — go compiler, — empty system GOPROXY, — prepared unpacked project sources in a specific directory]
A. We need a command to identify all the go module trees present in the source directory. A variation of $ find . -type f -name go.mod would probably work, but that lacks all the sanity checks the upstream go may want to apply now or in the future, to make sure a "go.mod" file is a legitimate go module descriptor.
[available resources: — minimal system, — go compiler, — empty system GOPROXY, — prepared unpacked project sources in a specific directory]
B. We need another command (or the same) that lists: 1. the go module name of each of those trees, 2. their consolidated first-level requirements (not the list of their individual first-level requirements)
Probably not both at once, or if both in a structured format we can reprocess. The most convenient command is <command> <directory> --provides → one module name per line <command> <directory> --requires → one constrain (module + associated version constrain) per line
The --requires command needs to filter out the modules names already present in the source tree, because in a multi-module X project that contains X/foo and X/bar modules, we want to build X/foo against the local version of X/bar, not some other one
(and we will have some auditing work on the --provides output to make sure every provides line actually belongs to the project being built and is not a copy of someone else's code)
I lack the necessary perspective today to say how replaces should be handled.
Local replaces clearly need ignoring, if some code needs the X third- party module, we want it to use our curated X module, not some project- local directory containing a fork of X.
Non-local replaces? That's less clear to me.
If a non local replace of B by C in module A means that every downstream user of module A will then use C instead of B when processing A code, that's fine with us, the fact B exists is an internal go compiler information, we only need to know C is needed in the A build container and in any container that uses the A module.
But, if a non local replace of B by C in module A only affects direct A builds, and indirect A builds will still use B, we'll just cry and yell and curse in a quiet corner, and then either remove the replacing in the A module file, or make it transitive by patching our version of A code imports statements to use C everywhere instead of A.
The result of the listing command will then be translated in rpm system identifiers, and used to populate the build GOPROXY.
Translation basically means: 1. applying some form of namespacing: “I need the go module named X” becomes “I need golang-module(X)” (sadly the golang() namespace is already taken by GOPATH sources, and can not be shared because go keeps GOPATH and module mode separate) 2. translating the golang version constrain syntax in rpm version constrain syntax. That's pretty easy for semver: “I need X@v1.2.3” becomes “I need golang-module(X) ≥ 1.2.3 and < 2”
For non-semver constrains like a specific hash-identified snapshot that's less nice, we do not allow hashes as valid version IDs rpm side (because hash ordering can not be deduced from the hashes themselves) so anything that absolutely require a specific snapshot of X will be translated into “I need golang-module(X)(commit=hash)” and golang-module(X)(commit=hash) is a separate ID from golang-module(X)
Because dependency cycles exist in the real-world, and because we are extra-careful to build only from things we already vetted, there will be cases where we won't be able to satisfy upstream project requirements
If A needs B that needs C that needs A we will have build one of A B C in reduced more at one point to break the cycle (we call that boostrapping). No idea if it's better to inform the listing command that one of the module needs it reports will be ignored, or if it's better to ignore it silently.
[available resources: — minimal system, — go compiler, — system GOPROXY populated with distro-produced go modules corresponding to identified project requirements, — prepared unpacked project sources in a specific director]
C. we need a go mod pack command that creates packed module files from the unpacked project sources into a staging GOPROXY directory.
We can pass it the project version as argument (since it can not read it in the upstream go.mod files).
Since VCS info is not available, the time component of the info file can not be populated with it. And, we definitely do not want the command to record the current time (because then the file content would depend on when it was built, and we have processes that check that arch-agnostic files files produced by different architecture builders are bit-for-bit identical, and we have security auditors that replay builds later in their own system and are alarmed when the replay produces different results). And, frankly, the time info in VCSes is messy and unreliable, and mtimes can not be used (the source preparation process can apply patches, and that will change mtimes, even if you apply the very same patch in two different runs).
Thus, we'd prefer this time info not to exist in our own info files, or be set to zero, or, if go really needs it, to pass it as parameter to go mod pack, (but that's more manual work our side to decide what to pass).
This command needs to operate in bulk on all the module trees previously identified (so if it found 3 module trees, create 3 packed modules in the staging GOPROXY), or a subset (explicitly provided module list).
This command needs to output the list of created module files (the zip ziphash info mod files). The list file is a special case, we would consider its file enveloppe as belonging to each of the versions of a module, and the file content belonging to no one in particular (ghost files in rpm tech). So it can be listed or not either way we will apply special processing to it. Probably, better to list it for cleanliness.
The "files created" list can be either outputted to stdout (in which case it outputs nothing else to stdout) or to a list file specified as argument. No particular preference, we can handle both.
This list is used to ventilate the produced files in specific system components: upstream project A may contain sources for A/B/C A/B/D A/E modules, and we can choose to put all of those in a single system component, or separate in two A/B/C+A/B/D and A/E, or go full granular with three system components. The level of splitting depends on the consequences in the system component dependency graph.
This go mod pack will need to be more discriminating than just “bulk pack every file under the go.mod root directory, .gitignore included” like I see go is doing right now.
If a project includes font files (for example the go font), we'll want to expose those in /usr/share/fonts so they are not restricted to the go compiler. If a project includes documentation files we will deploy them in /usr/share/doc not in the module zip file. And so on (protobuf files come to the mind)
So, typically, we'd want go mod pack to pack *only* the module tree Go source code (things that can be used by the go compiler) ideally only the source code that can be used on our platform (GOOS=linux), and vet the packing of any other file.
So a basic go mod pack invocation only packs go source files and testdata.
You have an info or dry run mode that says "default packing ignored/will ignore all those files in the origin tree"
You can pass lists of regexes to the go mod pack command to either include other non source code files, or to exclude already selected files from the packing (--include regex and --exclude regex flags).
The regexes are obviously very module specific, depend on the module resource needs and the distribution packaging policies. The generic go tooling need not worry about those policies, just apply the produced regexes.
Ideally, we could pass the go mod pack a --without module[@version] argument, that causes it to not pack any the code that requires module[@version]. That's necessary while bootstrapping and to cull very expensive unit or integration tests that require hundreds of third-party modules to run. Otherwise we can approximate it via exclude (but that will require more Fedora manual work to identify the required excludes).
Please understand, we are not doing all this filtering and culling in opposition to upstreams Go projects. We have no special wish to deviate from them. Our ideal world is usptream code releases that can be used directly as-is without any patching or filtering.
But, a lot of upstream projects lack discipline.
They will include files they have no legal right to distribute (and we will remove those before step A).
They will include files that can be distributed, but not modified (and entities like Fedora and Debian will think very hard if they really want to relay those, because they are contrary to our free software policies, and our users like to know that everything we ship can be modified without legal problems).
They will reference third-party modules with incompatible (or completely missing) licensing.
They will ship integration code that makes no sense outside of their own information system.
They will ship for years broken and bitrotten project/test/example code.
For them, continuing to include this code within their project is free. their go get will happily download hundreds of unveted and unnecessary Go modules from the internet. No one is likely to sue them for small legal mistakes.
But shipping this code is *not* free for us. A distribution is big enough it can be sued. Any failing unit test costs human time to check it is an harmless failure. Any module dep pulled by this unnecessary code is yet another software project that needs to be checked and audited and integrated and maintained at the system component level before it is available in the CI system.
So, we need to aggressively cull anything not necessary, to keep the our integration costs and risks down.
And we really really would appreciate if project unit tests (+ testdata) and project production code were split in separate zip files, with separate build requirements, so the unit test dependency costs were optional (we'd typically pay the project A unit test costs when integrating project A, not when integrating something that uses project A).
For all those reasons, the go modules files produced at this stage will definitely *not* match the hashes produced by the notary, even if we do not change a sigle line of go code.
[available resources: – minimal system, – go compiler, – system GOPROXY populated with distro-produced go modules corresponding to identified project requirements – prepared unpacked project sources in a specific directory, — staging GOPROXY containing candidate go modules corresponding to the built project]
D. we need a command to build the project binaries from the files in the system GOPROXY and the unpacked project sources, OR from the system GOPROXY and the staging GOPROXY contents
(ideally, the second option, to make sure the candidate module files in the staging directory are complete)
After this step we've hopefully finished producing files from the prepared unpacked project sources. At this stage rpm deploys every produced candidate file in a location that mirrors the target deployment paths, under a specific prefix
So: – a file in /usr/bin is a binary installed from existing system components – a file in /prefix/usr/bin is a candidate binary produced from the ongoing build — a file in GOPROXY is a Go module installed from existing system components — a file in /prefix/GOPROXY is a candidate Go module file produced from the ongoing build
[available resources: – minimal system, – go compiler, – system GOPROXY populated with distro-produced go modules corresponding to identified project requirements – prepared unpacked project sources in a specific directory, — candidate tree under /prefix containing all the files that will end up in new system modules. Some of those will replace existing files in /. ]
E. we need a command to run the unit tests of every produced go proxy module (in the staging GOPROXY). Any failure (returning error code) aborts the build process.
We'd really like this command to only take into account the candidate tree, and the existing / tree (for files that do not belong to the target system components).
[available resources: – minimal system, – go compiler, – system GOPROXY populated with distro-produced go modules corresponding to identified project requirements – prepared unpacked project sources in a specific directory, — candidate tree under /prefix containing all the files that will end up in new system modules. Some of those will replace existing files in /. ]
F. At this point rpm starts ventilating the files contained in the staging tree in new system components. Therefore, it needs to compute the metadata of each of those system components
So, we have a new run of "what are you and what do you need" command, this time on every mod/zip/zipphash/info fileset in /prefix/GOPROXY, instead of the content of the prepared unpacked project sources. So, probably some variation of the command in B.
Practically, those inspection runs can not be triggered by complex objects like filesets, only individual files, so we'll probably trigger them on the mod files, and assume that is the mod file is present, the rest is likely to be here too.
rpm keeps track or where each file will end up, and attributes the result of the query to the corresponding system component (it does not tell the queried fileset where it will end up)
So we now have brand-new shiny system components that contain clean audited GOPROXY module filesets, and declare they provide golang-module(foo) = x.y.z, and that they need golang-module(bar) ≥ a.b.c and < a+1
Installation of one of those system components will make the module fileset exist in GOPROXY
But, that is not sufficient for go, because of the list files.
So, we need a final command, that rpm can invoque when it adds or removes files in GOPROXY, to compute the list version indexes.
It could be approximated by a simple stupid shell script, but the go mod cache code upstream performs more sanity checks than that, and we'd like those sanity checks to be replicated in the "update list files" command.
And, that's pretty much all. We create clean new module files, we put them in system components, installation of those components causes the module files to exist in GOPROXY, and we'd like for the compiler to make use of those module files without bothering us with notary checks (that will fail pretty much all the time,due to how the whole integration process is structured). The system components are digitally signed but the build process has no access to the signing key (that is done on a separate system with an HSM for security reasons) so the build process can not generate detached signature files.
I would expect the amount of manual integration work needed would scale with the amount of local changes required, not with the amount of dependencies. E.g., if you have to modify a core Go module to work better on Fedora, I would think you make that one change, and your build system and tooling would handle automatically rebuilding dependencies as appropriate.
Yes that's how it works, but only because the underlying build commands are not supposed to discriminate between fedora build ids (releases)
We build mymodule@v1.2.2 with some level of fixing of patching (release 1). That creates a mod/zip/zipphash/info fileset
We build all the things that need mymodule@v1.2.2
Some time later a new issue causes us to adjust the fixing done for mymodule@v1.2.2
So we build mymodule@v1.2.2 release 2 with the new fixes. For a lot of languages that would be enough, and produce a new dll, transparently used by all downstream users.
With Go static building that creates a new mod/zip/zipphash/info fileset for mymodule@v1.2.2 (with new hashes)
and we have to perform the additional step of rebuild everything that uses mymodule@v1.2.2, recursively.
We don't want the compiler to complain about the hash change or to compare it to some internet notary.
We can certainly pass the release id info to the pack command in C so it is recorded somewhere in the produced mod/zip/zipphash/info fileset and each release fileset is unique.
That is, as long as it is only recorded for information purposes, and the go command does not try to treat the result as different from mymodule@v1.2.2, to discriminate between releases, or request a specific release.
Best regards,
¹ Because first, sometimes the on-disk representation of two components will clash. That's not supposed to happen with Go modules but our system is not Go specific, so it was not specced around Go module particularities, and who knows at this point if the module no-clash design will be robust WRT the weird things upstreams like to invent in the real word.
And second, we want to be sure that what we know about component relationships is the truth, so security teams can rely on the system component dependency graph when analysing incidents. The best way to make sure it is the truth is to disallow anything not identified in the CI container.
² (vendored copies of third party code, things we can not distribute legaly like for example arial.ttf when upstream needs a "free" font and does not understand copyright law, etc).
³ For historical reasons, because out CI infrastructure design antedates modern VCSes, and because we absolutely do not want the build commands to start pulling things from VCS history that do not match the version we're attempting to build
Quoting 'Matthew Dempsky' via golang-dev (2019-03-08 21:01:32)
Can you provide more detailed reproduction steps?
Here's a reproducer:
$ go mod init example.org/my-project go: creating new go.mod: module example.org/my-project $ cat >> main.go << "EOF" package main import _ "example.com/other-module" func main() {} EOF $ go list -mod=readonly ./... build example.org/my-project: cannot load example.com/other-module: import lookup disabled by -mod=readonly
I think we'd both mis-understood Nicolas's complaint the first time -- it's not that it still tries to access the network with -mod=readonly, but that even for basic commands for which a read-only mod file/no network access really *shouldn't* be a problem, it will still just bomb out.
I'm not familiar with the code, but this sounds like early in the common program logic there's a check as to whether any of the module data needs updating, then abort if it can't be done -- before even looking at what the user has asked us to do.
I agree this is kinda sad; it seems like it would be better if the go tool still tried to fulfill the user's request to the extent possible when -mod=readonly is set.
-Ian
On Sat, Mar 9, 2019 at 2:55 PM Ian Denhardt ian@zenhack.net wrote:
Here's a reproducer:
$ go mod init example.org/my-project go: creating new go.mod: module example.org/my-project $ cat >> main.go << "EOF" package main import _ "example.com/other-module" func main() {} EOF $ go list -mod=readonly ./... build example.org/my-project: cannot load example.com/other-module: import lookup disabled by -mod=readonly
I think we'd both mis-understood Nicolas's complaint the first time -- it's not that it still tries to access the network with -mod=readonly, but that even for basic commands for which a read-only mod file/no network access really *shouldn't* be a problem, it will still just bomb out.
The go command is "bombing out" here because the code in question is trying to import "example.com/other-module", yet no dependency listed in go.mod provides that package (there are no dependencies in go.mod at all!). [One of the things go list computes is the dependency graph (visible if you use go list -json or go list -f '{{.Deps}}').]
It's true that if you have a CI/CD system, then for a successful build you need to push complete code to it, and that includes a go.mod with listed dependencies sufficient to cover the relevant imports, as I described in my original reply to Nicolas. If go.mod were up to date (for example, if 'go mod tidy' found no work to do) and the local Go module cache already contained the listed dependencies, then the 'go list -mod=readonly' command you showed would complete just fine. You can't push code with an incomplete dependency list to the CI/CD system and expect it to build, any more than you can push code with missing function definitions and expect it to build.
Best, Russ
Le 2019-03-11 15:43, Russ Cox a écrit :
On Sat, Mar 9, 2019 at 2:55 PM Ian Denhardt ian@zenhack.net wrote:
Here's a reproducer:
$ go mod init example.org/my-project [1] go: creating new go.mod: module example.org/my-project [1] $ cat >> main.go << "EOF" package main import _ "example.com/other-module [2]" func main() {} EOF $ go list -mod=readonly ./... build example.org/my-project [1]: cannot load
example.com/other-module [2]:
import lookup disabled by -mod=readonly
I think we'd both mis-understood Nicolas's complaint the first time
it's not that it still tries to access the network with -mod=readonly, but that even for basic commands for which a read-only mod file/no network access really *shouldn't* be a problem, it will still just bomb out.
The go command is "bombing out" here because the code in question is trying to import "example.com/other-module [2]", yet no dependency listed in go.mod provides that package (there are no dependencies in go.mod at all!). [One of the things go list computes is the dependency graph (visible if you use go list -json or go list -f '{{.Deps}}').]
It's true that if you have a CI/CD system, then for a successful build you need to push complete code to it, and that includes a go.mod with listed dependencies sufficient to cover the relevant imports, as I described in my original reply to Nicolas. If go.mod were up to date (for example, if 'go mod tidy' found no work to do) and the local Go module cache already contained the listed dependencies, then the 'go list -mod=readonly' command you showed would complete just fine. You can't push code with an incomplete dependency list to the CI/CD system and expect it to build, any more than you can push code with missing function definitions and expect it to build.
And as I have explained in the detailed description Matthew requested, our construction of the CI/CD environment is incremental, so the assumption in the go tool code that "everything is there and it it is not it can be downloaded directly" does not work for us
Regards,
On Mon, Mar 11, 2019 at 8:01 AM Nicolas Mailhot nicolas.mailhot@laposte.net wrote:
And as I have explained in the detailed description Matthew requested,
To be clear, I was asking for details to reproduce the technical issues you're running into. I may have missed them, but I don't believe you've provided these.
our construction of the CI/CD environment is incremental, so the
assumption in the go tool code that "everything is there and it it is not it can be downloaded directly" does not work for us
If your build system provides the Go source/packages, then Go won't try to download them directly itself. This is how Go works within Google's build system (which, again, has the same no-network-access limitation as the build system you're describing, and yet supports building Go programs/packages).
It sounds like you're having trouble with providing the Go source/packages in the format expected by cmd/go. If you would provide reproduction steps of what you're doing and the problems you're running into, we can help advise what changes you need to make, and maybe identify tooling improvements to make that easier.
* via golang-dev:
If your build system provides the Go source/packages, then Go won't try to download them directly itself. This is how Go works within Google's build system (which, again, has the same no-network-access limitation as the build system you're describing, and yet supports building Go programs/packages).
How do you plan to bypass the notary requirement? Do you have a non-Internet connection to it?
Or do you pre-populate the expected data before starting the builds, similar to OCSP stapling?
Thanks, Florian
On Mon, Mar 11, 2019 at 11:09 AM Florian Weimer fweimer@redhat.com wrote:
How do you plan to bypass the notary requirement?
I don't work on the Go-inside-Google integration work, so someone else may have to weigh in if it's appropriate to share those plans. My point with mentioning Google's hermetic build system is more to emphasize that the Go developers are very familiar with that build model and care about keeping it working. The claims that the Go module system is being built without consideration of those requirements seem very mistaken.
That said, the notary is only involved when adding new lines to the go.sum file to handle adding or updating dependencies. There's no requirement to contact the Go notary when the go.sum file is already complete.
Le 2019-03-11 19:15, Matthew Dempsky a écrit :
On Mon, Mar 11, 2019 at 11:09 AM Florian Weimer fweimer@redhat.com wrote:
How do you plan to bypass the notary requirement?
I don't work on the Go-inside-Google integration work, so someone else may have to weigh in if it's appropriate to share those plans. My point with mentioning Google's hermetic build system is more to emphasize that the Go developers are very familiar with that build model and care about keeping it working. The claims that the Go module system is being built without consideration of those requirements seem very mistaken.
That said, the notary is only involved when adding new lines to the go.sum file to handle adding or updating dependencies. There's no requirement to contact the Go notary when the go.sum file is already complete.
Which we be pretty much the case all the time for us, since we build against out own curated set of modules, not the ones upstream found on the Internet.
In fact, I'm pretty sure we will start each build by removing the upstream go.sum file altogether.
Regards,
Quoting Matthew Dempsky (2019-03-11 13:54:07)
On Mon, Mar 11, 2019 at 8:01 AM Nicolas Mailhot <[1]nicolas.mailhot@laposte.net> wrote:
And as I have explained in the detailed description Matthew requested,To be clear, I was asking for details to reproduce the technical issues you're running into. I may have missed them, but I don't believe you've provided these.
our construction of the CI/CD environment is incremental, so the assumption in the go tool code that "everything is there and it it is not it can be downloaded directly" does not work for usIf your build system provides the Go source/packages, then Go won't try to download them directly itself. This is how Go works within Google's build system (which, again, has the same no-network-access limitation as the build system you're describing, and yet supports building Go programs/packages). It sounds like you're having trouble with providing the Go source/packages in the format expected by cmd/go. If you would provide reproduction steps of what you're doing and the problems you're running into, we can help advise what changes you need to make, and maybe identify tooling improvements to make that easier.
In a parallel thread, a Nix developer was asking about basically the same use case, and was pointed at `go build -mod=vendor`. It seems like this does exactly what is wanted here -- just use the code we have locally. Nicolas, does that address your use case?
-Ian
Le 2019-03-11 19:18, Ian Denhardt a écrit :
Quoting Matthew Dempsky (2019-03-11 13:54:07)
On Mon, Mar 11, 2019 at 8:01 AM Nicolas Mailhot <[1]nicolas.mailhot@laposte.net> wrote:
And as I have explained in the detailed description Matthew requested,To be clear, I was asking for details to reproduce the technical issues you're running into. I may have missed them, but I don't believe you've provided these.
our construction of the CI/CD environment is incremental, so the assumption in the go tool code that "everything is there and itit is not it can be downloaded directly" does not work for us
If your build system provides the Go source/packages, then Go won't try to download them directly itself. This is how Go works within Google's build system (which, again, has the same no-network-access limitation as the build system you're describing, and yet supports building Go programs/packages). It sounds like you're having trouble with providing the Go source/packages in the format expected by cmd/go. If you would provide reproduction steps of what you're doing and the problems you're running into, we can help advise what changes you need to make, and maybe identify tooling improvements to make that easier.
In a parallel thread, a Nix developer was asking about basically the same use case, and was pointed at `go build -mod=vendor`. It seems like this does exactly what is wanted here -- just use the code we have locally. Nicolas, does that address your use case?
That's definitely *not* what we want (our processes call for removing vendor and its equivalents in other languages before going to the build step). We do *not* like vendor. We do *not* like GOPATH. Modules are good. As long as we can create and use *our* modules, not someone else's idea of how a module content should look like. Our modules are basically the same thing as upstream modules (API compatible) with lots of warts removed.
As I already noted, there is a deep lack of understanding Go upstream side of how we work (for all computer system languages, not just for Go). What they imagine is not what we need or want.
Regards,
Quoting Nicolas Mailhot (2019-03-12 04:22:45)
In a parallel thread, a Nix developer was asking about basically the same use case, and was pointed at `go build -mod=vendor`. It seems like this does exactly what is wanted here -- just use the code we have locally. Nicolas, does that address your use case?
That's definitely *not* what we want (our processes call for removing vendor and its equivalents in other languages before going to the build step).
Not sure we're on the same page re: what was suggested. The idea is you'd point the vendor directory at something that contains the distro's versions of the dependencies, and use -mod=vendor to bypass all of the smarts around fetching modules and such. I was not proposing using an existing vendor directory provided by upstream or anything like that.
This is more or less what Jakub described in a sibling comment, as far as I can tell.
As I already noted, there is a deep lack of understanding Go upstream side of how we work (for all computer system languages, not just for Go). What they imagine is not what we need or want.
This definitely jives with my own memories from the last time I was a maintainer on a distro -- language package managers and module systems generally tend to be much more oriented towards developers using the modules than to distro maintainers packaging them. It's to the point where as an end-user of a distro I have a boat load of executables in places like ~/.local/bin, because I use enough oddball tools that aren't popular enough to be maintained by the distros themselves, and packaging them myself just isn't worth the trouble. I do wish these tools played more nicely together.
Le 2019-03-13 03:24, Ian Denhardt a écrit :
Quoting Nicolas Mailhot (2019-03-12 04:22:45)
In a parallel thread, a Nix developer was asking about basically the same use case, and was pointed at `go build -mod=vendor`. It seems like this does exactly what is wanted here -- just use the code we have locally. Nicolas, does that address your use case?
That's definitely *not* what we want (our processes call for removing vendor and its equivalents in other languages before going to the build step).
Not sure we're on the same page re: what was suggested. The idea is you'd point the vendor directory at something that contains the distro's versions of the dependencies, and use -mod=vendor to bypass all of the smarts around fetching modules and such.
But, again, I don't *want* a vendor directory, distro or otherwise.
vendor (and GOPATH) have always been a major PITA to assemble and maintain (and I speak as the person, that wrote at least half of the code that assembles and maintains them Fedora-side).
Ideally, I'd like proper shared libs, because rebuilding every dependent on changes is an obstacle to robust security (you forget one rebuild and poof, you're owned).
Baring that, I'd settle for a directory of modules, which was what GOPROXY was supposed to deliver, before Go upstream embarked in its mad crusade to squeeze out anything between dev and production.
On Wed, Mar 13, 2019, 09:14 Nicolas Mailhot nicolas.mailhot@laposte.net wrote:
Le 2019-03-13 03:24, Ian Denhardt a écrit :
Quoting Nicolas Mailhot (2019-03-12 04:22:45)
In a parallel thread, a Nix developer was asking about basically the same use case, and was pointed at `go build -mod=vendor`. It seems
like
this does exactly what is wanted here -- just use the code we have locally. Nicolas, does that address your use case?
That's definitely *not* what we want (our processes call for removing vendor and its equivalents in other languages before going to the build step).
Not sure we're on the same page re: what was suggested. The idea is you'd point the vendor directory at something that contains the distro's versions of the dependencies, and use -mod=vendor to bypass all of the smarts around fetching modules and such.
But, again, I don't *want* a vendor directory, distro or otherwise.
vendor (and GOPATH) have always been a major PITA to assemble and maintain (and I speak as the person, that wrote at least half of the code that assembles and maintains them Fedora-side).
Ideally, I'd like proper shared libs, because rebuilding every dependent on changes is an obstacle to robust security (you forget one rebuild and poof, you're owned).
Go has supported -buildmode=shared on all major architectures for some time. I'm curious why nobody uses that yet.
Fabio
Baring that, I'd settle for a directory of modules, which was what GOPROXY was supposed to deliver, before Go upstream embarked in its mad crusade to squeeze out anything between dev and production.
-- Nicolas Mailhot _______________________________________________ golang mailing list -- golang@lists.fedoraproject.org To unsubscribe send an email to golang-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/golang@lists.fedoraproject.org
----- Original Message -----
From: "Fabio Valentini" decathorpe@gmail.com To: golang@lists.fedoraproject.org Sent: Wednesday, March 13, 2019 10:07:35 AM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
On Wed, Mar 13, 2019, 09:14 Nicolas Mailhot < nicolas.mailhot@laposte.net > wrote:
Le 2019-03-13 03:24, Ian Denhardt a écrit :
Quoting Nicolas Mailhot (2019-03-12 04:22:45)
In a parallel thread, a Nix developer was asking about basically the same use case, and was pointed at `go build -mod=vendor`. It seems like this does exactly what is wanted here -- just use the code we have locally. Nicolas, does that address your use case?
That's definitely *not* what we want (our processes call for removing vendor and its equivalents in other languages before going to the build step).
Not sure we're on the same page re: what was suggested. The idea is you'd point the vendor directory at something that contains the distro's versions of the dependencies, and use -mod=vendor to bypass all of the smarts around fetching modules and such.
But, again, I don't *want* a vendor directory, distro or otherwise.
vendor (and GOPATH) have always been a major PITA to assemble and maintain (and I speak as the person, that wrote at least half of the code that assembles and maintains them Fedora-side).
Ideally, I'd like proper shared libs, because rebuilding every dependent on changes is an obstacle to robust security (you forget one rebuild and poof, you're owned).
Go has supported -buildmode=shared on all major architectures for some time. I'm curious why nobody uses that yet.
Fabio
From the Fedora perspective it is due to the stability of the upstream code(API changes) there is nearly no effective difference compared to the non shared(you will have to rebuild "everything" anyway). I have been toying for a long time with switching at least sdtlib to be dynamically linked in the BR, but haven't got around to push it out(i.e. coerce maintainers to BR it, IMHO we have bigger nuts to crack atm). Other thing is that ABI for the dynamic libs is not guaranteed between the Go releases, on other hand with my experiments I haven't seen breaking breakages, but it is not guaranteed. I might have been just lucky with my fairly small and simple test cases.
On practical note, nobody is barring you from BR-ing golang-shared(it currently exists on all arches) and actually start building with dynamically linked stdlib. With current pre-Go-module environment it should just work, you don't even need to explicitly require it at runtime. RPM magic will require it automatically.
I guess from the regular Go user point of view is that static linking(of Go code bits) is still the new and shining(hey I don't need any runtime deps, apart from libc), I guess. And nobody really rediscovered all the pitfalls in production.
JC
Baring that, I'd settle for a directory of modules, which was what GOPROXY was supposed to deliver, before Go upstream embarked in its mad crusade to squeeze out anything between dev and production.
-- Nicolas Mailhot _______________________________________________ golang mailing list -- golang@lists.fedoraproject.org To unsubscribe send an email to golang-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/golang@lists.fedoraproject.org
golang mailing list -- golang@lists.fedoraproject.org To unsubscribe send an email to golang-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/golang@lists.fedoraproject.org
* Jakub Cajka:
From the Fedora perspective it is due to the stability of the upstream code(API changes) there is nearly no effective difference compared to the non shared(you will have to rebuild "everything" anyway). I have been toying for a long time with switching at least sdtlib to be dynamically linked in the BR, but haven't got around to push it out(i.e. coerce maintainers to BR it, IMHO we have bigger nuts to crack atm).
I don't think linking the run time dynamically is feasible.
The Go 1 compatibility strategy does not mention not adding private fields to structs as a breaking change, for example:
https://golang.org/doc/go1compat
But each time you allocate a struct, the caller inlines the size, so adding private fields is very much a breaking change.
Thanks, Florian
----- Original Message -----
From: "Florian Weimer" fweimer@redhat.com To: "Jakub Cajka" jcajka@redhat.com Cc: golang@lists.fedoraproject.org Sent: Wednesday, March 13, 2019 10:58:16 AM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
- Jakub Cajka:
From the Fedora perspective it is due to the stability of the upstream code(API changes) there is nearly no effective difference compared to the non shared(you will have to rebuild "everything" anyway). I have been toying for a long time with switching at least sdtlib to be dynamically linked in the BR, but haven't got around to push it out(i.e. coerce maintainers to BR it, IMHO we have bigger nuts to crack atm).
I don't think linking the run time dynamically is feasible.
The Go 1 compatibility strategy does not mention not adding private fields to structs as a breaking change, for example:
https://golang.org/doc/go1compat
But each time you allocate a struct, the caller inlines the size, so adding private fields is very much a breaking change.
Thanks, Florian
This is not(shouldn't be) a runtime, just the standard library.
JC
golang mailing list -- golang@lists.fedoraproject.org To unsubscribe send an email to golang-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/golang@lists.fedoraproject.org
* Jakub Cajka:
----- Original Message -----
From: "Florian Weimer" fweimer@redhat.com To: "Jakub Cajka" jcajka@redhat.com Cc: golang@lists.fedoraproject.org Sent: Wednesday, March 13, 2019 10:58:16 AM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
- Jakub Cajka:
From the Fedora perspective it is due to the stability of the upstream code(API changes) there is nearly no effective difference compared to the non shared(you will have to rebuild "everything" anyway). I have been toying for a long time with switching at least sdtlib to be dynamically linked in the BR, but haven't got around to push it out(i.e. coerce maintainers to BR it, IMHO we have bigger nuts to crack atm).
I don't think linking the run time dynamically is feasible.
The Go 1 compatibility strategy does not mention not adding private fields to structs as a breaking change, for example:
https://golang.org/doc/go1compat
But each time you allocate a struct, the caller inlines the size, so adding private fields is very much a breaking change.
Thanks, Florian
This is not(shouldn't be) a runtime, just the standard library.
Uhm, what's the difference between the two? There is no separate run-time library.
Thanks, Florian
----- Original Message -----
From: "Florian Weimer" fweimer@redhat.com To: "Jakub Cajka" jcajka@redhat.com Cc: golang@lists.fedoraproject.org Sent: Wednesday, March 13, 2019 11:44:09 AM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
- Jakub Cajka:
----- Original Message -----
From: "Florian Weimer" fweimer@redhat.com To: "Jakub Cajka" jcajka@redhat.com Cc: golang@lists.fedoraproject.org Sent: Wednesday, March 13, 2019 10:58:16 AM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
- Jakub Cajka:
From the Fedora perspective it is due to the stability of the upstream code(API changes) there is nearly no effective difference compared to the non shared(you will have to rebuild "everything" anyway). I have been toying for a long time with switching at least sdtlib to be dynamically linked in the BR, but haven't got around to push it out(i.e. coerce maintainers to BR it, IMHO we have bigger nuts to crack atm).
I don't think linking the run time dynamically is feasible.
The Go 1 compatibility strategy does not mention not adding private fields to structs as a breaking change, for example:
https://golang.org/doc/go1compat
But each time you allocate a struct, the caller inlines the size, so adding private fields is very much a breaking change.
Thanks, Florian
This is not(shouldn't be) a runtime, just the standard library.
Uhm, what's the difference between the two? There is no separate run-time library.
Thanks, Florian
I have a feeling that we have had this discussion in the past. If I'm not mistaken then https://golang.org/pkg/ is the stdlib(including "runtime" package that enables interaction with runtime) and runtime is separated part of a language/compiler/produced binary, i.e. it is baked in to each binary it is not "public" lib in C sense and AFAIK can't be linked in dynamically. For reference https://golang.org/doc/faq#runtime . Hope I'm getting it right.
JC
* Jakub Cajka:
I have a feeling that we have had this discussion in the past. If I'm not mistaken then https://golang.org/pkg/ is the stdlib(including "runtime" package that enables interaction with runtime) and runtime is separated part of a language/compiler/produced binary, i.e. it is baked in to each binary it is not "public" lib in C sense and AFAIK can't be linked in dynamically. For reference https://golang.org/doc/faq#runtime . Hope I'm getting it right.
I expect that the garbage collector would be linked dynamically as well. It should be easy enough to check by disassembling a dynamically linked Go program. However, -linkshared does not currently work in Fedora 29 (after installing golang-shared):
$ go build -linkshared t.go go build runtime/internal/atomic: open /usr/lib/golang/pkg/linux_amd64_dynlink/runtime/internal/atomic.a: permission denied go build internal/cpu: open /usr/lib/golang/pkg/linux_amd64_dynlink/internal/cpu.a: permission denied go build sync/atomic: open /usr/lib/golang/pkg/linux_amd64_dynlink/sync/atomic.a: permission denied go build runtime/cgo: open /usr/lib/golang/pkg/linux_amd64_dynlink/runtime/cgo.a: permission denied go build vendor/golang_org/x/crypto/curve25519: open /usr/lib/golang/pkg/linux_amd64_dynlink/vendor/golang_org/x/crypto/curve25519.a: permission denied
Thanks, Florian
----- Original Message -----
From: "Florian Weimer" fweimer@redhat.com To: "Jakub Cajka" jcajka@redhat.com Cc: golang@lists.fedoraproject.org Sent: Wednesday, March 13, 2019 12:56:57 PM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
- Jakub Cajka:
I have a feeling that we have had this discussion in the past. If I'm not mistaken then https://golang.org/pkg/ is the stdlib(including "runtime" package that enables interaction with runtime) and runtime is separated part of a language/compiler/produced binary, i.e. it is baked in to each binary it is not "public" lib in C sense and AFAIK can't be linked in dynamically. For reference https://golang.org/doc/faq#runtime . Hope I'm getting it right.
I expect that the garbage collector would be linked dynamically as well. It should be easy enough to check by disassembling a dynamically linked Go program. However, -linkshared does not currently work in Fedora 29 (after installing golang-shared):
$ go build -linkshared t.go go build runtime/internal/atomic: open /usr/lib/golang/pkg/linux_amd64_dynlink/runtime/internal/atomic.a: permission denied go build internal/cpu: open /usr/lib/golang/pkg/linux_amd64_dynlink/internal/cpu.a: permission denied go build sync/atomic: open /usr/lib/golang/pkg/linux_amd64_dynlink/sync/atomic.a: permission denied go build runtime/cgo: open /usr/lib/golang/pkg/linux_amd64_dynlink/runtime/cgo.a: permission denied go build vendor/golang_org/x/crypto/curve25519: open /usr/lib/golang/pkg/linux_amd64_dynlink/vendor/golang_org/x/crypto/curve25519.a: permission denied
Thanks, Florian
Interesting. I haven't touched and put much time in to that since doing the initial scoping ~3y ago(so I would take anything that I have said with grain of salt). Would you mind opening BZ?
JC
golang mailing list -- golang@lists.fedoraproject.org To unsubscribe send an email to golang-leave@lists.fedoraproject.org Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/golang@lists.fedoraproject.org
* Jakub Cajka:
Interesting. I haven't touched and put much time in to that since doing the initial scoping ~3y ago(so I would take anything that I have said with grain of salt). Would you mind opening BZ?
Fair enough: https://bugzilla.redhat.com/show_bug.cgi?id=1688261
Thanks, Florian
----- Original Message -----
From: "Florian Weimer" fweimer@redhat.com To: "Jakub Cajka" jcajka@redhat.com Cc: golang@lists.fedoraproject.org Sent: Wednesday, March 13, 2019 1:17:16 PM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
- Jakub Cajka:
Interesting. I haven't touched and put much time in to that since doing the initial scoping ~3y ago(so I would take anything that I have said with grain of salt). Would you mind opening BZ?
Fair enough: https://bugzilla.redhat.com/show_bug.cgi?id=1688261
Thanks, Florian
Thanks :).
JC
* Fabio Valentini:
Go has supported -buildmode=shared on all major architectures for some time. I'm curious why nobody uses that yet.
There is no ABI document like this one for Go:
https://github.com/itanium-cxx-abi/cxx-abi
I assume that the Go compiler changes ABI between releases, just like GCC for C++ did before GCC 3.4 or thereabouts.
In addition to toolchain support, participating packages would have to follow a set of rules like this:
https://community.kde.org/Policies/Binary_Compatibility_Issues_With_C%2B%2B
Of course, the exact rules will be different for Go, but some of the issues are similar.
Thanks, Florian
* Florian Weimer:
- Fabio Valentini:
Go has supported -buildmode=shared on all major architectures for some time. I'm curious why nobody uses that yet.
There is no ABI document like this one for Go:
https://github.com/itanium-cxx-abi/cxx-abi
I assume that the Go compiler changes ABI between releases, just like GCC for C++ did before GCC 3.4 or thereabouts.
Here's a recent example of such a change:
| This patch by Cherry Zhang changes the Go frontend and libgo to pass | the old slice's ptr/len/cap by value to growslice. In the C calling | convention, on AMD64, and probably a number of other architectures, | a 3-word struct argument is passed on stack. This is less efficient | than passing in three registers. Further, this may affect the code | generation in other part of the program, even if the function is not | actually called.
Le 2019-03-13 10:07, Fabio Valentini a écrit :
Hi Fabio
Go has supported -buildmode=shared on all major architectures for some time. I'm curious why nobody uses that yet.
Probably because no one has figured yet how to derive "standard" dynamic library names from upstream artefact naming and versioning.
Go modules are a step forward, because they give us versions and the module name as derivation root. But someone still needs to define a deriving recipe.
As we've all seen while naming go distribution packages, it's not so easy to define names that do not exist upstream
Regards,
----- Original Message -----
From: "Ian Denhardt" ian@zenhack.net To: "Matthew Dempsky" mdempsky@google.com, "Nicolas Mailhot" nicolas.mailhot@laposte.net Cc: "Russ Cox" rsc@golang.org, "'Matthew Dempsky' via golang-dev" golang-dev@googlegroups.com, golang@lists.fedoraproject.org Sent: Monday, March 11, 2019 7:18:49 PM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
Quoting Matthew Dempsky (2019-03-11 13:54:07)
On Mon, Mar 11, 2019 at 8:01 AM Nicolas Mailhot <[1]nicolas.mailhot@laposte.net> wrote:
And as I have explained in the detailed description Matthew requested,To be clear, I was asking for details to reproduce the technical issues you're running into. I may have missed them, but I don't believe you've provided these.
our construction of the CI/CD environment is incremental, so the assumption in the go tool code that "everything is there and it it is not it can be downloaded directly" does not work for usIf your build system provides the Go source/packages, then Go won't try to download them directly itself. This is how Go works within Google's build system (which, again, has the same no-network-access limitation as the build system you're describing, and yet supports building Go programs/packages). It sounds like you're having trouble with providing the Go source/packages in the format expected by cmd/go. If you would provide reproduction steps of what you're doing and the problems you're running into, we can help advise what changes you need to make, and maybe identify tooling improvements to make that easier.
In a parallel thread, a Nix developer was asking about basically the same use case, and was pointed at `go build -mod=vendor`. It seems like this does exactly what is wanted here -- just use the code we have locally. Nicolas, does that address your use case?
-Ian
I think that our issue in Fedora is that we would like to continue using something that resembles current "GOPATH/vendor" approach where the tooling is not trying to kind of out smart us. As we have in distribution environment full control about the build environment and dependencies and we want/will have to retain it. I believe that having some "dumb mode" for operating all the go commands/tools(without the hard need for a modules (verification), even with reduced features) would be much appreciated(so you can use it as any other compiler/interpreter by just throwing code at it).
IMHO we will tap the metadata that the Go modules carry to simplify our processes and automations, but for example having to run our own notary(even just in build environment) just to be able to use go build/go list/go vet... might be really costly and we will be most probably looking for any alternative.
It has been mentioned several times, in this bunch of treads, that Google's internal build infrastructure has same requirements and is facing similar challenges. It would be great to hear how do they plan to "workaround" up coming module changes.
JC
-- You received this message because you are subscribed to the Google Groups "golang-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Don't google use Blaze/Bazel for builds internally? I don't think go get and modules are hugely relevant in that context (see https://github.com/bazelbuild/rules_go#does-this-work-with-go-modules).
On Tue, 12 Mar 2019 at 09:49, Jakub Cajka jcajka@redhat.com wrote:
----- Original Message -----
From: "Ian Denhardt" ian@zenhack.net To: "Matthew Dempsky" mdempsky@google.com, "Nicolas Mailhot" nicolas.mailhot@laposte.net Cc: "Russ Cox" rsc@golang.org, "'Matthew Dempsky' via golang-dev" golang-dev@googlegroups.com, golang@lists.fedoraproject.org Sent: Monday, March 11, 2019 7:18:49 PM Subject: Re: [golang-dev] proposal: public module authentication with the Go notary
Quoting Matthew Dempsky (2019-03-11 13:54:07)
On Mon, Mar 11, 2019 at 8:01 AM Nicolas Mailhot <[1]nicolas.mailhot@laposte.net> wrote:
And as I have explained in the detailed description Matthew requested,To be clear, I was asking for details to reproduce the technical issues you're running into. I may have missed them, but I don't believe you've provided these.
our construction of the CI/CD environment is incremental, so the assumption in the go tool code that "everything is there and it it is not it can be downloaded directly" does not work for usIf your build system provides the Go source/packages, then Go won't try to download them directly itself. This is how Go works within Google's build system (which, again, has the same no-network-access limitation as the build system you're describing, and yet supports building Go programs/packages). It sounds like you're having trouble with providing the Go source/packages in the format expected by cmd/go. If you would provide reproduction steps of what you're doing and the problems you're running into, we can help advise what changes you need to make, and maybe identify tooling improvements to make that easier.
In a parallel thread, a Nix developer was asking about basically the same use case, and was pointed at `go build -mod=vendor`. It seems like this does exactly what is wanted here -- just use the code we have locally. Nicolas, does that address your use case?
-Ian
I think that our issue in Fedora is that we would like to continue using something that resembles current "GOPATH/vendor" approach where the tooling is not trying to kind of out smart us. As we have in distribution environment full control about the build environment and dependencies and we want/will have to retain it. I believe that having some "dumb mode" for operating all the go commands/tools(without the hard need for a modules (verification), even with reduced features) would be much appreciated(so you can use it as any other compiler/interpreter by just throwing code at it).
IMHO we will tap the metadata that the Go modules carry to simplify our processes and automations, but for example having to run our own notary(even just in build environment) just to be able to use go build/go list/go vet... might be really costly and we will be most probably looking for any alternative.
It has been mentioned several times, in this bunch of treads, that Google's internal build infrastructure has same requirements and is facing similar challenges. It would be great to hear how do they plan to "workaround" up coming module changes.
JC
-- You received this message because you are subscribed to the Google Groups "golang-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
-- You received this message because you are subscribed to the Google Groups "golang-dev" group. To unsubscribe from this group and stop receiving emails from it, send an email to golang-dev+unsubscribe@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Quoting Russ Cox (2019-03-11 10:43:28)
[One of the things go list computes is the dependency graph (visible if you use go list -json or go list -f '{{.Deps}}').]
Ah, that makes more sense.
[resending to keep the golang-dev listserver happy]
Hi Jakub,
The notary part of Go modules, like the rest of the module implementation, suffers from a lack of understanding of integration and QA workflows, and a simplistic dev-centric worldview.
As designed, it will have to be patched out by every single entity trying to perform large scale Go code integration and QA, starting with the usual suspects, Linux distributions. Where that will leave upstream Go, since its main target platform is Linux, I have no idea (I suspect there will be some angst @Google about it).
I write this from the POW of the person that tries to update the Fedora Linux Go tooling so we can ship Go with module mode on in august. Whatever we do at that time will be mirrored by Fedora downstreams like RHEL and Centos and all the other Linux variants that get their inspiration Fedora-side.
It's not an armchair analysis, I already wrote way too much module- oriented custom code because the upstream tooling is deficient, and I've read my share of upstream issue reports.
CONTINUOUS TRANSPARENT INTERNET LOOKUPS ARE NOT ACCEPTABLE POST−DEV
When you integrate large volumes of code, targeting multiple hardware architectures, it’s not acceptable to have the result of the Arm QA run differ from the result of the x86_64 run just because it was scheduled some minutes later on the QA farm, the internet state changed in the meanwhile, and the Go tools decided to “help” you by looking at the internet behind your back and updated transparently the state being QAed. That's nothing new of Go specific, and that’s why our build system disables network access in build containers after populating them with the code to be built and tested, and already did so before someone decided to invent Go.
But, that QA reproducibility constrain was not understood by Go developers, and pretty much all the Go tools that were ported to module mode, will attempt Internet accesses at the slightest occasion, and change the project state based on those access results, with no ability to control or disable it, and nasty failure modes if the internet access fails.
The Go issue tracker is filling up with reports of people expressing their incredulity (or more) after hitting a variation of this problem on their QA systems.
QA−ED CODE IS NOT PRISTINE DEV CODE
No code is ever perfect.
When you integrate large volumes of code, targeting multiple hardware architectures, you *will* hit problems missed by the original upstream dev team. And you *will* have to patch them up, and you *will* have to build the result without waiting for upstream to look at it because, in production, the show must go on, some of those will eventually have zero-day security implications, and the upstreaming lag is not acceptable (assuming upstream is available and friendly, which is not necessarily the case).
So no serious shop is ever going to build its information system from pristine upstream dev code or pristine upstream Go modules. Pristine upstream code building is a nice ideal, but in the real world, it’s utopia.
If you don’t believe me, take your favorite app, and try to build a system containing it from genuine unadultered upstream code (that means, without relying on Fedora, Debian or whatever, since we all know they are full of “nasty” patched code).
Again, that technical real-world-is-not-perfect was not understood by Go developers. Sure they gave us the replace directive in Go module. But that is opt-in. So it only works on a small scale, when fixing third party code is an exception, and you only have a handful of projects that use this third-party code.
On a large scale everyone will just preventively rename everything by default just to retain the ability to perform fixes easily. You'll end up with blanket replacement of every "upstream" module name with "qaed/upstream" forks. And probably easier to rewrite all the imports to point to qaed/upstream by default in the source code.
That will result in a huge mess, much larger than the current GOPATH vendoring mess.
All because the design does not take into account the last mile QA patching that occurs before putting code in production.
Blindly enforcing renaming rules because you don't understand or accept the existence of QA middlemen produced things like Iceweasel. And I don't think Mozilla or Debian were ever happy about it.
PUBLISHING QA-ED CODE CAN NOT DEPEND ON A THIRD PARTY
At least, not if you want the result to work in a free software ecosystem. Freedom 4 “The freedom to distribute copies of your modified versions to others” is not “The freedom to distribute copies of your modified versions to others, but only if you tell this third party first”.
And Google is free to choose not to target a free software ecosystem, but that means all the entities that do target a free software ecosystem (like Linux distributions) or have happily built their infrastructure on free software products (like every single major cloud operator out there), will have fork Go, or reorient their software investments somewhere else, or some combination of those. And I don't think Google wants that, or will be happy about it.
But, that's the situation the current notary design will create. Because in trying to enforce a dev utopia, it is going to break the workflows of a large proportion of the current Go ecosystem.
Any working system will need a way to declare trust in a local authority (the shop QA team) or a set of authorities (the shop QA team, and trusted third parties), and not have this trust rely on continuous access to a server, regardless of who controls it (so detached digital signatures, not direct server hash lookups).
And I’ll stop here before I write things I will regret later.
Regards,
golang@lists.fedoraproject.org