diff mbox series

[v4,1/2] package/pkg-golang.mk: allow packages to override download GOPROXY

Message ID 20240805163133.4126564-1-james.hilliard1@gmail.com
State Changes Requested
Headers show
Series [v4,1/2] package/pkg-golang.mk: allow packages to override download GOPROXY | expand

Commit Message

James Hilliard Aug. 5, 2024, 4:31 p.m. UTC
Some packages like the upcoming tailscale package need to set GOPROXY
to get the correct dependencies.

Signed-off-by: James Hilliard <james.hilliard1@gmail.com>
---
 package/pkg-golang.mk | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Christian Stewart Aug. 5, 2024, 5:34 p.m. UTC | #1
James,

On Mon, Aug 5, 2024, 9:31 AM James Hilliard <james.hilliard1@gmail.com>
wrote:

> Some packages like the upcoming tailscale package need to set GOPROXY
> to get the correct dependencies..
>

I still highly disagree with setting GOPROXY per package.

I have been working with the tailscale devs to fix the dependency issues.
We fixed most of it.

Additionally I'm waiting for the anchore/quill developers to get back to me
about removing some Git LFS files from their module that also break GOPROXY.

The decision to use the proxy or not (run by Google) should be up to the
user. If I want to download everything from Git, that should be fully
possible. (And usually is with GOPROXY direct)

I'll follow up with those GitHub issues today and see if I can get the ball
rolling on another tailscale release with fixed dependencies.

Thanks,
Christian Stewart
Thomas Petazzoni Aug. 5, 2024, 8:31 p.m. UTC | #2
Hello Christian,

On Mon, 5 Aug 2024 10:34:44 -0700
Christian Stewart via buildroot <buildroot@buildroot.org> wrote:

> I still highly disagree with setting GOPROXY per package.
> 
> I have been working with the tailscale devs to fix the dependency issues.
> We fixed most of it.
> 
> Additionally I'm waiting for the anchore/quill developers to get back to me
> about removing some Git LFS files from their module that also break GOPROXY.
> 
> The decision to use the proxy or not (run by Google) should be up to the
> user. If I want to download everything from Git, that should be fully
> possible. (And usually is with GOPROXY direct)
> 
> I'll follow up with those GitHub issues today and see if I can get the ball
> rolling on another tailscale release with fixed dependencies.

Thanks a lot for your feedback. When seeing the new iteration from
James, my initial reaction was "I need the feedback from Christian on
this", and you had already given your feedback.

That being said: should we still have a way to patch/fix those kind of
issues on our side, until upstream fixes it? My understanding is that
this could be fixed on our side if we were able to apply a patch before
the Go vendoring happens. Is that correct?

Thomas
James Hilliard Aug. 5, 2024, 10:01 p.m. UTC | #3
On Mon, Aug 5, 2024 at 2:31 PM Thomas Petazzoni
<thomas.petazzoni@bootlin.com> wrote:
>
> Hello Christian,
>
> On Mon, 5 Aug 2024 10:34:44 -0700
> Christian Stewart via buildroot <buildroot@buildroot.org> wrote:
>
> > I still highly disagree with setting GOPROXY per package.
> >
> > I have been working with the tailscale devs to fix the dependency issues.
> > We fixed most of it.
> >
> > Additionally I'm waiting for the anchore/quill developers to get back to me
> > about removing some Git LFS files from their module that also break GOPROXY.
> >
> > The decision to use the proxy or not (run by Google) should be up to the
> > user. If I want to download everything from Git, that should be fully
> > possible. (And usually is with GOPROXY direct)

So this allows packages to choose the GOPROXY settings, previously
that was not possible as it was hard-coded as direct.

I don't really see much wrong with simply setting compatible GOPROXY
settings as required for a specific package. It's not really much different
from setting the download locations in general for packages IMO.

> >
> > I'll follow up with those GitHub issues today and see if I can get the ball
> > rolling on another tailscale release with fixed dependencies.
>
> Thanks a lot for your feedback. When seeing the new iteration from
> James, my initial reaction was "I need the feedback from Christian on
> this", and you had already given your feedback.
>
> That being said: should we still have a way to patch/fix those kind of
> issues on our side, until upstream fixes it? My understanding is that
> this could be fixed on our side if we were able to apply a patch before
> the Go vendoring happens. Is that correct?

Well I think since upstreams often expect the google goproxy then it's
easiest to just have us set that when required rather than try and patch
prior to vendoring.

>
> Thomas
> --
> Thomas Petazzoni, co-owner and CEO, Bootlin
> Embedded Linux and Kernel engineering and training
> https://bootlin.com
Christian Stewart Aug. 6, 2024, 1:07 a.m. UTC | #4
Hi James, Thomas,

On Mon, Aug 5, 2024 at 3:01 PM James Hilliard <james.hilliard1@gmail.com> wrote:
> On Mon, Aug 5, 2024 at 2:31 PM Thomas Petazzoni
> <thomas.petazzoni@bootlin.com> wrote:
> > On Mon, 5 Aug 2024 10:34:44 -0700
> > Christian Stewart via buildroot <buildroot@buildroot.org> wrote:
> >
> > > I still highly disagree with setting GOPROXY per package.
> > >
> > > I have been working with the tailscale devs to fix the dependency issues.
> > > We fixed most of it.
> > >
> > > Additionally I'm waiting for the anchore/quill developers to get back to me
> > > about removing some Git LFS files from their module that also break GOPROXY.
> > >
> > > The decision to use the proxy or not (run by Google) should be up to the
> > > user. If I want to download everything from Git, that should be fully
> > > possible. (And usually is with GOPROXY direct)
>
> So this allows packages to choose the GOPROXY settings, previously
> that was not possible as it was hard-coded as direct.
>
> I don't really see much wrong with simply setting compatible GOPROXY
> settings as required for a specific package. It's not really much different
> from setting the download locations in general for packages IMO.
>
> > >
> > > I'll follow up with those GitHub issues today and see if I can get the ball
> > > rolling on another tailscale release with fixed dependencies.
> >
> > Thanks a lot for your feedback. When seeing the new iteration from
> > James, my initial reaction was "I need the feedback from Christian on
> > this", and you had already given your feedback.
> >
> > That being said: should we still have a way to patch/fix those kind of
> > issues on our side, until upstream fixes it? My understanding is that
> > this could be fixed on our side if we were able to apply a patch before
> > the Go vendoring happens. Is that correct?
>
> Well I think since upstreams often expect the google goproxy then it's
> easiest to just have us set that when required rather than try and patch
> prior to vendoring.

My suggestion is to either set GOPROXY=direct for all packages, or for
none of them.

We could default to setting GOPROXY to empty, which uses the default
Google proxy.

The best approach is likely to add a Config.in option:

Something like: HOST_GO_GOPROXY

Then we can default that to: "https://proxy.golang.org,direct"

Which is the default value in the Go source code.

I think this would be the best approach if we do want to enable
GOPROXY. This way, users that don't want to use the proxy can override
HOST_GO_GOPROXY to "direct"

We could keep it as "direct" once this issue is closed:
https://github.com/tailscale/tailscale/issues/12859

Best regards,
Christian Stewart
Thomas Petazzoni Aug. 6, 2024, 2:50 p.m. UTC | #5
On Mon, 5 Aug 2024 18:07:37 -0700
Christian Stewart <christian@aperture.us> wrote:

> My suggestion is to either set GOPROXY=direct for all packages, or for
> none of them.
> 
> We could default to setting GOPROXY to empty, which uses the default
> Google proxy.
> 
> The best approach is likely to add a Config.in option:
> 
> Something like: HOST_GO_GOPROXY
> 
> Then we can default that to: "https://proxy.golang.org,direct"
> 
> Which is the default value in the Go source code.
> 
> I think this would be the best approach if we do want to enable
> GOPROXY. This way, users that don't want to use the proxy can override
> HOST_GO_GOPROXY to "direct"

But that is independent of the current problem we're trying to solve
with tailscale.

> We could keep it as "direct" once this issue is closed:
> https://github.com/tailscale/tailscale/issues/12859

Still, I think it's a problem if we can't fix problems on our side and
that we have to wait for upstream to make a new release. For all other
problems, we can apply a patch in Buildroot that gets us going until
upstream fixes the problem and makes a release. I would like to have
the same ability to fix Go vendoring issues.

Could you explain to a Go ignorant like me how is this proxy thing
helping in solving this issue? Should a proxy just "speed up" the
download? Why does it exhibit a different behavior between a
GOPROXY=direct build and a GOPROXY=somethingelse ?

Thanks!

Thomas
James Hilliard Aug. 6, 2024, 3:49 p.m. UTC | #6
On Tue, Aug 6, 2024 at 8:50 AM Thomas Petazzoni
<thomas.petazzoni@bootlin.com> wrote:
>
> On Mon, 5 Aug 2024 18:07:37 -0700
> Christian Stewart <christian@aperture.us> wrote:
>
> > My suggestion is to either set GOPROXY=direct for all packages, or for
> > none of them.
> >
> > We could default to setting GOPROXY to empty, which uses the default
> > Google proxy.

Why weren't we just using the default settings in the first place? I mean
if upstreams basically only test with default settings here then using
something different seems problematic in general.

> >
> > The best approach is likely to add a Config.in option:
> >
> > Something like: HOST_GO_GOPROXY
> >
> > Then we can default that to: "https://proxy.golang.org,direct"
> >
> > Which is the default value in the Go source code.
> >
> > I think this would be the best approach if we do want to enable
> > GOPROXY. This way, users that don't want to use the proxy can override
> > HOST_GO_GOPROXY to "direct"
>
> But that is independent of the current problem we're trying to solve
> with tailscale.
>
> > We could keep it as "direct" once this issue is closed:
> > https://github.com/tailscale/tailscale/issues/12859
>
> Still, I think it's a problem if we can't fix problems on our side and
> that we have to wait for upstream to make a new release. For all other
> problems, we can apply a patch in Buildroot that gets us going until
> upstream fixes the problem and makes a release. I would like to have
> the same ability to fix Go vendoring issues.

To me it sounds like just switching to the default GOPROXY settings
may be the best solution since that's typically what most projects use.

There's reports that GOPROXY=direct is kinda broken:
https://drewdevault.com/2021/08/06/goproxy-breaks-go.html

>
> Could you explain to a Go ignorant like me how is this proxy thing
> helping in solving this issue? Should a proxy just "speed up" the
> download? Why does it exhibit a different behavior between a
> GOPROXY=direct build and a GOPROXY=somethingelse ?

Presumably the proxy acts as a cache which prevents breakage if
the dependency direct source changes or something like that.

>
> Thanks!
>
> Thomas
> --
> Thomas Petazzoni, co-owner and CEO, Bootlin
> Embedded Linux and Kernel engineering and training
> https://bootlin.com
Christian Stewart Aug. 6, 2024, 4:13 p.m. UTC | #7
Thomas, James,

On Tue, Aug 6, 2024, 8:49 AM James Hilliard <james.hilliard1@gmail.com>
wrote:

> On Tue, Aug 6, 2024 at 8:50 AM Thomas Petazzoni
> <thomas.petazzoni@bootlin.com> wrote:
> >
> > On Mon, 5 Aug 2024 18:07:37 -0700
> > Christian Stewart <christian@aperture.us> wrote:
> >
> > > My suggestion is to either set GOPROXY=direct for all packages, or for
> > > none of them.
> > >
> > > We could default to setting GOPROXY to empty, which uses the default
> > > Google proxy.
>
> Why weren't we just using the default settings in the first place? I mean
> if upstreams basically only test with default settings here then using
> something different seems problematic in general.
>

> There's reports that GOPROXY=direct is kinda broken:
> https://drewdevault.com/2021/08/06/goproxy-breaks-go.html
>
> >
> > Could you explain to a Go ignorant like me how is this proxy thing
> > helping in solving this issue? Should a proxy just "speed up" the
> > download? Why does it exhibit a different behavior between a
> > GOPROXY=direct build and a GOPROXY=somethingelse ?
>
> Presumably the proxy acts as a cache which prevents breakage if
> the dependency direct source changes or something like that.
>

The reasoning for using GOPROXY=direct is as follows:

When we enable the proxy, the Go tool will download the sources for the
dependencies from Google.

This is OK in general, but in the context of buildroot we typically
download sources from their upstream releases or Git repositories.

GOPROXY is not broken.

That said, a lot of dependencies cause problems when they change what tag a
git release points to retroactively (rare but happens), or delete a older
commit from their repository (which happens extremely rarely).

The proxy smooths this over by always serving the same thing for a git tag
or commit hash.

The problem with this is, now we have something different being returned
for a dependency than what it says in go.mod.

The go.mod might say we are fetching foo at version 1.0.0, but in fact, if
you go to the repository for foo, and go to version 1.0.0, it could be a
completely different source tree from the one we get in the proxied
download.

This opens up the opportunity for bad actors to hide code in the proxied
version and then retroactively change the git tag to hide that bad code.

For this reason the default was set to always use Git to fetch the
dependencies for Buildroot as we would prefer to detect the mismatch and
deal with it up front, possibly falling back to fetching the .tar.gz from
our own mirror, rather than depend on this behavior from the Google proxy
to always be there and save us.

As you can see we have never had an issue with this setting until now, when
we have an actual broken dependency somewhere, and its when we are adding
the package, not retroactively, that the issue is found and solved.

This is my pitch for keeping GOPROXY=direct. If we want to use the Google
proxy with all the cavets I mentioned above, we could add the option
HOST_GO_GOPROXY and default it to the Google proxy. But personally I will
still always override this value to direct in my projects.

For now we can just include the latest commit hash of tailscale until they
make a new release and nothing else needs to be changed.

Thanks,
Christian Stewart
James Hilliard Aug. 6, 2024, 4:43 p.m. UTC | #8
On Tue, Aug 6, 2024 at 10:14 AM Christian Stewart <christian@aperture.us> wrote:
>
> Thomas, James,
>
> On Tue, Aug 6, 2024, 8:49 AM James Hilliard <james.hilliard1@gmail.com> wrote:
>>
>> On Tue, Aug 6, 2024 at 8:50 AM Thomas Petazzoni
>> <thomas.petazzoni@bootlin.com> wrote:
>> >
>> > On Mon, 5 Aug 2024 18:07:37 -0700
>> > Christian Stewart <christian@aperture.us> wrote:
>> >
>> > > My suggestion is to either set GOPROXY=direct for all packages, or for
>> > > none of them.
>> > >
>> > > We could default to setting GOPROXY to empty, which uses the default
>> > > Google proxy.
>>
>> Why weren't we just using the default settings in the first place? I mean
>> if upstreams basically only test with default settings here then using
>> something different seems problematic in general.
>>
>>
>> There's reports that GOPROXY=direct is kinda broken:
>> https://drewdevault.com/2021/08/06/goproxy-breaks-go.html
>>
>> >
>> > Could you explain to a Go ignorant like me how is this proxy thing
>> > helping in solving this issue? Should a proxy just "speed up" the
>> > download? Why does it exhibit a different behavior between a
>> > GOPROXY=direct build and a GOPROXY=somethingelse ?
>>
>> Presumably the proxy acts as a cache which prevents breakage if
>> the dependency direct source changes or something like that.
>
>
> The reasoning for using GOPROXY=direct is as follows:
>
> When we enable the proxy, the Go tool will download the sources for the dependencies from Google.
>
> This is OK in general, but in the context of buildroot we typically download sources from their upstream releases or Git repositories.

For cargo which I think is the most similar to how we handle dependencies
for golang we vendor dependencies from https://crates.io/ downloads AFAIU.

So I don't think it's normal for us to download vendored sources from their
upstream locations directly.

>
> GOPROXY is not broken.
>
> That said, a lot of dependencies cause problems when they change what tag a git release points to retroactively (rare but happens), or delete a older commit from their repository (which happens extremely rarely).
>
> The proxy smooths this over by always serving the same thing for a git tag or commit hash.

So upstreams seem to expect this caching behavior.

>
> The problem with this is, now we have something different being returned for a dependency than what it says in go.mod.
>
> The go.mod might say we are fetching foo at version 1.0.0, but in fact, if you go to the repository for foo, and go to version 1.0.0, it could be a completely different source tree from the one we get in the proxied download.
>
> This opens up the opportunity for bad actors to hide code in the proxied version and then retroactively change the git tag to hide that bad code.

I mean, the dependencies are all locked to the upstream hashes either way.

>
> For this reason the default was set to always use Git to fetch the dependencies for Buildroot as we would prefer to detect the mismatch and deal with it up front, possibly falling back to fetching the .tar.gz from our own mirror, rather than depend on this behavior from the Google proxy to always be there and save us.

I don't really see anything wrong with using google if that's what all
upstreams seem to expect.

>
> As you can see we have never had an issue with this setting until now, when we have an actual broken dependency somewhere, and its when we are adding the package, not retroactively, that the issue is found and solved.

We also don't seem to have a lot of golang packages.

>
> This is my pitch for keeping GOPROXY=direct. If we want to use the Google proxy with all the cavets I mentioned above, we could add the option HOST_GO_GOPROXY and default it to the Google proxy. But personally I will still always override this value to direct in my projects.

I'm not seeing much difference when the upstream project is using hash
based dependency lock files however, like as long as the hashes match
we know we have the sources specified by the upstream. If the upstream
itself can't be trusted then that's a whole different issue IMO and the
GOPROXY settings won't make much of a difference there.

>
> For now we can just include the latest commit hash of tailscale until they make a new release and nothing else needs to be changed.

IMO switching to the default GOPROXY settings is probably best.

I'm somewhat wary of using a commit hash for something like tailscale
which interfaces with an external service.

>
> Thanks,
> Christian Stewart
Thomas Petazzoni Aug. 6, 2024, 4:57 p.m. UTC | #9
Hello,

On Tue, 6 Aug 2024 09:13:50 -0700
Christian Stewart <christian@aperture.us> wrote:


> The reasoning for using GOPROXY=direct is as follows:
> 
> When we enable the proxy, the Go tool will download the sources for the
> dependencies from Google.
> 
> This is OK in general, but in the context of buildroot we typically
> download sources from their upstream releases or Git repositories.
> 
> GOPROXY is not broken.
> 
> That said, a lot of dependencies cause problems when they change what tag a
> git release points to retroactively (rare but happens), or delete a older
> commit from their repository (which happens extremely rarely).

Crap, those projects do that? They change a tag, or remove an older
commit from their repo? Yikes for reproducibility :-/

> The proxy smooths this over by always serving the same thing for a git tag
> or commit hash.
> 
> The problem with this is, now we have something different being returned
> for a dependency than what it says in go.mod.
> 
> The go.mod might say we are fetching foo at version 1.0.0, but in fact, if
> you go to the repository for foo, and go to version 1.0.0, it could be a
> completely different source tree from the one we get in the proxied
> download.

And that obviously sucks, and using a proxy doesn't seem like a great
workaround.

> This opens up the opportunity for bad actors to hide code in the proxied
> version and then retroactively change the git tag to hide that bad code.
> 
> For this reason the default was set to always use Git to fetch the
> dependencies for Buildroot as we would prefer to detect the mismatch and
> deal with it up front, possibly falling back to fetching the .tar.gz from
> our own mirror, rather than depend on this behavior from the Google proxy
> to always be there and save us.
> 
> As you can see we have never had an issue with this setting until now, when
> we have an actual broken dependency somewhere, and its when we are adding
> the package, not retroactively, that the issue is found and solved.

I'm not sure to follow you on this. Why would this only happen when we
add the package? If we use GOPROXY=direct, and 3 months after we've
added the package some upstream dependency decides to change its tag or
remove some commits, then it will start failing (of course,
sources.buildroot.net will have a backup, but sources.buildroot.net is
not meant to really be used in "normal" circumstances).

> This is my pitch for keeping GOPROXY=direct. If we want to use the Google
> proxy with all the cavets I mentioned above, we could add the option
> HOST_GO_GOPROXY and default it to the Google proxy. But personally I will
> still always override this value to direct in my projects.
> 
> For now we can just include the latest commit hash of tailscale until they
> make a new release and nothing else needs to be changed.

Don't understand this last part. What is your proposal to fix the
tailscale situation?

Thomas
Christian Stewart Aug. 6, 2024, 6:53 p.m. UTC | #10
Thomas,

On Tue, Aug 6, 2024, 9:57 AM Thomas Petazzoni <thomas.petazzoni@bootlin.com>
wrote:

> > For now we can just include the latest commit hash of tailscale until
> they
> > make a new release and nothing else needs to be changed.
>
> Don't understand this last part. What is your proposal to fix the
> tailscale situation?
>

They fixed the issue upstream and now the main branch of tailscale works
fine with GOPROXY=direct. See:

https://github.com/tailscale/tailscale/issues/12859#issuecomment-2239157563

It is fine to use the main branch of tailscale here even if it's an
external service as tailscale keeps compatibility between releases - I even
have many machines running quite out of date tailscale clients which still
connect and run perfectly. The same can be said for forwards compatibility.

All that I ask with regards to GOPROXY is that:

- we do not override it per package,
- if we do decide to change the default to use Google, please make it a
configurable option (like BR2_MIRROR) so that I (and other users) can
override it back to be direct.

Best regards,
Christian Stewart
diff mbox series

Patch

diff --git a/package/pkg-golang.mk b/package/pkg-golang.mk
index 8e27602d41..118908a159 100644
--- a/package/pkg-golang.mk
+++ b/package/pkg-golang.mk
@@ -85,10 +85,12 @@  define $(2)_GEN_GOMOD
 endef
 $(2)_POST_PATCH_HOOKS += $(2)_GEN_GOMOD
 
+$(2)_DL_GOPROXY ?= direct
+
 $(2)_DOWNLOAD_POST_PROCESS = go
 $(2)_DL_ENV += \
 	$$(HOST_GO_COMMON_ENV) \
-	GOPROXY=direct \
+	GOPROXY=$$($(2)_DL_GOPROXY) \
 	$$($(2)_GO_ENV)
 
 # Because we append vendored info, we can't rely on the values being empty