We went to great lengths to ensure garble builds are reproducible.
This includes how the tool itself works,
as its behavior should be the same given the same inputs.
However, we made one crucial mistake with the runtime package.
It has go:linkname directives pointing at other packages,
and some of those pointed packages aren't its dependencies.
Imagine two scenarios where garble builds the runtime package:
1) We run "garble build runtime". The way we handle linkname directives
calls listPackage on the target package, to obfuscate the target's
import path and object name. However, since we only obtained build
info of runtime and its deps, calls for some linknames such as
listPackage("sync/atomic") will fail. The linkname directive will
leave its target untouched.
2) We run "garble build std". Unlike the first scenario, all listPackage
calls issued by runtime's linkname directives will succeed, so its
linkname directive targets will be obfuscated.
At best, this can result in inconsistent builds, depending on how the
runtime package was built. At worst, the mismatching object names can
result in errors at link time, if the target packages are actually used.
The modified test reproduces the worst case scenario reliably,
when the fix is reverted:
> env GOCACHE=${WORK}/gocache-empty
> garble build -a runtime
> garble build -o=out_rebuild ./stdimporter
[stderr]
# test/main/stdimporter
JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined
JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined
JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined
JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined
[...]
The fix consists of two steps. First, if we're building the runtime and
listPackage fails on a package, that means we ran into scenario 1 above.
To avoid the inconsistency, we fill ListedPackages with "go list [...] std".
This means we'll always build runtime as described in scenario 2 above.
Second, when building packages other than the runtime,
we only allow listPackage to succeed if we're listing a dependency of
the current package.
This ensures we won't run into similar reproducibility bugs in the future.
Finally, re-enable test-gotip on CI since this was the last test flake.
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.
To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.
First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:
[...]/asm -gensymabis [...] foo.s bar.s
[...]/asm [...] foo.s bar.s
That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.
However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.
This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.
Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.
This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.
Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.
Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:
go list -export -toolexec=garble -a <dependency>
This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:
go build -toolexec=garble -a <main>
The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.
Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.
CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
That is, use a very specific build tag and git commit,
and ensure that neither ends up in the binary.
Luckily, we have nothing to do here.
We were already removing _gomod_.go from the build entirely,
and that is still the mechanism that "go build" uses to bundle the data.
Note that the test will still work if git is not installed,
but it will simply not check the VCS side.
Finally, we use "go version -m" to check the existing fields,
which is easier than calling the Go APIs directly.
It seems like "go test" passes on yesterday's Go master, now.
So, enable test-gotip again with that commit hash.
Fixes#385.
Spotted by our friend "go test -race":
WARNING: DATA RACE
Write at 0x0000010522d8 by goroutine 69:
mvdan.cc/garble.readFile()
garble/main_test.go:124 +0x23a
mvdan.cc/garble.binsubstr()
garble/main_test.go:141 +0xc4
github.com/rogpeppe/go-internal/testscript.(*TestScript).run()
github.com/rogpeppe/go-internal@v1.8.1-0.20211023094830-115ce09fd6b4/testscript/testscript.go:496 +0x9e8
[...]
Previous write at 0x0000010522d8 by goroutine 60:
mvdan.cc/garble.readFile()
garble/main_test.go:124 +0x23a
mvdan.cc/garble.binsubstr()
garble/main_test.go:141 +0xc4
github.com/rogpeppe/go-internal/testscript.(*TestScript).run()
github.com/rogpeppe/go-internal@v1.8.1-0.20211023094830-115ce09fd6b4/testscript/testscript.go:496 +0x9e8
[...]
This wasn't a data race that we spotted via failures in practice,
as it only affected test code since July.
The race is due to the fact that each test script runs as a parallel
sub-test within the same Go program, sharing all globals.
As such, a single "cached binary" global is read and written with races.
Moreover, note that the caching always missed.
I briefly rewrote the code to avoid the race via a sync.Map keyed by
absolute filenames, and while that removed the data race,
the caching never actually hit.
To have a cache hit, we need an absolute path to already be in the cache
and for it to not have been modified since it was last cached. That is:
modify-bin-1 foo
binsubstr foo 'abc' # miss
binsubstr foo 'def' # hit; use the cached "/tmp/[...]/foo" entry
modify-bin-2 foo
binsubstr foo 'abc' # miss
However, the test scripts don't do contiguous binsubstr calls like
these. Instead, they join repeated binsubstr calls:
modify-bin-1 foo
binsubstr foo 'abc' 'def' # miss
modify-bin-2 foo
binsubstr foo 'abc' # miss
For that reason, remove the extra code entirely.
I didn't notice any change to the performance of "go test -short"
with a warm build cache, with:
go test -c
./garble.test -test.short #warm cache
benchcmd -n 5 TestShort ./garble.test -test.short
name old time/op new time/op delta
TestShort 4.62s ±12% 4.35s ±12% ~ (p=0.310 n=5+5)
name old user-time/op new user-time/op delta
TestShort 16.8s ± 3% 16.7s ± 3% ~ (p=0.690 n=5+5)
name old sys-time/op new sys-time/op delta
TestShort 7.28s ± 1% 7.26s ± 2% ~ (p=0.841 n=5+5)
name old peak-RSS-bytes new peak-RSS-bytes delta
TestShort 305MB ± 0% 306MB ± 0% ~ (p=0.421 n=5+5)
Finally, start using "go test -race" on Linux on CI,
which should have made the PR back in July red before merging.
In Go 1.18 runtime/internal/sys is replaced with internal/goarch. This
change makes sure that when running garble after the runtime package
change, runtime stripping correctly removes the unused import.
Also bump go-internal version to work with go1.18, but disable go1.18 in
CI tests.
We can now use pruned module graphs in go.mod files,
and we no longer need to worry about runtime/internal/sys.
Note that I had to update testdata/mod slightly,
as the new pruned module graphs algorithm downloads an extra go.mod file.
This change also paves the way towards future Go 1.18 support.
Thanks to lu4p for cleaning up two TODOs as well.
Co-Authored-By: lu4p <lu4p@pm.me>
Also bump the gotip commit to the first after the merge window was open,
which simply bumps the internal version from 1.17 to 1.18.
We're leaving GO_COMMIT there for now,
as the 1.18 merge window is currently very noisy with generics changes.
We're just a couple of weeks away from the final release, so
double-check that we still work well with tip.
Update x/tools as well, as it has had some minor fixes for Go 1.17.
Back in early April we added initial support for Go 1.17,
working on a commit from master at that time. For that to work, we just
needed to add a couple of packages to runtimeRelated and tweak printFile
a bit to not break the new "//go:build" directives.
A significant amount of changes have landed since, though, and the tests
broke in multiple ways.
Most notably, the new register ABI is enabled by default for GOOS=amd64.
That affected garble indirectly in two ways: there's a new internal
package to add to runtimeRelated, and we must make reverse.txt more
clever in making its output constant across ABIs.
Another noticeable change is that Go 1.17 changes how its own version is
injected into the runtime package. It used to be via a constant in
runtime/internal/sys, such as:
const TheVersion = `devel ...`
Since we couldn't override such constants via the linker's -X flag,
we had to directly alter the declaration while compiling.
Thankfully, Go 1.17 simply uses a "var buildVersion string" in the
runtime package, and its value is injected by the linker.
This means we can now override it with the linker's -X flag.
We make the code to alter TheVersion for Go 1.16 a bit more clever,
to not break the package when building with Go 1.17.
Finally, our hack to work around ambiguous TOOLEXEC_IMPORTPATH values
now only kicks in for non-test packages, since Go 1.17 includes our
upstream fix. Otherwise, some tests would end up with the ".test"
variant suffix added a second time:
test/bar [test/bar.test] [test/bar [test/bar.test].test]
All the code to keep compatibility with Go 1.16.x remains in place.
We're still leaving TODOs to remind ourselves to remove it or simplify
it once we remove support for 1.16.x.
The 1.17 development freeze has already been in place for a month,
and beta1 is due to come this week, so it's unlikely that Go will change
in any considerable way at this point. Hence, we can say that support
for 1.17 is done.
Fixes#347.
Since it changes rapidly, especially during merge cycles, and we don't
want CI to surprisingly blow up in our faces from one day to another.
Pin this to a commit from yesterday which works, since some changes
merged today moved where the Go build version is recorded and broke
garble.
While at it, replace "git clone" with a wget of a source archive. This
is much, much faster, mainly because a tarball is significantly smaller.
We now download about 20MiB instead of over 350MiB.
One downside is that, without git, make.bash can't construct a devel
version on its own. For that reason, add a pretty basic manual version
via the VERSION file.
This means that we must not reject custom devel version strings. This is
a good thing anyway, because custom devel strings are already common
when building Go in custom ways. Those people tend to be advanced users,
such as CI, so fall back to assuming they know what they are doing and
don't error.
Plus, starting last week, devel versions in Go master now contain the
major Go version like in build tags, such as "go1.17-commit...", so we
will soon start relying on that instead of parsing dates:
$ go version
go version devel go1.17-a7e16abb22 Thu Apr 8 07:33:58 2021 +0000 linux/amd64
Just two minor tweaks were necessary to get "go test" to pass on:
go version devel go1.17-a25c584629 Tue Apr 6 04:48:09 2021 +0000 linux/amd64
Re-enable the CI for it, too. The config needed changing since the
set-env and add-path commands now use special files instead, due to some
security issues uncovered last winter.
It's possible that CI on master could suddenly break, if Go master
changes in some substantial way that requires more tweaks. If that turns
out to be an issue pretty often, we could always pin a specific git repo
commit and update it every few weeks.
This mainly cleans up the few bits of code where we explicitly kept
support for Go 1.15.x. With v0.1.0 released, we can drop support now,
since the next v0.2.0 release will only support Go 1.16.x.
Also updates all modules, including test ones, to 'go 1.16'.
Note that the TOOLEXEC_IMPORTPATH refactor is not done here, despite all
the TODOs about doing so when we drop 1.15 support. This is because that
refactor needs to be done carefully and might have side effects, so it's
best to keep it to a separate commit.
Finally, update the deps.
There are three minor bugs breaking Go 1.16 with the current version of
garble, after the import path obfuscation refactor:
1) Stripping the runtime results in an unused import error. This PR
fixes that.
2) The asm.txt test seems to be broken; something to do with the export
data not being right for the exported assembly func.
3) The obfuscated build of std fails, since our runtimeRelated table was
generated for Go 1.15, not 1.16.
This PR fixes the first issue, adds conditional skip lines for 1.16 for
the other two issues, and enables 1.16 on CI.
Note that 1.16 support is not here just yet, because of the other two
issues. As such, no doc changes.
Updates #124.
Since it's been failing for weeks, it's practically useless for now.
Even with continue-on-error, the failures still look scary at first
glance.
We can re-enable this job once we fix master.
Many files were missing copyright, so also add a short script to add the
missing lines with the current year, and run it.
The AUTHORS file is also self-explanatory. Contributors can add
themselves there, or we can simply update it from time to time via
git-shortlog.
Since we have two scripts now, set up a directory for them.
Most notably, x/mod now includes the GOPRIVATE pattern-matching API we
were copying before, so we can use it directly.
Also bump the Go version requirement to 1.15, in preparation for the
import path obfuscation PR, and don't let the gotip job fail the entire
workflow.
First, our original append line was completely ineffective; we never
used that "flags" slice again. Second, we only attempted to use the flag
when we obfuscated a package.
In fact, we never care about debugging information here, so for any
package we compile, we can add "-dwarf=false". At the moment, we compile
all packages, even if they aren't to be obfuscated, due to the lack of
access to the build cache.
As such, we save a significant amount of work. The numbers below were
obtained on a quiet machine with "go test -bench=. -benchtime=10x", six
times before and after the change.
name old time/op new time/op delta
Build-8 2.06s ± 4% 1.87s ± 2% -9.21% (p=0.002 n=6+6)
name old sys-time/op new sys-time/op delta
Build-8 1.51s ± 2% 1.46s ± 1% -3.12% (p=0.004 n=6+5)
name old user-time/op new user-time/op delta
Build-8 11.9s ± 2% 10.8s ± 1% -8.71% (p=0.002 n=6+6)
While at it, only do CI builds on pushes and PRs to the master branch,
so that my PRs created from the same repo don't trigger duplicate
builds.
Since the new linker was failing on our crypto/aes shenanigans until the
recent commit to remove it for literal obfuscation.
Building Go does take about two minutes on the CI machine, but that's
fast enough. One can see the exact version that was used via the 'go
version' line.
It fails on all three GitHub platforms, for some reason. Something
related to permission denied errors in the module cache.
It's unclear to me why we're trying to modify the build cache. For now,
un-break master.