Panicking in small helpers or in funcs that don't return error
has proved useful to keep code easier to maintain,
particularly for cases that should typically never happen.
However, in these cases we can error just as easily.
In particular, I was getting a panic whenever I forgot
that I was running garble with Go master (1.23), which is over the top.
When updating Garble to support Go 1.22.0, CI on MacOS spotted
that the syscall package was failing to build given that it uses
assembly code which is only allowed in some std packages.
That allowlist is based on import paths, and we were obfuscating
the syscall package's import path, so that was breaking GOOS=darwin.
As a fix, I added syscall to runtimeAndDeps to not obfuscate it.
That wasn't a great fix; it's not part of runtime and its dependencies,
and there's no reason we should avoid obfuscating the package contents.
Not obfuscating the contents in fact broke x/sys/unix,
as it contains a copy of syscall.Rlimit which it type converted with.
Undo that fix and reinstate the gogarble.txtar syscall test.
Implement the fix where we only leave syscall's import path alone.
Add a regression test, and add a note about adding x/net and x/sys
to check-third-party.sh so that we can catch these bugs earlier.
Fixes#830.
It seems like building with Go 1.22.0 for GOOS=darwin started
running into some issues with the syscall package's use of ABIInternal
in assembly source code:
> exec garble build
[stderr]
# syscall
[...].s:16: ABI selector only permitted when compiling runtime, reference was to "runtime.entersyscall"
The error can be reproduced from another platform like GOOS=linux
as long as we have any test that cross-compiles std to GOOS=darwin.
We had crossbuild.txtar which only ensured we covered GOOS=windows
and GOOS=linux, so add a third case to ensure MacOS is covered too.
This will slow down the tests a bit, but is important for the sake
of ensuring that we catch these bugs early, even without MacOS on CI.
In fact, we hadn't caught this earlier for Go 1.22 precisely because
on CI we only tested on Go tip with GOOS=linux, for the sake of speed.
Adding the rest of the package import paths from objabi.allowAsmABIPkgs
to our runtimeAndDeps generated map solves this error.
Go 1.21.0 was released in August 2023, so our upcoming release
will no longer support the Go 1.20 release series.
The first Go 1.22 release candidate is also due in December 2023,
less than a month from now, so dropping 1.20 will simplify 1.22 work.
Two new packages linknamed with the runtime package,
one new intrinsic function, and one that is being removed in Go 1.22
but we want to keep around as long as we support Go 1.21.
Also note that, since math/rand/v2 simply does not exist until Go 1.22,
we need to adjust appendListedPackages to not fail on older versions.
First, teach scripts/gen-go-std-tables.sh to omit test packages,
since runtime/metrics_test would always result in an error.
Instead, make transformLinkname explicitly skip that package,
leaving a comment about a potential improvement if needed.
Second, the only remaining "not found" error we had was "maps" on 1.20,
so rewrite that check based on ImportPath and GoVersionSemver.
Third, detect packages with the "exclude all Go files" error
by looking at CompiledGoFiles and IgnoredGoFiles, which is less brittle.
This means that we are no longer doing any filtering on pkg.Error.Err,
which means we are less likely to break with Go error message changes.
Fourth, the check on pkg.Incomplete is now obsolete given the above,
meaning that the CompiledGoFiles length check is plenty.
Finally, stop trying to be clever about how we print errors.
Now that we're no longer skipping packages based on pkg.Error values,
printing pkg.DepsErrors was causing duplicate messages in the output.
Simply print pkg.Error values with only minimal tweaks:
including the position if there is any, and avoiding double newlines.
Overall, this makes our logic a lot less complicated,
and garble still works the way we want it to.
computeLinkerVariableStrinsg had an unusedargument.
Only skip obfuscating the name "FS" in the "embed" package.
The reflect methods no longer use the transformer receiver type,
so that TODO now feels unnecessary. Many methods need to be aware
of what the current types.Package is, and that seems reasonable.
We no longer use writeFileExclusive for our own cache on disk,
so the TODO about using locking or atomic writes is no longer relevant.
This means we now have a unified cache directory for garble,
which is now documented in the README.
I considered using the same hash-based cache used for pkgCache,
but decided against it since that cache implementation only stores
regular files without any executable bits set.
We could force the cache package to do what we want here,
but I'm leaning against it for now given that single files work OK.
It is true that each garble process only obfuscates up to one package,
which is why we made them globals to begin with.
However, garble does quite a lot more now,
such as reversing the obfuscation of many packages at once.
Having a global "current package" variable makes mistakes easier.
Some funcs, like those in transformFuncs, are now transformer methods.
Packages like os and sync have started using go:linknames pointing to
packages outside their dependency tree, much like runtime already did.
This started causing warnings to be printed while obfuscsating std:
> exec garble build -o=out_rebuild ./stdimporter
[stderr]
# sync
//go:linkname refers to syscall.hasWaitingReaders - add `import _ "syscall"` for garble to find the package
# os
//go:linkname refers to net.newUnixFile - add `import _ "net"` for garble to find the package
> bincmp out_rebuild out
PASS
Relax the restriction in listPackage so that any package in std
is now allowed to list packages in runtimeLinknamed,
which makes the warnings and any potential problems go away.
Also make these std test cases check that no warnings are printed,
since I only happened to notice this problem by chance.
This is in preparation for the switch to Go's cache package,
whose ActionID type is also a full sha256 hash with 32 bytes.
We were using "short" hashes as shown by `go tool buildid`,
since that was consistent and 15 bytes was generally enough.
First, rename "component" to "hash", since it's shorter and more useful.
A full build ID is two or four hashes joined with slashes.
Second, add sanity checks that buildIDHashLength is being followed.
Otherwise the use of []byte could lead to human error.
Third, move all the hash encoding and decoding logic together.
A couple of new packages in runtimeAndDeps,
and go list's Package.DepsErrors may now include package build errors
which we want to ignore as we would print those as duplicates.
printOneCgoTraceback now returns a boolean rather than an int.
Since we need to have different logic based on the Go version,
and toolchainVersionSemver was only set for the main process,
move the string to the shared cache global.
This is a nice thing to do anyway, to reduce the number of globals.
While here, update actions/setup-go to v4, which starts caching
GOMODCACHE and GOCACHE by default now.
Disable it, because it still doesn't help in our case,
and GitHub's Actions caching is still really inefficient.
And update staticcheck too.
I mistakenly understood that, when the DepsErrors field has errors,
the Error field would contain an error as well.
That is not always the case; for example,
the imports_missing package in the added test script
had DepsErrors set but Error empty, causing a nil dereference panic.
Make the code more robust, and report both kinds of load errors.
Fixes#694.
We're building the linker binary for the host GOOS,
not the target GOOS that we happen to be building for.
I noticed that, after running `go test`, my garble cache
would contain both link and link.exe, which made no sense
as I run linux and not windows.
`go env` has GOHOSTOS to mirror GOOS, but there is no
GOHOSTEXE to mirror GOEXE, so we reconstruct it from
runtime.GOOS, which is equivalent to GOHOSTOS.
Add a regression test as well.
When we use `go list` on the standard library, we need to be careful
about what flags are passed from the top-level build command,
because some flags are not going to be appropriate.
In particular, GOFLAGS=-modfile=... resulted in a failure,
reproduced via the GOFLAGS variable added to linker.txtar:
go: inconsistent vendoring in /home/mvdan/tip/src:
golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
To ignore the vendor directory, use -mod=readonly or -mod=mod.
To sync the vendor directory, run:
go mod vendor
To work around this problem, reset the -mod and -modfile flags when
calling "go list" on the standard library, as those are the only two
flags which alter how we load the main module in a build.
The code which builds a modified cmd/link has a similar problem;
it already reset GOOS and GOARCH, but it could similarly run into
problems if other env vars like GOFLAGS were set.
To be on the safe side, we also disable GOENV and GOEXPERIMENT,
which we borrow from Go's bootstrapping commands.
We obfuscate import paths as well as their declared names.
The compiler treats some packages and APIs in special ways,
and the way it detects those is by looking at import paths and names.
In the past, we have avoided obfuscating some names like embed.FS or
reflect.Value.MethodByName for this reason. Otherwise,
go:embed or the linker's deadcode elimination might be broken.
This matching by path and name also happens with compiler intrinsics.
Intrinsics allow the compiler to rewrite some standard library calls
with small and efficient assembly, depending on the target GOARCH.
For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32,
which on amd64 may result in using the TZCNTL instruction.
We never noticed that we were breaking many of these intrinsics.
The intrinsics for funcs declared in the runtime and its dependencies
still worked properly, as we do not obfuscate those packages yet.
However, for other packages like math/bits and sync/atomic,
the intrinsics were being entirely disabled due to obfuscated names.
Skipping intrinsics is particularly bad for performance,
and it also leads to slightly larger binaries:
│ old │ new │
│ bin-B │ bin-B vs base │
Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4)
Finally, the main reason we noticed that intrinsics were broken
is that apparently GOARCH=mips fails to link without them,
as some symbols end up being not defined at all.
This patch fixes builds for the MIPS family of architectures.
Rather than building and linking all of std for every GOARCH,
test that intrinsics work by asking the compiler to print which
intrinsics are being applied, and checking that math/bits gets them.
This fix is relatively unfortunate, as it means we stop obfuscating
about 120 function names and a handful of package paths.
However, fixing builds and intrinsics is much more important.
We can figure out better ways to deal with intrinsics in the future.
Fixes#646.
The added test case would panic, because we would try to hash a name
with a broken package's GarbleActionID, which was empty.
We skipped over all package errors in appendListedPackages because two
kinds of errors were OK in the standard library.
However, this also meant we ignored real errors we should stop at,
because obfuscating those user packages is pointless.
Add more assertions, check for the OK errors explicitly,
and fail on any other error immediately.
Note that, in the process, I also found a bug in cmd/go.
Uncovered by github.com/bytedance/sonic,
whose internal/loader package fails to build on Go 1.20.
This value is hard-coded in the linker and written in a header.
We could rewrite the final binary, like we used to do with import paths,
but that would require once again maintaining libraries to do so.
Instead, we're now modifying the linker to do what we want.
It's not particularly hard, as every Go install has its source code,
and rebuilding a slightly modified linker only takes a few seconds at most.
Thanks to `go build -overlay`, we only need to copy the files we modify,
and right now we're just modifying one file in the toolchain.
We use a git patch, as the change is fairly static and small,
and the patch is easier to understand and maintain.
The other side of this change is in the runtime,
as it also hard-codes the magic value when loading information.
We modify the code via syntax trees in that case, like `-tiny` does,
because the change is tiny (one literal) and the affected lines of code
are modified regularly between major Go releases.
Since rebuilding a slightly modified linker can take a few seconds,
and Go's build cache does not cache linked binaries,
we keep our own cached version of the rebuilt binary in `os.UserCacheDir`.
The feature isn't perfect, and will be improved in the future.
See the TODOs about the added dependency on `git`,
or how we are currently only able to cache one linker binary at once.
Fixes#622.
We were obfuscating reflect's package path and its declared names,
but the toolchain wants to detect the presence of method reflection
to turn down the aggressiveness of dead code elimination.
Given that the obfuscation broke the detection,
we could easily end up in crashes when making reflect calls:
fatal error: unreachable method called. linker bug?
goroutine 1 [running]:
runtime.throw({0x50c9b3?, 0x2?})
runtime/panic.go:1047 +0x5d fp=0xc000063660 sp=0xc000063630 pc=0x43245d
runtime.unreachableMethod()
runtime/iface.go:532 +0x25 fp=0xc000063680 sp=0xc000063660 pc=0x40a845
runtime.call16(0xc00010a360, 0xc00000e0a8, 0x0, 0x0, 0x0, 0x8, 0xc000063bb0)
runtime/wcS9OpRFL:728 +0x49 fp=0xc0000636a0 sp=0xc000063680 pc=0x45eae9
runtime.reflectcall(0xc00001c120?, 0x1?, 0x1?, 0x18110?, 0xc0?, 0x1?, 0x1?)
<autogenerated>:1 +0x3c fp=0xc0000636e0 sp=0xc0000636a0 pc=0x462e9c
Avoid obfuscating the three names which cause problems: "reflect",
"Method", and "MethodByName".
While here, we also teach obfuscatedImportPath to skip "runtime",
as I also saw that the toolchain detects it for many reasons.
That wasn't a problem yet, as we do not obfuscate the runtime,
but it was likely going to become a problem in the future.
We can drop the code that kicked in when GOGARBLE was empty.
We can also add the value in addGarbleToHash unconditionally,
as we never allow it to be empty.
In the tests, remove all GOGARBLE lines where it just meant "obfuscate
everything" or "obfuscate the entire main module".
cgo.txtar had "obfuscate everything" as a separate step,
so remove it entirely.
linkname.txtar started failing because the imported package did not
import strings, so listPackage errored out. This wasn't a problem when
strings itself wasn't obfuscated, as transformLinkname silently left
strings.IndexByte untouched. It is a problem when IndexByte does get
obfuscated. Make that kind of listPackage error visible, and fix it.
reflect.txtar started failing with "unreachable method" runtime throws.
It's not clear to me why; it appears that GOGARBLE=* makes the linker
think that ExportedMethodName is suddenly unreachable.
Work around the problem by making the method explicitly reachable,
and leave a TODO as a reminder to investigate.
Finally, gogarble.txtar no longer needs to test for GOPRIVATE.
The rest of the test is left the same, as we still want the various
values for GOGARBLE to continue to work just like before.
Fixes#594.
Some big changes landed in Go for the upcoming 1.20.
While here, remove the use of GOGC=off with make.bash,
as https://go.dev/cl/436235 makes that unnecessary now.
We were still leaking the filenames for assembly files.
In our existing asm.txtar test's output binary,
the string `test/main/garble_main_amd64.s` was present.
This leaked full import paths on one hand,
and the filenames of each assembly file on the other.
We avoid this in Go files by using `/*line` directives,
but those are not supported in assembly files.
Instead, obfuscate the paths in the temporary directory.
Note that we still need a separate temporary directory per package,
because otherwise any included header files might collide.
We must remove the `main` package panic in obfuscatedImportPath,
as we now need to use that function for all packages.
While here, remove the outdated comment about `-trimpath`.
Fixes#605.
One more package that further unblocks obfuscating the runtime.
The issue was the TODO we already had about go:linkname directives with
just one argument, which are used in the syscall package.
While here, factor out the obfuscation of linkname directives into
transformLinkname, as it was starting to get a bit complex.
We now support debug logging as well, while still being able to use
"early returns" for some cases where we bail out.
We also need listPackage to treat all runtime sub-packages like it does
runtime itself, as `runtime/internal/syscall` linknames into `syscall`
without it being a dependency as well.
Finally, add a regression test that, without the fix,
properly spots that the syscall package was not obfuscated:
FAIL: testdata/script/gogarble.txtar:41: unexpected match for ["syscall.RawSyscall6"] in out
Updates #193.
This failed at link time because transformAsm did not know how to handle
the fact that the runtime package's assembly code implements the
`time.now` function via:
TEXT time·now<ABIInternal>(SB),NOSPLIT,$16-24
First, we need transformAsm to happen for all packages, not just the
ones that we are obfuscating. This is because the runtime can implement
APIs in other packages which are themselves obfuscated, whereas runtime
may not itself be getting obfuscated. This is currently the case with
`GOGARBLE=*` as we do not yet support obfuscating the runtime.
Second, we need to teach replaceAsmNames to handle qualified names with
import paths. Not just to look up the right package information for the
name, but also to obfuscate the package path if necessary.
Third, we need to relax the Deps requirement on listPackage, since the
runtime package and its dependencies are always implicit dependencies.
This is a big step towards being able to obfuscate the runtime, as there
is now just one package left that we cannot obfuscate outside the runtime.
Updates #193.
The generics issue has been fixed for the upcoming Go 1.20.
Include that version as a reminder for when we can drop Go 1.19.
The fs.SkipAll proposal is also implemented for Go 1.20.
The BinaryContentID comment was a little bit trickier.
We did get stamped VCS information some time ago,
but it only provides us with the current commit info and a dirty bit.
That is not enough for our use of the build cache,
because we want any uncommitted changes to garble to cause rebuilds.
I don't think we'll get any better than using garble's own build ID.
Reword the quasi-TODO to instead explain what we're doing and why.
See https://golang.org/issue/28749. The improved asm test would fail:
go parse: $WORK/imported/imported_amd64.s:1:1: expected 'package', found TEXT (and 2 more errors)
because we would incorrectly parse a non-Go file as a Go file.
Add a workaround. The original reporter's reproducer with go-ethereum
works now, as this was the last hiccup.
Fixes#555.
The reverse feature relied on `GoFiles` from `go list`,
but that list may not be enough to typecheck a package:
typecheck error: $WORK/main.go:3:15: undeclared name: longMain
`go help list` shows:
GoFiles []string // .go source files (excluding CgoFiles, TestGoFiles, XTestGoFiles)
CgoFiles []string // .go source files that import "C"
CompiledGoFiles []string // .go files presented to compiler (when using -compiled)
In other words, to mimic the same list of Go files fed to the compiler,
we want CompiledGoFiles.
Note that, since the cgo files show up as generated files,
we currently do not support reversing their filenames.
That is left as a TODO for now.
Updates #555.
This way, the child process knows that it's running a toolchain command
via -toolexec without having to guess via filepath.IsAbs.
While here, improve the docs and tests a bit.
There used to be a reason to keep these maps separate, but ever since we
became better at obfuscating the standard library, that has gone away.
It's still a good idea to keep `go list -deps runtime` as a group,
but we can do that via a comment inside a joint map literal.
I also noticed that one comment still referred to cannotObfuscateNames,
which hasn't existed for some time. Fix that up.
It's also not documented how cachedOutput contains info for all deps,
so clarify that while we're improving the docs.
Finally, the reason we cannot obfuscate the syscall package was out of
date; it's not part of the runtime. It is a go:linkname bug.
A chunk from crypto/internal/boring has been split away as a separate
package very recently, shortly before 1.19rc1 is due for release.
See https://go.dev/cl/407135 for more information.
Makes garble work on the latest Go tip again.
It appears that we already support obfuscating them,
and nothing seems to break when they are pulled in.
While here, add runtime/internal/syscall to runtimeAndDeps.
It first appeared in Go 1.18, but we missed adding it.
It seems like not having it there didn't cause any issues,
which makes sense given it's got almost zero Go code.
We also teach garble about the -work boolean build flag,
which has existed for multiple years but we forgot about.
It's likely that noone noticed as it's a rarely used flag.
First, I tried to follow my own past advice to only set GarbleActionID
if ToObfuscate is true. However, that broke at least three parts of
transformCompile, as the hash is used for more than I recalled.
Give up on that idea, because the current code is working as intended.
Better document what GarbleActionID is and what we use it for.
Second, now that https://go.dev/cl/348741 was shipped with Go 1.18,
using the logger when its output is io.Discard is already a no-op.
So we no longer need our debugf wrapper to apply the no-op logic.
First, two cleanups: unsafe and internal/abi are already in
runtimeAndDeps, so they are already not being obfuscated.
No need to repeat them in the other map.
Then, via trial and error, remove:
* runtime/pprof; it seems like we handle its runtime linknames well now.
* os/signal; unclear what "rebuilds don't work" meant, but it works now.
Our gogarble.txt test already does a full reproducible rebuild.
* crypto/x509/internal/macos; another linkname user that works now.
It is likely that we could remove one or two more packages already,
but it's best to move slowly and watch out for unexpected regressions.
It added packages which are only built with the boringcrypto build tag,
so trying to `go list` them will fail even though it doesn't matter.
While here, a few more minor cleanups:
1) Hide GarbleActionID and ToObfuscate from encoding/json, so that they
can't possibly collide with the fields consumed from `go list -json`.
2) Add test cases for `garble build` with packages that fail to load.
Note that this requires GOGARBLE=* to avoid its "does not match any
package to be built" error.
3) Remove the last use of interface{}, in a testdata file.
Fixes#531.
Returning a json or gob error directly to the user is generally not
helpful, as it lacks any form of context.
For example, from the json unmarshal of "go env -json":
$ garble build
invalid character 'w' looking for beginning of value
Also improve the error when the user ran garble in the wrong way,
resulting in no shared gob file. The context is now shorter, and we also
include the os.Open error in case it contains any useful details.
While here, apply Go tip's gofmt, which reformatted a godoc list.
For #523.
Now that we've released v0.6.0, that will be the last feature release to
feature support for Go 1.17. The upcoming v0.7.0 will be Go 1.18+.
Code-wise, the cleanup here isn't super noticeable,
but it will be easier to work on features like VCS-aware version
information and generics support without worrying about Go 1.17.
Plus, now CI is back to being much faster.
Note how "go 1.18" in go.mod makes "go mod tidy" more aggressive.
The added comment in main.go explains the situation in detail.
The added test is a minimization of the scenario, which failed:
> cd mod1
> garble -seed=${SEED1} build -v gopkg.in/garbletest.v2
> cd ../mod2
> garble -seed=${SEED1} build -v
[stderr]
test/main/mod2
# test/main/mod2
cannot load garble export file for gopkg.in/garbletest.v2: open […]/go-build/ed/[…]-garble-ZV[…]-d: no such file or directory
To work around the problem, we'll always add each package's
GarbleActionID to its build artifact, even when not using -seed.
This will get us the previous behavior with regards to the build cache,
meaning that we fix the recent regression.
The added variable doesn't make it to the final binary either.
While here, improve the cached file loading error's context,
and add an extra sanity check for duplicates on ListedPackages.
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.
Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.
In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.
Expand the README docs with more examples and detail.
Fixes#449.
Back in the day, we used to call toObfuscate anytime we needed to know
whether a package should be obfuscated.
More recently, we started computing via the ToObfuscate field,
which then gets shared with all sub-processes via sharedCache.
We still had two places that directly called toObfuscate.
Replace those with ToObfuscate, and inline toObfuscate into shared.go.
obfuscatedImportPath is also a potential footgun for main packages.
Some use cases always want the original "main" package name,
such as for use in the compiler's "-p main" flag,
while other cases want the obfuscated package import path,
such as the entries in importcfg files.
Since each of these call sites handles the edge case well,
obfuscatedImportPath now panics on main packages to avoid any misuse.
Finally, test that we never leak main package paths via ldflags.txt.
We never did, but it's good to make sure.
Overall, this avoids confusion and trims the size of main.go a bit.