Go 1.21 already breaks one of the three patches.
The complexity of the patches will likely increase,
and we only ever need to actively maintain the highest set of patches,
so splitting by major version feels natural.
We're building the linker binary for the host GOOS,
not the target GOOS that we happen to be building for.
I noticed that, after running `go test`, my garble cache
would contain both link and link.exe, which made no sense
as I run linux and not windows.
`go env` has GOHOSTOS to mirror GOOS, but there is no
GOHOSTEXE to mirror GOEXE, so we reconstruct it from
runtime.GOOS, which is equivalent to GOHOSTOS.
Add a regression test as well.
When we use `go list` on the standard library, we need to be careful
about what flags are passed from the top-level build command,
because some flags are not going to be appropriate.
In particular, GOFLAGS=-modfile=... resulted in a failure,
reproduced via the GOFLAGS variable added to linker.txtar:
go: inconsistent vendoring in /home/mvdan/tip/src:
golang.org/x/crypto@v0.5.1-0.20230203195927-310bfa40f1e4: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
golang.org/x/net@v0.7.0: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
golang.org/x/sys@v0.5.1-0.20230208141308-4fee21c92339: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
golang.org/x/text@v0.7.1-0.20230207171107-30dadde3188b: is marked as explicit in vendor/modules.txt, but not explicitly required in go.mod
To ignore the vendor directory, use -mod=readonly or -mod=mod.
To sync the vendor directory, run:
go mod vendor
To work around this problem, reset the -mod and -modfile flags when
calling "go list" on the standard library, as those are the only two
flags which alter how we load the main module in a build.
The code which builds a modified cmd/link has a similar problem;
it already reset GOOS and GOARCH, but it could similarly run into
problems if other env vars like GOFLAGS were set.
To be on the safe side, we also disable GOENV and GOEXPERIMENT,
which we borrow from Go's bootstrapping commands.
At linker stage, we now encrypt funcInfo.entryoff value with a simple algorithm (1 xor + 1 mul).
This makes it harder to relate function metadata (e.g. name) to function itself in binary, almost without affecting performance.
Even when setting git's current directory to a temporary directory,
it could find a git repository in a parent directory and still malfunction.
Use the --git-dir flag to ensure that walking doesn't happen at all.
While here, ensure that "git apply" is always applying our patches,
and add a regression test to linker.txtar when not testing with -short.
We were noticing sporadic `go test` failures where running
an obfuscated binary could panic with:
fatal error: invalid function symbol table
It turns out that this could happen when modinfo.txtar runs first;
when it patched the linker with `git -C apply`, its current directory
had a valid git repository initialized, and apparently that somehow
prevents the patches from being applied.
It's unclear whether this is git's fault or our own, but in any case,
using a temporary working directory is easier and fixes the bug.
Do the same for `go build`, just in case.
The patch to the linker does this when generating the pclntab,
which is the binary section containing func names.
When `-tiny` is being used, we look for unexported funcs,
and set their names to the offset `0` - a shared empty string.
We also avoid including the original name in the binary,
which saves a significant amount of space.
The following stats were collected on GOOS=linux,
which show that `-tiny` is now about 4% smaller:
go build 1203067
garble build 782336
(old) garble -tiny build 688128
(new) garble -tiny build 659456
This value is hard-coded in the linker and written in a header.
We could rewrite the final binary, like we used to do with import paths,
but that would require once again maintaining libraries to do so.
Instead, we're now modifying the linker to do what we want.
It's not particularly hard, as every Go install has its source code,
and rebuilding a slightly modified linker only takes a few seconds at most.
Thanks to `go build -overlay`, we only need to copy the files we modify,
and right now we're just modifying one file in the toolchain.
We use a git patch, as the change is fairly static and small,
and the patch is easier to understand and maintain.
The other side of this change is in the runtime,
as it also hard-codes the magic value when loading information.
We modify the code via syntax trees in that case, like `-tiny` does,
because the change is tiny (one literal) and the affected lines of code
are modified regularly between major Go releases.
Since rebuilding a slightly modified linker can take a few seconds,
and Go's build cache does not cache linked binaries,
we keep our own cached version of the rebuilt binary in `os.UserCacheDir`.
The feature isn't perfect, and will be improved in the future.
See the TODOs about the added dependency on `git`,
or how we are currently only able to cache one linker binary at once.
Fixes#622.