Go code can retrieve and use field and method names via the `reflect` package.
For that reason, historically we did not obfuscate names of fields and methods
underneath types that we detected as used for reflection, via e.g. `reflect.TypeOf`.
However, that caused a number of issues. Since we obfuscate and build one package
at a time, we could only detect when types were used for reflection in their own package
or in upstream packages. Use of reflection in downstream packages would be detected
too late, causing one package to obfuscate the names and the other not to, leading to a build failure.
A different approach is implemented here. All names are obfuscated now, but we collect
those types used for reflection, and at the end of a build in `package main`,
we inject a function into the runtime's `internal/abi` package to reverse the obfuscation
for those names which can be used for reflection.
This does mean that the obfuscation for these names is very weak, as the binary
contains a one-to-one mapping to their original names, but they cannot be obfuscated
without breaking too many Go packages out in the wild. There is also some amount
of overhead in `internal/abi` due to this, but we aim to make the overhead insignificant.
Fixes#884, #799, #817, #881, #858, #843, #842Closes#406
We used it to detect GOOS-specific packages and ignore their load errors
without having to do a substring search.
However, it turns out that repeatedly loading the string slice
from gob files in the cache is rather slow, particularly since many
Go packages have dozens of GOOS-specific files which can be ignored.
│ old │ new │
│ cached-sec/op │ cached-sec/op vs base │
Build-8 340.3m ± 1% 335.8m ± 2% -1.32% (p=0.002 n=10)
│ old │ new │
│ mallocs/op │ mallocs/op vs base │
Build-8 35.73M ± 0% 35.09M ± 0% -1.79% (p=0.000 n=10)
In looking at the cpu and memory profiles, it surfaced that we spent
a lot of time in garbage collection, and a significant amount of the
garbage was produced by gob decoding string slices.
listedPackage.Deps is a list of a package's transitive dependencies,
so as a Go build gets larger, the list also gets larger and larger.
Given that Imports is the list of direct dependencies,
we can reconstruct it ourselves as needed, which is not always.
Moreover, since we want to do lookups, we can build a map directly.
This doesn't directly result in a wall time speed-up,
but it does result in a significant reduction in allocations.
The gob files we store in the disk cache should also be a bit smaller.
│ old │ new │
│ cached-sec/op │ cached-sec/op vs base │
Build-8 339.5m ± 2% 340.3m ± 1% ~ (p=0.218 n=10)
│ old │ new │
│ mallocs/op │ mallocs/op vs base │
Build-8 38.08M ± 0% 35.73M ± 0% -6.18% (p=0.000 n=10)
This allows me to collect a full CPU profile, showing us that we clearly
spend too much CPU time in garbage collection.
When collecting a full memory profile, we can see where the allocations
come from now:
Showing nodes accounting for 5636770, 31.07% of 18141579 total
Dropped 630 nodes (cum <= 90707)
Showing top 10 nodes out of 278
flat flat% sum% cum cum%
1692229 9.33% 9.33% 1910679 10.53% encoding/gob.decStringSlice
1005596 5.54% 14.87% 1005596 5.54% golang.org/x/tools/go/ssa.buildReferrers
458753 2.53% 17.40% 458753 2.53% go/scanner.(*Scanner).scanIdentifier
458752 2.53% 19.93% 458752 2.53% reflect.(*structType).Field
425984 2.35% 22.28% 448136 2.47% go/parser.(*parser).parseIdent
390049 2.15% 24.43% 390049 2.15% golang.org/x/tools/go/ssa.(*BasicBlock).emit
327683 1.81% 26.23% 371373 2.05% golang.org/x/tools/go/ssa.NewConst
311296 1.72% 27.95% 1024551 5.65% mvdan.cc/garble.(*transformer).transformGoFile.func1
287891 1.59% 29.54% 287891 1.59% encoding/gob.decString
278537 1.54% 31.07% 657043 3.62% golang.org/x/tools/go/ssa.(*builder).compLit
This method can work for any invocation of garble,
but for now we only directly wire it up for `go test -bench`.
It can still be used for regular invocations of `garble build`.
x/exp/rand was being used for no apparent reason; use math/rand.
x/exp/maps and x/exp/slices can be replaced with maps and slices
respectively now that we require Go 1.23 or later.
Note that the APIs are slightly different due to iterators.
This lets us start taking advantage of featurs from Go 1.23,
particularly tracking aliases in go/types and iterators.
Note that we need to add code to properly handle or skip over the new
*types.Alias type which go/types produces for Go type aliases.
Also note that we actually turn this mode off entirely for now,
due to the bug reported at https://go.dev/issue/70394.
We don't yet remove our own alias tracking code yet due to the above.
We hope to be able to remove it very soon.
Otherwise we miscalculate int sizes, type sizes, alignments, and so on.
Caught by the GOARCH=386 go test on CI, since the os package imports
internal/syscall/unix, which uses arch-dependent padding.
The different padding between our incorrect use of go/types
and the correct typechecking done by the compiler caused different
obfuscation of fields, as the struct types stringified differently,
and they are used as a hash salt for field name obfuscation.
Recently, a patch changed the argument `-mod=` to `-mod=readonly`
as the former is not really a valid flag value, and broke with go.work.
However, the latter seems to break our tests on Go 1.22.6
when listing all of runtimeLinknamed:
panic: failed to load missing runtime-linknamed packages: golang.org/x/crypto@v0.16.1-0.20231129163542-152cdb1503eb:
reading http://127.0.0.1:43357/mod/golang.org/x/crypto/@v/v0.16.1-0.20231129163542-152cdb1503eb.mod: 404 Not Found
It seems like, somehow, listing std packages was trying to download
x/crypto from GOPROXY - which is a local server with testdata/mod,
and so it does not contain x/crypto. However, this is entirely wrong,
as std vendors dependencies, including this very version of x/crypto.
Reverting the change to `-mod=readonly` resolves this issue,
which explains why we hadn't encountered this surprising GOPROXY error,
but the revert would also break users of go.work files.
Luckily, we have a better alternative: rather than trying to override
the value of the flags by adding more arguments, delete them entirely.
And update some actions and staticcheck while here.
Drop the testing of Go master as well, as I haven't used or maintained
such a setup for a while now. We can simply add Go 1.24 RC versions
to the go-version matrix once they come out.
Fixes#859.
This teaches the program how to collect information from multiple
Go versions and join it together. For this to work, it needs to
select the Go versions itself, which is now possible via GOTOOLCHAIN.
The merging of data is fairly simple; we join the results from all
versions, and we remove duplicates from older Go versions.
Start producing output with the Go version noted on every data point,
so that we can easily scan what each Go version is contributing.
The empty string is not a valid value for the -mod flag, and it fails when using a workspace too:
go: -mod may only be set to readonly or vendor when in workspace mode, but it is set to ""
gopls correctly pointed out that the err==nil check was never met,
as err was assigned and we returned early when err!=nil.
This was an oversight when I wrote this; when Encode fails,
we shouldn't return, because we still want to close the file.
We don't defer because we want to check the error; explain that.
Keeping the original lexical sorting of Go packages would be very hard,
as a Go program may import an unknown number of Go packages,
and we load and obfuscate one package at a time by design.
One option would be to load all packages upfront when obfuscating
main packages, but that would break the per-package caching of
ofbuscated Go packages, causing a huge slow-down in builds.
Another option would be to not obfuscate import paths,
which would clearly cause a worsening of the obfuscation quality.
The third option is to not attempt to keep the original order,
and document that as a caveat in the README.
I suspect the vast majority of Go projects won't be affected by this,
and those few that might be can always use imports to enforce the order.
Closes#693, per the decision above to not change what we do.
Right now, we only have linker patches for Go 1.22.x.
We only ever maintain those for one or two major Go versions at a time.
If a user tries to use the Go toolchain from 1.21, we already fail
with "Go version too old" messages early on, but we don't for 1.23,
causing a relatively confusing error later on when we link a binary:
cannot get modified linker: cannot retrieve linker patches: open patches/go1.23: file does not exist
Instead, fail early and with a good error message.
Panicking in small helpers or in funcs that don't return error
has proved useful to keep code easier to maintain,
particularly for cases that should typically never happen.
However, in these cases we can error just as easily.
In particular, I was getting a panic whenever I forgot
that I was running garble with Go master (1.23), which is over the top.
When updating Garble to support Go 1.22.0, CI on MacOS spotted
that the syscall package was failing to build given that it uses
assembly code which is only allowed in some std packages.
That allowlist is based on import paths, and we were obfuscating
the syscall package's import path, so that was breaking GOOS=darwin.
As a fix, I added syscall to runtimeAndDeps to not obfuscate it.
That wasn't a great fix; it's not part of runtime and its dependencies,
and there's no reason we should avoid obfuscating the package contents.
Not obfuscating the contents in fact broke x/sys/unix,
as it contains a copy of syscall.Rlimit which it type converted with.
Undo that fix and reinstate the gogarble.txtar syscall test.
Implement the fix where we only leave syscall's import path alone.
Add a regression test, and add a note about adding x/net and x/sys
to check-third-party.sh so that we can catch these bugs earlier.
Fixes#830.
We no longer need to worry about the scope of range variables,
we can iterate over integers directly, and we can use cmp.Or too.
I haven't paid close attention to using these everywhere.
This is mainly testing out the new features where I saw some benefit.
The trash block generator docs aren't ready yet, they will come soon.
This is not a release blocker, given that the control flow obfuscator
is experimental and opt-in for now.
It seems like building with Go 1.22.0 for GOOS=darwin started
running into some issues with the syscall package's use of ABIInternal
in assembly source code:
> exec garble build
[stderr]
# syscall
[...].s:16: ABI selector only permitted when compiling runtime, reference was to "runtime.entersyscall"
The error can be reproduced from another platform like GOOS=linux
as long as we have any test that cross-compiles std to GOOS=darwin.
We had crossbuild.txtar which only ensured we covered GOOS=windows
and GOOS=linux, so add a third case to ensure MacOS is covered too.
This will slow down the tests a bit, but is important for the sake
of ensuring that we catch these bugs early, even without MacOS on CI.
In fact, we hadn't caught this earlier for Go 1.22 precisely because
on CI we only tested on Go tip with GOOS=linux, for the sake of speed.
Adding the rest of the package import paths from objabi.allowAsmABIPkgs
to our runtimeAndDeps generated map solves this error.