The generics issue has been fixed for the upcoming Go 1.20.
Include that version as a reminder for when we can drop Go 1.19.
The fs.SkipAll proposal is also implemented for Go 1.20.
The BinaryContentID comment was a little bit trickier.
We did get stamped VCS information some time ago,
but it only provides us with the current commit info and a dirty bit.
That is not enough for our use of the build cache,
because we want any uncommitted changes to garble to cause rebuilds.
I don't think we'll get any better than using garble's own build ID.
Reword the quasi-TODO to instead explain what we're doing and why.
See https://golang.org/issue/28749. The improved asm test would fail:
go parse: $WORK/imported/imported_amd64.s:1:1: expected 'package', found TEXT (and 2 more errors)
because we would incorrectly parse a non-Go file as a Go file.
Add a workaround. The original reporter's reproducer with go-ethereum
works now, as this was the last hiccup.
Fixes#555.
The reverse feature relied on `GoFiles` from `go list`,
but that list may not be enough to typecheck a package:
typecheck error: $WORK/main.go:3:15: undeclared name: longMain
`go help list` shows:
GoFiles []string // .go source files (excluding CgoFiles, TestGoFiles, XTestGoFiles)
CgoFiles []string // .go source files that import "C"
CompiledGoFiles []string // .go files presented to compiler (when using -compiled)
In other words, to mimic the same list of Go files fed to the compiler,
we want CompiledGoFiles.
Note that, since the cgo files show up as generated files,
we currently do not support reversing their filenames.
That is left as a TODO for now.
Updates #555.
Assembly files can include header files within the same Go module,
and those header files can include "defines" which refer to Go names.
Since those Go names are likely being obfuscated,
we need to replace them just like we do in assembly files.
The added mechanism is rather basic; we add two TODOs to improve it.
This should help when building projects like go-ethereum.
Fixes#553.
This way, the child process knows that it's running a toolchain command
via -toolexec without having to guess via filepath.IsAbs.
While here, improve the docs and tests a bit.
Following the best practices from upstream.
In particular, the "txt" extension is somewhat ambiguous.
This may cause some conflicts due to the git diff noise,
but hopefully we won't ever do this again.
I was wrongly assumed that, if `used` has an `Elem` method,
then `origin` must too. But it does not if it's a type parameter.
Add a test case too, which panicked before the fix.
Fixes#577.
While here, start the changelog for the upcoming release,
which will likely be a bugfix release as it's a bit early to drop 1.18.
We also bump staticcheck to get a version that supports 1.19.
I also noticed the "Go version X or newer" messages were slightly weird
and inconsistent. Our policy, per the README, is "Go version X or newer",
so the errors given to the user were unnecessarily confusing.
For example, now that Go 1.19 is out, we shouldn't simply recommend that
they upgrade to 1.18; we should recommend 1.18 or later.
When obfuscating the following piece of code:
func issue_573(s struct{ f int }) {
var _ *int = &s.f
/*x*/
}
the function body would roughly end up printed as:
we would roughly end up with:
var _ *int = &dZ4xYx3N
/*x*/.rbg1IM3V
Note that the /*x*/ comment got moved earlier in the source code.
This happens because the new identifiers are longer, so the printer
thinks that the selector now ends past the comment.
That would be fine - we don't really mind where comments end up,
because these non-directive comments end up being removed anyway.
However, the resulting syntax is wrong, as the period for the selector
must be on the first line rather than the second.
This is a go/printer bug that we should fix upstream,
but until then, we must work around it in Go 1.18.x and 1.19.x.
The fix is somewhat obvious in hindsight. To reduce the chances that
go/printer will trip over comments and produce invalid syntax,
get rid of most comments before we use the printer.
We still keep the removal of comments after printing,
since go/printer consumes some comments in ast.Node Doc fields.
Add the minimized unit test case above, and add the upstream project
that found this bug to check-third-party.
andybalholm/brotli helps cover a compression algorithm and ccgo code
generation from C to Go, and it's also a fairly popular module,
particular with HTTP implementations which want pure-Go brotli.
While here, fix the check-third-party script: it was setting GOFLAGS
a bit too late, so it may run `go get` on the wrong mod file.
Fixes#573.
Every now and then, I get test failures in the goenv test like:
> [!windows] cp $EXEC_PATH $NAME/garble$exe
> [!windows] exec $NAME/garble$exe build
[fork/exec with"double"quotes/garble: text file busy]
FAIL: testdata/scripts/goenv.txt:21: unexpected command failure
The root cause is https://go.dev/issue/22315, which isn't going to be
fixed anytime soon, as it is a race condition in Linux itself, triggered
by how heavily concurrent Go tends to be.
For now, try to make the race much less likely to happen.
The changes are all fairly minor and non-breaking,
and the last release was less than two months ago,
so a bugfix release sounds like the right choice.
That is, since Go 1.18.1, released back in April 2022.
We no longer need to worry about the buggy Go 1.18.0.
While here, use a clearer env var name; the settings are build settings.
Make the literals section easier to follow by using the word
"expression" more consistently.
The tiny section was misleading as well, as the README made it seem like
position information was stripped both with and without the flag.
Clarify what happens with that information in each case.
We obfuscate import paths in import declarations like:
"domain.com/somepkg"
by replacing them with the obfuscated package path:
somepkg "HPS4Mskq"
Note how we add a name to the import if there wasn't one,
so that references like somepkg.Foo keep working in the code.
This could break in some edge cases involving comments between imports
in the Go code, because go/printer is somewhat brittle with positions:
> garble build -tags buildtag
[stderr]
# test/main/importedpkg
:16: syntax error: missing import path
exit status 2
exit status 2
To prevent that, ensure the name has a reasonable position.
This was preventing github.com/gorilla/websocket from being obufscated.
It is a fairly popular library in Go, but we don't add it to
scripts/check-third-party.sh for now as wireguard already gives us
coverage over networking and cryptography.
These lines get executed for every identifier in every package in each
Go build, so one allocation per log.Printf call can quickly add up to
millions of allocations across a build.
Until https://go.dev/issue/53465 is fixed, the best way to avoid the
escaping due to `...any` is to not perform the function call at all.
name old time/op new time/op delta
Build-16 10.5s ± 1% 10.5s ± 2% ~ (p=0.604 n=9+10)
name old bin-B new bin-B delta
Build-16 5.52M ± 0% 5.52M ± 0% ~ (all equal)
name old cached-time/op new cached-time/op delta
Build-16 506ms ±13% 500ms ± 7% ~ (p=0.739 n=10+10)
name old mallocs/op new mallocs/op delta
Build-16 31.7M ± 0% 30.1M ± 0% -5.33% (p=0.000 n=10+9)
name old sys-time/op new sys-time/op delta
Build-16 5.70s ± 5% 5.78s ± 6% ~ (p=0.278 n=9+10)
There used to be a reason to keep these maps separate, but ever since we
became better at obfuscating the standard library, that has gone away.
It's still a good idea to keep `go list -deps runtime` as a group,
but we can do that via a comment inside a joint map literal.
I also noticed that one comment still referred to cannotObfuscateNames,
which hasn't existed for some time. Fix that up.
It's also not documented how cachedOutput contains info for all deps,
so clarify that while we're improving the docs.
Finally, the reason we cannot obfuscate the syscall package was out of
date; it's not part of the runtime. It is a go:linkname bug.
A chunk from crypto/internal/boring has been split away as a separate
package very recently, shortly before 1.19rc1 is due for release.
See https://go.dev/cl/407135 for more information.
Makes garble work on the latest Go tip again.
It's not a problem to leak filenames like _cgo_gotypes.go,
but it is a problem when it includes the import path:
$ strings main | grep _cgo_gotypes
test/main/_cgo_gotypes.go
Here, "test/main" is the module path, which we want to hide.
We hadn't caught this before because the cgo.txt test did not check that
module paths aren't being leaked - it does now.
The fix is rather simple; we let printFile handle cgo-generated files.
We used to avoid that due to compiler errors, as the compiler only
allows some special cgo comment directives to work in cgo-generated
code, to prevent misuse in user code.
The fix is rather easy: the obfuscated filenames should begin with
"_cgo_" to appease the compiler's check.
Back in February 2021, we changed the obfuscation logic so that the
entire `garble build` process would use one shared temporary directory
across all package builds, reducing the amount of files we created in
the top-level system temporary directory.
However, we made one mistake: we didn't swap os.Remove for os.RemoveAll.
Ever since then, we've been leaving temporary files behind.
Add regression tests, which failed before the fix, and fix the bug.
Note that we need to test `garble reverse` as well, as it calls
toolexecCmd separately, so it needs its own cleanup as well.
The cleanup happens via the env var, which doesn't feel worse than
having toolexecCmd return an extra string or cleanup func.
While here, also test that we support TMPDIRs with special characters.
printFile is one of the functions to blame for most of the CPU cost and
allocations for garble itself, as reported by `perf record` for a clean build.
One contributor is how we print each file and then parse it again,
which we did for the sake of inserting line directives correctly.
With a bit of care, we can do this by tokenizing after printing,
as opposed to parsing into a full go/ast again.
This is moderately cheaper, but more than anything, allocates far less.
That is to be expected given how go/ast is a tree of pointers,
whereas go/scanner simply gives us a stream of tokens.
name old time/op new time/op delta
Build-16 10.4s ± 2% 10.3s ± 1% ~ (p=0.393 n=10+10)
name old bin-B new bin-B delta
Build-16 5.51M ± 0% 5.51M ± 0% ~ (all equal)
name old cached-time/op new cached-time/op delta
Build-16 398ms ±12% 391ms ±10% ~ (p=0.529 n=10+10)
name old mallocs/op new mallocs/op delta
Build-16 34.4M ± 0% 31.8M ± 0% -7.65% (p=0.000 n=10+10)
name old sys-time/op new sys-time/op delta
Build-16 5.80s ± 6% 5.86s ± 4% ~ (p=0.218 n=10+10)
The new code is shorter, but perhaps a bit trickier,
so I also added more comments to explain what's going on.
Note how the time/op change is practically noise,
but mallocs/op goes down significantly, which is always a good sign.
Reuse a buffer and a map across loop iterations, because we can.
Make recordTypeDone only track named types, as that is enough to detect
type cycles. Without named types, there can be no cycles.
These two reduce allocs by a fraction of a percent:
name old time/op new time/op delta
Build-16 10.4s ± 2% 10.4s ± 1% ~ (p=0.739 n=10+10)
name old bin-B new bin-B delta
Build-16 5.51M ± 0% 5.51M ± 0% ~ (all equal)
name old cached-time/op new cached-time/op delta
Build-16 391ms ± 9% 407ms ± 7% ~ (p=0.095 n=10+9)
name old mallocs/op new mallocs/op delta
Build-16 34.5M ± 0% 34.4M ± 0% -0.12% (p=0.000 n=10+10)
name old sys-time/op new sys-time/op delta
Build-16 5.87s ± 5% 5.82s ± 5% ~ (p=0.182 n=10+9)
It doesn't seem like much, but remember that these stats are for the
entire set of processes, where garble only accounts for about 10% of the
total wall time when compared to the compiler or linker. So a ~0.1%
decrease globally is still significant.
linkerVariableStrings is also indexed by *types.Var rather than types.Object,
since -ldflags=-X only supports setting the string value of variables.
This shouldn't make a significant difference in terms of allocs,
but at least the map is less prone to confusion with other object types.
To ensure the new code doesn't trip up on non-variables, we add test cases.
Finally, for the sake of clarity, index into the types.Info maps like
Defs and Uses rather than calling ObjectOf if we know whether the
identifier we have is a definition of a name or the use of a defined name.
This isn't better in terms of performance, as ObjectOf is a tiny method,
but just like with linkerVariableStrings before, the new code is clearer.
The _gomod_.go file inserted by the Go toolchain no longer shows up;
it's likely that either the -trimpath or -buildvcs=false flags are
preventing that extra bit of work from happening entirely.
The modinfo.txt test ensures that we're not breaking,
and the inner lines of code weren't hit as part of `go test`.
It also appears that we don't need to avoid obfuscating functions
defined with an `//export` directive. This is likely because cgo runs as
a pre-process step compared to the compiler, so us removing the
directive later does not make a difference.
We might need to revisit this in the future if we implement obfuscating
Go code instead of builds, e.g. `garble export`.
Just in case, I've expanded the cgo.txt test to also include one more
kind of cgo integration: an "import C" block including a C header file.
Either of these changes are slightly risky, as our tests don't cover all
edge cases. We've just done a release, so now is the time to try them.
Now that we've required Go 1.18 or later for some time,
stop supporting `// +build` directives entirely.
That should be fine, given that `go build` will fail too.
The TODO about ToObfuscate is also obsolete; see the added comment.
Finally, tweak the comments a bit after reading them again.
We don't use go/ast.Objects, as we use go/types instead.
Avoiding this work saves a bit of CPU and memory allocs.
name old time/op new time/op delta
Build-16 10.2s ± 1% 10.2s ± 1% ~ (p=0.937 n=6+6)
name old bin-B new bin-B delta
Build-16 5.47M ± 0% 5.47M ± 0% ~ (all equal)
name old cached-time/op new cached-time/op delta
Build-16 328ms ±14% 321ms ± 6% ~ (p=0.589 n=6+6)
name old mallocs/op new mallocs/op delta
Build-16 34.8M ± 0% 34.0M ± 0% -2.26% (p=0.010 n=6+4)
name old sys-time/op new sys-time/op delta
Build-16 5.89s ± 3% 5.89s ± 3% ~ (p=0.937 n=6+6)
See golang/go#52463.
It appears that we already support obfuscating them,
and nothing seems to break when they are pulled in.
While here, add runtime/internal/syscall to runtimeAndDeps.
It first appeared in Go 1.18, but we missed adding it.
It seems like not having it there didn't cause any issues,
which makes sense given it's got almost zero Go code.
We also teach garble about the -work boolean build flag,
which has existed for multiple years but we forgot about.
It's likely that noone noticed as it's a rarely used flag.
If we don't quote it, paths containing spaces or quote characters will
fail. For instance, the added test without the fix fails:
> env NAME='with spaces'
> mkdir $NAME
> cp $EXEC_PATH $NAME/garble$exe
> exec $NAME/garble$exe build main.go
[stderr]
go tool compile: fork/exec $WORK/with: no such file or directory
exit status 1
Luckily, the fix is easy: we bundle Go's cmd/internal/quoted package,
which implements a QuotedJoin API for this very purpose.
Fixes#544.
First, I tried to follow my own past advice to only set GarbleActionID
if ToObfuscate is true. However, that broke at least three parts of
transformCompile, as the hash is used for more than I recalled.
Give up on that idea, because the current code is working as intended.
Better document what GarbleActionID is and what we use it for.
Second, now that https://go.dev/cl/348741 was shipped with Go 1.18,
using the logger when its output is io.Discard is already a no-op.
So we no longer need our debugf wrapper to apply the no-op logic.
First, two cleanups: unsafe and internal/abi are already in
runtimeAndDeps, so they are already not being obfuscated.
No need to repeat them in the other map.
Then, via trial and error, remove:
* runtime/pprof; it seems like we handle its runtime linknames well now.
* os/signal; unclear what "rebuilds don't work" meant, but it works now.
Our gogarble.txt test already does a full reproducible rebuild.
* crypto/x509/internal/macos; another linkname user that works now.
It is likely that we could remove one or two more packages already,
but it's best to move slowly and watch out for unexpected regressions.
When splitFlagsFromFiles saw "-p foo/bar.go",
it thought that was the first Go file, when in fact it's not.
We didn't notice because such import paths are pretty rare,
but they do exist, like github.com/nats-io/nats.go.
Before the fix, the added test case fails as expected:
> garble build -tags buildtag
[stderr]
# test/main/goextension.go
open test/main/goextension.go: no such file or directory
We could go through the trouble of teaching splitFlagsFromFiles about
all of the flags understood by the compiler and linker, but that feels
like far more code than the small alternative we went with.
And I'm pretty sure the alternative will work pretty reliably for now.
Fixes#539.
Our tests should already be pretty extensive,
and any bug fixes should result in more regression test cases,
but testing against a few diverse and popular third party modules
will help prevent unintended regressions while developing garble.
The list is short for now. More can be added later.
This adds protobuf and wireguard from the original issue,
but not cobra and logrus, as they aren't particularly complex nor add
significant variety on top of protobuf and wireguard.
While here, we remove the job that only runs crlf-test.sh,
as we don't really need a separate job for a tiny script.
Fixes#240.
It added packages which are only built with the boringcrypto build tag,
so trying to `go list` them will fail even though it doesn't matter.
While here, a few more minor cleanups:
1) Hide GarbleActionID and ToObfuscate from encoding/json, so that they
can't possibly collide with the fields consumed from `go list -json`.
2) Add test cases for `garble build` with packages that fail to load.
Note that this requires GOGARBLE=* to avoid its "does not match any
package to be built" error.
3) Remove the last use of interface{}, in a testdata file.
Fixes#531.
The seed obfuscator uses a type declaration in order to declare a function,
which returns a function with the same type.
This breaks when obfuscating literals inside generic functions, because
type declarations inside generic functions are not currently supported.
Therefore the obfuscator gets disabled until
https://github.com/golang/go/issues/47631 is fixed.
Trying to make Go master work, I noticed that crypto/tls still failed to
build. The reason was generic structs; we would badly obfuscate their
field names when the types are instantiated:
> garble build
[stderr]
# test/main
Z4ZpcbMj.go:4: unknown field 'FOpszkrN' in struct literal of type SYdpWfK5[string]
Z4ZpcbMj.go:5: m8hLTotb.FypXrbTd undefined (type SYdpWfK5[string] has no field or method FypXrbTd)
exit status 2
See the added comment for what happened and how we fixed it. And add tests.
The proposal at https://go.dev/issue/50603 has been approved,
so Go will at some point start producing module pseudo-versions
even if the main module was built from a VCS clone.
To not wait until a future release like Go 1.20,
implement that ourselves with the help of module.PseudoVersion.
The result is a friendlier version output; what used to be
$ go install && garble version
mvdan.cc/garble (devel)
Build settings:
[...]
will now look like
$ go install && garble version
mvdan.cc/garble v0.0.0-20220505210747-22e3d30216be
Build settings:
[...]
Note that we don't use VCS tags in any way, so the prefix is hard-coded
as v0.0.0. That seems fine for development builds, and Go doesn't embed
VCS tag information in binaries anyway.
Finally, note that we start printing the module sum, as it's redundant.
The VCS commit hash, at least in git, should be unique enough.
This ensures that we support obfuscating builds containing the use of
type parameters, the new feature in Go 1.18.
The test is small for now, but we can extend it over time.
There was just one bug that kept the code from obfuscating properly;
that has been fixed in https://go.dev/cl/405194,
and we update x/tools to the latest master version to include it.
Fixes#414.
We don't use sub-matches captured by these groups,
so avoiding that extra work will save some CPU cycles.
It is likely insignificant compared to the rest of a Go build,
but it's a very easy little win.
The added test case reproduces the failure if we uncomment the added
"continue" line in processImportCfg:
# test/bar/exporttest [test/bar/exporttest.test]
panic: refusing to list non-dependency package: test/bar/exporttest
goroutine 1 [running]:
mvdan.cc/garble.processImportCfg({0xc000166780?, 0xc0001f4a70?, 0x2?})
/home/mvdan/src/garble/main.go:983 +0x58b
mvdan.cc/garble.transformCompile({0xc000124020?, 0x11?, 0x12?})
/home/mvdan/src/garble/main.go:736 +0x338
It seems like a quirk of cmd/go that it includes a redundant packagefile
line in this particular edge case, but it's generally harmless for "go
build". For "garble build" it's also harmless in principle, but in
practice we had sanity checks that got upset by the unexpected line.
For now, notice the edge case and ignore it.
Fixes#522.