We were still leaking the filenames for assembly files.
In our existing asm.txtar test's output binary,
the string `test/main/garble_main_amd64.s` was present.
This leaked full import paths on one hand,
and the filenames of each assembly file on the other.
We avoid this in Go files by using `/*line` directives,
but those are not supported in assembly files.
Instead, obfuscate the paths in the temporary directory.
Note that we still need a separate temporary directory per package,
because otherwise any included header files might collide.
We must remove the `main` package panic in obfuscatedImportPath,
as we now need to use that function for all packages.
While here, remove the outdated comment about `-trimpath`.
Fixes#605.
A main package can be imported in one edge case we weren't testing:
when the same main package has any tests, which implicitly import main.
Before the fix, we would panic:
> garble test -v ./...
# test/bar/somemaintest.test
panic: main packages should never need to obfuscate their import paths
The last change made it so that hashWithCustomSalt does not always end
up with 8 base64 characters, which is a good change for the sake of
avoiding easy patterns in obfuscated code.
However, the docs weren't updated accordingly, and it wasn't
particularly clear that the byte giving us randomness wasn't part of the
resulting base64-encoded name.
First, refactor the code to only feed as many checksum bytes to the
base64 encoder as necessary, which is 12.
This shrinks b64NameBuffer and saves us some base64 encoding work.
Second, use the first checksum byte that we don't use, the 13th,
as the source of the randomness.
Note how before we used a base64-encoded byte for the randomness,
which isn't great as that byte was only one of 63 characters,
whereas a checksum byte is one of 256.
Third, update the docs so that the code is as clear as possible.
This is particularly important given that we have no tests.
With debug prints in the gogarble.txt test, we can see that the
randomness in hash lengths is working as intended:
# test/main/stdimporter
hashLength = 13
hashLength = 8
hashLength = 12
hashLength = 15
hashLength = 10
hashLength = 15
hashLength = 9
hashLength = 8
hashLength = 15
hashLength = 15
hashLength = 12
hashLength = 10
hashLength = 13
hashLength = 13
hashLength = 8
hashLength = 15
hashLength = 11
Finally, add a regression test that will complain if we end up with
hashed names that reuse the same length too often.
Out of eight hashed names, the test will fail if six end up with the
same length, as that is incredibly unlikely given that each should pick
one of eight lengths with a fair distribution.
One more package that further unblocks obfuscating the runtime.
The issue was the TODO we already had about go:linkname directives with
just one argument, which are used in the syscall package.
While here, factor out the obfuscation of linkname directives into
transformLinkname, as it was starting to get a bit complex.
We now support debug logging as well, while still being able to use
"early returns" for some cases where we bail out.
We also need listPackage to treat all runtime sub-packages like it does
runtime itself, as `runtime/internal/syscall` linknames into `syscall`
without it being a dependency as well.
Finally, add a regression test that, without the fix,
properly spots that the syscall package was not obfuscated:
FAIL: testdata/script/gogarble.txtar:41: unexpected match for ["syscall.RawSyscall6"] in out
Updates #193.
See https://golang.org/issue/28749. The improved asm test would fail:
go parse: $WORK/imported/imported_amd64.s:1:1: expected 'package', found TEXT (and 2 more errors)
because we would incorrectly parse a non-Go file as a Go file.
Add a workaround. The original reporter's reproducer with go-ethereum
works now, as this was the last hiccup.
Fixes#555.
The reverse feature relied on `GoFiles` from `go list`,
but that list may not be enough to typecheck a package:
typecheck error: $WORK/main.go:3:15: undeclared name: longMain
`go help list` shows:
GoFiles []string // .go source files (excluding CgoFiles, TestGoFiles, XTestGoFiles)
CgoFiles []string // .go source files that import "C"
CompiledGoFiles []string // .go files presented to compiler (when using -compiled)
In other words, to mimic the same list of Go files fed to the compiler,
we want CompiledGoFiles.
Note that, since the cgo files show up as generated files,
we currently do not support reversing their filenames.
That is left as a TODO for now.
Updates #555.
Assembly files can include header files within the same Go module,
and those header files can include "defines" which refer to Go names.
Since those Go names are likely being obfuscated,
we need to replace them just like we do in assembly files.
The added mechanism is rather basic; we add two TODOs to improve it.
This should help when building projects like go-ethereum.
Fixes#553.
This way, the child process knows that it's running a toolchain command
via -toolexec without having to guess via filepath.IsAbs.
While here, improve the docs and tests a bit.
Following the best practices from upstream.
In particular, the "txt" extension is somewhat ambiguous.
This may cause some conflicts due to the git diff noise,
but hopefully we won't ever do this again.
I was wrongly assumed that, if `used` has an `Elem` method,
then `origin` must too. But it does not if it's a type parameter.
Add a test case too, which panicked before the fix.
Fixes#577.
While here, start the changelog for the upcoming release,
which will likely be a bugfix release as it's a bit early to drop 1.18.
We also bump staticcheck to get a version that supports 1.19.
I also noticed the "Go version X or newer" messages were slightly weird
and inconsistent. Our policy, per the README, is "Go version X or newer",
so the errors given to the user were unnecessarily confusing.
For example, now that Go 1.19 is out, we shouldn't simply recommend that
they upgrade to 1.18; we should recommend 1.18 or later.
When obfuscating the following piece of code:
func issue_573(s struct{ f int }) {
var _ *int = &s.f
/*x*/
}
the function body would roughly end up printed as:
we would roughly end up with:
var _ *int = &dZ4xYx3N
/*x*/.rbg1IM3V
Note that the /*x*/ comment got moved earlier in the source code.
This happens because the new identifiers are longer, so the printer
thinks that the selector now ends past the comment.
That would be fine - we don't really mind where comments end up,
because these non-directive comments end up being removed anyway.
However, the resulting syntax is wrong, as the period for the selector
must be on the first line rather than the second.
This is a go/printer bug that we should fix upstream,
but until then, we must work around it in Go 1.18.x and 1.19.x.
The fix is somewhat obvious in hindsight. To reduce the chances that
go/printer will trip over comments and produce invalid syntax,
get rid of most comments before we use the printer.
We still keep the removal of comments after printing,
since go/printer consumes some comments in ast.Node Doc fields.
Add the minimized unit test case above, and add the upstream project
that found this bug to check-third-party.
andybalholm/brotli helps cover a compression algorithm and ccgo code
generation from C to Go, and it's also a fairly popular module,
particular with HTTP implementations which want pure-Go brotli.
While here, fix the check-third-party script: it was setting GOFLAGS
a bit too late, so it may run `go get` on the wrong mod file.
Fixes#573.
Every now and then, I get test failures in the goenv test like:
> [!windows] cp $EXEC_PATH $NAME/garble$exe
> [!windows] exec $NAME/garble$exe build
[fork/exec with"double"quotes/garble: text file busy]
FAIL: testdata/scripts/goenv.txt:21: unexpected command failure
The root cause is https://go.dev/issue/22315, which isn't going to be
fixed anytime soon, as it is a race condition in Linux itself, triggered
by how heavily concurrent Go tends to be.
For now, try to make the race much less likely to happen.
That is, since Go 1.18.1, released back in April 2022.
We no longer need to worry about the buggy Go 1.18.0.
While here, use a clearer env var name; the settings are build settings.
We obfuscate import paths in import declarations like:
"domain.com/somepkg"
by replacing them with the obfuscated package path:
somepkg "HPS4Mskq"
Note how we add a name to the import if there wasn't one,
so that references like somepkg.Foo keep working in the code.
This could break in some edge cases involving comments between imports
in the Go code, because go/printer is somewhat brittle with positions:
> garble build -tags buildtag
[stderr]
# test/main/importedpkg
:16: syntax error: missing import path
exit status 2
exit status 2
To prevent that, ensure the name has a reasonable position.
This was preventing github.com/gorilla/websocket from being obufscated.
It is a fairly popular library in Go, but we don't add it to
scripts/check-third-party.sh for now as wireguard already gives us
coverage over networking and cryptography.
It's not a problem to leak filenames like _cgo_gotypes.go,
but it is a problem when it includes the import path:
$ strings main | grep _cgo_gotypes
test/main/_cgo_gotypes.go
Here, "test/main" is the module path, which we want to hide.
We hadn't caught this before because the cgo.txt test did not check that
module paths aren't being leaked - it does now.
The fix is rather simple; we let printFile handle cgo-generated files.
We used to avoid that due to compiler errors, as the compiler only
allows some special cgo comment directives to work in cgo-generated
code, to prevent misuse in user code.
The fix is rather easy: the obfuscated filenames should begin with
"_cgo_" to appease the compiler's check.
Back in February 2021, we changed the obfuscation logic so that the
entire `garble build` process would use one shared temporary directory
across all package builds, reducing the amount of files we created in
the top-level system temporary directory.
However, we made one mistake: we didn't swap os.Remove for os.RemoveAll.
Ever since then, we've been leaving temporary files behind.
Add regression tests, which failed before the fix, and fix the bug.
Note that we need to test `garble reverse` as well, as it calls
toolexecCmd separately, so it needs its own cleanup as well.
The cleanup happens via the env var, which doesn't feel worse than
having toolexecCmd return an extra string or cleanup func.
While here, also test that we support TMPDIRs with special characters.
Reuse a buffer and a map across loop iterations, because we can.
Make recordTypeDone only track named types, as that is enough to detect
type cycles. Without named types, there can be no cycles.
These two reduce allocs by a fraction of a percent:
name old time/op new time/op delta
Build-16 10.4s ± 2% 10.4s ± 1% ~ (p=0.739 n=10+10)
name old bin-B new bin-B delta
Build-16 5.51M ± 0% 5.51M ± 0% ~ (all equal)
name old cached-time/op new cached-time/op delta
Build-16 391ms ± 9% 407ms ± 7% ~ (p=0.095 n=10+9)
name old mallocs/op new mallocs/op delta
Build-16 34.5M ± 0% 34.4M ± 0% -0.12% (p=0.000 n=10+10)
name old sys-time/op new sys-time/op delta
Build-16 5.87s ± 5% 5.82s ± 5% ~ (p=0.182 n=10+9)
It doesn't seem like much, but remember that these stats are for the
entire set of processes, where garble only accounts for about 10% of the
total wall time when compared to the compiler or linker. So a ~0.1%
decrease globally is still significant.
linkerVariableStrings is also indexed by *types.Var rather than types.Object,
since -ldflags=-X only supports setting the string value of variables.
This shouldn't make a significant difference in terms of allocs,
but at least the map is less prone to confusion with other object types.
To ensure the new code doesn't trip up on non-variables, we add test cases.
Finally, for the sake of clarity, index into the types.Info maps like
Defs and Uses rather than calling ObjectOf if we know whether the
identifier we have is a definition of a name or the use of a defined name.
This isn't better in terms of performance, as ObjectOf is a tiny method,
but just like with linkerVariableStrings before, the new code is clearer.
The _gomod_.go file inserted by the Go toolchain no longer shows up;
it's likely that either the -trimpath or -buildvcs=false flags are
preventing that extra bit of work from happening entirely.
The modinfo.txt test ensures that we're not breaking,
and the inner lines of code weren't hit as part of `go test`.
It also appears that we don't need to avoid obfuscating functions
defined with an `//export` directive. This is likely because cgo runs as
a pre-process step compared to the compiler, so us removing the
directive later does not make a difference.
We might need to revisit this in the future if we implement obfuscating
Go code instead of builds, e.g. `garble export`.
Just in case, I've expanded the cgo.txt test to also include one more
kind of cgo integration: an "import C" block including a C header file.
Either of these changes are slightly risky, as our tests don't cover all
edge cases. We've just done a release, so now is the time to try them.
If we don't quote it, paths containing spaces or quote characters will
fail. For instance, the added test without the fix fails:
> env NAME='with spaces'
> mkdir $NAME
> cp $EXEC_PATH $NAME/garble$exe
> exec $NAME/garble$exe build main.go
[stderr]
go tool compile: fork/exec $WORK/with: no such file or directory
exit status 1
Luckily, the fix is easy: we bundle Go's cmd/internal/quoted package,
which implements a QuotedJoin API for this very purpose.
Fixes#544.
When splitFlagsFromFiles saw "-p foo/bar.go",
it thought that was the first Go file, when in fact it's not.
We didn't notice because such import paths are pretty rare,
but they do exist, like github.com/nats-io/nats.go.
Before the fix, the added test case fails as expected:
> garble build -tags buildtag
[stderr]
# test/main/goextension.go
open test/main/goextension.go: no such file or directory
We could go through the trouble of teaching splitFlagsFromFiles about
all of the flags understood by the compiler and linker, but that feels
like far more code than the small alternative we went with.
And I'm pretty sure the alternative will work pretty reliably for now.
Fixes#539.
It added packages which are only built with the boringcrypto build tag,
so trying to `go list` them will fail even though it doesn't matter.
While here, a few more minor cleanups:
1) Hide GarbleActionID and ToObfuscate from encoding/json, so that they
can't possibly collide with the fields consumed from `go list -json`.
2) Add test cases for `garble build` with packages that fail to load.
Note that this requires GOGARBLE=* to avoid its "does not match any
package to be built" error.
3) Remove the last use of interface{}, in a testdata file.
Fixes#531.
The seed obfuscator uses a type declaration in order to declare a function,
which returns a function with the same type.
This breaks when obfuscating literals inside generic functions, because
type declarations inside generic functions are not currently supported.
Therefore the obfuscator gets disabled until
https://github.com/golang/go/issues/47631 is fixed.
Trying to make Go master work, I noticed that crypto/tls still failed to
build. The reason was generic structs; we would badly obfuscate their
field names when the types are instantiated:
> garble build
[stderr]
# test/main
Z4ZpcbMj.go:4: unknown field 'FOpszkrN' in struct literal of type SYdpWfK5[string]
Z4ZpcbMj.go:5: m8hLTotb.FypXrbTd undefined (type SYdpWfK5[string] has no field or method FypXrbTd)
exit status 2
See the added comment for what happened and how we fixed it. And add tests.
The proposal at https://go.dev/issue/50603 has been approved,
so Go will at some point start producing module pseudo-versions
even if the main module was built from a VCS clone.
To not wait until a future release like Go 1.20,
implement that ourselves with the help of module.PseudoVersion.
The result is a friendlier version output; what used to be
$ go install && garble version
mvdan.cc/garble (devel)
Build settings:
[...]
will now look like
$ go install && garble version
mvdan.cc/garble v0.0.0-20220505210747-22e3d30216be
Build settings:
[...]
Note that we don't use VCS tags in any way, so the prefix is hard-coded
as v0.0.0. That seems fine for development builds, and Go doesn't embed
VCS tag information in binaries anyway.
Finally, note that we start printing the module sum, as it's redundant.
The VCS commit hash, at least in git, should be unique enough.
This ensures that we support obfuscating builds containing the use of
type parameters, the new feature in Go 1.18.
The test is small for now, but we can extend it over time.
There was just one bug that kept the code from obfuscating properly;
that has been fixed in https://go.dev/cl/405194,
and we update x/tools to the latest master version to include it.
Fixes#414.
The added test case reproduces the failure if we uncomment the added
"continue" line in processImportCfg:
# test/bar/exporttest [test/bar/exporttest.test]
panic: refusing to list non-dependency package: test/bar/exporttest
goroutine 1 [running]:
mvdan.cc/garble.processImportCfg({0xc000166780?, 0xc0001f4a70?, 0x2?})
/home/mvdan/src/garble/main.go:983 +0x58b
mvdan.cc/garble.transformCompile({0xc000124020?, 0x11?, 0x12?})
/home/mvdan/src/garble/main.go:736 +0x338
It seems like a quirk of cmd/go that it includes a redundant packagefile
line in this particular edge case, but it's generally harmless for "go
build". For "garble build" it's also harmless in principle, but in
practice we had sanity checks that got upset by the unexpected line.
For now, notice the edge case and ignore it.
Fixes#522.
Returning a json or gob error directly to the user is generally not
helpful, as it lacks any form of context.
For example, from the json unmarshal of "go env -json":
$ garble build
invalid character 'w' looking for beginning of value
Also improve the error when the user ran garble in the wrong way,
resulting in no shared gob file. The context is now shorter, and we also
include the os.Open error in case it contains any useful details.
While here, apply Go tip's gofmt, which reformatted a godoc list.
For #523.
When a function definition is variadic,
the number of parameters may not match the number of calling arguments.
Our existing code was a bit naive with this edge case,
leading to a panic when indexing call.Args.
Keep track of that information, and properly handle all variadic
arguments that may be used indirectly with reflection.
We add a test case that used to panic, where 0 arguments are used for a
variadic parameter, as well as a case where we need to disable
obfuscation for a Go type used as the second variadic argument rather
than the first.
Finally, the old code in findReflectFunctions looked at Go function
types from the point of view of go/ast.FuncType.
Use go/types.Signature instead, where we don't need to deal with the
grouping of variables in the original syntax, and which is more
consistent with the rest of the garble codebase.
Fixes#524.
When someone builds garble from a git clone,
the resulting binary used to not contain any information:
$ garble version
(devel)
Since Go 1.18, VCS information is stamped by default into binaries.
We now print it, alongside any other available build settings:
$ garble version
mvdan.cc/garble (devel)
Build settings:
-compiler gc
CGO_ENABLED 1
GOARCH amd64
GOOS linux
GOAMD64 v3
vcs git
vcs.revision 91ea246349
vcs.time 2022-03-18T13:45:11Z
vcs.modified true
Note that it's still possible for a garble build to contain no useful
version information, such as when built via "go build -buildvcs=false".
However, if a user opts into omitting the information, it's on them to
figure out what version of garble they actually built.
While here, bump test-gotip.
Fixes#491.
strings.Cut makes some string handling code more intuitive.
Note that we can't use it everywhere, as some places need LastIndexByte.
Start using x/exp/slices, too, which is our first use of generics.
Note that its API is experimental and may still change,
but since we are not a library, we can control its version updates.
I also noticed that we were using TrimSpace for importcfg files.
It's actually unnecessary if we swap strings.SplitAfter for Split,
as the only whitespace present was the trailing newline.
While here, I noticed an unused copy of printfWithoutPackage.
Now that we've released v0.6.0, that will be the last feature release to
feature support for Go 1.17. The upcoming v0.7.0 will be Go 1.18+.
Code-wise, the cleanup here isn't super noticeable,
but it will be easier to work on features like VCS-aware version
information and generics support without worrying about Go 1.17.
Plus, now CI is back to being much faster.
Note how "go 1.18" in go.mod makes "go mod tidy" more aggressive.
For instance, Go 1.18 added support for generics, so its compiler output
files changed format to accomodate for the new language feature.
If garble is built with Go 1.17 and then used to perform builds on Go
1.18, it will fail in a very confusing way, because garble's go/types
and go/importer packages will not know how to deal with that.
As already discussed in #269, require the version that built the garble
binary to be equal or newer. In that thread we discussed only comparing
the major version, so for example garble built on go1.18 could be used
on the toolchain go1.18.5. However, that could still fail in confusing
ways if a fix to go/types or go/importer happened in a point release.
While here, I noticed that we were still using Go 1.17 for some CI
checks. Fix that, except for staticcheck.
Fixes#269.
The added comment in main.go explains the situation in detail.
The added test is a minimization of the scenario, which failed:
> cd mod1
> garble -seed=${SEED1} build -v gopkg.in/garbletest.v2
> cd ../mod2
> garble -seed=${SEED1} build -v
[stderr]
test/main/mod2
# test/main/mod2
cannot load garble export file for gopkg.in/garbletest.v2: open […]/go-build/ed/[…]-garble-ZV[…]-d: no such file or directory
To work around the problem, we'll always add each package's
GarbleActionID to its build artifact, even when not using -seed.
This will get us the previous behavior with regards to the build cache,
meaning that we fix the recent regression.
The added variable doesn't make it to the final binary either.
While here, improve the cached file loading error's context,
and add an extra sanity check for duplicates on ListedPackages.
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.
Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.
In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.
Expand the README docs with more examples and detail.
Fixes#449.
Noticed while debugging #492,
as adding -debug to a previously run command would unexpectedly
rebuild all packages in the build, as if -a was given.
While here, remove commented out testscript that were kept in error.
In particular, using -ldflags with -
In particular, a command like:
garble -literals build -ldflags='-X "main.foo=foo bar"'
would fail, because we would try to use "\"main" as the package name for
the -X qualified name, with the leading quote character.
This is because we used strings.Split(ldflags, " ").
Instead, use the same quoted.Split that cmd/go uses,
copied over thanks to x/tools/cmd/bundle and go:generate.
Updates #492.
Back in the day, we used to call toObfuscate anytime we needed to know
whether a package should be obfuscated.
More recently, we started computing via the ToObfuscate field,
which then gets shared with all sub-processes via sharedCache.
We still had two places that directly called toObfuscate.
Replace those with ToObfuscate, and inline toObfuscate into shared.go.
obfuscatedImportPath is also a potential footgun for main packages.
Some use cases always want the original "main" package name,
such as for use in the compiler's "-p main" flag,
while other cases want the obfuscated package import path,
such as the entries in importcfg files.
Since each of these call sites handles the edge case well,
obfuscatedImportPath now panics on main packages to avoid any misuse.
Finally, test that we never leak main package paths via ldflags.txt.
We never did, but it's good to make sure.
Overall, this avoids confusion and trims the size of main.go a bit.
There are two scenarios when it comes to embedding fields.
The first is easy, and we always handled it well:
type Named struct { Foo int }
type T struct { Named }
In this scenario, T ends up with an embedded field named "Named",
and a promoted field named "Foo".
Then there's the form with a type alias:
type Named struct { Foo int }
type Alias = Named
type T struct { Alias }
This case is different: T ends up with an embedded field named "Alias",
and a promoted field named "Foo".
Note how the field gets its name from the referenced type,
even if said type is just an alias to another type.
This poses two problems.
First, we must obfuscate the field T.Alias as the name "Alias",
and not as the name "Named" that the alias points to.
Second, we must be careful of cases where Named and Alias are declared
in different packages, as they will obfuscate the same name differently.
Both of those problems compounded in the reported issue.
The actual reason is that quic-go has a type alias in the form of:
type ConnectionState = qtls.ConnectionState
In other words, the entire problem boils down to a type alias which
points to a named type in a different package, where both types share
the same name. For example:
package parent
import "parent/p1"
type T struct { p1.SameName }
[...]
package p1
import "parent/p2"
type SameName = p2.SameName
[...]
package p2
type SameName struct { Foo int }
This broke garble because we had a heuristic to detect when an embedded
field was a type alias:
// Instead, detect such a "foreign alias embed".
// If we embed a final named type,
// but the field name does not match its name,
// then it must have been done via an alias.
// We dig out the alias's TypeName via locateForeignAlias.
if named.Obj().Name() != node.Name {
As the reader can deduce, this heuristic would incorrectly assume that
the snippet above does not embed a type alias, when in fact it does.
When obfuscating the field T.SameName, which uses a type alias,
we would correctly obfuscate the name "SameName",
but we would incorrectly obfuscate it with the package p2, not p1.
This would then result in build errors.
To fix this problem for good, we need to get rid of the heuristic.
Instead, we now mimic what was done for KnownCannotObfuscate,
but for embedded fields which use type aliases.
KnownEmbeddedAliasFields is now filled for each package
and stored in the cache as part of cachedOutput.
We can then detect the "embedded alias" case reliably,
even when the field is declared in an imported package.
On the plus side, we get to remove locateForeignAlias.
We also add a couple of TODOs to record further improvements.
Finally, add a test.
Fixes#466.
If package P1 imports package P2, P1 needs to know which names from P2
weren't obfuscated. For instance, if P2 declares T2 and does
"reflect.TypeOf(T2{...})", then P2 won't obfuscate the name T2, and
neither should P1.
This information should flow from P2 to P1, as P2 builds before
P1. We do this via obfuscatedTypesPackage; P1 loads the type information
of the obfuscated version of P2, and does a lookup for T2. If T2 exists,
then it wasn't obfuscated.
This mechanism has served us well, but it has downsides:
1) It wastes CPU; we load the type information for the entire package.
2) It's complex; for instance, we need KnownObjectFiles as an extra.
3) It makes our code harder to understand, as we load both the original
and obfuscated type informaiton.
Instead, we now have each package record what names were not obfuscated
as part of its cachedOuput file. Much like KnownObjectFiles, the map
records incrementally through the import graph, to avoid having to load
cachedOutput files for indirect dependencies.
We shouldn't need to worry about those maps getting large;
we only skip obfuscating declared names in a few uncommon scenarios,
such as the use of reflection or cgo's "//export".
Since go/types is relatively allocation-heavy, and the export files
contain a lot of data, we get a nice speed-up:
name old time/op new time/op delta
Build-16 11.5s ± 2% 11.1s ± 3% -3.77% (p=0.008 n=5+5)
name old bin-B new bin-B delta
Build-16 5.15M ± 0% 5.15M ± 0% ~ (all equal)
name old cached-time/op new cached-time/op delta
Build-16 375ms ± 3% 341ms ± 6% -8.96% (p=0.008 n=5+5)
name old sys-time/op new sys-time/op delta
Build-16 283ms ±17% 289ms ±13% ~ (p=0.841 n=5+5)
name old user-time/op new user-time/op delta
Build-16 687ms ± 6% 664ms ± 7% ~ (p=0.548 n=5+5)
Fixes#456.
Updates #475.
First, join the two benchmarks into one.
The previous "cached" benchmark was borderline pointless,
as it built the same package with the existing output binary,
so it would quickly realise it had nothing to do and take ~100ms.
The previous "noncached" benchmark input had no dependencies,
so it was only really benchmarking the non-obfuscation of the runtime.
All in all, neither benchmark measured obfuscating multiple packages.
The new benchmark reuses the "cached" input, but with GOCACHE="*",
meaning that we now obfuscate dozens of standard library packages.
Each iteration first does a built from scratch, the worst case scenario,
and then does an incremental rebuild of just the main package,
which is the closest to a best case scenario without being a no-op.
Since each iteration now performs both kinds of builds,
we include a new "cached-time" metric to report what portion of the
"time" metric corresponds to the incremental build.
Thus, we can see a clean build takes ~11s, and a cached takes ~0.3s:
name time/op
Build-16 11.6s ± 1%
name bin-B
Build-16 5.34M ± 0%
name cached-time/op
Build-16 326ms ± 5%
name sys-time/op
Build-16 184ms ±13%
name user-time/op
Build-16 611ms ± 5%
The benchmark is also no logner parallel; see the docs.
Note that the old benchmark also reported bin-B incorrectly,
as it looked at the binary size of garble itself, not the input program.
The -X linker flag sets a string variable to a given value,
which is often used to inject strings such as versions.
The way garble's literal obfuscation works,
we replace string literals with anonymous functions which,
when evaluated, result in the original string.
Both of these features work fine separately,
but when intersecting, they break. For example, given:
var myVar = "original"
[...]
-ldflags=-X=main.myVar=replaced
The -X flag effectively replaces the initial value,
and -literals adds code to be run at init time:
var myVar = "replaced"
func init() { myVar = func() string { ... } }
Since the init func runs later, -literals breaks -X.
To avoid that problem,
don't obfuscate literals whose variables are set via -ldflags=-X.
We also leave TODOs about obfuscating those in the future,
but we're also leaving regression tests to ensure we get it right.
Fixes#323.
We recently made an important change when obfuscating the runtime,
so that if it's missing any linkname packages in ListedPackages,
it does an extra "go list" call to obtain their information.
This works very well, but we missed an edge case.
In main.go, we disable flagLiterals for the runtime package,
but not for other packages like sync/atomic.
And, since the runtime's extra "go list" has to compute GarbleActionIDs,
it uses the list of garble flags via appendFlags.
Unfortunately, it thinks "-literals" isn't set, when it is,
and the other packages see it as being set.
This discrepancy results in link time errors,
as each end of the linkname obfuscates with a different hash:
> garble -literals build
[stderr]
# test/main
jccGkbFG.(*yijmzGHo).String: relocation target jccGkbFG.e_77sflf not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.ce66Fmzl not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.e5kPa1qY not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.aQ_3sL3Q not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.zls3wmws not defined
jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.g69WgKIS not defined
To fix the problem, treat flagLiterals as read-only after flag.Parse,
just like we already do with the other flags except flagDebugDir.
The code that turned flagLiterals to false is no longer needed,
as literals.Obfuscate is only called when ToObfuscate is true,
and ToObfuscate is false for runtimeAndDeps already.
Pointers are of the form "0x" and a number of hex characters.
Those are all part of the obfuscated filename alphabet,
so if we're unlucky enough, we can get hits:
> cmp stdout reverse.stdout
--- stdout
+++ reverse.stdout
@@ -4,7 +4,7 @@
runtime/debug.Stack(...)
runtime/debug/stack.go:24 +0x??
test/main/lib.printStackTrace(...)
- pv0x??mRa.go:1 +0x??
+ test/main/lib/long_lib.go:32 +0x??
test/main/lib.(*ExportedLibType).ExportedLibMethod(...)
test/main/lib/long_lib.go:19 +0x??
main.unexportedMainFunc.func1(...)
Include the plus sign in the regular expression,
which is used for showing pointers but isn't part of our alphabet.
cmd/go treats "--foo=bar" juts like "-foo=bar",
just like any other program using the flag package.
However, we didn't support this longer form in filterForwardBuildFlags.
Because of it, "garble build -tags=foo" worked,
but "garble build --tags=foo" did not,
as we wouldn't forward "--tags=foo" as a build flag for "go list".
Fixes#429.
Just ran into this failure while fixing a different bug:
> ! grep 'ExportedLib(Type|Field)|unexportedMainFunc|test/main|main\.go|lib\.go' main.stderr
[main.stderr]
lib filename: A4Spnz0u.go
goroutine 1 [running]:
runtime/debug.Stack(...)
runtime/debug/stack.go:24 +0x??
kso0S_A6.at6JKzwa(...)
p8_ovZPW.go:1 +0x??
kso0S_A6.(*JKArRn6w).ExportedLibMethod(...)
h4JncykI.go:1 +0x??
main.cQXA3D6d.func1(...)
FaQ2WcAJ.go:1
main.cQXA3D6d(...)
T7Ztgy1Q.go:1 +0x??
main.main(...)
OHzKYhm3.go:1 +0x??
main filename: Myi8glib.go
FAIL: testdata/scripts/reverse.txt:16: unexpected match for `ExportedLib(Type|Field)|unexportedMainFunc|test/main|main\.go|lib\.go` found in main.stderr: lib.go
Note that "main.go" ended up obfuscated as "Myi8glib.go",
which just so happens to match against "lib.go".
Use longer filenames, so that the chances of collisions are near-zero.
A recent change added the -debugdir value to addGarbleToHash,
which is part of the hash seed for all obfuscation taking place.
In principle, it was okay to add, just like any other garble flag.
In practice, it broke the added test case:
> garble -debugdir ./debug1 build
[stderr]
# test/main
FBi9xa6e.(*ac0bCOhR).String: relocation target FBi9xa6e.rV55e6H9 not defined
qmECK6zf.init.0: relocation target qmECK6zf.eUU08z98 not defined
[...]
This is because -debugdir gets turned into an absolute path,
and not all garble processes ended up using it consistently.
The fix is rather simple; since -debugdir doesn't affect obfuscation,
don't include it in the build hash seeding at all.
Fixes#451.
We were listing all of std, which certainly worked,
but was quite slow at over 200 packages.
In practice, we can only be missing up to 20-30 packages.
It was a good change as it fixed a severe bug,
but it also introduced a fairly noticeable slow-down.
The numbers are clear; this change shaves off multiple seconds when
obfuscating the runtime with a cold cache:
name old time/op new time/op delta
Build/NoCache-16 5.06s ± 1% 1.94s ± 1% -61.64% (p=0.008 n=5+5)
name old bin-B new bin-B delta
Build/NoCache-16 6.70M ± 0% 6.71M ± 0% +0.05% (p=0.008 n=5+5)
name old sys-time/op new sys-time/op delta
Build/NoCache-16 13.4s ± 2% 5.0s ± 2% -62.45% (p=0.008 n=5+5)
name old user-time/op new user-time/op delta
Build/NoCache-16 60.6s ± 1% 19.8s ± 1% -67.34% (p=0.008 n=5+5)
Since we only want to call "go list" one extra time,
instead of once for every package we find out we're missing,
we want to know what packages we could be missing in advance.
Resurrect a smarter version of the runtime-related script.
Finally, remove the runtime-related.txt test script,
as it has now been superseeded by the sanity checks in listPackage.
That is, obfuscating the runtime package will now panic if we are
missing any necessary package information.
To double check that we get the runtime's linkname edge case right,
make gogarble.txt use runtime/debug.WriteHeapDump,
which is implemented via a direct runtime linkname.
This ensures we don't lose test coverage from runtime-related.txt.
Note that this cross-compilation disables cgo by default,
and so the cgo.txt test script isn't run on GOARCH=386.
That seems fine for now, as the test isn't arch-specific.
This testing uncovered one build failure in internal/literals;
the comparison between int and math.MaxUint32 is invalid on 32-bit.
To fix that build failure, use int64 consistently.
One test also incorrectly assumed amd64; it now supports 386 too.
For any other architecture, it's being skipped for now.
I also had to increase the -race test timeout,
as it usually takes 8-9m on GitHub Actions,
and the timeout would sometimes trigger.
Finally, use "go env" rather than "go version" on CI,
which gives us much more useful information,
and also includes Go's own version now via GOVERSION.
Fixes#426.
We've had confused users a handful of times by now.
And it's reasonable to expect flags to be after the command,
as that's how flags work for cmd/go itself.
I don't think we want to mix our flags with Go's,
or start accepting flags in either place.
Both seem like worse solutions long-term, as they can add confusion.
However, we can quickly give a useful hint when a flag is misplaced.
That should get new users unblocked without asking for help.
We use a regular expression for this purpose,
because it doesn't seem like a FlagSet supports what we need;
to detect whether an argument is one of our flags,
without actually applying its value to the flagset.
Our flagset would also error on Go's flags, which we don't want.
We went to great lengths to ensure garble builds are reproducible.
This includes how the tool itself works,
as its behavior should be the same given the same inputs.
However, we made one crucial mistake with the runtime package.
It has go:linkname directives pointing at other packages,
and some of those pointed packages aren't its dependencies.
Imagine two scenarios where garble builds the runtime package:
1) We run "garble build runtime". The way we handle linkname directives
calls listPackage on the target package, to obfuscate the target's
import path and object name. However, since we only obtained build
info of runtime and its deps, calls for some linknames such as
listPackage("sync/atomic") will fail. The linkname directive will
leave its target untouched.
2) We run "garble build std". Unlike the first scenario, all listPackage
calls issued by runtime's linkname directives will succeed, so its
linkname directive targets will be obfuscated.
At best, this can result in inconsistent builds, depending on how the
runtime package was built. At worst, the mismatching object names can
result in errors at link time, if the target packages are actually used.
The modified test reproduces the worst case scenario reliably,
when the fix is reverted:
> env GOCACHE=${WORK}/gocache-empty
> garble build -a runtime
> garble build -o=out_rebuild ./stdimporter
[stderr]
# test/main/stdimporter
JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined
JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined
JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined
JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined
[...]
The fix consists of two steps. First, if we're building the runtime and
listPackage fails on a package, that means we ran into scenario 1 above.
To avoid the inconsistency, we fill ListedPackages with "go list [...] std".
This means we'll always build runtime as described in scenario 2 above.
Second, when building packages other than the runtime,
we only allow listPackage to succeed if we're listing a dependency of
the current package.
This ensures we won't run into similar reproducibility bugs in the future.
Finally, re-enable test-gotip on CI since this was the last test flake.
Add a regression test in gogarble.txt,
as that test is already set up with packages to not obfuscate.
This bug manifested in the form of a build failure for GOOS=plan9
with -literals turned on:
[...]/os/file_plan9.go:151:12: invalid operation: cannot call non-function append (variable of type bool)
In this case, the "os" package is not to be obfuscated,
but we would still obfuscate its literals as per the bug.
But, since the package's identifiers were not obfuscated,
names like "append" were not replaced as per ea2e0bdf71,
meaning that the shadowing would still affect us.
Fixes#417.
Piggybacking off of GOPRIVATE is great for a number of reasons:
* People tend to obfuscate private code, whose package paths will
generally be in GOPRIVATE already
* Its meaning and syntax are well understood
* It allows all the flexibility we need without adding our own env var
or config option
However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:
* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
Downloading modules, such as when adding or updating dependencies,
or when the local cache is cold, can be less reliable.
* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
Adding entries to go.sum, such as when adding or updating dependencies,
can be less secure.
We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.
While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.
Fixes#276.
For packages that we alter, we parse and modify the importcfg file.
Parsing is necessary so we can locate obfuscated object files,
which we use to remember what identifiers were obfuscated.
Modifying the files is necessary when we obfuscate import paths,
and those import paths have entries in an importcfg file.
However, we made one crucial mistake when writing the code.
When handling importmap entries such as:
importmap golang.org/x/net/idna=vendor/golang.org/x/net/idna
we would name the two sides beforePath and afterPath, respectively.
They were added to importMap with afterPath as the key,
but when we iterated over the map to write a modified importcfg file,
we would then assume the key is beforepath.
All in all, we would end up writing the opposite direction:
importmap vendor/golang.org/x/net/idna=golang.org/x/net/idna
This would ultimately result in the importmap never being useful,
and some rather confusing error messages such as:
cannot find package golang.org/x/net/idna (using -importcfg)
Add a test case that reproduces this error,
and fix the code so it always uses beforePath as the key.
Note that we were also updating importCfgEntries with such entries.
I could not reproduce any failure when just removing that code,
nor could I explain why it was added there in the first place.
As such, remove that bit of code as well.
Finally, a reasonable question might be why we never noticed the bug.
In practice, such "importmap"s, represented as ImportMap by "go list",
only currently appear for packages vendored into the standard library.
Until very recently, we didn't support obfuscating most of std,
so we would usually not alter the affected importcfg files.
Now that we do parse and modify them, the bug surfaced.
Fixes#408.
In the added test case, "garble -literals build" would fail:
--- FAIL: TestScripts/literals (8.29s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble -literals build
[stderr]
# test/main
Usz1FmFm.go:1: cannot call non-function string (type int), declared at Usz1FmFm.go:1
Usz1FmFm.go:1: string is not a type
Usz1FmFm.go:1: cannot call non-function append (type int), declared at Usz1FmFm.go:1
That is, for input code such as:
var append int
println("foo")
_ = append
We'd end up with obfuscated code like:
var append int
println(func() string {
// obfuscation...
x = append(x, ...)
// obfuscation...
return string(x)
})
_ = append
Which would then break, as the code is shadowing the "append" builtin.
To work around this, always obfuscate variable names, so we end up with:
var mwu1xuNz int
println(func() string {
// obfuscation...
x = append(x, ...)
// obfuscation...
return string(x)
})
_ = mwu1xuNz
This change shouldn't make the quality of our obfuscation stronger,
as local variable names do not currently end up in Go binaries.
However, this does make garble more consistent in treating identifiers,
and it completely avoids any issues related to shadowing builtins.
Moreover, this also paves the way for publishing obfuscated source code,
such as #369.
Fixes#417.
That is, use a very specific build tag and git commit,
and ensure that neither ends up in the binary.
Luckily, we have nothing to do here.
We were already removing _gomod_.go from the build entirely,
and that is still the mechanism that "go build" uses to bundle the data.
Note that the test will still work if git is not installed,
but it will simply not check the VCS side.
Finally, we use "go version -m" to check the existing fields,
which is easier than calling the Go APIs directly.
It seems like "go test" passes on yesterday's Go master, now.
So, enable test-gotip again with that commit hash.
Fixes#385.
Obfuscating newName arguments of linkname directives
with dots in the importpath didn't work before.
We had a test which covers this,
but the corresponding package wasn't actually obfuscated.
Also update linkname directives of public packages,
to allow the package where something is linknamed to to be
obfuscated regardless.
Public packages can now depend on private packages.
We can now use pruned module graphs in go.mod files,
and we no longer need to worry about runtime/internal/sys.
Note that I had to update testdata/mod slightly,
as the new pruned module graphs algorithm downloads an extra go.mod file.
This change also paves the way towards future Go 1.18 support.
Thanks to lu4p for cleaning up two TODOs as well.
Co-Authored-By: lu4p <lu4p@pm.me>
The "reverse" command had many levels of optional arguments:
garble [garble flags] reverse [build flags] [package] [files]
This was pretty confusing,
and could easily lead to people running the command incorrectly:
# note that output.txt isn't a Go package!
garble reverse output.txt
Moreover, it made the handling of Go build flags pretty confusing.
Should the command below work?
garble reverse -tags=mytag
It also made it easy to not notice that one must supply the main package
to properly reverse some text that it produced, like a panic message.
With the package path being implicit,
one could mistakenly provide the wrong package by running garble
in a directory containing a different package.
See #394.
Functions which use reflection on one of their parameters are,
now added to knownReflectAPIs automatically.
This makes most explicit hints for reflection redundant.
Simple protobuf code now works correctly when obfuscated.
Fixes#162Fixes#373
Before, we would just notice direct calls to reflect's TypeOf and
ValueOf. Any other uses of reflection, such as encoding/json or
google.golang.org/protobuf, would require hints as documented by the
README.
Issue #162 outlines some ways we could fix this issue in a general way,
automatically detecting what functions use reflection on their parameters,
even for third party API funcs.
However, that goal is pretty significant in terms of code and effort.
As a temporary improvement, we can expand the list of "known" reflection
APIs via a static table.
Since this table is keyed by "func full name" strings, we could
potentially include third party APIs, such as:
google.golang.org/protobuf/proto.Marshal
However, for now simply include all the std APIs we know about.
If we fail to do the proper fix for automatic detection in the future,
we can then fall back to expanding this global table for third parties.
Update the README's docs, to clarify that the hint is not always
necessary anymore.
Also update the reflect.txt test to stop using the hint for encoding/json,
and to also start testing text/template with a method call.
While at it, I noticed that we weren't testing the println outputs,
as they'd go to stderr - fix that too.
Updates #162.
Otherwise, the added test case would fail, as we don't modify the C code
and so there would be a name mismatch.
In the far future we might start modifying Go names in C code,
similar to what we did for Go assembly,
but right now that seems out of scope and too complex.
An easier fix is to simply record those (hopefully few) names in ignoreObjects.
While at it, recordReflectArgs started to really outgrow its name, as it
also collected expressions used as constants for literal obfuscation.
Give it a better name.
Fixes#366.
With the -literals flag, we try to convert some const declarations to
vars, as long as that doesn't break typechecking. We really only do that
for typed constant strings, really.
There was a quirk: if a numerical constant had a type and used iota, we
would not obfuscate its value, but we would still convert the
declaration from const to var. Since iotas only work within const
declarations, that would break compilation:
> garble -literals build
[stderr]
# test/main
FeWE3zwi.go:19: undefined: iota
exit status 2
To fix the problem, make the logic more conservative: only obfuscate
constant declarations where the values are typed strings, meaning that
any numerical constants are left entirely untouched.
This fixes the build of google.golang.org/protobuf/runtime/protoiface
with -literals turned on.
Back in early April we added initial support for Go 1.17,
working on a commit from master at that time. For that to work, we just
needed to add a couple of packages to runtimeRelated and tweak printFile
a bit to not break the new "//go:build" directives.
A significant amount of changes have landed since, though, and the tests
broke in multiple ways.
Most notably, the new register ABI is enabled by default for GOOS=amd64.
That affected garble indirectly in two ways: there's a new internal
package to add to runtimeRelated, and we must make reverse.txt more
clever in making its output constant across ABIs.
Another noticeable change is that Go 1.17 changes how its own version is
injected into the runtime package. It used to be via a constant in
runtime/internal/sys, such as:
const TheVersion = `devel ...`
Since we couldn't override such constants via the linker's -X flag,
we had to directly alter the declaration while compiling.
Thankfully, Go 1.17 simply uses a "var buildVersion string" in the
runtime package, and its value is injected by the linker.
This means we can now override it with the linker's -X flag.
We make the code to alter TheVersion for Go 1.16 a bit more clever,
to not break the package when building with Go 1.17.
Finally, our hack to work around ambiguous TOOLEXEC_IMPORTPATH values
now only kicks in for non-test packages, since Go 1.17 includes our
upstream fix. Otherwise, some tests would end up with the ".test"
variant suffix added a second time:
test/bar [test/bar.test] [test/bar [test/bar.test].test]
All the code to keep compatibility with Go 1.16.x remains in place.
We're still leaving TODOs to remind ourselves to remove it or simplify
it once we remove support for 1.16.x.
The 1.17 development freeze has already been in place for a month,
and beta1 is due to come this week, so it's unlikely that Go will change
in any considerable way at this point. Hence, we can say that support
for 1.17 is done.
Fixes#347.
Our recent work in fieldToAlias worked well when the embedded field
declaration (using an alias) was in the same package as the use of that
field. We would have the *ast.Ident for the field declaration, so
types.Info.Uses would give us the TypeName for the alias.
Unfortunately, if the declaration was in a dependency package, we did
not have that same *ast.Ident, as we weren't parsing the source code for
dependencies for type-checking. This resulted in us incorrectly
obfuscating the use of such an embedded field:
> garble build
[stderr]
# test/main
JtzmzxWf.go:4: unknown field 'ExternalForeignAlias' in struct literal of type _BdSNiEL.Vcs_smer
To fix this, look through the direct imports of the package defining the
field to find an alias under the exact same name. Not a foolproof
solution, as there's a TODO, but it should work for most cases.
Fixes the obfuscation of google.golang.org/grpc/internal/status, too.
Updates #349.
The added test case, which is obfuscating and linking os/user, would fail
before this fix:
> garble build
[stderr]
# test/main
/usr/lib/go/pkg/tool/linux_amd64/link: running gcc failed: exit status 1
/usr/bin/ld: $WORK/.tmp/go-link-073246656/go.o: in function `Chz0Yfs2._cgo_cmalloc':
go.go:(.text+0x993cc): undefined reference to `Chz0Yfs2.runtime_throw'
/usr/bin/ld: $WORK/.tmp/go-link-073246656/go.o: in function `Chz0Yfs2.tDfhQ8uK':
go.go:(.text+0x99801): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
/usr/bin/ld: go.go:(.text+0x9982a): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
/usr/bin/ld: go.go:(.text+0x99853): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
collect2: error: ld returned 1 exit status
The reason is that we would alter the linkname directives of cgo-generated
code, but we would not obfuscate the code itself at all.
The generated code would end up being transformed into:
//go:linkname zh_oKZIy runtime.throw
func runtime_throw(string)
One can clearly see the error there; handleDirectives obfuscated the
local linkname name, but since transformGo didn't run, the actual Go
declaration was not obfuscated in the same way. Thus, the linker fails
to find a function body for runtime_throw, and fails.
The solution is simple: handleDirectives assumes that it's running on
code being obfuscated, so only run it when transformGo is running.
We can also remove the cgo skip check in handleDirectives, as it never
runs on cgo-generated code now.
Fixes a number of build errors that have been noticed since
907aebd770.
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:
tf.info.Uses[node].(*types.TypeName)
Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.
Thus, we failed to obfuscate the name properly in the added test case:
> garble build
[stderr]
# test/main/sub
xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn
It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.
Updates #349.
We already added support for "//go:embed" with string and []byte,
by not obfuscating the "embed" import path.
However, embed.FS was still failing:
> garble build
[stderr]
# test/main
:13: go:embed cannot apply to var of type embed.WtKNvwbN
The compiler detects the type by matching its name to exactly "embed.FS",
so don't obfuscate the name "FS" either.
While at it, ensure that the embed code behaves the same with "go build".
Updates #349.
Before, "garble build -h" would print the same as "garble -h", which is
too much and confusing, as it doesn't tell us much about "build".
Now it's far better, and includes the output of "go build -h":
$ garble build -h
usage: garble [garble flags] build [arguments]
This command wraps "go build". Below is its help:
usage: go build [-o output] [build flags] [packages]
Run 'go help build' for details.
We do the same for "garble reverse -h", since it doesn't wrap a Go tool
command.
I added a second one recently without noticing, since the original was
too far below and I had written commands above it.
Deduplicate them, and put the new env line as early as possible, to
prevent further issues and confusion.
Every now and then, a CI run would fail:
FAIL: testdata/scripts/reflect.txt:7: unexpected match for ["main.go"] in main
These were rare, and very hard to reproduce or debug.
My best guess is that, since "main.go" is a short string and we use
random eight-character obfuscated filenames ending with ".go", it was
possible that the random filename happened to end in "main" in some
cases.
Given the base64 encoding, the chances of a single suffix collision are
about 0.000006%. Note, however, that a single obfuscated build will most
likely obfuscate many filenames, especially for the tests obfuscating
multiple packages. For a single CI run with many tests across three OSs,
the chances of any collision are likely very low, but realistic.
All this has a simple fix: use longer filenames to match with. We choose
"garble_main.go" since it's long enough, but also because it's still
clear it's a "main" Go file, and it's very unlikely to cause conflicts
with filenames in upstream Go given the "garble_" prefix.
Before this change, we would try to never obfuscate alias names. That
was far from ideal, as they can end up in field names via anonymous
fields.
Even then, we would sometimes still fail to build, because we would
inconsistently obfuscate alias names. For example, in the added test
case:
--- FAIL: TestScripts/syntax (0.23s)
testscript.go:397:
> env GOPRIVATE='test/main,private.source'
> garble build
[stderr]
# test/main/sub
Lv_a8gRD.go:15: undefined: KCvSpxmQ
To fix this problem, we set obj to be the TypeName corresponding to the
alias when it is used as an embedded field. We can then make the right
choice when obfuscating the name.
Right now, all aliases will be obfuscated. A TODO exists about not
obfuscating alias names when they're used as embedded fields in a struct
type in the same package, and that package is used for reflection -
since then, the alias name ends up as the field name.
With these changes, the protobuf module now builds.
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.
obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.
We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:
--- FAIL: TestScripts/reflect (0.81s)
testscript.go:397:
> env GOPRIVATE=test/main
> garble build
[stderr]
# test/main
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]
More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.
To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.
This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.
This fixes a few panics while building the protobuf module.
Our allPkgs boolean logic was wrong, because it could still lead to
garble obfuscating a type when used in the main package, but not in its
defining package. The added test case shows such a case.
To fix that, use a package path to only record the named objects from
the target package, which is a narrower operation without this problem,
but still makes all our tests pass.
This makes the google.golang.org/protobuf/internal/filetype package
start building.
I've wanted this to more easily debug build failures.
To not force a build failure in a test script, as that would require
some trickery to remain stable, we use runtime.Caller without printing
the line number. Before this patch, those filenames without line numbers
would not be reversed at all.
If a package A imports package B, and uses reflect.TypeOf on an unnamed
struct type B.T (such as an alias), we don't want to record B.T's fields
as "do not obfuscate". This is for the same reason that we don't if B.T
is a named struct type: the detection only works for the package
defining the type, as otherwise it's inconsistent.
We failed to handle this case well, because we assumed all struct types
would be under some named type. This is not the case for type aliases.
Fortunately, struct fields are named, and as such they are objects.
Check their package too, just like we do for named types.
Fixes another build error when obfuscating the protobuf module.
We add a simplified version of the example above as a test case,
which originated from debugging the protobuf build failure.
The detection of reflection usage is tricky and there are plenty of edge
cases to test for. We definitely want one script for it, rather than
splitting those cases between other scripts like imports.txt and
syntax.txt.
Moreover, those two were rather generic and large, so this helps keep a
balance.
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:
/tmp/garble-shared123/encode.go.456.go
where "123" and "456" are random numbers, usually longer.
This was usually fine for two reasons:
1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
directory and its random number would be invisible.
2) We would add "//line" directives to the source files, replacing
the filename with obfuscated versions excluding any random number.
Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.
Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.
To fix the issues above, the same "encoding/json/encode.go" would now
end up as:
/tmp/garble-shared123/encoding/json/encode.go
Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.
This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.
We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.
Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.
For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.
This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.
When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.
Fixes#328.
I've tripped over this GODEBUG env var four or five times already.
Since it affects any Go program by making them print tons of runtime
information, it also affects "go env", breaking garble horribly.
To prevent further issues, unset the env var when done.
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.
The spec defines two identical struct types as:
Two struct types are identical if they have the same sequence of
fields, and if corresponding fields have the same names, and
identical types, and identical tags. Non-exported field names
from different packages are always different.
Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.
To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.
Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.
This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.
Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.
Fixes#310.
Obfuscating some std packages for windows/arm triggered a bug; when
encountering a call to runtime·memmove, we'd hash "memmove" with the
current package's action ID.
This is wrong on two levels: First, we aren't obfuscating the runtime
package yet. And second, if we did, we would have to hash the symbol
appropriately, with that package's action ID.
For now, only hashing the local names does the trick. That's all that
the code currently supports, anyway.