It's common for asset bundling code generators to produce huge literals,
for example in strings. Our literal obfuscators are meant for relatively
small string-like literals that a human would write, such as URLs, file
paths, and English text.
I ran some quick experiments, and it seems like "garble build -literals"
appears to hang trying to obfuscate literals starting at 5-20KiB. It's
not really hung; it's just doing a lot of busy work obfuscating those
literals. The code it produces is also far from ideal, so it also takes
some time to finally compile.
The generated code also led to crashes. For example, using "garble build
-literals -tiny" on a package containing literals of over a megabyte,
our use of asthelper to remove comments and shuffle line numbers could
run out of stack memory.
This all points in one direction: we never designed "-literals" to deal
with large sizes. Set a source-code-size limit of 2KiB.
We alter the literals.txt test as well, to include a few 128KiB string
literals. Before this fix, "go test" would seemingly hang on that test
for over a minute (I did not wait any longer). With the fix, those large
literals are not obfuscated, so the test ends in its usual 1-3s.
As said in the const comment, I don't believe any of this is a big
problem. Come Go 1.16, most developers should stop using asset-bundling
code generators and use go:embed instead. If we wanted to somehow
obfuscate those, it would be an entirely separate feature.
And, if someone wants to work on obfuscating truly large literals for
any reason, we need good tests and benchmarks to ensure garble does not
consume CPU for minutes or run out of memory.
I also simplified the generate-literals test command. The only argument
that matters to the script is the filename, since it's used later on.
Fixes#178.
main.go includes a lengthy comment that documents this edge case, why it
happened, and how we are fixing it. To summarize, we should no longer
error with a build error in those cases. Read the comment for details.
A few other minor changes were done to allow writing this patch.
First, the actionID and contentID funcs were renamed, since they started
to collide with variable names.
Second, the logging has been improved a bit, which allowed me to debug
the issue.
Third, the "cache" global shared by all garble sub-processes now
includes the necessary parameters to run "go list -toolexec", including
the path to garble and the build flags being used.
Thanks to lu4p for writing a test case, which also applied gofmt to that
testdata Go file.
Fixes#180.
Closes#181, since it includes its test case.
If code includes a linkname directive pointing at a name in an imported
package, like:
//go:linkname localName importedpackage.RemoteName
func localName()
We should rewrite the comment to replace "RemoteName" with its
obfuscated counterpart, if the package in question was obfuscated and
that name was as well.
We already had some code to handle linkname directives, but only to
ensure that "localName" was never obfuscated. This behavior is kept, to
ensure that the directive applies to the right name. In the future, we
could instead rewrite "localName" in the directive, like we do with
"RemoteName".
Add plenty of tests, too. The linkname directive used to be tested in
imports.txt and syntax.txt, but that was hard to maintain as each file
tested different edge cases.
Now that we have build caching, adding one extra testscript file isn't a
big problem anymoree. Add linkname.txt, which is self-explanatory. The
other two scripts also get a bit less complex.
Fixes#197.
First, we don't need the nameSpecialDirectives list as a separate thing.
cgo types aren't obfuscated anymore, so the only item in that list that
made a difference in the tests was go:linkname, which we'll overhaul
soon. For now, keep its code around.
Second, processDetachedDirectives can be replaced by just seven lines.
Third, we don't need to separate build tag directives from the rest of
the detached directives. Their relative order (with other comments) does
not matater.
Fourth and last, ranging over a nil slice is a no-op, so a nil check
around a slice range is unnecessary.
This is some prep work to make the patch to support go:linkname smaller
and easier to review.
It turns out that the modules we include in testdata/mod via
txtar-addmod don't result in the same h1 hash that one gets when using
proxy.golang.org.
As proof:
$ go clean -modcache && go get -d rsc.io/quote@v1.5.2 && go test -short
[...]
--- FAIL: TestScripts/imports (0.06s)
> garble build -tags buildtag
go list error: exit status 1: verifying rsc.io/quote@v1.5.2: checksum mismatch
downloaded: h1:w5fcysjrx7yqtD/aO+QwRjYZOKnaM9Uh2b40tElTs3Y=
go.sum: h1:3fEykkD9k7lYzXqCYrwGAf7iNhbk4yCjHmKBN9td4L0=
The added comment explains the situation in detail.
For now, simply work around the issue by not sharing GOMODCACHE with the
host.
Previously, we were never obfuscating runtime and its direct
dependencies. Unfortunately, due to linkname, the runtime package is
actually closely related to dozens of other std packages as well.
Until we can obfuscate the runtime and properly support go:linkname
directives, obfuscating fewer std packages is a better outcome than
breaking and not producing any obfuscated code at all.
The added test case is building runtime/pprof, which used to cause
failures:
# runtime/pprof
/go/src/runtime/pprof/label.go:27:21: undefined: context.Context
/go/src/runtime/pprof/label.go:59:21: undefined: context.Context
/go/src/runtime/pprof/label.go:93:16: undefined: context.Context
/go/src/runtime/pprof/label.go:101:20: undefined: context.Context
The net package was also very close to obfuscating properly thanks to
this change, so its test is now run as well. The only other remaining
fix was to not obfuscate fields on cgo types, since those aren't
obfuscated at the moment.
The map is pretty long, but it's only a temporary solution and the
command to obtain the list again is included. Never obfuscating the
entire std library is also an option, but it's a bit unnecessary.
Fixes#134.
The first bug fixed is not garbling package names that don't contain any code. For example, given the import path "github.com/foo/bar", "github.com" was treated as a package name that should be garbled, which doesn't make sense.
The other bug was incorrectly matching private package names inside import paths. Before, if "internal" was an imported package matched by GOPRIVATE, but "internal/foo" was not, any instance of "internal/foo" would still be garbled as "<garbled>/foo", which is incorrect.
Previously garble heavily used env vars to share data between processes.
This also makes it easy to share complex data between processes.
The complexity of main.go is considerably reduced.
The previous globals worked, but were unnecessarily complex. For
example, we passed the fromPath variable around, but it's really a
static global, since we only compile or link a single package in each Go
process. Use such global variables instead of passing them around, which
currently include the package's import path, its build ID, and its
import config path.
Also split all the hashing and build ID code into hash.go, since that's
a relatively well contained 200 lines of code that doesn't need to make
main.go any bigger. We also split the code to alter Go's own version to
a separate function, so that it can be moved out of main.go as well.
* Use latest Binject/debug version to support importmap directives in the importcfg file
* Uncomment line in goprivate testscript to test ImportMap
* Fixed issue where a package in specified in importmap would be hashed differently in a package that imported it, due to the mapping of import paths.
Also commented out the 'net' import in the goprivate testscript (again) due to cgo compile errors
We also update the "original types importer" to support ImportMap.
The test now gets further along, no longer getting stuck on "path not
found in listed packages". Instead, we get stuck on:
error parsing importcfg: <...>/importcfg:2: unknown directive "importmap"
This bug has been filed at https://github.com/Binject/debug/issues/17.
Until it's fixed, we can't really proceed on #146, so the net import in
the test file (which triggers this case) is commented out for now.
Updates #146.
In Go 1.15, if a dependency is required but not listed in go.mod/go.sum,
it's resolved and added automatically.
This is changing in 1.16. From that release, one will have to explicitly
update the mod files via 'go mod tidy' or 'go get'.
To get ahead of the curve, start using -mod=readonly to get the same
behavior in 1.15, and fix all existing tests.
The only tests that failed were imports.txt and syntax.txt, the only
ones to require other modules. But since we're here, let's add the 'go'
line to all go.mod files as well.
That is, a package that is built without obfuscation imports an
obfuscated package. This will result in confusing compilation error
messages, because the importer can't find the exported names from the
imported package by their non-obfuscated names:
> ! garble build ./importer
[stderr]
# test/main/importer
importer/importer.go:5:9: undefined: imported.Name
exit status 2
Instead, detect this bad input case and provide a nice error:
public package "test/main/importer" can't depend on obfuscated package "test/main/imported" (matched via GOPRIVATE="test/main/imported")
For now, this is by design. It also makes little sense for a public
package to import an obfuscated package in general, because the public
package would have to leak details about the private package's API and
behavior.
While at it, fix a quirk where we thought the unsafe package could be
private. It can't be, because the runtime package is always public and
it imports the runtime package:
public package "internal/bytealg" can't depend on obfuscated package "unsafe" (matched via GOPRIVATE="*")
Instead of trying to obfuscate "unsafe" and doing nothing, simply add it
to the neverPrivate list, which is also a better name than
"privateBlacklist" (for #169).
Fixes#164.
Co-authored-by: lu4p <lu4p@pm.me>
Means that we no longer have to pass a dozen parameters around, mainly
to transformGo. We can also start documenting what each of the fields
actually does, and group them better.
While at it, pkgPath and pkgScope can both be replaced by a
*types.Package, since they're both accessible via trivially cheap
methods.
By avoiding the fmt import, we save some work: 'go test -run Script/tiny'
goes down from 0.8s to 0.5s even with a warm build cache.
Since the output changes between runs, we use stderr grep lines instead
of cmp. This way we can also check that the "oh noes" panic is entirely
hidden in the tiny mode.
We can use println instead of fmt.Println. Similarly, we can avoid
strings.Join by just appending bytes to a []byte.
This is less important now that we have build caching, but it still
helps to do less work overall and link smaller binaries.
Reduces 'go test -run Script/init' from 0.5s to 0.3s on my laptop.
Also, properly format the Go in that file, since the space indentation
wasn't noticed during code review. We might want to enforce gofmt in Go
files within txtar files if this keeps happening.
* fix bug where structs would get garbled in some packages but not in others
* only check if struct/field was not defined in current package
* fix a related bug when two objects share the same name in the same package and one is garbled but the other one is not
* renamed parameter for clarity
Since it's been failing for weeks, it's practically useless for now.
Even with continue-on-error, the failures still look scary at first
glance.
We can re-enable this job once we fix master.
Use a static main.stderr file, like in the other tests. This means we
don't need to always start the test with a 'go build', and the output is
also obvious by just reading the txtar file.
We can also move generate-literals to a later stage, so that 'go test
-short' needs to do even less work.
'go test -short -run Script/literals' drops from ~0.4s to ~0.2s on my
laptop.
Finally, make the printing of byte lists not use trailing spaces, so
that the txtar file itself doesn't have trailing whitespace in its lines
either.
Fixes#103.
The test intended to use an extra module to be obfuscated, rsc.io/quote,
which we were bundling in the local proxy as well. Unfortunately, the
use of GOPRIVATE also meant that we did not actually fetch the module
from the proxy, and we would instead do a full roundtrip to the internet
to "git clone" the actual upstream repository.
To prevent that roundtrip, instead use a locally replaced module. This
fits the syntax.txt test too, since it's one more edge case that we want
to make sure works well with garble. Since rsc.io/quote is used in
another test, simply make up our own tiny module.
Reduces a 'go test -run Syntax/syntax' run with warm cache from ~5s to
~0.5s, thanks to removing the multiple roundtrips. A warm 'go test' run
still sits at ~6s, since we still need that much CPU time in total.
While at it, fix a staticcheck warning and fix inconsistent indentation
in a couple of tests.
As per the discussion in https://github.com/golang/go/issues/41145, it
turns out that we don't need special support for build caching in
-toolexec. We can simply modify the behavior of "[...]/compile -V=full"
and "[...]/link -V=full" so that they include garble's own version and
options in the printed build ID.
The part of the build ID that matters is the last, since it's the
"content ID" which is used to work out whether there is a need to redo
the action (build) or not. Since cmd/go parses the last word in the
output as "buildID=...", we simply add "+garble buildID=_/_/_/${hash}".
The slashes let us imitate a full binary build ID, but we assume that
the other components such as the action ID are not necessary, since the
only reader here is cmd/go and it only consumes the content ID.
The reported content ID includes the tool's original content ID,
garble's own content ID from the built binary, and the garble options
which modify how we obfuscate code. If any of the three changes, we
should use a different build cache key. GOPRIVATE also affects caching,
since a different GOPRIVATE value means that we might have to garble a
different set of packages.
Include tests, which mainly check that 'garble build -v' prints package
lines when we expect to always need to rebuild packages, and that it
prints nothing when we should be reusing the build cache even when the
built binary is missing.
After this change, 'go test' on Go 1.15.2 stabilizes at about 8s on my
machine, whereas it used to be at around 25s before.
What obfuscateImports did was valid, but unfortunately made the build
cache redo work. This is because we were modifying object files in-place
in the build cache, meaning that the Go tool would think it had to
re-compile those packages.
Instead, write the modified object files in a temporary directory, and
leave the input object files untouched. We require a bit of extra code
to keep track of this and adjust the link argument as well as its
importcfg file.
The function of obfuscateImports, as well as the reasoning above, is now
summarized in its godoc as well.
This should be the last change in preparation for proper build caching
support. Rebasing the build caching branch on this commit finally makes
caching work reliably every single time.
More correct comments transformation was implemented.
Added processing of //go:linkname localname [importpath.name] directive, now localname is not renamed. This is safe and does not cause a name disclosure because the functions marked //linkname do not have a name in the resulting binary.
Added cgo directives support
Fixed filename leak protection for cgo
Part of #149
Fix for bug when a conflict occurred between generated short names
and local variables/functions/types/structs.
The already existing names are collected and if the generated short name
already exists, the package counter is increased until a free name is found.
Part of #149.
Added cleanup of the Comment field.
In some cases, the appearance of a comment in a random place
may break the compilation (e.g. cgo and runtime package).
This is safe because the Comment field cannot contain any directives.
Part of #149.
The struct type for buildInfo doesn't need to be named. Plus, the
"packageInfo" name was actually pretty misleading, because buildInfo
contains data from many packages.
Add an importCfg field, so that we don't need to fetch the flag value
many times.
Simplify reading the importCfg file; we used to also write to it, but
that's no longer the case, so we can just use ioutil.ReadFile.
Finally, give the function that fills buildInfo a better name, a godoc,
and fix the origTypesConfig godoc.
We also add a TODO to reuse goobj.ParseImportCfg in the future.
We now store how we obfuscated unexported names in the object file
itself, not a separate file. This means that the data can survive in the
build cache, whereas the separate file was being lost. Luckily, we can
just add an extra header to the archive, and other programs like the Go
linker will just ignore it.
In tiny.txt, we already check line numbers via stderr, so there's no
need to do that via -debugdir.
In syntax.txt, we only really care about what names remain in the
binary, not the names which remain in the source but don't affect the
binary.
These changes are important because -debugdir adds a non-trivial amount
of work, which will impede build caching once that feature lands. We
will likely make -debugdir support build caching eventually, but for
now, this preliminary change will make 'go test' much faster with build
caching.
And of course, the tests get simpler, which is nice.
Give the func a name that tells what the return value means.
Add missing newlines to printfs, use consistent quoting, and replace
"%s" with %q.
Document the Go 1.15 date.
Finally, fix the imports via goimports.
See https://github.com/burrowers/garble/issues/121#issuecomment-695935859.
In some rare cases, it's nearly impossible to write a test for a change,
but they are truly so rare that we shouldn't give any ideas here.
By default, all contributors should try to write a test for every
change that changes what the code is meant to do.
Many files were missing copyright, so also add a short script to add the
missing lines with the current year, and run it.
The AUTHORS file is also self-explanatory. Contributors can add
themselves there, or we can simply update it from time to time via
git-shortlog.
Since we have two scripts now, set up a directory for them.