Commit Graph

12 Commits (b71ff42877a4ce5d1fefd98f920e2f6e400718d2)

Author SHA1 Message Date
Daniel Martí 4e9ee17ec8
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.

Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.

To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.

The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.

Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.

isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".

Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.

Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.

While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
3 years ago
Daniel Martí ff0bea73b5
all: drop support for Go 1.15.x (#265)
This mainly cleans up the few bits of code where we explicitly kept
support for Go 1.15.x. With v0.1.0 released, we can drop support now,
since the next v0.2.0 release will only support Go 1.16.x.

Also updates all modules, including test ones, to 'go 1.16'.

Note that the TOOLEXEC_IMPORTPATH refactor is not done here, despite all
the TODOs about doing so when we drop 1.15 support. This is because that
refactor needs to be done carefully and might have side effects, so it's
best to keep it to a separate commit.

Finally, update the deps.
3 years ago
Daniel Martí 1267e2eced
fix link errors when importing crypto/ecdsa (#262)
First, we had some link errors such as:

	cannot find package J6OzO8GN (using -importcfg)

This was caused by the code that writes an updated importcfg, which did
not handle import maps well. That code is now fixed, and we also add an
obfuscatedImportPath method for clarity.

Once fixed, we ran into other link errors:

	Pw3g97ww.addVW: relocation target Pw3g97ww.addVWlarge not defined

After some digging, the cause of those is assembly code that we do not
yet support obfuscating. #261 tracks that.

Meanwhile, to fix "GOPRIVATE=* garble build" and to be able to have a
test for the original import path bug, we add the packages which use
that form of assembly code to runtimeRelated - math/big and
crypto/sha512. There might be more, but these were the ones found by
trying to link crypto/tls, a fairly common dependency.

Fixes #256.
3 years ago
Daniel Martí b03cd08c09
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.

Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.

Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.

While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.

We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.

	name     old time/op       new time/op       delta
	Build-8        106ms ± 1%         92ms ± 0%  -14.07%  (p=0.010 n=6+4)

	name     old bin-B         new bin-B         delta
	Build-8        6.60M ± 0%        6.60M ± 0%   -0.01%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        208ms ± 5%        149ms ± 3%  -28.27%  (p=0.004 n=6+5)

	name     old user-time/op  new user-time/op  delta
	Build-8        433ms ± 3%        384ms ± 3%  -11.35%  (p=0.002 n=6+6)
3 years ago
Daniel Martí 6898d61637
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.

Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.

This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.

Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.

First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.

Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.

Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.

All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.

This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.

This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.

This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".

While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.

The numbers for Go 1.16 look very good:

	name     old time/op       new time/op       delta
	Build-8        146ms ± 4%        101ms ± 1%  -31.01%  (p=0.002 n=6+6)

	name     old bin-B         new bin-B         delta
	Build-8        6.61M ± 0%        6.60M ± 0%   -0.09%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        321ms ± 7%        202ms ± 6%  -37.11%  (p=0.002 n=6+6)

	name     old user-time/op  new user-time/op  delta
	Build-8        538ms ± 4%        414ms ± 4%  -23.12%  (p=0.002 n=6+6)
3 years ago
Daniel Martí 63c42c3cc7
support typechecking all of std (#236)
There was one bug keeping the command below from working:

	GOPRIVATE='*' garble build std

The bug is rather obscure; I'm still working on a minimal reproducer
that I can submit upstream, and I'm not yet convinced about where the
bug lives and how it can be fixed.

In short, the command would fail with:

	typecheck error: /go/src/crypto/ecdsa/ecdsa.go:122:12: cannot use asn1.SEQUENCE (constant 48 of type asn1.Tag) as asn1.Tag value in argument to b.AddASN1

Note that the error is ambiguous; there are two asn1 packages, but they
are actually mismatching. We can see that by manually adding debug
prints to go/types:

	constant: asn1.SEQUENCE (constant 48 of type golang.org/x/crypto/cryptobyte/asn1.Tag)
	argument type: vendor/golang.org/x/crypto/cryptobyte/asn1.Tag

It's clear that, for some reason, go/types ends up confused and loading
a vendored and non-vendored version of asn1. There also seems to be no
way to work around this with our lookup function, as it just receives an
import path as a parameter, and returns an object file reader.

For now, work around the issue by *not* using a custom lookup function
in this rare edge case involving vendored dependencies in std packages.
The added code has a lengthy comment explaining the reasoning.

I still intend to investigate this further, but there's no reason to
keep garble failing if we can work around the bug.

Fixes #223.
3 years ago
Daniel Martí e33179d480
reverse: support unexported names and package paths (#233)
Unexported names are a bit tricky, since they are not listed in the
export data file. Perhaps unsurprisingly, it's only meant to expose
exported objects.

One option would be to go back to adding an extra header to the export
data file, containing the unexported methods in a map[string]T or
[]string. However, we have an easier route: just parse the Go files and
look up the names directly.

This does mean that we parse the Go files every time "reverse" runs,
even if the build cache is warm, but that should not be an issue.
Parsing Go files without any typechecking is very cheap compared to
everything else we do. Plus, we save having to load go/types information
from the build cache, or having to load extra headers from export files.

It should be noted that the obfuscation process does need type
information, mainly to be careful about which names can be obfuscated
and how they should be obfuscated. Neither is a worry here; all names
belong to a single package, and it doesn't matter if some aren't
actually obfuscated, since the string replacements would simply never
trigger in practice.

The test includes an unexported func, to test the new feature. We also
start reversing the obfuscation of import paths. Now, the test's reverse
output is as follows:

	goroutine 1 [running]:
	runtime/debug.Stack(0x??, 0x??, 0x??)
		runtime/debug/stack.go:24 +0x??
	test/main/lib.ExportedLibFunc(0x??, 0x??, 0x??, 0x??)
		p.go:6 +0x??
	main.unexportedMainFunc(...)
		C.go:2
	main.main()
		z.go:3 +0x??

The only major missing feature is positions and filenames. A follow-up
PR will take care of those.

Updates #5.
3 years ago
Daniel Martí d8e8738216
initial support for reversing panic output (#225)
For now, this only implements reversing of exported names which are
hashed with action IDs. Many other kinds of obfuscation, like positions
and private names, are not yet implemented.

Note that we don't document this new command yet on purpose, since it's
not finished.

Some other minor cleanups were done for future changes, such as making
transformLineInfo into a method that also receives the original
filename, and making header names more self-describing.

Updates #5.
3 years ago
Daniel Martí 78b69bbdab share a single temporary directory between all processes
Each compile and link sub-process created its own temporary directory,
to be cleaned up shortly after. Moreover, we also had the global
gob-encoded temporary file.

Instead, place all of those under a single, start-to-end temporary
directory. This is cleaner for the end user, and easier to maintain for
us.

A big plus is that we can also get rid of the confusing deferred global,
as it was mostly used to clean up these extra temp dirs. The only
remaining use was post-compile code, which is now an explicit func
returned by each "transform" func.

While at it, clean up the math/rand seeding code a bit and add a debug
log line, and stop shadowing a cmd string with a cmd *exec.Cmd.

Fixes #147.
3 years ago
Daniel Martí 249501b5e9
fix garbling names belonging to indirect imports (#203)
main.go includes a lengthy comment that documents this edge case, why it
happened, and how we are fixing it. To summarize, we should no longer
error with a build error in those cases. Read the comment for details.

A few other minor changes were done to allow writing this patch.

First, the actionID and contentID funcs were renamed, since they started
to collide with variable names.

Second, the logging has been improved a bit, which allowed me to debug
the issue.

Third, the "cache" global shared by all garble sub-processes now
includes the necessary parameters to run "go list -toolexec", including
the path to garble and the build flags being used.

Thanks to lu4p for writing a test case, which also applied gofmt to that
testdata Go file.

Fixes #180.
Closes #181, since it includes its test case.
4 years ago
Daniel Martí c9deff810b
obfuscate fewer std packages (#196)
Previously, we were never obfuscating runtime and its direct
dependencies. Unfortunately, due to linkname, the runtime package is
actually closely related to dozens of other std packages as well.

Until we can obfuscate the runtime and properly support go:linkname
directives, obfuscating fewer std packages is a better outcome than
breaking and not producing any obfuscated code at all.

The added test case is building runtime/pprof, which used to cause
failures:

	# runtime/pprof
	/go/src/runtime/pprof/label.go:27:21: undefined: context.Context
	/go/src/runtime/pprof/label.go:59:21: undefined: context.Context
	/go/src/runtime/pprof/label.go:93:16: undefined: context.Context
	/go/src/runtime/pprof/label.go:101:20: undefined: context.Context

The net package was also very close to obfuscating properly thanks to
this change, so its test is now run as well. The only other remaining
fix was to not obfuscate fields on cgo types, since those aren't
obfuscated at the moment.

The map is pretty long, but it's only a temporary solution and the
command to obtain the list again is included. Never obfuscating the
entire std library is also an option, but it's a bit unnecessary.

Fixes #134.
4 years ago
lu4p cf290b8e6d
Share data between processes via a shared file. (#192)
Previously garble heavily used env vars to share data between processes.
This also makes it easy to share complex data between processes.

The complexity of main.go is considerably reduced.
4 years ago