Commit Graph

179 Commits (ff361f6b1599c542c389904e1666357a0a558cb2)

Author SHA1 Message Date
Daniel Martí 1d31a139f5 support aliases as embedded fields in dependencies
Our recent work in fieldToAlias worked well when the embedded field
declaration (using an alias) was in the same package as the use of that
field. We would have the *ast.Ident for the field declaration, so
types.Info.Uses would give us the TypeName for the alias.

Unfortunately, if the declaration was in a dependency package, we did
not have that same *ast.Ident, as we weren't parsing the source code for
dependencies for type-checking. This resulted in us incorrectly
obfuscating the use of such an embedded field:

	> garble build
	[stderr]
	# test/main
	JtzmzxWf.go:4: unknown field 'ExternalForeignAlias' in struct literal of type _BdSNiEL.Vcs_smer

To fix this, look through the direct imports of the package defining the
field to find an alias under the exact same name. Not a foolproof
solution, as there's a TODO, but it should work for most cases.

Fixes the obfuscation of google.golang.org/grpc/internal/status, too.

Updates #349.
3 years ago
Daniel Martí 68b39e8195 fix a link issue when obfuscating complex cgo packages
The added test case, which is obfuscating and linking os/user, would fail
before this fix:

	> garble build
	[stderr]
	# test/main
	/usr/lib/go/pkg/tool/linux_amd64/link: running gcc failed: exit status 1
	/usr/bin/ld: $WORK/.tmp/go-link-073246656/go.o: in function `Chz0Yfs2._cgo_cmalloc':
	go.go:(.text+0x993cc): undefined reference to `Chz0Yfs2.runtime_throw'
	/usr/bin/ld: $WORK/.tmp/go-link-073246656/go.o: in function `Chz0Yfs2.tDfhQ8uK':
	go.go:(.text+0x99801): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
	/usr/bin/ld: go.go:(.text+0x9982a): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
	/usr/bin/ld: go.go:(.text+0x99853): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
	collect2: error: ld returned 1 exit status

The reason is that we would alter the linkname directives of cgo-generated
code, but we would not obfuscate the code itself at all.

The generated code would end up being transformed into:

	//go:linkname zh_oKZIy runtime.throw
	func runtime_throw(string)

One can clearly see the error there; handleDirectives obfuscated the
local linkname name, but since transformGo didn't run, the actual Go
declaration was not obfuscated in the same way. Thus, the linker fails
to find a function body for runtime_throw, and fails.

The solution is simple: handleDirectives assumes that it's running on
code being obfuscated, so only run it when transformGo is running.

We can also remove the cgo skip check in handleDirectives, as it never
runs on cgo-generated code now.

Fixes a number of build errors that have been noticed since
907aebd770.
3 years ago
Daniel Martí 64317883c9 handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:

	tf.info.Uses[node].(*types.TypeName)

Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.

Thus, we failed to obfuscate the name properly in the added test case:

	> garble build
	[stderr]
	# test/main/sub
	xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn

It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.

Updates #349.
3 years ago
Daniel Martí e2ddce75a7 support embedding via embed.FS
We already added support for "//go:embed" with string and []byte,
by not obfuscating the "embed" import path.

However, embed.FS was still failing:

	> garble build
	[stderr]
	# test/main
	:13: go:embed cannot apply to var of type embed.WtKNvwbN

The compiler detects the type by matching its name to exactly "embed.FS",
so don't obfuscate the name "FS" either.

While at it, ensure that the embed code behaves the same with "go build".

Updates #349.
3 years ago
Daniel Martí 7fc424ca26 make "garble command -h" give command-specific help
Before, "garble build -h" would print the same as "garble -h", which is
too much and confusing, as it doesn't tell us much about "build".

Now it's far better, and includes the output of "go build -h":

	$ garble build -h
	usage: garble [garble flags] build [arguments]

	This command wraps "go build". Below is its help:

	usage: go build [-o output] [build flags] [packages]
	Run 'go help build' for details.

We do the same for "garble reverse -h", since it doesn't wrap a Go tool
command.
3 years ago
Daniel Martí 8edde922ee remove unused code spotted by -coverprofile
Remove some asthelper APIs that haven't been used for some time.
They can be recovered from the git history if needed again.

One type assertion in the literals package is always true.

Embedded field objects are handled near the top of transformGo, so the
extra !obj.Embedded() check was always true. Remove it.

We always obfuscate standalone funcs now, so the obfuscatedTypesPackage
check is no longer necessary. This was necessary when we used to not
obfuscate func names when they were used in linkname directives.

The workaround for test package imports in obfuscatedTypesPackage I had
to add a few commits ago no longer seems to be necessary. This might be
thanks to the simplification with functions in the paragraph just above.

It's impossible to run garble without -trimpath nowadays, as we error
before the build even starts:

	$ go build -toolexec=garble
	go tool compile: exit status 1
	cannot open shared file, this is most likely due to not running "garble [command]"

When run as "garble build", the trimpath flag is always set. So the
check in alterTrimpath never triggers anymore, and couldn't be tested.

Finally, simplify the handling of comment syntax in printFile, and add a
few TODOs for other code paths not covered by our existing tests.

Total code coverage is up from 90.3% to 91.0%.
3 years ago
Daniel Martí 65ff07875b obfuscate alias names like any other objects
Before this change, we would try to never obfuscate alias names. That
was far from ideal, as they can end up in field names via anonymous
fields.

Even then, we would sometimes still fail to build, because we would
inconsistently obfuscate alias names. For example, in the added test
case:

	--- FAIL: TestScripts/syntax (0.23s)
	    testscript.go:397:
	        > env GOPRIVATE='test/main,private.source'
	        > garble build
	        [stderr]
	        # test/main/sub
	        Lv_a8gRD.go:15: undefined: KCvSpxmQ

To fix this problem, we set obj to be the TypeName corresponding to the
alias when it is used as an embedded field. We can then make the right
choice when obfuscating the name.

Right now, all aliases will be obfuscated. A TODO exists about not
obfuscating alias names when they're used as embedded fields in a struct
type in the same package, and that package is used for reflection -
since then, the alias name ends up as the field name.

With these changes, the protobuf module now builds.
3 years ago
Daniel Martí 68f07389b2 fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.

obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.

We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:

	--- FAIL: TestScripts/reflect (0.81s)
	    testscript.go:397:
	        > env GOPRIVATE=test/main
	        > garble build
	        [stderr]
	        # test/main
	        panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	        	panic: runtime error: invalid memory address or nil pointer dereference
	        [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]

More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.

To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.

This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.

This fixes a few panics while building the protobuf module.
3 years ago
Daniel Martí 654841e1fb skip reflection detection for sibling packages
Our allPkgs boolean logic was wrong, because it could still lead to
garble obfuscating a type when used in the main package, but not in its
defining package. The added test case shows such a case.

To fix that, use a package path to only record the named objects from
the target package, which is a narrower operation without this problem,
but still makes all our tests pass.

This makes the google.golang.org/protobuf/internal/filetype package
start building.
3 years ago
Daniel Martí b6dee63b32 improve handling of reflect on foreign unnamed types
If a package A imports package B, and uses reflect.TypeOf on an unnamed
struct type B.T (such as an alias), we don't want to record B.T's fields
as "do not obfuscate". This is for the same reason that we don't if B.T
is a named struct type: the detection only works for the package
defining the type, as otherwise it's inconsistent.

We failed to handle this case well, because we assumed all struct types
would be under some named type. This is not the case for type aliases.

Fortunately, struct fields are named, and as such they are objects.
Check their package too, just like we do for named types.

Fixes another build error when obfuscating the protobuf module.
We add a simplified version of the example above as a test case,
which originated from debugging the protobuf build failure.
3 years ago
Daniel Martí d8de5a4306 avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:

	/tmp/garble-shared123/encode.go.456.go

where "123" and "456" are random numbers, usually longer.

This was usually fine for two reasons:

1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
   directory and its random number would be invisible.

2) We would add "//line" directives to the source files, replacing
   the filename with obfuscated versions excluding any random number.

Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.

Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.

To fix the issues above, the same "encoding/json/encode.go" would now
end up as:

	/tmp/garble-shared123/encoding/json/encode.go

Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.

This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.

We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.

Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.

For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.

This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.

When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.

Fixes #328.
3 years ago
Daniel Martí d34406d832 clarify the status of the TOOLEXEC_IMPORTPATH hack
Now that upstream has merged our fix, it will ship with 1.17.
We'll be able to remove this entire chunk of code soon enough.
3 years ago
Daniel Martí c9b0b07853 hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.

The spec defines two identical struct types as:

	Two struct types are identical if they have the same sequence of
	fields, and if corresponding fields have the same names, and
	identical types, and identical tags. Non-exported field names
	from different packages are always different.

Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.

To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.

Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.

This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.

Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.

Fixes #310.
3 years ago
Andrew LeFevre b3db7d6fa7 fix obfuscating linkname directives that where the package name contained a dot 3 years ago
Andrew LeFevre 907aebd770 obfuscate local names in linkname directives
Previously, the local name part of linkname directives were never
obfuscated. Now that we can obfuscate function names in Go assembly,
that limitation is required no longer.

Fixes #309
3 years ago
Daniel Martí b4fc735a1e fix windows/arm cross-build linking
Obfuscating some std packages for windows/arm triggered a bug; when
encountering a call to runtime·memmove, we'd hash "memmove" with the
current package's action ID.

This is wrong on two levels: First, we aren't obfuscating the runtime
package yet. And second, if we did, we would have to hash the symbol
appropriately, with that package's action ID.

For now, only hashing the local names does the trick. That's all that
the code currently supports, anyway.
3 years ago
Daniel Martí 05d35350cf record types into ignoreObjects more reliably
Our previous logic only took care of fairly simple types, such as a
simple struct or a pointer to a struct. If we had a struct embedding
another struct, we'd fail to record the objects for the fields in the
inner struct, and that would lead to miscompilation:

	> garble build
	[stderr]
	# test/main
	LZmt64Nm.go:7: outer.InnerField undefined (type *CcUt1wkQ.EmbeddingOuter has no field or method InnerField)

To fix this issue, make the function that records all objects under a
types.Type smarter. Since it now does more than just dealing with
structs, it's also renamed.

Since the function now walks types properly, we get to remove the extra
ast.Inspect in recordReflectArgs, which is nice.

We also make it a method, to avoid the map parameter. A boolean
parameter is also added, since we need this feature to only look at the
current package when looking at reflect calls.

Finally, we add a test case, a simplified version of the original bug
report.

Fixes #315.
3 years ago
Daniel Martí 2fad0e1583 wrap types.Importer to canonicalize import paths
The docs for go/importer.ForCompiler say:

	The lookup function is called each time the resulting importer
	needs to resolve an import path. In this mode the importer can
	only be invoked with canonical import paths (not relative or
	absolute ones); it is assumed that the translation to canonical
	import paths is being done by the client of the importer.

We use a lookup func for two reasons: first, to support modules, and
second, to be able to use our information from "go list -json -export".

However, go/types does not canonicalize import paths before calling
ImportFrom. This is somewhat understandable; it doesn't know whether an
importer was created with a lookup func, and ImportFrom only requires
the input path to be canonicalized in that scenario. When the lookup
func is nil, the importer canonicalizes by itself via go/build.Import.

Before this change, the added crossbuild test would fail:

	> garble build net/http
	[stderr]
	# vendor/golang.org/x/crypto/chacha20
	typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/chacha20/chacha_generic.go:10:2: could not import crypto/cipher (can't find import: "crypto/cipher")
	# vendor/golang.org/x/text/secure/bidirule
	typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/secure/bidirule/bidirule.go:12:2: could not import errors (can't find import: "errors")
	# vendor/golang.org/x/crypto/cryptobyte
	typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/cryptobyte/asn1.go:8:16: could not import encoding/asn1 (can't find import: "encoding/asn1")
	# vendor/golang.org/x/text/unicode/norm
	typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/unicode/norm/composition.go:7:8: could not import unicode/utf8 (can't find import: "unicode/utf8")

This is because we'd fall back to importer.Default, which only knows how
to find packages in $GOROOT/pkg. Those are missing for cross-builds,
unsurprisingly, as those built archives end up in the build cache.

After this change, we properly support importing std-vendored packages,
so we can get rid of the importer.Default workaround. And, by extension,
cross-builds now work as well.

Note that, in the added test script, the full build of the binary fails,
as there seems to be some sort of linker problem:

	> garble build
	[stderr]
	# test/main
	d9rqJyxo.uoqIiDs5: relocation target runtime.os9A16A3 not defined

We leave that as a TODO for now, as this change is subtle enough as it
is.
3 years ago
lu4p 1a9fdb4e8e
Fix calls to linkname functions (#314) 3 years ago
Daniel Martí 3afc993266 use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:

* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used

Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":

	$ go env -json GOPRIVATE GOMOD GOVERSION
	{
		"GOMOD": "/home/mvdan/src/garble/go.mod",
		"GOPRIVATE": "",
		"GOVERSION": "go1.16.3"
	}

Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.

Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.

This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:

	name           old time/op       new time/op       delta
	Build/Cache-8        168ms ± 0%        166ms ± 1%  -1.26%  (p=0.009 n=6+6)

	name           old bin-B         new bin-B         delta
	Build/Cache-8        6.36M ± 0%        6.36M ± 0%  +0.12%  (p=0.002 n=6+6)

	name           old sys-time/op   new sys-time/op   delta
	Build/Cache-8        222ms ± 2%        219ms ± 3%    ~     (p=0.589 n=6+6)

	name           old user-time/op  new user-time/op  delta
	Build/Cache-8        857ms ± 1%        846ms ± 1%  -1.31%  (p=0.041 n=6+6)
3 years ago
Daniel Martí 24d5ff362c fix a regression involving imported linkname funcs
In ce2c45440a, we simplified the code a bit and removed one call to
obfuscatedTypesPackage.

Unfortunately, we introduced a regression; if an exported function is
linknamed to another symbol name, and it's called from an importer
package, we would have a build failure now:

	> garble build
	[stderr]
	# test/main
	ZiOACuw7.go:1: undefined: ODC0xN52.BaDqbhkj

This is because the imported package would not hash the original name,
via its ignoreObjects logic. And, since the importer package has no
access to that knowledge, it would hash the same name, and fail to find
it in the final build.

The regression happened because we used to have a types.Scope Lookup
that saved us in this scenario. Add the test, and re-add the Lookup,
this time only for this particular scenario with function names.

Thanks to Andrew LeFevre for reporting and describing the test case.

While at it, replace more uses of "garbled" to "obfuscated".
3 years ago
Daniel Martí 5de519694a CI: pin a commit when testing against Go tip
Since it changes rapidly, especially during merge cycles, and we don't
want CI to surprisingly blow up in our faces from one day to another.

Pin this to a commit from yesterday which works, since some changes
merged today moved where the Go build version is recorded and broke
garble.

While at it, replace "git clone" with a wget of a source archive. This
is much, much faster, mainly because a tarball is significantly smaller.
We now download about 20MiB instead of over 350MiB.

One downside is that, without git, make.bash can't construct a devel
version on its own. For that reason, add a pretty basic manual version
via the VERSION file.

This means that we must not reject custom devel version strings. This is
a good thing anyway, because custom devel strings are already common
when building Go in custom ways. Those people tend to be advanced users,
such as CI, so fall back to assuming they know what they are doing and
don't error.

Plus, starting last week, devel versions in Go master now contain the
major Go version like in build tags, such as "go1.17-commit...", so we
will soon start relying on that instead of parsing dates:

	$ go version
	go version devel go1.17-a7e16abb22 Thu Apr 8 07:33:58 2021 +0000 linux/amd64
3 years ago
Daniel Martí d38dfd4e90 make garble work on Go tip again
Just two minor tweaks were necessary to get "go test" to pass on:

	go version devel go1.17-a25c584629 Tue Apr 6 04:48:09 2021 +0000 linux/amd64

Re-enable the CI for it, too. The config needed changing since the
set-env and add-path commands now use special files instead, due to some
security issues uncovered last winter.

It's possible that CI on master could suddenly break, if Go master
changes in some substantial way that requires more tweaks. If that turns
out to be an issue pretty often, we could always pin a specific git repo
commit and update it every few weeks.
3 years ago
Daniel Martí b995c1b589
obfuscate literals as part of transformGo (#299)
This is easier to understand, since now the modification of the
*ast.File is all within a single chunk of code. We can also simplify
literals.Obfuscate to work on a single file, as transformGo runs in a
loop.

We also remove the "use receiver" TODOs, since the code is now in a
different package and it can't declare methods on a type here.
3 years ago
Daniel Martí 081b69eec2 join the import-rewrite ast.Inspect with transformGo
Walking the entire Go file again seems unnecessary, when we have an
astutil.Apply just next to it.

We could add to its "pre" function, but it's a bit easier to use the
"post" instead, which is empty.

Another advantage here is that astutil.Apply is more powerful than
ast.Inspect, which can come in handy in the future.
3 years ago
Daniel Martí e94b8e3750 make "help" refuse arguments for now
Since they are otherwise silently ignored.
3 years ago
Daniel Martí 664f834906 document "garble reverse"
Fixes #5.
3 years ago
Daniel Martí fe095ef132
handle unknown flags in reverse (#290)
While at it, expand the tests for build and test too.
3 years ago
Daniel Martí 1a8e32227f
improve "reverse" even further (#289)
Fix up a few TODOs, and simplify the way we handle comments.

We now add whitespace around inline /*line*/ directives, to ensure we
don't break programs. A test case is added too.

We now add line directives to call sites, not function declarations,
since those are what actually shows up in stack traces.
It's unclear if we care about any other lines inside functions at all.
This also fixes reversing with -literals, since that feature adds a
significant amount of code which shuffles line numbers around.

Finally, we extend the tests with types, methods, and anonymous
functions, and we make all of them work well.

Updates #5.
3 years ago
Daniel Martí ce2c45440a
simplify the uses of obfuscatedTypesPackage (#284)
obfuscatedTypesPackage is now only useful for one scenario: a type which
is used with reflection in a dependency, so neither its name nor any of
its potential struct field names are obfuscated.

This is because we can only detect the use of reflection with go/ast,
which we don't have for dependencies.

As such, we only need obfuscatedTypesPackage in two places - when
considering to obfuscate a field or a type name.

There were two other calls to the function, which we remove.

The first was used for linkname directives. Those directives only work
for variables and functions, neither of which is affected by the
reflection detection.

The second was used for all identifiers, at the very end of the
transformGo inner func. It's entirely unnecessary right now, as it never
triggers anymore. It's possible it was necessary some time ago when we
still didn't obfuscate assembly functions.

While at it, improve some comments and add a few TODOs for edge cases
which do not have code coverage.
3 years ago
Daniel Martí a1f11fb231 add a writeTemp helper func
We had two nearly identical copies of the code to open a temp file with
CreateTemp, write to it, and handle closing properly. Unify them.

With the added docs this isn't exactly a net win, but we want the long
funcs such as transformCompile to be easy to follow, and this helps.
3 years ago
Daniel Martí 13e4ba2ae0 use "obfuscate" instead of "garble" in some more places
Mainly comments. "garble" refers to the tool, but the verb and adjective
is more intuitive as "obfuscate" and "obfuscated" instead of "garble"
and "garbled".
3 years ago
Daniel Martí 961daf20c4
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.

Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".

Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.

The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.

The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.

Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.

Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.

For #5.
3 years ago
Daniel Martí ea19e39aa4 use hashWith for obfuscation position information
Position information was obfuscated with math/rand manually, which meant
that the resulting positions were pretty small like "x.go:34", but they
were also very hard to reverse due to their short length and difficulty
to reproduce.

We now hash them with hashWith and the package's GarbleActionID:

	"main.go:203" hashed with 933ad1c700755b7c3a9913c55cade1 to "mwu1xuNz.go"

The input to the hash is the base filename and the byte offset of the
declaration within the file, meaning that it's unique within a package.
The output filename is long enough to allow easy reversal.

The line number is always 1, since the information needed for reversing
is contained entirely within the filename. It doesn't really matter if
we encode data in the filename or line number, but it's easier for us to
use a string.

For #5.
3 years ago
Daniel Martí 5d74ab07f5 all: replace uses of the deprecated ioutil
Now that we require Go 1.16, we can simplify code by removing ioutil.
3 years ago
Daniel Martí a328a487f8 improve the code comments around -seed
Explain why we print the random seed on failure, and why we accept
base64 padding when we don't use it.

While at it, the flagOptions.Random field is unused, so remove it. It
also seems wrong for it to exist; the random value is only for the seed
flag, which already has a field of its own.
3 years ago
Daniel Martí f3ea42230a remove the last remaining -debugdir buffer
We were still printing files into a buffer, which got left behind from
when we used to store the obfuscated source in object files.

All that code is gone, so this buffer is now just wasting CPU cycles and
memory.
3 years ago
Daniel Martí 9312938650
remove obsolete TODOs (#274)
obfuscatedTypesPackage can't be outright removed, since we still do
require knowledge of what was obfuscated for one reason: types used with
reflection. Since we can only figure that out with go/ast and not just
go/types, we rely on the original compilation to tell us that
information.

IgnoreFuncBodies=true for typechecking the original source code would be
nice, as we would save time, but ultimately it doesn't work. When we
rename top-level declarations such as functions and types, we also need
to amend their references in func bodies. We depend on type information
for that.

Finally, we've been randomizing filenames for a while now. Randomizing
the order of the files doesn't seem to be useful.
3 years ago
Daniel Martí 748c6a0538
obfuscate asm function names as well (#273)
Historically, it was impossible to rename those funcs as the
implementation was in assembly files, and we only transformed Go code.

Now that transformAsm exists, it's only about fifty lines to do some
very basic parsing and rewriting of assembly files.

This fixes the obfuscated builds of multiple std packages, including a
few dependencies of net/http, since they included assembly funcs which
called pure Go functions. Those pure Go functions had their names
obfuscated, breaking the call sites in assembly.

Fixes #258.
Fixes #261.
3 years ago
Daniel Martí ffeea469e6
remove a couple of pieces of dead code (#272)
We no longer cache the obfuscated source code for -debugdir, so
obfSrcArchive is entirely useless at this point. We likely didn't notice
because the Go compiler doesn't realise it's unused.

symAbis is also unnecessary. It used to be necessary as we could only
collect the action ID from transformCompile in the second asm
invocation. Since we no longer need the temporary file dance with
transformCompile, we can simplify that code too.
3 years ago
Daniel Martí def9351b25
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.

Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.

Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.

The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".

The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.

The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.

Finally, document "garble test" again.

Fixes #241.
3 years ago
Daniel Martí 4e9ee17ec8
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.

Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.

To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.

The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.

Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.

isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".

Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.

Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.

While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
3 years ago
Daniel Martí ff0bea73b5
all: drop support for Go 1.15.x (#265)
This mainly cleans up the few bits of code where we explicitly kept
support for Go 1.15.x. With v0.1.0 released, we can drop support now,
since the next v0.2.0 release will only support Go 1.16.x.

Also updates all modules, including test ones, to 'go 1.16'.

Note that the TOOLEXEC_IMPORTPATH refactor is not done here, despite all
the TODOs about doing so when we drop 1.15 support. This is because that
refactor needs to be done carefully and might have side effects, so it's
best to keep it to a separate commit.

Finally, update the deps.
3 years ago
Daniel Martí 2a9c0b7bf4
prepare for the first release (#264)
First, write a changelog file. We will use GitHub releases, but the
content in those is not stored in git nor is it portable or machine
readable. The canonical place for the changelog is here.

Second, disable 'garble test', as it is entirely broken. Issue #241
tracks fixing and re-enabling it, which will most likely happen for the
next release.

Third, disable the undocumented 'garble list'. This was added as part of
'garble reverse', but it never got used. I can't think of any reason why
any end user would prefer it over 'go list', either.

'garble reverse' remains enabled, but undocumented as it isn't fully
functional yet. Until it supports position information, it's not
particularly useful to end users. But it's not broken either, so it can
remain where it is.

Fourth, update the '-tiny' size reduction numbers in the README. Since
we removed the in-place modification of object files, we are no longer
able to do such an aggressive stripping of info. Garble itself drops in
size by 2%, so replace the old 6-10% estimate by 2-5%. We probably will
gain some of this back in the near future.

Finally, fix the indentation formatting of the README to consistently
use tabs.
3 years ago
Daniel Martí 1267e2eced
fix link errors when importing crypto/ecdsa (#262)
First, we had some link errors such as:

	cannot find package J6OzO8GN (using -importcfg)

This was caused by the code that writes an updated importcfg, which did
not handle import maps well. That code is now fixed, and we also add an
obfuscatedImportPath method for clarity.

Once fixed, we ran into other link errors:

	Pw3g97ww.addVW: relocation target Pw3g97ww.addVWlarge not defined

After some digging, the cause of those is assembly code that we do not
yet support obfuscating. #261 tracks that.

Meanwhile, to fix "GOPRIVATE=* garble build" and to be able to have a
test for the original import path bug, we add the packages which use
that form of assembly code to runtimeRelated - math/big and
crypto/sha512. There might be more, but these were the ones found by
trying to link crypto/tls, a fairly common dependency.

Fixes #256.
3 years ago
Daniel Martí 99887c13c2 ignore -ldflags=-X flags mentioning unknown packages
That would panic, since the *listedPackage would be nil for a package
path we aren't aware of:

	panic: runtime error: invalid memory address or nil pointer dereference
	[signal SIGSEGV: segmentation violation code=0x1 addr=0x88 pc=0x126b57d]

	goroutine 1 [running]:
	main.transformLink.func1(0x7ffeefbff28b, 0x5d)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1260 +0x17d
	main.flagValueIter(0xc0000a8e20, 0x2f, 0x2f, 0x12e278e, 0x2, 0xc000129e28)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1410 +0x1e9
	main.transformLink(0xc0000a8e20, 0x30, 0x36, 0x4, 0xc000114648, 0x23, 0x12dfd60, 0x0)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1241 +0x1b9
	main.mainErr(0xc0000a8e10, 0x31, 0x37, 0x37, 0x0)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:287 +0x389
	main.main1(0xc000096058)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:150 +0xe7
	main.main()
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:83 +0x25

The linker ignores such unknown references, so we should too.

Fixes #259.
3 years ago
Daniel Martí b03cd08c09
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.

Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.

Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.

While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.

We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.

	name     old time/op       new time/op       delta
	Build-8        106ms ± 1%         92ms ± 0%  -14.07%  (p=0.010 n=6+4)

	name     old bin-B         new bin-B         delta
	Build-8        6.60M ± 0%        6.60M ± 0%   -0.01%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        208ms ± 5%        149ms ± 3%  -28.27%  (p=0.004 n=6+5)

	name     old user-time/op  new user-time/op  delta
	Build-8        433ms ± 3%        384ms ± 3%  -11.35%  (p=0.002 n=6+6)
3 years ago
Daniel Martí 6898d61637
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.

Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.

This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.

Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.

First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.

Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.

Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.

All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.

This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.

This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.

This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".

While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.

The numbers for Go 1.16 look very good:

	name     old time/op       new time/op       delta
	Build-8        146ms ± 4%        101ms ± 1%  -31.01%  (p=0.002 n=6+6)

	name     old bin-B         new bin-B         delta
	Build-8        6.61M ± 0%        6.60M ± 0%   -0.09%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        321ms ± 7%        202ms ± 6%  -37.11%  (p=0.002 n=6+6)

	name     old user-time/op  new user-time/op  delta
	Build-8        538ms ± 4%        414ms ± 4%  -23.12%  (p=0.002 n=6+6)
3 years ago
Daniel Martí 09e244986e formally add support for Go 1.16
This was pretty much just fixing the README and closing the issue. The
only other noteworthy user-facing change is that, if the Go version is
detected to be too old, we now suggest 1.16.x instead of 1.15.x.

While at it, refactor goversion.txt a bit. I wanted it to print a
clearer "mocking the go build" error if another command was used like
"go build", but I didn't want to learn BAT. So, instead use a simple Go
program and build it, which will work on all platforms. The added
"go build" step barely takes 100ms on my machine, given how simple the
program is.

The [short] line also doesn't seem necessary to me. The entire script
runs in under 200ms for me, so it's well within the realm of "short", at
least compared to many of the other test scripts.

Fixes #124.
3 years ago
Daniel Martí 2ee6604408
replace the -p path when assembling (#247)
The asm tool runs twice for a package with assembly. The second time it
does, the path given to the -p flag matters, just like in the compiler,
as we generate an object file.

We don't have a -buildid flag in the asm tool, so obtaining the action
ID to obfuscate the package path with is a bit tricky. We store it from
transformCompile, and read it from transformAsm. See the detailed docs
for more.

This was the last "skip" line in the tests due to Go 1.16. After all PRs
are merged, one last PR documenting that 1.16 is supported will be sent,
closing the issue for good.

It's unclear why this wasn't an issue in Go 1.15. My best guess is that
the ABI changes only happened in Go 1.16, and this causes exported asm
funcs to start showing up in object files with their package paths.

Updates #124.
3 years ago