Commit Graph

343 Commits (d2beda1f0016a11b0b0a364eff06a153eb4a448a)

Author SHA1 Message Date
Daniel Martí 79b6e4db3d skip unnecessary "refusing to list" test errors
The added test case reproduces the failure if we uncomment the added
"continue" line in processImportCfg:

	# test/bar/exporttest [test/bar/exporttest.test]
	panic: refusing to list non-dependency package: test/bar/exporttest

	goroutine 1 [running]:
	mvdan.cc/garble.processImportCfg({0xc000166780?, 0xc0001f4a70?, 0x2?})
		/home/mvdan/src/garble/main.go:983 +0x58b
	mvdan.cc/garble.transformCompile({0xc000124020?, 0x11?, 0x12?})
		/home/mvdan/src/garble/main.go:736 +0x338

It seems like a quirk of cmd/go that it includes a redundant packagefile
line in this particular edge case, but it's generally harmless for "go
build". For "garble build" it's also harmless in principle, but in
practice we had sanity checks that got upset by the unexpected line.

For now, notice the edge case and ignore it.

Fixes #522.
2 years ago
Daniel Martí e3a59eae07 add missing context to two unmarshal errors
Returning a json or gob error directly to the user is generally not
helpful, as it lacks any form of context.
For example, from the json unmarshal of "go env -json":

	$ garble build
	invalid character 'w' looking for beginning of value

Also improve the error when the user ran garble in the wrong way,
resulting in no shared gob file. The context is now shorter, and we also
include the os.Open error in case it contains any useful details.

While here, apply Go tip's gofmt, which reformatted a godoc list.

For #523.
2 years ago
Daniel Martí 20ff64128e make KnownReflectAPIs aware of variadic funcs
When a function definition is variadic,
the number of parameters may not match the number of calling arguments.
Our existing code was a bit naive with this edge case,
leading to a panic when indexing call.Args.

Keep track of that information, and properly handle all variadic
arguments that may be used indirectly with reflection.

We add a test case that used to panic, where 0 arguments are used for a
variadic parameter, as well as a case where we need to disable
obfuscation for a Go type used as the second variadic argument rather
than the first.

Finally, the old code in findReflectFunctions looked at Go function
types from the point of view of go/ast.FuncType.
Use go/types.Signature instead, where we don't need to deal with the
grouping of variables in the original syntax, and which is more
consistent with the rest of the garble codebase.

Fixes #524.
2 years ago
pagran c8e0abf9c9 Fix removing named imports and fix removing imports with init() methods 2 years ago
Daniel Martí 6a39ad2d81 make "garble version" include VCS information
When someone builds garble from a git clone,
the resulting binary used to not contain any information:

	$ garble version
	(devel)

Since Go 1.18, VCS information is stamped by default into binaries.
We now print it, alongside any other available build settings:

	$ garble version
	mvdan.cc/garble (devel)

	Build settings:
	       -compiler gc
	     CGO_ENABLED 1
	          GOARCH amd64
	            GOOS linux
	         GOAMD64 v3
	             vcs git
	    vcs.revision 91ea246349
	        vcs.time 2022-03-18T13:45:11Z
	    vcs.modified true

Note that it's still possible for a garble build to contain no useful
version information, such as when built via "go build -buildvcs=false".
However, if a user opts into omitting the information, it's on them to
figure out what version of garble they actually built.

While here, bump test-gotip.

Fixes #491.
2 years ago
Daniel Martí 1c564ef091 slightly improve code thanks to Go 1.18 APIs
strings.Cut makes some string handling code more intuitive.
Note that we can't use it everywhere, as some places need LastIndexByte.

Start using x/exp/slices, too, which is our first use of generics.
Note that its API is experimental and may still change,
but since we are not a library, we can control its version updates.

I also noticed that we were using TrimSpace for importcfg files.
It's actually unnecessary if we swap strings.SplitAfter for Split,
as the only whitespace present was the trailing newline.

While here, I noticed an unused copy of printfWithoutPackage.
2 years ago
lu4p 237e0b7b7c Detect unnecessary imports instead of hardcoding 2 years ago
lu4p 1a0b028db7 all: drop support for Go 1.17
Now that we've released v0.6.0, that will be the last feature release to
feature support for Go 1.17. The upcoming v0.7.0 will be Go 1.18+.

Code-wise, the cleanup here isn't super noticeable,
but it will be easier to work on features like VCS-aware version
information and generics support without worrying about Go 1.17.
Plus, now CI is back to being much faster.

Note how "go 1.18" in go.mod makes "go mod tidy" more aggressive.
2 years ago
lu4p d555639657 Remove unused imports via go/types.
Fixes #481
2 years ago
Daniel Martí 0b6769c807 remove duplicate go:generate directive
I hadn't noticed that cmd/bundle prints its own go:generate directive.
I guess that makes sense for the average user running it directly,
but that doesn't apply to us, and we end up with duplicate directives.

Before:

	$ go generate -n
	bundle -o cmdgo_quoted.go -prefix cmdgoQuoted cmd/internal/quoted
	go run golang.org/x/tools/cmd/bundle@v0.1.9 -o cmdgo_quoted.go -prefix cmdgoQuoted cmd/internal/quoted

After:

	$ go generate -n
	go run golang.org/x/tools/cmd/bundle@v0.1.9 -o cmdgo_quoted.go -prefix cmdgoQuoted cmd/internal/quoted
	sed -i /go:generate/d cmdgo_quoted.go

While here, I made a typo in the last release notes, because of course.
I already edited that out in the GitHub release.
2 years ago
Daniel Martí ab39ee804d fail if the current Go version is newer than what built garble
For instance, Go 1.18 added support for generics, so its compiler output
files changed format to accomodate for the new language feature.
If garble is built with Go 1.17 and then used to perform builds on Go
1.18, it will fail in a very confusing way, because garble's go/types
and go/importer packages will not know how to deal with that.

As already discussed in #269, require the version that built the garble
binary to be equal or newer. In that thread we discussed only comparing
the major version, so for example garble built on go1.18 could be used
on the toolchain go1.18.5. However, that could still fail in confusing
ways if a fix to go/types or go/importer happened in a point release.

While here, I noticed that we were still using Go 1.17 for some CI
checks. Fix that, except for staticcheck.

Fixes #269.
2 years ago
Daniel Martí 434de2e472 make early errors count towards code coverage
I recently added TODOs for bits of code we should cover in the tests.
I was looking at that again just now, and was puzzled;
we do indeed have test cases for many of them already.

We just weren't counting them towards code coverage due to a bug.
errJustExit works as expected, except that it calls os.Exit directly,
whereas testscript wants a non-zero return to run its "after" code.
Part of that code is what handles joining code coverage files.

The total code coverage jumps from 86.2% to 87.6%.
2 years ago
Daniel Martí 91ea246349 minor code tidying up
We don't need to nest the RemoveAll and MkdirAll calls for debugdir.

Fill the string for -toolexec= when we actually append it to goArgs.

Consistently use the objectString type alias.

Remove unnecessary parameter names in transformFuncs.

Unindent the code for switch variables by reversing the conditional.
2 years ago
Daniel Martí cd797e6e95 add a few TODOs with uncovered code that should not be
This will hopefully give new contributors a place to start.
Some of these, like verifying that the "help" commands work,
will be relatively simple additions to our test scripts.
2 years ago
Daniel Martí 8b55dd4bd2 work around a build cache regression in the previous commit
The added comment in main.go explains the situation in detail.
The added test is a minimization of the scenario, which failed:

        > cd mod1
        > garble -seed=${SEED1} build -v gopkg.in/garbletest.v2
        > cd ../mod2
        > garble -seed=${SEED1} build -v
        [stderr]
        test/main/mod2
        # test/main/mod2
        cannot load garble export file for gopkg.in/garbletest.v2: open […]/go-build/ed/[…]-garble-ZV[…]-d: no such file or directory

To work around the problem, we'll always add each package's
GarbleActionID to its build artifact, even when not using -seed.
This will get us the previous behavior with regards to the build cache,
meaning that we fix the recent regression.
The added variable doesn't make it to the final binary either.

While here, improve the cached file loading error's context,
and add an extra sanity check for duplicates on ListedPackages.
2 years ago
Daniel Martí c1c90fee13 make obfuscation fully deterministic with -seed
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.

Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.

In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.

Expand the README docs with more examples and detail.

Fixes #449.
2 years ago
Daniel Martí cf0351bdf5 remove ErrExist check on -debugdir's RemoveAll
This code initially used os.Stat, where it made sense to use
os.IsNotExist to catch whether the directory didn't exist.

However, we later transitioned to os.RemoveAll, which will never return
neither ErrExist nor ErrNotExist:

	If the path does not exist, RemoveAll returns nil (no error).

Simplify the code.
2 years ago
Daniel Martí bd3f950799 add a TODO about a possible cache bug with -literals and -ldflags
I spent some time trying to reproduce the bug but couldn't,
so for now, make a detailed note for it.
We can come back to it if we actually run into it in the future.

Fixes #492.
2 years ago
Daniel Martí 88a27d491b add support for -ldflags using quotes
In particular, using -ldflags with -

In particular, a command like:

	garble -literals build -ldflags='-X "main.foo=foo bar"'

would fail, because we would try to use "\"main" as the package name for
the -X qualified name, with the leading quote character.

This is because we used strings.Split(ldflags, " ").
Instead, use the same quoted.Split that cmd/go uses,
copied over thanks to x/tools/cmd/bundle and go:generate.

Updates #492.
2 years ago
Daniel Martí d8f6f308bd clarify how each "cannot obfuscate" map works
We used to record all objects in cannotObfuscateNames,
and then we'd add the exported ones to KnownCannotObfuscate.

Instead, teach recordAsNotObfuscated to store each object in either
knownCannotObfuscateUnexported or KnownCannotObfuscate, but not both.
The former isn't cached so it uses in-memory pointers as keys,
and the latter uses the cross-process objectStrings like before.

Functionally, this is all the same, but with the difference that the map
indexed by types.Object will not contain objects already recorded in
KnownCannotObfuscate, reducing the amount of duplicate memory use.

While here, give recordIgnore a less ambiguous name,
and remove the second parameter as it was always tf.pkg.Path().
This also means we can compare *types.Package pointers directly.

Finally, add more TODOs for further improvement ideas.
It does mean that we end up with more TODOs than before,
even though I'm fixing one, but I reckon that's a good thing.
Recording these ideas can give first-time contributors ways to help,
and it ensures I don't forget about ideas just in my head.
2 years ago
Daniel Martí a9a721e352 concentrate and simplify "to obfuscate" logic
Back in the day, we used to call toObfuscate anytime we needed to know
whether a package should be obfuscated.
More recently, we started computing via the ToObfuscate field,
which then gets shared with all sub-processes via sharedCache.

We still had two places that directly called toObfuscate.
Replace those with ToObfuscate, and inline toObfuscate into shared.go.

obfuscatedImportPath is also a potential footgun for main packages.
Some use cases always want the original "main" package name,
such as for use in the compiler's "-p main" flag,
while other cases want the obfuscated package import path,
such as the entries in importcfg files.

Since each of these call sites handles the edge case well,
obfuscatedImportPath now panics on main packages to avoid any misuse.

Finally, test that we never leak main package paths via ldflags.txt.
We never did, but it's good to make sure.

Overall, this avoids confusion and trims the size of main.go a bit.
2 years ago
Daniel Martí 34f85e3286 remove TODO for tf.pkg.Imports
I tried to fix the TODO, but ran into the problem described by the added
documentation - that some packages in the import graph are incomplete,
as go/types was clever and didn't fully load them.

While here, also make the panics a bit more descriptive,
which helped me debug what was going wrong after the attempted refactor.
2 years ago
Daniel Martí 70b1cb2fd8 CI: start enforcing vet and staticcheck
Fix a staticcheck warning about unused code,
as well as an unparam warning and a missing copyright header.

We also bump the action versions to their latest releases,
and drop unnecessary "name" fields for self-describing steps.

Note that we drop the "go env" commands, as setup-go does that now.

Finally, I did briefly try to add caching,
but then realised it didn't help us at all. Document why.
2 years ago
Daniel Martí 345ecda999 implement TODO to use a name variable
Slightly simplifies the main chunk of code.
While here, improve the wording for when Go isn't installed correctly.
2 years ago
Daniel Martí 955c24856c properly record when type aliases are embedded as fields
There are two scenarios when it comes to embedding fields.
The first is easy, and we always handled it well:

	type Named struct { Foo int }

	type T struct { Named }

In this scenario, T ends up with an embedded field named "Named",
and a promoted field named "Foo".

Then there's the form with a type alias:

	type Named struct { Foo int }

	type Alias = Named

	type T struct { Alias }

This case is different: T ends up with an embedded field named "Alias",
and a promoted field named "Foo".
Note how the field gets its name from the referenced type,
even if said type is just an alias to another type.

This poses two problems.
First, we must obfuscate the field T.Alias as the name "Alias",
and not as the name "Named" that the alias points to.
Second, we must be careful of cases where Named and Alias are declared
in different packages, as they will obfuscate the same name differently.

Both of those problems compounded in the reported issue.
The actual reason is that quic-go has a type alias in the form of:

	type ConnectionState = qtls.ConnectionState

In other words, the entire problem boils down to a type alias which
points to a named type in a different package, where both types share
the same name. For example:

	package parent

	import "parent/p1"

	type T struct { p1.SameName }

	[...]

	package p1

	import "parent/p2"

	type SameName = p2.SameName

	[...]

	package p2

	type SameName struct { Foo int }

This broke garble because we had a heuristic to detect when an embedded
field was a type alias:

	// Instead, detect such a "foreign alias embed".
	// If we embed a final named type,
	// but the field name does not match its name,
	// then it must have been done via an alias.
	// We dig out the alias's TypeName via locateForeignAlias.
	if named.Obj().Name() != node.Name {

As the reader can deduce, this heuristic would incorrectly assume that
the snippet above does not embed a type alias, when in fact it does.
When obfuscating the field T.SameName, which uses a type alias,
we would correctly obfuscate the name "SameName",
but we would incorrectly obfuscate it with the package p2, not p1.
This would then result in build errors.

To fix this problem for good, we need to get rid of the heuristic.
Instead, we now mimic what was done for KnownCannotObfuscate,
but for embedded fields which use type aliases.
KnownEmbeddedAliasFields is now filled for each package
and stored in the cache as part of cachedOutput.
We can then detect the "embedded alias" case reliably,
even when the field is declared in an imported package.

On the plus side, we get to remove locateForeignAlias.
We also add a couple of TODOs to record further improvements.
Finally, add a test.

Fixes #466.
2 years ago
Daniel Martí 91d4a8b6af start reporting total allocs by garble in the benchmark
This is like allocs/op by testing.B.ReportAllocs,
but it combines all allocations by garble sub-processes.
As we currently generate quite a bit of garbage,
and reductions in it may only reduce time/op very slowly,
this new metric will help us visualize small improvements.

The regular ReportAllocs would not help us at all,
as the main process simply executes "garble build".

We remove user-ns/op to make space for mallocs/op,
and also since it's a bit redundant given sys-ns/op and time-ns/op.

	name      time/op
	Build-16      9.20s ± 1%

	name      bin-B
	Build-16      5.16M ± 0%

	name      cached-time/op
	Build-16      304ms ± 4%

	name      mallocs/op
	Build-16      30.7M ± 0%

	name      sys-time/op
	Build-16      4.78s ± 4%
2 years ago
Daniel Martí 4c3b90c051 stop loading obfuscated type information from deps
If package P1 imports package P2, P1 needs to know which names from P2
weren't obfuscated. For instance, if P2 declares T2 and does
"reflect.TypeOf(T2{...})", then P2 won't obfuscate the name T2, and
neither should P1.

This information should flow from P2 to P1, as P2 builds before
P1. We do this via obfuscatedTypesPackage; P1 loads the type information
of the obfuscated version of P2, and does a lookup for T2. If T2 exists,
then it wasn't obfuscated.

This mechanism has served us well, but it has downsides:

1) It wastes CPU; we load the type information for the entire package.

2) It's complex; for instance, we need KnownObjectFiles as an extra.

3) It makes our code harder to understand, as we load both the original
   and obfuscated type informaiton.

Instead, we now have each package record what names were not obfuscated
as part of its cachedOuput file. Much like KnownObjectFiles, the map
records incrementally through the import graph, to avoid having to load
cachedOutput files for indirect dependencies.

We shouldn't need to worry about those maps getting large;
we only skip obfuscating declared names in a few uncommon scenarios,
such as the use of reflection or cgo's "//export".

Since go/types is relatively allocation-heavy, and the export files
contain a lot of data, we get a nice speed-up:

	name      old time/op         new time/op         delta
	Build-16          11.5s ± 2%          11.1s ± 3%  -3.77%  (p=0.008 n=5+5)

	name      old bin-B           new bin-B           delta
	Build-16          5.15M ± 0%          5.15M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          375ms ± 3%          341ms ± 6%  -8.96%  (p=0.008 n=5+5)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          283ms ±17%          289ms ±13%    ~     (p=0.841 n=5+5)

	name      old user-time/op    new user-time/op    delta
	Build-16          687ms ± 6%          664ms ± 7%    ~     (p=0.548 n=5+5)

Fixes #456.
Updates #475.
2 years ago
Daniel Martí 8652271db2 slightly simplify how we deal with linknamed runtime deps
Obfuscating the runtime only needs to list the linknamed packages,
and doesn't need to know about their dependencies directly.

Refactor the script to return a "flat" list that includes all packages
we need, except those that we know the runtime already pulled in.

This allows us to simplify the script and avoid passing -deps to cmd/go.
Performance is unaffected, but I reckon it's worthwhile given how much
we simplified the script.

Longer term, it's also best to avoid using -deps when we don't need it,
as cmd/go could avoid computing information we don't need.

	name              old time/op       new time/op       delta
	Build/NoCache-16        1.68s ± 1%        1.68s ± 0%    ~     (p=1.000 n=5+5)

	name              old bin-B         new bin-B         delta
	Build/NoCache-16        6.72M ± 0%        6.72M ± 0%  +0.01%  (p=0.008 n=5+5)

	name              old sys-time/op   new sys-time/op   delta
	Build/NoCache-16        1.88s ± 1%        1.89s ± 2%    ~     (p=0.548 n=5+5)

	name              old user-time/op  new user-time/op  delta
	Build/NoCache-16        19.9s ± 1%        19.8s ± 0%    ~     (p=0.421 n=5+5)
2 years ago
Andrew LeFevre 474a9194c8 obfuscate exported names in main
Currently exported names in the main package are not hashed, because they
might be a Go plugin API. We don't currently support Go plugins though,
so hash them anyway.

Updates #87.
2 years ago
Daniel Martí c9341790d4 avoid obfuscating literals set via -ldflags=-X
The -X linker flag sets a string variable to a given value,
which is often used to inject strings such as versions.

The way garble's literal obfuscation works,
we replace string literals with anonymous functions which,
when evaluated, result in the original string.

Both of these features work fine separately,
but when intersecting, they break. For example, given:

	var myVar = "original"
	[...]
	-ldflags=-X=main.myVar=replaced

The -X flag effectively replaces the initial value,
and -literals adds code to be run at init time:

	var myVar = "replaced"
	func init() { myVar = func() string { ... } }

Since the init func runs later, -literals breaks -X.
To avoid that problem,
don't obfuscate literals whose variables are set via -ldflags=-X.

We also leave TODOs about obfuscating those in the future,
but we're also leaving regression tests to ensure we get it right.

Fixes #323.
2 years ago
Daniel Martí 96b15e0ac5 support GOGARBLE=* with -literals again
We recently made an important change when obfuscating the runtime,
so that if it's missing any linkname packages in ListedPackages,
it does an extra "go list" call to obtain their information.

This works very well, but we missed an edge case.
In main.go, we disable flagLiterals for the runtime package,
but not for other packages like sync/atomic.

And, since the runtime's extra "go list" has to compute GarbleActionIDs,
it uses the list of garble flags via appendFlags.
Unfortunately, it thinks "-literals" isn't set, when it is,
and the other packages see it as being set.

This discrepancy results in link time errors,
as each end of the linkname obfuscates with a different hash:

	> garble -literals build
	[stderr]
	# test/main
	jccGkbFG.(*yijmzGHo).String: relocation target jccGkbFG.e_77sflf not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.ce66Fmzl not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.e5kPa1qY not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.aQ_3sL3Q not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.zls3wmws not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.g69WgKIS not defined

To fix the problem, treat flagLiterals as read-only after flag.Parse,
just like we already do with the other flags except flagDebugDir.
The code that turned flagLiterals to false is no longer needed,
as literals.Obfuscate is only called when ToObfuscate is true,
and ToObfuscate is false for runtimeAndDeps already.
2 years ago
Daniel Martí 321fa85441 remove the use of -buildinfo=false
It looks like the flag will be scrapped from Go 1.18.
Stop using it before 1.18rc1 releases without it.

See: https://github.com/golang/go/issues/50501#issuecomment-1010225207
3 years ago
Daniel Martí d25e718d0c stop passing ignoreObjects to literals.Obfuscate
Literal obfuscation uses constant folding now,
so it no longer needs to record identifiers to ignore.
Remove the parameter and the outdated bit of docs.
3 years ago
Daniel Martí c506f02763 handle --long build flags properly
cmd/go treats "--foo=bar" juts like "-foo=bar",
just like any other program using the flag package.

However, we didn't support this longer form in filterForwardBuildFlags.
Because of it, "garble build -tags=foo" worked,
but "garble build --tags=foo" did not,
as we wouldn't forward "--tags=foo" as a build flag for "go list".

Fixes #429.
3 years ago
Daniel Martí ff803004d0 avoid build ID mismatches when using -debugdir
A recent change added the -debugdir value to addGarbleToHash,
which is part of the hash seed for all obfuscation taking place.

In principle, it was okay to add, just like any other garble flag.
In practice, it broke the added test case:

	> garble -debugdir ./debug1 build
	[stderr]
	# test/main
	FBi9xa6e.(*ac0bCOhR).String: relocation target FBi9xa6e.rV55e6H9 not defined
	qmECK6zf.init.0: relocation target qmECK6zf.eUU08z98 not defined
	[...]

This is because -debugdir gets turned into an absolute path,
and not all garble processes ended up using it consistently.

The fix is rather simple; since -debugdir doesn't affect obfuscation,
don't include it in the build hash seeding at all.

Fixes #451.
3 years ago
Daniel Martí 34cbd1b841 only list missing packages when obfuscating the runtime
We were listing all of std, which certainly worked,
but was quite slow at over 200 packages.
In practice, we can only be missing up to 20-30 packages.
It was a good change as it fixed a severe bug,
but it also introduced a fairly noticeable slow-down.

The numbers are clear; this change shaves off multiple seconds when
obfuscating the runtime with a cold cache:

	name              old time/op       new time/op       delta
	Build/NoCache-16        5.06s ± 1%        1.94s ± 1%  -61.64%  (p=0.008 n=5+5)

	name              old bin-B         new bin-B         delta
	Build/NoCache-16        6.70M ± 0%        6.71M ± 0%   +0.05%  (p=0.008 n=5+5)

	name              old sys-time/op   new sys-time/op   delta
	Build/NoCache-16        13.4s ± 2%         5.0s ± 2%  -62.45%  (p=0.008 n=5+5)

	name              old user-time/op  new user-time/op  delta
	Build/NoCache-16        60.6s ± 1%        19.8s ± 1%  -67.34%  (p=0.008 n=5+5)

Since we only want to call "go list" one extra time,
instead of once for every package we find out we're missing,
we want to know what packages we could be missing in advance.
Resurrect a smarter version of the runtime-related script.

Finally, remove the runtime-related.txt test script,
as it has now been superseeded by the sanity checks in listPackage.
That is, obfuscating the runtime package will now panic if we are
missing any necessary package information.

To double check that we get the runtime's linkname edge case right,
make gogarble.txt use runtime/debug.WriteHeapDump,
which is implemented via a direct runtime linkname.
This ensures we don't lose test coverage from runtime-related.txt.
3 years ago
Daniel Martí 4e97811a62 give a useful error for "garble build -tiny"
We've had confused users a handful of times by now.
And it's reasonable to expect flags to be after the command,
as that's how flags work for cmd/go itself.

I don't think we want to mix our flags with Go's,
or start accepting flags in either place.
Both seem like worse solutions long-term, as they can add confusion.

However, we can quickly give a useful hint when a flag is misplaced.
That should get new users unblocked without asking for help.

We use a regular expression for this purpose,
because it doesn't seem like a FlagSet supports what we need;
to detect whether an argument is one of our flags,
without actually applying its value to the flagset.
Our flagset would also error on Go's flags, which we don't want.
3 years ago
Daniel Martí 5abd3c468d update cmd/go flags for 1.18
As of 1.18beta1. I used bash commands like:

	diff -u <(go1.17.5 help build) <(gotip help build)

While here, supply -buildinfo=false and -buildvcs=false to go build.
We already remove that information by discarding the _gomod_.go file,
but we might as well pass the flags too.
If anything, it lets the toolchain avoid the work entirely.
Note that we can't use these flags on Go 1.17 for now, though.

Add a TODO that came to mind while writing this, too.
3 years ago
Daniel Martí 93b2873c28 ensure the runtime is built in a reproducible way
We went to great lengths to ensure garble builds are reproducible.
This includes how the tool itself works,
as its behavior should be the same given the same inputs.

However, we made one crucial mistake with the runtime package.
It has go:linkname directives pointing at other packages,
and some of those pointed packages aren't its dependencies.

Imagine two scenarios where garble builds the runtime package:

1) We run "garble build runtime". The way we handle linkname directives
   calls listPackage on the target package, to obfuscate the target's
   import path and object name. However, since we only obtained build
   info of runtime and its deps, calls for some linknames such as
   listPackage("sync/atomic") will fail. The linkname directive will
   leave its target untouched.

2) We run "garble build std". Unlike the first scenario, all listPackage
   calls issued by runtime's linkname directives will succeed, so its
   linkname directive targets will be obfuscated.

At best, this can result in inconsistent builds, depending on how the
runtime package was built. At worst, the mismatching object names can
result in errors at link time, if the target packages are actually used.

The modified test reproduces the worst case scenario reliably,
when the fix is reverted:

	> env GOCACHE=${WORK}/gocache-empty
	> garble build -a runtime
	> garble build -o=out_rebuild ./stdimporter
	[stderr]
	# test/main/stdimporter
	JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined
	JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined
	JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined
	JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined
	[...]

The fix consists of two steps. First, if we're building the runtime and
listPackage fails on a package, that means we ran into scenario 1 above.
To avoid the inconsistency, we fill ListedPackages with "go list [...] std".
This means we'll always build runtime as described in scenario 2 above.

Second, when building packages other than the runtime,
we only allow listPackage to succeed if we're listing a dependency of
the current package.
This ensures we won't run into similar reproducibility bugs in the future.

Finally, re-enable test-gotip on CI since this was the last test flake.
3 years ago
Daniel Martí bd0705a536 amend redirecting URLs via "xurls -fix" 3 years ago
Daniel Martí 06d5972223
add -debug to aid in debugging (#431)
Not really meant for end users,
but they might still debug failures before filing bugs.

We add the -debug flag itself,
as well as machinery to deduplicate output lines.
There are quite a lot of them otherwise,
which aren't helpful and simply add noise.

In the future, if we always want to output a debug log line,
such as "choosing not to obfuscate here because X",
we can simply insert the unique position string.

Finally, turn all commented-out log.Printf calls to debugf.
Improve a few log lines to be more human-friendly,
and also add a few extras like how long it takes to load files.

We can improve the logging further in the future.
This seems like a good starting point.
3 years ago
Daniel Martí 5f74a1c9f0 unify the definition and storage of flag values
The parent garble process parses the original flags,
as provided by the user via the command line.
Previously, those got stored in the shared cache file,
so that child processes spawned by toolexec could see them.

Unfortunately, this made the code relatively easy to misuse.
A child process would always see flagLiterals as zero value,
given that it should never see such a flag argument directly.
Similarly, one would have to be careful with cached options,
as they could only be consumed after the cache file is loaded.

Simplify the situation by deduplicating the storage of flags.
Now, the parent passes all flags onto children via toolexec.

One exception is GarbleDir, which now becomes an env var.
This seems in line with other top-level dirs like GARBLE_SHARED.

Finally, we turn -seed into a flag.Value,
which lets us implement its "set" behavior as part of flag.Parse.

Overall, we barely reduce the amount of code involved,
but we certainly remove a couple of footguns.
As part of the cleanup, we also introduce appendFlags.
3 years ago
Daniel Martí a144789910 only obfuscate literals in packages to obfuscate
Add a regression test in gogarble.txt,
as that test is already set up with packages to not obfuscate.

This bug manifested in the form of a build failure for GOOS=plan9
with -literals turned on:

	[...]/os/file_plan9.go:151:12: invalid operation: cannot call non-function append (variable of type bool)

In this case, the "os" package is not to be obfuscated,
but we would still obfuscate its literals as per the bug.

But, since the package's identifiers were not obfuscated,
names like "append" were not replaced as per ea2e0bdf71,
meaning that the shadowing would still affect us.

Fixes #417.
3 years ago
Daniel Martí fceb19f6da
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:

* People tend to obfuscate private code, whose package paths will
  generally be in GOPRIVATE already

* Its meaning and syntax are well understood

* It allows all the flexibility we need without adding our own env var
  or config option

However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:

* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
  Downloading modules, such as when adding or updating dependencies,
  or when the local cache is cold, can be less reliable.

* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
  Adding entries to go.sum, such as when adding or updating dependencies,
  can be less secure.

We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.

While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.

Fixes #276.
3 years ago
lu4p a645929151
obfuscate literals via constant folding
Constants don't need to be added to ignoreObjs anymore,
because go/types now does this work for us.

Fixes #360
3 years ago
Daniel Martí b5bef981ee
stop relying on nested "go list -toolexec" calls (#422)
We rely on importcfg files to load type info for obfuscated packages.
We use this type information to remember what names we didn't obfuscate.
Unfortunately, indirect dependencies aren't listed in importcfg files,
so we relied on extra "go list -toolexec" calls to locate object files.

This worked fine, but added a significant amount of complexity.
The extra "go list -export -toolexec=garble" invocations weren't slow,
as they avoided rebuilding or re-obfuscating thanks to the build cache.
Still, it was hard to reason about how garble runs during a build
if we might have multiple layers of -toolexec invocations.

Instead, record the export files we encounter in an incremental map,
and persist it in the build cache via the gob file we're already using.
This way, each garble invocation knows where all object files are,
even those for indirect imports.

One wrinkle is that importcfg files can point to temporary object files.
In that case, figure out its final location in the build cache.
This requires hard-coding a bit of knowledge about how GOCACHE works,
but it seems relatively harmless given how it's very little code.
Plus, if GOCACHE ever changes, it will be obvious when our code breaks.

Finally, add a TODO about potentially saving even more work.
3 years ago
Daniel Martí d5d1131b75 keep importmap entries in the right direction
For packages that we alter, we parse and modify the importcfg file.
Parsing is necessary so we can locate obfuscated object files,
which we use to remember what identifiers were obfuscated.
Modifying the files is necessary when we obfuscate import paths,
and those import paths have entries in an importcfg file.

However, we made one crucial mistake when writing the code.
When handling importmap entries such as:

	importmap golang.org/x/net/idna=vendor/golang.org/x/net/idna

we would name the two sides beforePath and afterPath, respectively.
They were added to importMap with afterPath as the key,
but when we iterated over the map to write a modified importcfg file,
we would then assume the key is beforepath.
All in all, we would end up writing the opposite direction:

	importmap vendor/golang.org/x/net/idna=golang.org/x/net/idna

This would ultimately result in the importmap never being useful,
and some rather confusing error messages such as:

	cannot find package golang.org/x/net/idna (using -importcfg)

Add a test case that reproduces this error,
and fix the code so it always uses beforePath as the key.
Note that we were also updating importCfgEntries with such entries.
I could not reproduce any failure when just removing that code,
nor could I explain why it was added there in the first place.
As such, remove that bit of code as well.

Finally, a reasonable question might be why we never noticed the bug.
In practice, such "importmap"s, represented as ImportMap by "go list",
only currently appear for packages vendored into the standard library.
Until very recently, we didn't support obfuscating most of std,
so we would usually not alter the affected importcfg files.
Now that we do parse and modify them, the bug surfaced.

Fixes #408.
3 years ago
Daniel Martí ea2e0bdf71
obfuscate all variable names, even local ones (#420)
In the added test case, "garble -literals build" would fail:

	--- FAIL: TestScripts/literals (8.29s)
		testscript.go:397:
			> env GOPRIVATE=test/main
			> garble -literals build
			[stderr]
			# test/main
			Usz1FmFm.go:1: cannot call non-function string (type int), declared at Usz1FmFm.go:1
			Usz1FmFm.go:1: string is not a type
			Usz1FmFm.go:1: cannot call non-function append (type int), declared at Usz1FmFm.go:1

That is, for input code such as:

	var append int
	println("foo")
	_ = append

We'd end up with obfuscated code like:

	var append int
	println(func() string {
		// obfuscation...
		x = append(x, ...)
		// obfuscation...
		return string(x)
	})
	_ = append

Which would then break, as the code is shadowing the "append" builtin.
To work around this, always obfuscate variable names, so we end up with:

	var mwu1xuNz int
	println(func() string {
		// obfuscation...
		x = append(x, ...)
		// obfuscation...
		return string(x)
	})
	_ = mwu1xuNz

This change shouldn't make the quality of our obfuscation stronger,
as local variable names do not currently end up in Go binaries.
However, this does make garble more consistent in treating identifiers,
and it completely avoids any issues related to shadowing builtins.

Moreover, this also paves the way for publishing obfuscated source code,
such as #369.

Fixes #417.
3 years ago
Daniel Martí caa9831a63
fail if we are unexpectedly overwriting files (#418)
While investigating a bug report,
I noticed that garble was writing to the same temp file twice.
At best, writing to the same path on disk twice is wasteful,
as the design is careful to be deterministic and use unique paths.
At worst, the two writes could cause races at the filesystem level.

To prevent either of those situations,
we now create files with os.OpenFile and os.O_EXCL,
meaning that we will error if the file already exists.
That change uncovered a number of such unintended cases.

First, transformAsm would write obfuscated Go files twice.
This is because the Go toolchain actually runs:

	[...]/asm -gensymabis [...] foo.s bar.s
	[...]/asm [...] foo.s bar.s

That is, the first run is only meant to generate symbol ABIs,
which are then used by the compiler.
We need to obfuscate at that first stage,
because the symbol ABI descriptions need to use obfuscated names.

However, having already obfuscated the assembly on the first stage,
there is no need to do so again on the second stage.
If we detect gensymabis is missing, we simply reuse the previous files.

This first situation doesn't seem racy,
but obfuscating the Go assembly files twice is certainly unnecessary.

Second, saveKnownReflectAPIs wrote a gob file to the build cache.
Since the build cache can be kept between builds,
and since the build cache uses reproducible paths for each build,
running the same "garble build" twice could overwrite those files.

This could actually cause races at the filesystem level;
if two concurrent builds write to the same gob file on disk,
one of them could end up using a partially-written file.

Note that this is the only of the three cases not using temporary files.
As such, it is expected that the file may already exist.
In such a case, we simply avoid overwriting it rather than failing.

Third, when "garble build -a" was used,
and when we needed an export file not listed in importcfg,
we would end up calling roughly:

	go list -export -toolexec=garble -a <dependency>

This meant we would re-build and re-obfuscate those packages.
Which is unfortunate, because the parent process already did via:

	go build -toolexec=garble -a <main>

The repeated dependency builds tripped the new os.O_EXCL check,
as we would try to overwrite the same obfuscated Go files.
Beyond being wasteful, this could again cause subtle filesystem races.
To fix the problem, avoid passing flags like "-a" to nested go commands.

Overall, we should likely be using safer ways to write to disk,
be it via either atomic writes or locked files.
However, for now, catching duplicate writes is a big step.
I have left a self-assigned TODO for further improvements.

CI on the pull request found a failure on test-gotip.
The failure reproduces on master, so it seems to be related to gotip,
and not a regression introduced by this change.
For now, disable test-gotip until we can investigate.
3 years ago
Daniel Martí 29356f30f7 update runtimeAndDeps for Go 1.18
In particular, internal/abi now has some actual code,
so obfuscating those literals was breaking as expected.
Document how to update the list in the future as well.

The change above gets "go test" to just one test failure on:

	go version devel go1.18-578ada410d Tue Nov 9 22:58:24 2021 +0000 linux/amd64

We also move the doc about why we disable GarbleLiterals,
so that it's next to where the disabling happens.

While here, we also rename GarbleLiterals to ObfuscateLiterals,
as we have been trying to move away from "to garble" as a verb.

Finally, limit the verbosity of diffoscope.
One test was failing for me, and diffoscope printed thousands of lines.
Not particularly useful when I'm trying to skim test results.
Usually, seeing a few dozen lines of output is enough.

Updates #385.
3 years ago
lu4p 3d19605782
Fix linkname directives with dots in importpath (#407)
Obfuscating newName arguments of linkname directives
with dots in the importpath didn't work before.

We had a test which covers this,
but the corresponding package wasn't actually obfuscated.
3 years ago
lu4p 88f238e558
Obfuscate more packages of the standard library (#312)
Also update linkname directives of public packages,
to allow the package where something is linknamed to to be
obfuscated regardless.

Public packages can now depend on private packages.
3 years ago
Daniel Martí 7ede21c981 drop support for Go 1.16.x
We can now use pruned module graphs in go.mod files,
and we no longer need to worry about runtime/internal/sys.

Note that I had to update testdata/mod slightly,
as the new pruned module graphs algorithm downloads an extra go.mod file.

This change also paves the way towards future Go 1.18 support.

Thanks to lu4p for cleaning up two TODOs as well.

Co-Authored-By: lu4p <lu4p@pm.me>
3 years ago
Daniel Martí 1682e8ee10 always require one argument for "reverse"
The "reverse" command had many levels of optional arguments:

	garble [garble flags] reverse [build flags] [package] [files]

This was pretty confusing,
and could easily lead to people running the command incorrectly:

	# note that output.txt isn't a Go package!
	garble reverse output.txt

Moreover, it made the handling of Go build flags pretty confusing.
Should the command below work?

	garble reverse -tags=mytag

It also made it easy to not notice that one must supply the main package
to properly reverse some text that it produced, like a panic message.
With the package path being implicit,
one could mistakenly provide the wrong package by running garble
in a directory containing a different package.

See #394.
3 years ago
Daniel Martí 08ec70e9a9 avoid a filesystem race with build cache entries
After the last commit, I started seeing seemingly random test failures:

    --- FAIL: TestScripts/cgo (1.17s)
        testscript.go:397:
            > env GOPRIVATE=test/main
            > garble build
            [stderr]
            # runtime/internal/math
            EOF
            # internal/bytealg
            EOF
            exit status 2
            [exit status 1]
            FAIL: testdata/scripts/cgo.txt:3: unexpected command failure

    --- FAIL: TestScripts/reflect (8.63s)
        testscript.go:397:
            > env GOPRIVATE=test/main
            > garble build
            [stderr]
            # math
            EOF
            exit status 2
            [exit status 1]
            FAIL: testdata/scripts/reflect.txt:3: unexpected command failure

    --- FAIL: TestScripts/linkname (8.72s)
        testscript.go:397:
            > env GOPRIVATE=test/main
            > garble build
            [stderr]
            # math
            EOF
            exit status 2
            [exit status 1]
            FAIL: testdata/scripts/linkname.txt:3: unexpected command failure

After some investigation,
it turned out that concurrent "garble build" processes
were writing to the same build cache paths at the same time.
This is effectively a filesystem race,
and encoding/gob could error when reading partly written files.

To fix this problem,
use a cache path that changes according to the obfuscated build.
See garbleExportFile.

Note that the data we store in that file does not vary with obfuscation.
We could fix the filesystem race by adding locking around the old path.
However, we'll want to cache data that does vary with garble flags,
such as the -debugdir source code.
3 years ago
lu4p aafd845418 More robust reflection detection
Functions which use reflection on one of their parameters are,
now added to knownReflectAPIs automatically.

This makes most explicit hints for reflection redundant.
Simple protobuf code now works correctly when obfuscated.

Fixes #162
Fixes #373
3 years ago
Daniel Martí 0eba9c210c document version command in help text
Back in February we added the version command,
similar to Go's own "go version",
but we failed to add a mention in the help text.
3 years ago
Daniel Martí 5c70681fee detect more std API calls which use reflection
Before, we would just notice direct calls to reflect's TypeOf and
ValueOf. Any other uses of reflection, such as encoding/json or
google.golang.org/protobuf, would require hints as documented by the
README.

Issue #162 outlines some ways we could fix this issue in a general way,
automatically detecting what functions use reflection on their parameters,
even for third party API funcs.

However, that goal is pretty significant in terms of code and effort.
As a temporary improvement, we can expand the list of "known" reflection
APIs via a static table.

Since this table is keyed by "func full name" strings, we could
potentially include third party APIs, such as:

	google.golang.org/protobuf/proto.Marshal

However, for now simply include all the std APIs we know about.
If we fail to do the proper fix for automatic detection in the future,
we can then fall back to expanding this global table for third parties.

Update the README's docs, to clarify that the hint is not always
necessary anymore.

Also update the reflect.txt test to stop using the hint for encoding/json,
and to also start testing text/template with a method call.
While at it, I noticed that we weren't testing the println outputs,
as they'd go to stderr - fix that too.

Updates #162.
3 years ago
Daniel Martí 04b4ae390e update build and test flag list for Go 1.16
Go 1.16 just added one flag, -overlay, as documented in
https://golang.org/doc/go1.16#go-command.

We forgot to update these tables for 1.16, though it probably doesn't
matter as this flag is pretty much only used by gopls.
3 years ago
Daniel Martí d338be9941 give a usage example for combining flags
Users are still filing bugs about this, presumably because the docs
aren't clear enough.

Fixes #372.
3 years ago
Daniel Martí c9e15e18cf make the positioning of Go flags clearer in the help text
We did say "garble flags before the command" towards the end,
but let's also make that clearer at the top.

For #342.
3 years ago
Daniel Martí ec0bdc4012 keep cgo-exported Go names non-obfuscated
Otherwise, the added test case would fail, as we don't modify the C code
and so there would be a name mismatch.

In the far future we might start modifying Go names in C code,
similar to what we did for Go assembly,
but right now that seems out of scope and too complex.

An easier fix is to simply record those (hopefully few) names in ignoreObjects.

While at it, recordReflectArgs started to really outgrow its name, as it
also collected expressions used as constants for literal obfuscation.
Give it a better name.

Fixes #366.
3 years ago
Daniel Martí 5f8bae06b7 small improvements towards obfuscating the runtime
I spent a couple of days trying to obfuscate all of std.
Ultimately I failed at making it fully work,
especially when it comes to the runtime package,
but I did fix a few problems along the way, as seen here.

First, fix the TODO to allow handleDirectives and transformGo to run on
runtime packages as well, if they are considered private. Note that this
is never true right now, but it matters once we remove runtimeRelated.

Second, modify parsedebugvars in a way that doesn't break typechecking.
We can remove AST nodes or even modify them in simple ways,
but if we add new AST nodes after typechecking,
those will lack type information.

We were replacing the entire body, running into that problem.
Instead, carefully cut the body to set some defaults,
but remove everything from the point GODEBUG is read.

Finally, add commented-out debug prints of transformed asm files.

For #193.
3 years ago
Daniel Martí 5e3cdf89a8 update support for Go 1.17 in time for beta1
Back in early April we added initial support for Go 1.17,
working on a commit from master at that time. For that to work, we just
needed to add a couple of packages to runtimeRelated and tweak printFile
a bit to not break the new "//go:build" directives.

A significant amount of changes have landed since, though, and the tests
broke in multiple ways.

Most notably, the new register ABI is enabled by default for GOOS=amd64.
That affected garble indirectly in two ways: there's a new internal
package to add to runtimeRelated, and we must make reverse.txt more
clever in making its output constant across ABIs.

Another noticeable change is that Go 1.17 changes how its own version is
injected into the runtime package. It used to be via a constant in
runtime/internal/sys, such as:

	const TheVersion = `devel ...`

Since we couldn't override such constants via the linker's -X flag,
we had to directly alter the declaration while compiling.

Thankfully, Go 1.17 simply uses a "var buildVersion string" in the
runtime package, and its value is injected by the linker.
This means we can now override it with the linker's -X flag.

We make the code to alter TheVersion for Go 1.16 a bit more clever,
to not break the package when building with Go 1.17.

Finally, our hack to work around ambiguous TOOLEXEC_IMPORTPATH values
now only kicks in for non-test packages, since Go 1.17 includes our
upstream fix. Otherwise, some tests would end up with the ".test"
variant suffix added a second time:

	test/bar [test/bar.test] [test/bar [test/bar.test].test]

All the code to keep compatibility with Go 1.16.x remains in place.
We're still leaving TODOs to remind ourselves to remove it or simplify
it once we remove support for 1.16.x.

The 1.17 development freeze has already been in place for a month,
and beta1 is due to come this week, so it's unlikely that Go will change
in any considerable way at this point. Hence, we can say that support
for 1.17 is done.

Fixes #347.
3 years ago
Daniel Martí 1d31a139f5 support aliases as embedded fields in dependencies
Our recent work in fieldToAlias worked well when the embedded field
declaration (using an alias) was in the same package as the use of that
field. We would have the *ast.Ident for the field declaration, so
types.Info.Uses would give us the TypeName for the alias.

Unfortunately, if the declaration was in a dependency package, we did
not have that same *ast.Ident, as we weren't parsing the source code for
dependencies for type-checking. This resulted in us incorrectly
obfuscating the use of such an embedded field:

	> garble build
	[stderr]
	# test/main
	JtzmzxWf.go:4: unknown field 'ExternalForeignAlias' in struct literal of type _BdSNiEL.Vcs_smer

To fix this, look through the direct imports of the package defining the
field to find an alias under the exact same name. Not a foolproof
solution, as there's a TODO, but it should work for most cases.

Fixes the obfuscation of google.golang.org/grpc/internal/status, too.

Updates #349.
3 years ago
Daniel Martí 68b39e8195 fix a link issue when obfuscating complex cgo packages
The added test case, which is obfuscating and linking os/user, would fail
before this fix:

	> garble build
	[stderr]
	# test/main
	/usr/lib/go/pkg/tool/linux_amd64/link: running gcc failed: exit status 1
	/usr/bin/ld: $WORK/.tmp/go-link-073246656/go.o: in function `Chz0Yfs2._cgo_cmalloc':
	go.go:(.text+0x993cc): undefined reference to `Chz0Yfs2.runtime_throw'
	/usr/bin/ld: $WORK/.tmp/go-link-073246656/go.o: in function `Chz0Yfs2.tDfhQ8uK':
	go.go:(.text+0x99801): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
	/usr/bin/ld: go.go:(.text+0x9982a): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
	/usr/bin/ld: go.go:(.text+0x99853): undefined reference to `Chz0Yfs2._cgo_runtime_gostring'
	collect2: error: ld returned 1 exit status

The reason is that we would alter the linkname directives of cgo-generated
code, but we would not obfuscate the code itself at all.

The generated code would end up being transformed into:

	//go:linkname zh_oKZIy runtime.throw
	func runtime_throw(string)

One can clearly see the error there; handleDirectives obfuscated the
local linkname name, but since transformGo didn't run, the actual Go
declaration was not obfuscated in the same way. Thus, the linker fails
to find a function body for runtime_throw, and fails.

The solution is simple: handleDirectives assumes that it's running on
code being obfuscated, so only run it when transformGo is running.

We can also remove the cgo skip check in handleDirectives, as it never
runs on cgo-generated code now.

Fixes a number of build errors that have been noticed since
907aebd770.
3 years ago
Daniel Martí 64317883c9 handle aliases to foreign named types properly
When such an alias name was used to define an embedded field, we handled
that case gracefully via the code using:

	tf.info.Uses[node].(*types.TypeName)

Unfortunately, when the same field name was used elsewhere, such as a
composite literal, tf.Info.Uses gave us a *types.Var, not a
*types.TypeName, meaning we could no longer tell if this was an alias,
or what it pointed to.

Thus, we failed to obfuscate the name properly in the added test case:

	> garble build
	[stderr]
	# test/main/sub
	xxWZf66u.go:36: unknown field 'foreignAlias' in struct literal of type smhWelwn

It doesn't seem like any of the go/types APIs allows us to obtain the
*types.TypeName directly in this scenario. Thus, use a trick that we
used before: after typechecking, but before obfuscating, record all
embedded struct field *types.Var which are aliases via a map, where the
value holds the *types.TypeName for the alias.

Updates #349.
3 years ago
Daniel Martí e2ddce75a7 support embedding via embed.FS
We already added support for "//go:embed" with string and []byte,
by not obfuscating the "embed" import path.

However, embed.FS was still failing:

	> garble build
	[stderr]
	# test/main
	:13: go:embed cannot apply to var of type embed.WtKNvwbN

The compiler detects the type by matching its name to exactly "embed.FS",
so don't obfuscate the name "FS" either.

While at it, ensure that the embed code behaves the same with "go build".

Updates #349.
3 years ago
Daniel Martí 7fc424ca26 make "garble command -h" give command-specific help
Before, "garble build -h" would print the same as "garble -h", which is
too much and confusing, as it doesn't tell us much about "build".

Now it's far better, and includes the output of "go build -h":

	$ garble build -h
	usage: garble [garble flags] build [arguments]

	This command wraps "go build". Below is its help:

	usage: go build [-o output] [build flags] [packages]
	Run 'go help build' for details.

We do the same for "garble reverse -h", since it doesn't wrap a Go tool
command.
3 years ago
Daniel Martí 8edde922ee remove unused code spotted by -coverprofile
Remove some asthelper APIs that haven't been used for some time.
They can be recovered from the git history if needed again.

One type assertion in the literals package is always true.

Embedded field objects are handled near the top of transformGo, so the
extra !obj.Embedded() check was always true. Remove it.

We always obfuscate standalone funcs now, so the obfuscatedTypesPackage
check is no longer necessary. This was necessary when we used to not
obfuscate func names when they were used in linkname directives.

The workaround for test package imports in obfuscatedTypesPackage I had
to add a few commits ago no longer seems to be necessary. This might be
thanks to the simplification with functions in the paragraph just above.

It's impossible to run garble without -trimpath nowadays, as we error
before the build even starts:

	$ go build -toolexec=garble
	go tool compile: exit status 1
	cannot open shared file, this is most likely due to not running "garble [command]"

When run as "garble build", the trimpath flag is always set. So the
check in alterTrimpath never triggers anymore, and couldn't be tested.

Finally, simplify the handling of comment syntax in printFile, and add a
few TODOs for other code paths not covered by our existing tests.

Total code coverage is up from 90.3% to 91.0%.
3 years ago
Daniel Martí 65ff07875b obfuscate alias names like any other objects
Before this change, we would try to never obfuscate alias names. That
was far from ideal, as they can end up in field names via anonymous
fields.

Even then, we would sometimes still fail to build, because we would
inconsistently obfuscate alias names. For example, in the added test
case:

	--- FAIL: TestScripts/syntax (0.23s)
	    testscript.go:397:
	        > env GOPRIVATE='test/main,private.source'
	        > garble build
	        [stderr]
	        # test/main/sub
	        Lv_a8gRD.go:15: undefined: KCvSpxmQ

To fix this problem, we set obj to be the TypeName corresponding to the
alias when it is used as an embedded field. We can then make the right
choice when obfuscating the name.

Right now, all aliases will be obfuscated. A TODO exists about not
obfuscating alias names when they're used as embedded fields in a struct
type in the same package, and that package is used for reflection -
since then, the alias name ends up as the field name.

With these changes, the protobuf module now builds.
3 years ago
Daniel Martí 68f07389b2 fix a number of issues involving types from indirect imports
obfuscatedTypesPackage is used to figure out if a name in a dependency
package was obfuscated or not. For example, if that package used
reflection on a named type, it wasn't obfuscated, so we must have the
same information to not obfuscate the same name downstream.

obfuscatedTypesPackage could return nil if the package was indirectly
imported, though. This can happen if a direct import has a function that
returns an indirect type, or if a direct import exposes a name that's a
type alias to an indirect type.

We sort of dealt with this in two pieces of code by checking for
obfPkg!=nil, but a third one did not have this check and caused a panic
in the added test case:

	--- FAIL: TestScripts/reflect (0.81s)
	    testscript.go:397:
	        > env GOPRIVATE=test/main
	        > garble build
	        [stderr]
	        # test/main
	        panic: runtime error: invalid memory address or nil pointer dereference [recovered]
	        	panic: runtime error: invalid memory address or nil pointer dereference
	        [signal SIGSEGV: segmentation violation code=0x1 addr=0x20 pc=0x8a5e39]

More importantly though, the nil check only avoids panics. It doesn't
fix the root cause of the problem: that importcfg does not contain
indirectly imported packages. The added test case would still fail, as
we would obfuscate a type in the main package, but not in the indirectly
imported package where the type is defined.

To fix this, resurrect a bit of code from earlier garble versions, which
uses "go list -toolexec=garble" to fetch a package's export file. This
lets us fill the indirect import gaps in importcfg, working around the
problem entirely.

This solution is still not particularly great, so we add a TODO about
possibly rethinking this in the future. It does add some overhead and
complexity, though thankfully indirect imports should be uncommon.

This fixes a few panics while building the protobuf module.
3 years ago
Daniel Martí 654841e1fb skip reflection detection for sibling packages
Our allPkgs boolean logic was wrong, because it could still lead to
garble obfuscating a type when used in the main package, but not in its
defining package. The added test case shows such a case.

To fix that, use a package path to only record the named objects from
the target package, which is a narrower operation without this problem,
but still makes all our tests pass.

This makes the google.golang.org/protobuf/internal/filetype package
start building.
3 years ago
Daniel Martí b6dee63b32 improve handling of reflect on foreign unnamed types
If a package A imports package B, and uses reflect.TypeOf on an unnamed
struct type B.T (such as an alias), we don't want to record B.T's fields
as "do not obfuscate". This is for the same reason that we don't if B.T
is a named struct type: the detection only works for the package
defining the type, as otherwise it's inconsistent.

We failed to handle this case well, because we assumed all struct types
would be under some named type. This is not the case for type aliases.

Fortunately, struct fields are named, and as such they are objects.
Check their package too, just like we do for named types.

Fixes another build error when obfuscating the protobuf module.
We add a simplified version of the example above as a test case,
which originated from debugging the protobuf build failure.
3 years ago
Daniel Martí d8de5a4306 avoid reproducibility issues with full rebuilds
We were using temporary filenames for modified Go and assembly files.
For example, an obfuscated "encoding/json/encode.go" would end up as:

	/tmp/garble-shared123/encode.go.456.go

where "123" and "456" are random numbers, usually longer.

This was usually fine for two reasons:

1) We would add "/tmp/garble-shared123/" to -trimpath, so the temporary
   directory and its random number would be invisible.

2) We would add "//line" directives to the source files, replacing
   the filename with obfuscated versions excluding any random number.

Unfortunately, this broke in multiple ways. Most notably, assembly files
do not have any line directives, and it's not clear that there's any
support for them. So the random number in their basename could end up in
the binary, breaking reproducibility.

Another issue is that the -trimpath addition described above was only
done for cmd/compile, not cmd/asm, so assembly filenames included the
randomized temporary directory.

To fix the issues above, the same "encoding/json/encode.go" would now
end up as:

	/tmp/garble-shared123/encoding/json/encode.go

Such a path is still unique even though the "456" random number is gone,
as import paths are unique within a single build.

This fixes issues with the base name of each file, so we no longer rely
on line directives as the only way to remove the second original random
number.

We still rely on -trimpath to get rid of the temporary directory in
filenames. To fix its problem with assembly files, also amend the
-trimpath flag when running the assembler tool.

Finally, add a test that reproducible builds still work when a full
rebuild is done. We choose goprivate.txt for such a test as its
stdimporter package imports a number of std packages, including uses of
assembly and cgo.

For the time being, we don't use such a "full rebuild" reproducibility
test in other test scripts, as this step is expensive, rebuilding many
packages from scratch.

This issue went unnoticed for over a year because such random numbers
"123" and "456" were created when a package was obfuscated, and that
only happened once per package version as long as the build cache was
kept intact.

When clearing the build cache, or forcing a rebuild with -a, one gets
new random numbers, and thus a different binary resulting from the same
build input. That's not something that most users would do regularly,
and our tests did not cover that edge case either, until now.

Fixes #328.
3 years ago
Daniel Martí d34406d832 clarify the status of the TOOLEXEC_IMPORTPATH hack
Now that upstream has merged our fix, it will ship with 1.17.
We'll be able to remove this entire chunk of code soon enough.
3 years ago
Daniel Martí c9b0b07853 hash field names equally in all packages
Packages P1 and P2 can define identical struct types T1 and T2, and one
can convert from type T1 to T2 or vice versa.

The spec defines two identical struct types as:

	Two struct types are identical if they have the same sequence of
	fields, and if corresponding fields have the same names, and
	identical types, and identical tags. Non-exported field names
	from different packages are always different.

Unfortunately, garble broke this: since we obfuscated field names
differently depending on the package, cross-package conversions like the
case above would result in typechecking errors.

To fix this, implement Joe Tsai's idea: hash struct field names with the
string representation of the entire struct. This way, identical struct
types will have their field names obfuscated in the same way in all
packages across a build.

Note that we had to refactor "reverse" a bit to start using transformer,
since now it needs to keep track of struct types as well.

This failure was affecting the build of google.golang.org/protobuf,
since it makes regular use of cross-package struct conversions.

Note that the protobuf module still fails to build, but for other
reasons. The package that used to fail now succeeds, so the build gets a
bit further than before. #240 tracks adding relevant third-party Go
modules to CI, so we'll track the other remaining failures there.

Fixes #310.
3 years ago
Andrew LeFevre b3db7d6fa7 fix obfuscating linkname directives that where the package name contained a dot 3 years ago
Andrew LeFevre 907aebd770 obfuscate local names in linkname directives
Previously, the local name part of linkname directives were never
obfuscated. Now that we can obfuscate function names in Go assembly,
that limitation is required no longer.

Fixes #309
3 years ago
Daniel Martí b4fc735a1e fix windows/arm cross-build linking
Obfuscating some std packages for windows/arm triggered a bug; when
encountering a call to runtime·memmove, we'd hash "memmove" with the
current package's action ID.

This is wrong on two levels: First, we aren't obfuscating the runtime
package yet. And second, if we did, we would have to hash the symbol
appropriately, with that package's action ID.

For now, only hashing the local names does the trick. That's all that
the code currently supports, anyway.
3 years ago
Daniel Martí 05d35350cf record types into ignoreObjects more reliably
Our previous logic only took care of fairly simple types, such as a
simple struct or a pointer to a struct. If we had a struct embedding
another struct, we'd fail to record the objects for the fields in the
inner struct, and that would lead to miscompilation:

	> garble build
	[stderr]
	# test/main
	LZmt64Nm.go:7: outer.InnerField undefined (type *CcUt1wkQ.EmbeddingOuter has no field or method InnerField)

To fix this issue, make the function that records all objects under a
types.Type smarter. Since it now does more than just dealing with
structs, it's also renamed.

Since the function now walks types properly, we get to remove the extra
ast.Inspect in recordReflectArgs, which is nice.

We also make it a method, to avoid the map parameter. A boolean
parameter is also added, since we need this feature to only look at the
current package when looking at reflect calls.

Finally, we add a test case, a simplified version of the original bug
report.

Fixes #315.
3 years ago
Daniel Martí 2fad0e1583 wrap types.Importer to canonicalize import paths
The docs for go/importer.ForCompiler say:

	The lookup function is called each time the resulting importer
	needs to resolve an import path. In this mode the importer can
	only be invoked with canonical import paths (not relative or
	absolute ones); it is assumed that the translation to canonical
	import paths is being done by the client of the importer.

We use a lookup func for two reasons: first, to support modules, and
second, to be able to use our information from "go list -json -export".

However, go/types does not canonicalize import paths before calling
ImportFrom. This is somewhat understandable; it doesn't know whether an
importer was created with a lookup func, and ImportFrom only requires
the input path to be canonicalized in that scenario. When the lookup
func is nil, the importer canonicalizes by itself via go/build.Import.

Before this change, the added crossbuild test would fail:

	> garble build net/http
	[stderr]
	# vendor/golang.org/x/crypto/chacha20
	typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/chacha20/chacha_generic.go:10:2: could not import crypto/cipher (can't find import: "crypto/cipher")
	# vendor/golang.org/x/text/secure/bidirule
	typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/secure/bidirule/bidirule.go:12:2: could not import errors (can't find import: "errors")
	# vendor/golang.org/x/crypto/cryptobyte
	typecheck error: /usr/lib/go/src/vendor/golang.org/x/crypto/cryptobyte/asn1.go:8:16: could not import encoding/asn1 (can't find import: "encoding/asn1")
	# vendor/golang.org/x/text/unicode/norm
	typecheck error: /usr/lib/go/src/vendor/golang.org/x/text/unicode/norm/composition.go:7:8: could not import unicode/utf8 (can't find import: "unicode/utf8")

This is because we'd fall back to importer.Default, which only knows how
to find packages in $GOROOT/pkg. Those are missing for cross-builds,
unsurprisingly, as those built archives end up in the build cache.

After this change, we properly support importing std-vendored packages,
so we can get rid of the importer.Default workaround. And, by extension,
cross-builds now work as well.

Note that, in the added test script, the full build of the binary fails,
as there seems to be some sort of linker problem:

	> garble build
	[stderr]
	# test/main
	d9rqJyxo.uoqIiDs5: relocation target runtime.os9A16A3 not defined

We leave that as a TODO for now, as this change is subtle enough as it
is.
3 years ago
lu4p 1a9fdb4e8e
Fix calls to linkname functions (#314) 3 years ago
Daniel Martí 3afc993266 use "go env -json" to collect env info all at once
In the worst case scenario, when GOPRIVATE isn't set at all, we would
run these three commands:

* "go env GOPRIVATE", to fetch GOPRIVATE itself
* "go list -m", for GOPRIVATE's fallback
* "go version", to check the version of Go being used

Now that we support Go 1.16 and later, all these three can be obtained
via "go env -json":

	$ go env -json GOPRIVATE GOMOD GOVERSION
	{
		"GOMOD": "/home/mvdan/src/garble/go.mod",
		"GOPRIVATE": "",
		"GOVERSION": "go1.16.3"
	}

Note that we don't get the module path directly, but we can use the
x/mod/modfile Go API to parse it from the GOMOD file cheaply.

Notably, this also simplifies our Go version checking logic, as now we
get just the version string without the "go version" prefix and
"GOOS/GOARCH" suffix we don't care about.

This makes our code a bit more maintainable and robust. When running a
short incremental build, we can also see a small speed-up, as saving two
"go" invocations can save a few milliseconds:

	name           old time/op       new time/op       delta
	Build/Cache-8        168ms ± 0%        166ms ± 1%  -1.26%  (p=0.009 n=6+6)

	name           old bin-B         new bin-B         delta
	Build/Cache-8        6.36M ± 0%        6.36M ± 0%  +0.12%  (p=0.002 n=6+6)

	name           old sys-time/op   new sys-time/op   delta
	Build/Cache-8        222ms ± 2%        219ms ± 3%    ~     (p=0.589 n=6+6)

	name           old user-time/op  new user-time/op  delta
	Build/Cache-8        857ms ± 1%        846ms ± 1%  -1.31%  (p=0.041 n=6+6)
3 years ago
Daniel Martí 24d5ff362c fix a regression involving imported linkname funcs
In ce2c45440a, we simplified the code a bit and removed one call to
obfuscatedTypesPackage.

Unfortunately, we introduced a regression; if an exported function is
linknamed to another symbol name, and it's called from an importer
package, we would have a build failure now:

	> garble build
	[stderr]
	# test/main
	ZiOACuw7.go:1: undefined: ODC0xN52.BaDqbhkj

This is because the imported package would not hash the original name,
via its ignoreObjects logic. And, since the importer package has no
access to that knowledge, it would hash the same name, and fail to find
it in the final build.

The regression happened because we used to have a types.Scope Lookup
that saved us in this scenario. Add the test, and re-add the Lookup,
this time only for this particular scenario with function names.

Thanks to Andrew LeFevre for reporting and describing the test case.

While at it, replace more uses of "garbled" to "obfuscated".
3 years ago
Daniel Martí 5de519694a CI: pin a commit when testing against Go tip
Since it changes rapidly, especially during merge cycles, and we don't
want CI to surprisingly blow up in our faces from one day to another.

Pin this to a commit from yesterday which works, since some changes
merged today moved where the Go build version is recorded and broke
garble.

While at it, replace "git clone" with a wget of a source archive. This
is much, much faster, mainly because a tarball is significantly smaller.
We now download about 20MiB instead of over 350MiB.

One downside is that, without git, make.bash can't construct a devel
version on its own. For that reason, add a pretty basic manual version
via the VERSION file.

This means that we must not reject custom devel version strings. This is
a good thing anyway, because custom devel strings are already common
when building Go in custom ways. Those people tend to be advanced users,
such as CI, so fall back to assuming they know what they are doing and
don't error.

Plus, starting last week, devel versions in Go master now contain the
major Go version like in build tags, such as "go1.17-commit...", so we
will soon start relying on that instead of parsing dates:

	$ go version
	go version devel go1.17-a7e16abb22 Thu Apr 8 07:33:58 2021 +0000 linux/amd64
3 years ago
Daniel Martí d38dfd4e90 make garble work on Go tip again
Just two minor tweaks were necessary to get "go test" to pass on:

	go version devel go1.17-a25c584629 Tue Apr 6 04:48:09 2021 +0000 linux/amd64

Re-enable the CI for it, too. The config needed changing since the
set-env and add-path commands now use special files instead, due to some
security issues uncovered last winter.

It's possible that CI on master could suddenly break, if Go master
changes in some substantial way that requires more tweaks. If that turns
out to be an issue pretty often, we could always pin a specific git repo
commit and update it every few weeks.
3 years ago
Daniel Martí b995c1b589
obfuscate literals as part of transformGo (#299)
This is easier to understand, since now the modification of the
*ast.File is all within a single chunk of code. We can also simplify
literals.Obfuscate to work on a single file, as transformGo runs in a
loop.

We also remove the "use receiver" TODOs, since the code is now in a
different package and it can't declare methods on a type here.
3 years ago
Daniel Martí 081b69eec2 join the import-rewrite ast.Inspect with transformGo
Walking the entire Go file again seems unnecessary, when we have an
astutil.Apply just next to it.

We could add to its "pre" function, but it's a bit easier to use the
"post" instead, which is empty.

Another advantage here is that astutil.Apply is more powerful than
ast.Inspect, which can come in handy in the future.
3 years ago
Daniel Martí e94b8e3750 make "help" refuse arguments for now
Since they are otherwise silently ignored.
3 years ago
Daniel Martí 664f834906 document "garble reverse"
Fixes #5.
3 years ago
Daniel Martí fe095ef132
handle unknown flags in reverse (#290)
While at it, expand the tests for build and test too.
3 years ago
Daniel Martí 1a8e32227f
improve "reverse" even further (#289)
Fix up a few TODOs, and simplify the way we handle comments.

We now add whitespace around inline /*line*/ directives, to ensure we
don't break programs. A test case is added too.

We now add line directives to call sites, not function declarations,
since those are what actually shows up in stack traces.
It's unclear if we care about any other lines inside functions at all.
This also fixes reversing with -literals, since that feature adds a
significant amount of code which shuffles line numbers around.

Finally, we extend the tests with types, methods, and anonymous
functions, and we make all of them work well.

Updates #5.
3 years ago
Daniel Martí ce2c45440a
simplify the uses of obfuscatedTypesPackage (#284)
obfuscatedTypesPackage is now only useful for one scenario: a type which
is used with reflection in a dependency, so neither its name nor any of
its potential struct field names are obfuscated.

This is because we can only detect the use of reflection with go/ast,
which we don't have for dependencies.

As such, we only need obfuscatedTypesPackage in two places - when
considering to obfuscate a field or a type name.

There were two other calls to the function, which we remove.

The first was used for linkname directives. Those directives only work
for variables and functions, neither of which is affected by the
reflection detection.

The second was used for all identifiers, at the very end of the
transformGo inner func. It's entirely unnecessary right now, as it never
triggers anymore. It's possible it was necessary some time ago when we
still didn't obfuscate assembly functions.

While at it, improve some comments and add a few TODOs for edge cases
which do not have code coverage.
3 years ago
Daniel Martí a1f11fb231 add a writeTemp helper func
We had two nearly identical copies of the code to open a temp file with
CreateTemp, write to it, and handle closing properly. Unify them.

With the added docs this isn't exactly a net win, but we want the long
funcs such as transformCompile to be easy to follow, and this helps.
3 years ago
Daniel Martí 13e4ba2ae0 use "obfuscate" instead of "garble" in some more places
Mainly comments. "garble" refers to the tool, but the verb and adjective
is more intuitive as "obfuscate" and "obfuscated" instead of "garble"
and "garbled".
3 years ago
Daniel Martí 961daf20c4
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.

Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".

Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.

The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.

The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.

Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.

Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.

For #5.
3 years ago
Daniel Martí ea19e39aa4 use hashWith for obfuscation position information
Position information was obfuscated with math/rand manually, which meant
that the resulting positions were pretty small like "x.go:34", but they
were also very hard to reverse due to their short length and difficulty
to reproduce.

We now hash them with hashWith and the package's GarbleActionID:

	"main.go:203" hashed with 933ad1c700755b7c3a9913c55cade1 to "mwu1xuNz.go"

The input to the hash is the base filename and the byte offset of the
declaration within the file, meaning that it's unique within a package.
The output filename is long enough to allow easy reversal.

The line number is always 1, since the information needed for reversing
is contained entirely within the filename. It doesn't really matter if
we encode data in the filename or line number, but it's easier for us to
use a string.

For #5.
3 years ago
Daniel Martí 5d74ab07f5 all: replace uses of the deprecated ioutil
Now that we require Go 1.16, we can simplify code by removing ioutil.
3 years ago
Daniel Martí a328a487f8 improve the code comments around -seed
Explain why we print the random seed on failure, and why we accept
base64 padding when we don't use it.

While at it, the flagOptions.Random field is unused, so remove it. It
also seems wrong for it to exist; the random value is only for the seed
flag, which already has a field of its own.
3 years ago
Daniel Martí f3ea42230a remove the last remaining -debugdir buffer
We were still printing files into a buffer, which got left behind from
when we used to store the obfuscated source in object files.

All that code is gone, so this buffer is now just wasting CPU cycles and
memory.
3 years ago
Daniel Martí 9312938650
remove obsolete TODOs (#274)
obfuscatedTypesPackage can't be outright removed, since we still do
require knowledge of what was obfuscated for one reason: types used with
reflection. Since we can only figure that out with go/ast and not just
go/types, we rely on the original compilation to tell us that
information.

IgnoreFuncBodies=true for typechecking the original source code would be
nice, as we would save time, but ultimately it doesn't work. When we
rename top-level declarations such as functions and types, we also need
to amend their references in func bodies. We depend on type information
for that.

Finally, we've been randomizing filenames for a while now. Randomizing
the order of the files doesn't seem to be useful.
3 years ago
Daniel Martí 748c6a0538
obfuscate asm function names as well (#273)
Historically, it was impossible to rename those funcs as the
implementation was in assembly files, and we only transformed Go code.

Now that transformAsm exists, it's only about fifty lines to do some
very basic parsing and rewriting of assembly files.

This fixes the obfuscated builds of multiple std packages, including a
few dependencies of net/http, since they included assembly funcs which
called pure Go functions. Those pure Go functions had their names
obfuscated, breaking the call sites in assembly.

Fixes #258.
Fixes #261.
3 years ago
Daniel Martí ffeea469e6
remove a couple of pieces of dead code (#272)
We no longer cache the obfuscated source code for -debugdir, so
obfSrcArchive is entirely useless at this point. We likely didn't notice
because the Go compiler doesn't realise it's unused.

symAbis is also unnecessary. It used to be necessary as we could only
collect the action ID from transformCompile in the second asm
invocation. Since we no longer need the temporary file dance with
transformCompile, we can simplify that code too.
3 years ago
Daniel Martí def9351b25
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.

Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.

Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.

The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".

The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.

The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.

Finally, document "garble test" again.

Fixes #241.
3 years ago
Daniel Martí 4e9ee17ec8
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.

Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.

To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.

The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.

Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.

isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".

Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.

Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.

While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
3 years ago
Daniel Martí ff0bea73b5
all: drop support for Go 1.15.x (#265)
This mainly cleans up the few bits of code where we explicitly kept
support for Go 1.15.x. With v0.1.0 released, we can drop support now,
since the next v0.2.0 release will only support Go 1.16.x.

Also updates all modules, including test ones, to 'go 1.16'.

Note that the TOOLEXEC_IMPORTPATH refactor is not done here, despite all
the TODOs about doing so when we drop 1.15 support. This is because that
refactor needs to be done carefully and might have side effects, so it's
best to keep it to a separate commit.

Finally, update the deps.
3 years ago
Daniel Martí 2a9c0b7bf4
prepare for the first release (#264)
First, write a changelog file. We will use GitHub releases, but the
content in those is not stored in git nor is it portable or machine
readable. The canonical place for the changelog is here.

Second, disable 'garble test', as it is entirely broken. Issue #241
tracks fixing and re-enabling it, which will most likely happen for the
next release.

Third, disable the undocumented 'garble list'. This was added as part of
'garble reverse', but it never got used. I can't think of any reason why
any end user would prefer it over 'go list', either.

'garble reverse' remains enabled, but undocumented as it isn't fully
functional yet. Until it supports position information, it's not
particularly useful to end users. But it's not broken either, so it can
remain where it is.

Fourth, update the '-tiny' size reduction numbers in the README. Since
we removed the in-place modification of object files, we are no longer
able to do such an aggressive stripping of info. Garble itself drops in
size by 2%, so replace the old 6-10% estimate by 2-5%. We probably will
gain some of this back in the near future.

Finally, fix the indentation formatting of the README to consistently
use tabs.
3 years ago
Daniel Martí 1267e2eced
fix link errors when importing crypto/ecdsa (#262)
First, we had some link errors such as:

	cannot find package J6OzO8GN (using -importcfg)

This was caused by the code that writes an updated importcfg, which did
not handle import maps well. That code is now fixed, and we also add an
obfuscatedImportPath method for clarity.

Once fixed, we ran into other link errors:

	Pw3g97ww.addVW: relocation target Pw3g97ww.addVWlarge not defined

After some digging, the cause of those is assembly code that we do not
yet support obfuscating. #261 tracks that.

Meanwhile, to fix "GOPRIVATE=* garble build" and to be able to have a
test for the original import path bug, we add the packages which use
that form of assembly code to runtimeRelated - math/big and
crypto/sha512. There might be more, but these were the ones found by
trying to link crypto/tls, a fairly common dependency.

Fixes #256.
3 years ago
Daniel Martí 99887c13c2 ignore -ldflags=-X flags mentioning unknown packages
That would panic, since the *listedPackage would be nil for a package
path we aren't aware of:

	panic: runtime error: invalid memory address or nil pointer dereference
	[signal SIGSEGV: segmentation violation code=0x1 addr=0x88 pc=0x126b57d]

	goroutine 1 [running]:
	main.transformLink.func1(0x7ffeefbff28b, 0x5d)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1260 +0x17d
	main.flagValueIter(0xc0000a8e20, 0x2f, 0x2f, 0x12e278e, 0x2, 0xc000129e28)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1410 +0x1e9
	main.transformLink(0xc0000a8e20, 0x30, 0x36, 0x4, 0xc000114648, 0x23, 0x12dfd60, 0x0)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1241 +0x1b9
	main.mainErr(0xc0000a8e10, 0x31, 0x37, 0x37, 0x0)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:287 +0x389
	main.main1(0xc000096058)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:150 +0xe7
	main.main()
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:83 +0x25

The linker ignores such unknown references, so we should too.

Fixes #259.
3 years ago
Daniel Martí b03cd08c09
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.

Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.

Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.

While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.

We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.

	name     old time/op       new time/op       delta
	Build-8        106ms ± 1%         92ms ± 0%  -14.07%  (p=0.010 n=6+4)

	name     old bin-B         new bin-B         delta
	Build-8        6.60M ± 0%        6.60M ± 0%   -0.01%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        208ms ± 5%        149ms ± 3%  -28.27%  (p=0.004 n=6+5)

	name     old user-time/op  new user-time/op  delta
	Build-8        433ms ± 3%        384ms ± 3%  -11.35%  (p=0.002 n=6+6)
3 years ago
Daniel Martí 6898d61637
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.

Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.

This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.

Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.

First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.

Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.

Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.

All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.

This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.

This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.

This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".

While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.

The numbers for Go 1.16 look very good:

	name     old time/op       new time/op       delta
	Build-8        146ms ± 4%        101ms ± 1%  -31.01%  (p=0.002 n=6+6)

	name     old bin-B         new bin-B         delta
	Build-8        6.61M ± 0%        6.60M ± 0%   -0.09%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        321ms ± 7%        202ms ± 6%  -37.11%  (p=0.002 n=6+6)

	name     old user-time/op  new user-time/op  delta
	Build-8        538ms ± 4%        414ms ± 4%  -23.12%  (p=0.002 n=6+6)
3 years ago
Daniel Martí 09e244986e formally add support for Go 1.16
This was pretty much just fixing the README and closing the issue. The
only other noteworthy user-facing change is that, if the Go version is
detected to be too old, we now suggest 1.16.x instead of 1.15.x.

While at it, refactor goversion.txt a bit. I wanted it to print a
clearer "mocking the go build" error if another command was used like
"go build", but I didn't want to learn BAT. So, instead use a simple Go
program and build it, which will work on all platforms. The added
"go build" step barely takes 100ms on my machine, given how simple the
program is.

The [short] line also doesn't seem necessary to me. The entire script
runs in under 200ms for me, so it's well within the realm of "short", at
least compared to many of the other test scripts.

Fixes #124.
3 years ago
Daniel Martí 2ee6604408
replace the -p path when assembling (#247)
The asm tool runs twice for a package with assembly. The second time it
does, the path given to the -p flag matters, just like in the compiler,
as we generate an object file.

We don't have a -buildid flag in the asm tool, so obtaining the action
ID to obfuscate the package path with is a bit tricky. We store it from
transformCompile, and read it from transformAsm. See the detailed docs
for more.

This was the last "skip" line in the tests due to Go 1.16. After all PRs
are merged, one last PR documenting that 1.16 is supported will be sent,
closing the issue for good.

It's unclear why this wasn't an issue in Go 1.15. My best guess is that
the ABI changes only happened in Go 1.16, and this causes exported asm
funcs to start showing up in object files with their package paths.

Updates #124.
3 years ago
Daniel Martí 5e3ba2fc09
update the list of runtime-related packages for 1.16 (#246)
With a few extra lines, we can keep Go 1.15 support in the table too.

Re-enables the goprivate.txt test for Go 1.16.

While at it, make the script's use of grep a bit simpler with -E, which
also uses the same syntax as Go's regexp. Its skip logic was also buggy,
resulting in the macos results always being empty.

Updates #124.
3 years ago
Daniel Martí 180d64236a
properly fix obfuscated imports with their package names (#245)
The TODO I left there didn't take long to surface as a bug. If the
package path ends with a word containing a hyphen, that's not a valid
identifier, so we end up with invalid Go syntax.

Add that test case, as well as one where an import was already named.

To fix the issue, we just need to use the package name we got from
'go list -json'.

Fixes #243.
3 years ago
Daniel Martí 05d0dd1801
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.

That worked well, in most cases. Unfortunately, it had some flaws:

* Complexity. Even when most of the code is maintained in a separate
  module, the import_obfuscation.go file was still close to a thousand
  lines of code.

* Go compatibility. The object file format changes between Go releases,
  so we were supporting Go 1.15, but not 1.16. Fixing the object file
  package to work with 1.16 would probably break 1.15 support.

* Bugs. For example, we recently had to add a workaround for #224, since
  import paths containing dots after the domain would end up escaped.
  Another example is #190, which seems to be caused by the object file
  parser or writer corrupting the compiled code and causing segfaults in
  some rare edge cases.

Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:

1) Replace its "package foo" lines with the obfuscated package path. No
   need to separate the package path and name, since the obfuscated path
   does not contain slashes.

2) Replace the "-p pkg/foo" flag with the obfuscated path.

3) Replace the "import" spec lines with the obfuscated package paths,
   for those dependencies which were obfuscated.

4) Replace the "-importcfg [...]" file with a version that uses the
   obfuscated paths instead.

The linker also needs that last step, since it also uses an importcfg
file to find object files.

There are three noteworthy drawbacks to this new method:

1) Since we no longer write object files, we can't use them to store
   data to be cached. As such, the -debugdir flag goes back to using the
   "-a" build flag to always rebuild all packages. On the plus side,
   that caching didn't work very well; see #176.

2) The package name "main" remains in all declarations under it, not
   just "func main", since we can only rename entire packages. This
   seems fine, as it gives little information to the end user.

3) The -tiny mode no longer sets all lines to 0, since it did that by
   modifying object files. As a temporary measure, we instead set all
   top-level declarations to be on line 1. A TODO is added to hopefully
   improve this again in the near future.

The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.

Fixes #176.
Fixes #190.
3 years ago
Daniel Martí e2a32634a6
simplify, improve, and test line obfuscation (#239)
First, remove the shuffling of the declarations list within each file.
This is what we used at the very start to shuffle positions. Ever since
we started obfuscating positions via //line comments, that has been
entirely unnecessary.

Second, add a proper test that will fail if we don't obfuscate line
numbers well enough. Filenames were already decently covered by other
tests.

Third, simplify the line obfuscation code. It does not require
astutil.Apply, and ranging over file.Decls is easier.

Finally, also obfuscate the position of top-level vars, since we only
used to do it for top-level funcs. Without that fix, the test would fail
as varLines was unexpectedly sorted.
3 years ago
Daniel Martí 63c42c3cc7
support typechecking all of std (#236)
There was one bug keeping the command below from working:

	GOPRIVATE='*' garble build std

The bug is rather obscure; I'm still working on a minimal reproducer
that I can submit upstream, and I'm not yet convinced about where the
bug lives and how it can be fixed.

In short, the command would fail with:

	typecheck error: /go/src/crypto/ecdsa/ecdsa.go:122:12: cannot use asn1.SEQUENCE (constant 48 of type asn1.Tag) as asn1.Tag value in argument to b.AddASN1

Note that the error is ambiguous; there are two asn1 packages, but they
are actually mismatching. We can see that by manually adding debug
prints to go/types:

	constant: asn1.SEQUENCE (constant 48 of type golang.org/x/crypto/cryptobyte/asn1.Tag)
	argument type: vendor/golang.org/x/crypto/cryptobyte/asn1.Tag

It's clear that, for some reason, go/types ends up confused and loading
a vendored and non-vendored version of asn1. There also seems to be no
way to work around this with our lookup function, as it just receives an
import path as a parameter, and returns an object file reader.

For now, work around the issue by *not* using a custom lookup function
in this rare edge case involving vendored dependencies in std packages.
The added code has a lengthy comment explaining the reasoning.

I still intend to investigate this further, but there's no reason to
keep garble failing if we can work around the bug.

Fixes #223.
3 years ago
Daniel Martí 2fa5697189
avoid panic on funcs that almost look like tests (#235)
The added test case used to crash garble:

	panic: runtime error: invalid memory address or nil pointer dereference [recovered]
		panic: runtime error: invalid memory address or nil pointer dereference
	[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x8fe71b]

	goroutine 1 [running]:
	golang.org/x/tools/go/ast/astutil.Apply.func1(0xc0001e8880, 0xc000221570)
		/go/pkg/mod/golang.org/x/tools@v0.0.0-20210115202250-e0d201561e39/go/ast/astutil/rewrite.go:47 +0x97
	panic(0x975bc0, 0xd6c610)
		/sdk/go1.15.8/src/runtime/panic.go:969 +0x1b9
	go/types.(*Named).Obj(...)
		/sdk/go1.15.8/src/go/types/type.go:473
	mvdan.cc/garble.isTestSignature(0xc0001e7080, 0xa02e84)
		/src/garble/main.go:1170 +0x7b
	mvdan.cc/garble.(*transformer).transformGo.func2(0xc000122df0, 0xaac301)
		/src/garble/main.go:1028 +0xff1

We were assuming that the first parameter was a named type, but that
might not be the case.

This crash was found out in the wild, from which a minimal repro was
written. We add two variants of it to the test data, just in case.
3 years ago
Daniel Martí e64fccd367
better document and position the hash base64 encoding (#234)
We now document why we use a custom base64 charset.

The old "b64" name was also too generic, so it might have been misused
for other purposes.
3 years ago
Daniel Martí af517a20f8 make the handling of import paths more robust
First, make isPrivate panic on malformed import paths, since that should
never happen. This catches the errors that some users had run into with
packages like gopkg.in/yaml.v2 and github.com/satori/go.uuid:

	panic: malformed import path "gopkg.in/garbletest%2ev2": invalid char '%'

This seems to trigger when a module path contains a dot after the first
element, *and* that module is fetched via the proxy. This results in the
toolchain URL-encoding the second dot, and garble ends up seeing that
encoded path.

We reproduce this behavior with a fake gopkg.in module added to the test
module proxy. Using yaml.v2 directly would have been easier, but it's
pretty large. Note that we tried a replace directive, but that does not
trigger the URL-encoding bug.

Also note that we do not obfuscate the gopkg.in package; that's fine, as
the isPrivate path validity check catches the bug either way.

For now, make initImport use url.PathUnescape to work around this issue.
The underlying bug is likely in either the goobj2 fork, or in the
upstream Go toolchain itself.

hashImport also gives a better error if it cannot find a package now,
rather than just an "empty seed" panic.

Finally, the sanity check in isPrivate unearthed the fact that we do not
support garbling test packages at all, since they were invalid paths
which never matched GOPRIVATE. Add an explicit check and TODO about
that.

Fixes #224.
Fixes #228.
3 years ago
Daniel Martí 79c775e218
obfuscate unexported names like exported ones (#227)
In 90fa325da7, the obfuscation logic was changed to use hashes for
exported names, but incremental names starting at just one letter for
unexported names. Presumably, this was done for the sake of binary size.

I argue that this is not a good idea for the default mode for a number
of reasons:

1) It makes reversing of stack traces nearly impossible for unexported
   names, since replacing an obfuscated name "c" with "originalName"
   would trigger too many false positives by matching single characters.

2) Exported and unexported names aren't different. We need to know how
   names were obfuscated at a later time in both cases, thanks to use
   cases like -ldflags=-X. Using short names for one but not the other
   doesn't make a lot of sense, and makes the logic inconsistent.

3) Shaving off three bytes for unexported names doesn't seem like a huge
   deal for the default mode, when we already have -tiny to optimize for
   size.

This saves us a bit of work, but most importantly, simplifies the
obfuscation state as we no longer need to carry privateNameMap between
the compile and link stages.

	name     old time/op       new time/op       delta
	Build-8        153ms ± 2%        150ms ± 2%    ~     (p=0.065 n=6+6)

	name     old bin-B         new bin-B         delta
	Build-8        7.09M ± 0%        7.08M ± 0%  -0.24%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        296ms ± 5%        277ms ± 6%  -6.50%  (p=0.026 n=6+6)

	name     old user-time/op  new user-time/op  delta
	Build-8        562ms ± 1%        558ms ± 3%    ~     (p=0.329 n=5+6)

Note that I do not oppose using short names for both exported and
unexported names in the future for -tiny, since reversing of stack
traces will by design not work there. The code can be resurrected from
the git history if we want to improve -tiny that way in the future, as
we'd need to store state in header files again.

Another major cleanup we can do here is to no longer use the
garbledImports map. From a look at obfuscateImports, we hash a package's
import path with its action ID, much like exported names, so we can
simply re-do that hashing for the linker's -X flag.

garbledImports does have some logic to handle duplicate package names,
but it's worth noting that should not affect package paths, as they are
always unique. That area of code could probably do with some
simplification in the future, too.

While at it, make hashWith panic if either parameter is empty.
obfuscateImports was hashing the main package path without a salt due to
a bug, so we want to catch those in the future.

Finally, make some tiny spacing and typo tweaks to the README.
3 years ago
Daniel Martí d8e8738216
initial support for reversing panic output (#225)
For now, this only implements reversing of exported names which are
hashed with action IDs. Many other kinds of obfuscation, like positions
and private names, are not yet implemented.

Note that we don't document this new command yet on purpose, since it's
not finished.

Some other minor cleanups were done for future changes, such as making
transformLineInfo into a method that also receives the original
filename, and making header names more self-describing.

Updates #5.
3 years ago
Daniel Martí 78b69bbdab share a single temporary directory between all processes
Each compile and link sub-process created its own temporary directory,
to be cleaned up shortly after. Moreover, we also had the global
gob-encoded temporary file.

Instead, place all of those under a single, start-to-end temporary
directory. This is cleaner for the end user, and easier to maintain for
us.

A big plus is that we can also get rid of the confusing deferred global,
as it was mostly used to clean up these extra temp dirs. The only
remaining use was post-compile code, which is now an explicit func
returned by each "transform" func.

While at it, clean up the math/rand seeding code a bit and add a debug
log line, and stop shadowing a cmd string with a cmd *exec.Cmd.

Fixes #147.
3 years ago
Daniel Martí 39d60f91e6
include a "garble version" test (#221)
Not sure why the last commit came with none.

While at it, I noticed that the version command would ignore arguments.
Error instead.
3 years ago
Daniel Martí c7ee0e08e5
add a "version" command (#220)
Mimicking "go version", this tells the user garble's own version.

The code is exactly the same that is used for another tool written in
Go, shfmt. It uses runtime/debug to fetch the module version embedded in
binaries built by Go. For example:

	$ go get mvdan.cc/sh/v3/cmd/shfmt@latest
	$ shfmt -version
	v3.2.2
	$ go get mvdan.cc/sh/v3/cmd/shfmt@master
	$ shfmt -version
	v3.3.0-0.dev.0.20210203135509-56c9918c980d

Note that this will not work for a plain "go build" or "go install"
after a "git clone", since in that case the Go tool can't know garble's
own version via go.mod - since it's the current main module:

	$ go build
	$ ./garble version
	(devel)

For the use case of the power user building from source directly, they
are probably clever enough to tell us what git commit they are on, so
this is not a big problem right now. It will also get better once
golang/go#37475 is fixed in the future.

Until then, if we need to do "release" builds locally, we can embed an
explicit version into the binary via ldflags:

	$ go build -ldflags=-X=main.version=v1.2.3
	$ ./garble version
	v1.2.3

Fixes #217.
3 years ago
Andrew LeFevre e014f480f9
if the seed is random and the build fails, print the seed (#213)
Fixes #212
3 years ago
Andrew LeFevre 2d720ae155
fixed some bugs related to additional linkname corner cases (#210) 3 years ago
Daniel Martí f667a7ad31
all: use better names than "blacklist", and docs (#206)
The three transformer map fields are now very well documented, which was
badly needed for anyone trying to understand the source code.

ignoreObjects is also a better field name than blacklist, as it says
what the map is indexed by (types.Object) and what we do with those:
ignore them when we obfuscate code.

The rewriting of go:linkname directives is moved to a separate func, so
that we can name that func from the docs.

Finally, the docs are overall improved a bit, as I was re-tracing all
the pieces of code that used the ambiguous "blacklist" terminology.

Fixes #169.
4 years ago
Daniel Martí ff3d62f9c3
prevent exiting early with early errors (#205)
The point of main1 returning an int is that testscript can run code
afterwards, such as to collect coverage information when running with
-coverprofile.

We were using plain os.Exit in a couple of places: when help was
requested, and when the Go version could not be fetched.

In those cases, return an error to main1, and let it do the right thing.

For #35.
4 years ago
Daniel Martí 249501b5e9
fix garbling names belonging to indirect imports (#203)
main.go includes a lengthy comment that documents this edge case, why it
happened, and how we are fixing it. To summarize, we should no longer
error with a build error in those cases. Read the comment for details.

A few other minor changes were done to allow writing this patch.

First, the actionID and contentID funcs were renamed, since they started
to collide with variable names.

Second, the logging has been improved a bit, which allowed me to debug
the issue.

Third, the "cache" global shared by all garble sub-processes now
includes the necessary parameters to run "go list -toolexec", including
the path to garble and the build flags being used.

Thanks to lu4p for writing a test case, which also applied gofmt to that
testdata Go file.

Fixes #180.
Closes #181, since it includes its test case.
4 years ago
lu4p 2e2bd09b5e Simplify maps to boolean value 4 years ago
Daniel Martí c0731921c2
rewrite go:linkname directives with garbled names (#200)
If code includes a linkname directive pointing at a name in an imported
package, like:

	//go:linkname localName importedpackage.RemoteName
	func localName()

We should rewrite the comment to replace "RemoteName" with its
obfuscated counterpart, if the package in question was obfuscated and
that name was as well.

We already had some code to handle linkname directives, but only to
ensure that "localName" was never obfuscated. This behavior is kept, to
ensure that the directive applies to the right name. In the future, we
could instead rewrite "localName" in the directive, like we do with
"RemoteName".

Add plenty of tests, too. The linkname directive used to be tested in
imports.txt and syntax.txt, but that was hard to maintain as each file
tested different edge cases.

Now that we have build caching, adding one extra testscript file isn't a
big problem anymoree. Add linkname.txt, which is self-explanatory. The
other two scripts also get a bit less complex.

Fixes #197.
4 years ago
Daniel Martí 2b26183253
reduce the amount of code to handle compiler directives (#199)
First, we don't need the nameSpecialDirectives list as a separate thing.
cgo types aren't obfuscated anymore, so the only item in that list that
made a difference in the tests was go:linkname, which we'll overhaul
soon. For now, keep its code around.

Second, processDetachedDirectives can be replaced by just seven lines.

Third, we don't need to separate build tag directives from the rest of
the detached directives. Their relative order (with other comments) does
not matater.

Fourth and last, ranging over a nil slice is a no-op, so a nil check
around a slice range is unnecessary.

This is some prep work to make the patch to support go:linkname smaller
and easier to review.
4 years ago
Daniel Martí c9deff810b
obfuscate fewer std packages (#196)
Previously, we were never obfuscating runtime and its direct
dependencies. Unfortunately, due to linkname, the runtime package is
actually closely related to dozens of other std packages as well.

Until we can obfuscate the runtime and properly support go:linkname
directives, obfuscating fewer std packages is a better outcome than
breaking and not producing any obfuscated code at all.

The added test case is building runtime/pprof, which used to cause
failures:

	# runtime/pprof
	/go/src/runtime/pprof/label.go:27:21: undefined: context.Context
	/go/src/runtime/pprof/label.go:59:21: undefined: context.Context
	/go/src/runtime/pprof/label.go:93:16: undefined: context.Context
	/go/src/runtime/pprof/label.go:101:20: undefined: context.Context

The net package was also very close to obfuscating properly thanks to
this change, so its test is now run as well. The only other remaining
fix was to not obfuscate fields on cgo types, since those aren't
obfuscated at the moment.

The map is pretty long, but it's only a temporary solution and the
command to obtain the list again is included. Never obfuscating the
entire std library is also an option, but it's a bit unnecessary.

Fixes #134.
4 years ago
lu4p cf290b8e6d
Share data between processes via a shared file. (#192)
Previously garble heavily used env vars to share data between processes.
This also makes it easy to share complex data between processes.

The complexity of main.go is considerably reduced.
4 years ago
Daniel Martí dfa622fe50
simplify globals, split hash.go (#191)
The previous globals worked, but were unnecessarily complex. For
example, we passed the fromPath variable around, but it's really a
static global, since we only compile or link a single package in each Go
process. Use such global variables instead of passing them around, which
currently include the package's import path, its build ID, and its
import config path.

Also split all the hashing and build ID code into hash.go, since that's
a relatively well contained 200 lines of code that doesn't need to make
main.go any bigger. We also split the code to alter Go's own version to
a separate function, so that it can be moved out of main.go as well.
4 years ago
Andrew LeFevre 8c03afee95
Use latest Binject/debug version to support importmap directives, fixes #146 (#189)
* Use latest Binject/debug version to support importmap directives in the importcfg file

* Uncomment line in goprivate testscript to test ImportMap

* Fixed issue where a package in specified in importmap would be hashed differently in a package that imported it, due to the mapping of import paths.

Also commented out the 'net' import in the goprivate testscript (again) due to cgo compile errors
4 years ago
Daniel Martí 3e7416ee9e
add test case for ImportMap support (#186)
We also update the "original types importer" to support ImportMap.

The test now gets further along, no longer getting stuck on "path not
found in listed packages". Instead, we get stuck on:

	error parsing importcfg: <...>/importcfg:2: unknown directive "importmap"

This bug has been filed at https://github.com/Binject/debug/issues/17.
Until it's fixed, we can't really proceed on #146, so the net import in
the test file (which triggers this case) is commented out for now.

Updates #146.
4 years ago
Daniel Martí 4e79bfc01a
warn when a public package imports a private package (#182)
That is, a package that is built without obfuscation imports an
obfuscated package. This will result in confusing compilation error
messages, because the importer can't find the exported names from the
imported package by their non-obfuscated names:

	> ! garble build ./importer
	[stderr]
	# test/main/importer
	importer/importer.go:5:9: undefined: imported.Name
	exit status 2

Instead, detect this bad input case and provide a nice error:

	public package "test/main/importer" can't depend on obfuscated package "test/main/imported" (matched via GOPRIVATE="test/main/imported")

For now, this is by design. It also makes little sense for a public
package to import an obfuscated package in general, because the public
package would have to leak details about the private package's API and
behavior.

While at it, fix a quirk where we thought the unsafe package could be
private. It can't be, because the runtime package is always public and
it imports the runtime package:

	public package "internal/bytealg" can't depend on obfuscated package "unsafe" (matched via GOPRIVATE="*")

Instead of trying to obfuscate "unsafe" and doing nothing, simply add it
to the neverPrivate list, which is also a better name than
"privateBlacklist" (for #169).

Fixes #164.

Co-authored-by: lu4p <lu4p@pm.me>
4 years ago
Daniel Martí 07fd9d5beb introduce a receiver type to transform a Go package
Means that we no longer have to pass a dozen parameters around, mainly
to transformGo. We can also start documenting what each of the fields
actually does, and group them better.

While at it, pkgPath and pkgScope can both be replaced by a
*types.Package, since they're both accessible via trivially cheap
methods.
4 years ago
Andrew LeFevre 6cf1eb6d49
keep init funcs in original order (#165) 4 years ago
Andrew LeFevre 1fc990dcf8
Fix bug where structs would get garbled in some packages but not in others (#161)
* fix bug where structs would get garbled in some packages but not in others

* only check if struct/field was not defined in current package

* fix a related bug when two objects share the same name in the same package and one is garbled but the other one is not

* renamed parameter for clarity
4 years ago
Andrew LeFevre 0e0a9fc594
Allow struct fields to be garbled, fixes #48 (#159)
* Allow struct fields to be garbled, fixes #48

* fix syntax test script

* simplified code according to review
4 years ago
pagran 803c1d9439
Store obfuscated sources in object files (#158)
Now the flag "-debugdir" does not trigger a full recompilation.
Obfuscated source files are saved to object files and are extracted during linking.
4 years ago
Daniel Martí 1e19d136c7 testdata: make syntax.txt pass when offline
The test intended to use an extra module to be obfuscated, rsc.io/quote,
which we were bundling in the local proxy as well. Unfortunately, the
use of GOPRIVATE also meant that we did not actually fetch the module
from the proxy, and we would instead do a full roundtrip to the internet
to "git clone" the actual upstream repository.

To prevent that roundtrip, instead use a locally replaced module. This
fits the syntax.txt test too, since it's one more edge case that we want
to make sure works well with garble. Since rsc.io/quote is used in
another test, simply make up our own tiny module.

Reduces a 'go test -run Syntax/syntax' run with warm cache from ~5s to
~0.5s, thanks to removing the multiple roundtrips. A warm 'go test' run
still sits at ~6s, since we still need that much CPU time in total.

While at it, fix a staticcheck warning and fix inconsistent indentation
in a couple of tests.
4 years ago
Daniel Martí 2a0ac434fb
initial support for build caching (#142)
As per the discussion in https://github.com/golang/go/issues/41145, it
turns out that we don't need special support for build caching in
-toolexec. We can simply modify the behavior of "[...]/compile -V=full"
and "[...]/link -V=full" so that they include garble's own version and
options in the printed build ID.

The part of the build ID that matters is the last, since it's the
"content ID" which is used to work out whether there is a need to redo
the action (build) or not. Since cmd/go parses the last word in the
output as "buildID=...", we simply add "+garble buildID=_/_/_/${hash}".
The slashes let us imitate a full binary build ID, but we assume that
the other components such as the action ID are not necessary, since the
only reader here is cmd/go and it only consumes the content ID.

The reported content ID includes the tool's original content ID,
garble's own content ID from the built binary, and the garble options
which modify how we obfuscate code. If any of the three changes, we
should use a different build cache key. GOPRIVATE also affects caching,
since a different GOPRIVATE value means that we might have to garble a
different set of packages.

Include tests, which mainly check that 'garble build -v' prints package
lines when we expect to always need to rebuild packages, and that it
prints nothing when we should be reusing the build cache even when the
built binary is missing.

After this change, 'go test' on Go 1.15.2 stabilizes at about 8s on my
machine, whereas it used to be at around 25s before.
4 years ago
Daniel Martí 859221a950 make import path obfuscation work with the build cache
What obfuscateImports did was valid, but unfortunately made the build
cache redo work. This is because we were modifying object files in-place
in the build cache, meaning that the Go tool would think it had to
re-compile those packages.

Instead, write the modified object files in a temporary directory, and
leave the input object files untouched. We require a bit of extra code
to keep track of this and adjust the link argument as well as its
importcfg file.

The function of obfuscateImports, as well as the reasoning above, is now
summarized in its godoc as well.

This should be the last change in preparation for proper build caching
support. Rebasing the build caching branch on this commit finally makes
caching work reliably every single time.
4 years ago
pagran ea4a01df87
More correct comments transformation (#152)
More correct comments transformation was implemented.

Added processing of //go:linkname localname [importpath.name] directive, now localname is not renamed. This is safe and does not cause a name disclosure because the functions marked //linkname do not have a name in the resulting binary.

Added cgo directives support

Fixed filename leak protection for cgo

Part of #149
4 years ago