Commit Graph

404 Commits (1db12b7118b6e04fca1464e89ac2a04972061d98)
 

Author SHA1 Message Date
Daniel Martí 1db12b7118 update Go module deps
Mainly for the x/tools update before the upcoming release,
and for Go 1.18 support.
2 years ago
Daniel Martí cd797e6e95 add a few TODOs with uncovered code that should not be
This will hopefully give new contributors a place to start.
Some of these, like verifying that the "help" commands work,
will be relatively simple additions to our test scripts.
2 years ago
Daniel Martí 8b55dd4bd2 work around a build cache regression in the previous commit
The added comment in main.go explains the situation in detail.
The added test is a minimization of the scenario, which failed:

        > cd mod1
        > garble -seed=${SEED1} build -v gopkg.in/garbletest.v2
        > cd ../mod2
        > garble -seed=${SEED1} build -v
        [stderr]
        test/main/mod2
        # test/main/mod2
        cannot load garble export file for gopkg.in/garbletest.v2: open […]/go-build/ed/[…]-garble-ZV[…]-d: no such file or directory

To work around the problem, we'll always add each package's
GarbleActionID to its build artifact, even when not using -seed.
This will get us the previous behavior with regards to the build cache,
meaning that we fix the recent regression.
The added variable doesn't make it to the final binary either.

While here, improve the cached file loading error's context,
and add an extra sanity check for duplicates on ListedPackages.
2 years ago
Daniel Martí c1c90fee13 make obfuscation fully deterministic with -seed
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.

Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.

In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.

Expand the README docs with more examples and detail.

Fixes #449.
2 years ago
Daniel Martí cf0351bdf5 remove ErrExist check on -debugdir's RemoveAll
This code initially used os.Stat, where it made sense to use
os.IsNotExist to catch whether the directory didn't exist.

However, we later transitioned to os.RemoveAll, which will never return
neither ErrExist nor ErrNotExist:

	If the path does not exist, RemoveAll returns nil (no error).

Simplify the code.
2 years ago
Daniel Martí bd3f950799 add a TODO about a possible cache bug with -literals and -ldflags
I spent some time trying to reproduce the bug but couldn't,
so for now, make a detailed note for it.
We can come back to it if we actually run into it in the future.

Fixes #492.
2 years ago
Daniel Martí ad87a0e2bc don't let -debug affect the build cache hashes
Noticed while debugging #492,
as adding -debug to a previously run command would unexpectedly
rebuild all packages in the build, as if -a was given.

While here, remove commented out testscript that were kept in error.
2 years ago
Daniel Martí 88a27d491b add support for -ldflags using quotes
In particular, using -ldflags with -

In particular, a command like:

	garble -literals build -ldflags='-X "main.foo=foo bar"'

would fail, because we would try to use "\"main" as the package name for
the -X qualified name, with the leading quote character.

This is because we used strings.Split(ldflags, " ").
Instead, use the same quoted.Split that cmd/go uses,
copied over thanks to x/tools/cmd/bundle and go:generate.

Updates #492.
2 years ago
Daniel Martí d8f6f308bd clarify how each "cannot obfuscate" map works
We used to record all objects in cannotObfuscateNames,
and then we'd add the exported ones to KnownCannotObfuscate.

Instead, teach recordAsNotObfuscated to store each object in either
knownCannotObfuscateUnexported or KnownCannotObfuscate, but not both.
The former isn't cached so it uses in-memory pointers as keys,
and the latter uses the cross-process objectStrings like before.

Functionally, this is all the same, but with the difference that the map
indexed by types.Object will not contain objects already recorded in
KnownCannotObfuscate, reducing the amount of duplicate memory use.

While here, give recordIgnore a less ambiguous name,
and remove the second parameter as it was always tf.pkg.Path().
This also means we can compare *types.Package pointers directly.

Finally, add more TODOs for further improvement ideas.
It does mean that we end up with more TODOs than before,
even though I'm fixing one, but I reckon that's a good thing.
Recording these ideas can give first-time contributors ways to help,
and it ensures I don't forget about ideas just in my head.
2 years ago
Daniel Martí a9a721e352 concentrate and simplify "to obfuscate" logic
Back in the day, we used to call toObfuscate anytime we needed to know
whether a package should be obfuscated.
More recently, we started computing via the ToObfuscate field,
which then gets shared with all sub-processes via sharedCache.

We still had two places that directly called toObfuscate.
Replace those with ToObfuscate, and inline toObfuscate into shared.go.

obfuscatedImportPath is also a potential footgun for main packages.
Some use cases always want the original "main" package name,
such as for use in the compiler's "-p main" flag,
while other cases want the obfuscated package import path,
such as the entries in importcfg files.

Since each of these call sites handles the edge case well,
obfuscatedImportPath now panics on main packages to avoid any misuse.

Finally, test that we never leak main package paths via ldflags.txt.
We never did, but it's good to make sure.

Overall, this avoids confusion and trims the size of main.go a bit.
2 years ago
Daniel Martí 34f85e3286 remove TODO for tf.pkg.Imports
I tried to fix the TODO, but ran into the problem described by the added
documentation - that some packages in the import graph are incomplete,
as go/types was clever and didn't fully load them.

While here, also make the panics a bit more descriptive,
which helped me debug what was going wrong after the attempted refactor.
2 years ago
Daniel Martí 5c1b2f17f8 CI: start testing on Go 1.18rc1, bump gotip to 1.19
Our upcoming release will focus on supporting 1.17.x and 1.18.x.
Since gotip is now merging features for the future 1.19.x,
let's add 1.18rc1 to the matrix and bump gotip.

This will make our CI jobs a bit slower for the time being,
but that's the price to pay to ensure we support both versions.
After the next garble version is released,
we can consider dropping support for 1.17.x.
2 years ago
Daniel Martí 70b1cb2fd8 CI: start enforcing vet and staticcheck
Fix a staticcheck warning about unused code,
as well as an unparam warning and a missing copyright header.

We also bump the action versions to their latest releases,
and drop unnecessary "name" fields for self-describing steps.

Note that we drop the "go env" commands, as setup-go does that now.

Finally, I did briefly try to add caching,
but then realised it didn't help us at all. Document why.
2 years ago
Daniel Martí 345ecda999 implement TODO to use a name variable
Slightly simplifies the main chunk of code.
While here, improve the wording for when Go isn't installed correctly.
2 years ago
Daniel Martí 955c24856c properly record when type aliases are embedded as fields
There are two scenarios when it comes to embedding fields.
The first is easy, and we always handled it well:

	type Named struct { Foo int }

	type T struct { Named }

In this scenario, T ends up with an embedded field named "Named",
and a promoted field named "Foo".

Then there's the form with a type alias:

	type Named struct { Foo int }

	type Alias = Named

	type T struct { Alias }

This case is different: T ends up with an embedded field named "Alias",
and a promoted field named "Foo".
Note how the field gets its name from the referenced type,
even if said type is just an alias to another type.

This poses two problems.
First, we must obfuscate the field T.Alias as the name "Alias",
and not as the name "Named" that the alias points to.
Second, we must be careful of cases where Named and Alias are declared
in different packages, as they will obfuscate the same name differently.

Both of those problems compounded in the reported issue.
The actual reason is that quic-go has a type alias in the form of:

	type ConnectionState = qtls.ConnectionState

In other words, the entire problem boils down to a type alias which
points to a named type in a different package, where both types share
the same name. For example:

	package parent

	import "parent/p1"

	type T struct { p1.SameName }

	[...]

	package p1

	import "parent/p2"

	type SameName = p2.SameName

	[...]

	package p2

	type SameName struct { Foo int }

This broke garble because we had a heuristic to detect when an embedded
field was a type alias:

	// Instead, detect such a "foreign alias embed".
	// If we embed a final named type,
	// but the field name does not match its name,
	// then it must have been done via an alias.
	// We dig out the alias's TypeName via locateForeignAlias.
	if named.Obj().Name() != node.Name {

As the reader can deduce, this heuristic would incorrectly assume that
the snippet above does not embed a type alias, when in fact it does.
When obfuscating the field T.SameName, which uses a type alias,
we would correctly obfuscate the name "SameName",
but we would incorrectly obfuscate it with the package p2, not p1.
This would then result in build errors.

To fix this problem for good, we need to get rid of the heuristic.
Instead, we now mimic what was done for KnownCannotObfuscate,
but for embedded fields which use type aliases.
KnownEmbeddedAliasFields is now filled for each package
and stored in the cache as part of cachedOutput.
We can then detect the "embedded alias" case reliably,
even when the field is declared in an imported package.

On the plus side, we get to remove locateForeignAlias.
We also add a couple of TODOs to record further improvements.
Finally, add a test.

Fixes #466.
2 years ago
Daniel Martí 73e91fd8c0 README: clarify that the first "garble build" may be slow
The section said that "garble build" should take about twice as long as
"go build", which is typically true, with the glaring exception that is
the first "garble build" after normal use of "go build".

This scenario will be rather common for first-time users,
and the difference will be most noticeable with large projects.
A large build may take about a second to rebuild with a warm cache,
but could take tens or even hundreds of seconds to rebuild from scratch.

See #479.
2 years ago
Daniel Martí 7994877a52 allocate slightly less in printFile
Reuse buffers, so they only grow once.
Join a fmt.Sprintf with a string addition.

	name      old time/op         new time/op         delta
	Build-16          9.39s ± 2%          9.28s ± 2%    ~     (p=0.548 n=5+5)

	name      old bin-B           new bin-B           delta
	Build-16          5.16M ± 0%          5.16M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          324ms ± 6%          317ms ± 2%    ~     (p=0.421 n=5+5)

	name      old mallocs/op      new mallocs/op      delta
	Build-16          29.4M ± 0%          29.3M ± 0%  -0.25%  (p=0.008 n=5+5)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          4.66s ± 2%          4.59s ± 2%    ~     (p=0.310 n=5+5)
2 years ago
Daniel Martí cffb5acd11 remove another allocation per hashed name
We use a buffer to turn a hash into a name via base64.
We can reuse that buffer to save one repeated allocation.
The returned value is still unique, as we convert to a string.

	name      old time/op         new time/op         delta
	Build-16          9.26s ± 3%          9.39s ± 2%    ~     (p=0.151 n=5+5)

	name      old bin-B           new bin-B           delta
	Build-16          5.16M ± 0%          5.16M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          314ms ± 5%          324ms ± 6%    ~     (p=0.310 n=5+5)

	name      old mallocs/op      new mallocs/op      delta
	Build-16          29.8M ± 0%          29.4M ± 0%  -1.38%  (p=0.008 n=5+5)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          4.70s ± 4%          4.66s ± 2%    ~     (p=0.548 n=5+5)
2 years ago
Daniel Martí cdc1efd95b avoid allocating twice for every name we hash
name      old time/op         new time/op         delta
	Build-16          9.25s ± 2%          9.26s ± 3%    ~     (p=0.841 n=5+5)

	name      old bin-B           new bin-B           delta
	Build-16          5.16M ± 0%          5.16M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          316ms ± 5%          314ms ± 5%    ~     (p=1.000 n=5+5)

	name      old mallocs/op      new mallocs/op      delta
	Build-16          30.7M ± 0%          29.8M ± 0%  -2.90%  (p=0.008 n=5+5)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          4.78s ± 4%          4.70s ± 4%    ~     (p=0.690 n=5+5)
2 years ago
Daniel Martí 91d4a8b6af start reporting total allocs by garble in the benchmark
This is like allocs/op by testing.B.ReportAllocs,
but it combines all allocations by garble sub-processes.
As we currently generate quite a bit of garbage,
and reductions in it may only reduce time/op very slowly,
this new metric will help us visualize small improvements.

The regular ReportAllocs would not help us at all,
as the main process simply executes "garble build".

We remove user-ns/op to make space for mallocs/op,
and also since it's a bit redundant given sys-ns/op and time-ns/op.

	name      time/op
	Build-16      9.20s ± 1%

	name      bin-B
	Build-16      5.16M ± 0%

	name      cached-time/op
	Build-16      304ms ± 4%

	name      mallocs/op
	Build-16      30.7M ± 0%

	name      sys-time/op
	Build-16      4.78s ± 4%
2 years ago
Daniel Martí 2d4cc49d50 CI: bump gotip to February
While here, fix two typos.
2 years ago
Daniel Martí 4c3b90c051 stop loading obfuscated type information from deps
If package P1 imports package P2, P1 needs to know which names from P2
weren't obfuscated. For instance, if P2 declares T2 and does
"reflect.TypeOf(T2{...})", then P2 won't obfuscate the name T2, and
neither should P1.

This information should flow from P2 to P1, as P2 builds before
P1. We do this via obfuscatedTypesPackage; P1 loads the type information
of the obfuscated version of P2, and does a lookup for T2. If T2 exists,
then it wasn't obfuscated.

This mechanism has served us well, but it has downsides:

1) It wastes CPU; we load the type information for the entire package.

2) It's complex; for instance, we need KnownObjectFiles as an extra.

3) It makes our code harder to understand, as we load both the original
   and obfuscated type informaiton.

Instead, we now have each package record what names were not obfuscated
as part of its cachedOuput file. Much like KnownObjectFiles, the map
records incrementally through the import graph, to avoid having to load
cachedOutput files for indirect dependencies.

We shouldn't need to worry about those maps getting large;
we only skip obfuscating declared names in a few uncommon scenarios,
such as the use of reflection or cgo's "//export".

Since go/types is relatively allocation-heavy, and the export files
contain a lot of data, we get a nice speed-up:

	name      old time/op         new time/op         delta
	Build-16          11.5s ± 2%          11.1s ± 3%  -3.77%  (p=0.008 n=5+5)

	name      old bin-B           new bin-B           delta
	Build-16          5.15M ± 0%          5.15M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          375ms ± 3%          341ms ± 6%  -8.96%  (p=0.008 n=5+5)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          283ms ±17%          289ms ±13%    ~     (p=0.841 n=5+5)

	name      old user-time/op    new user-time/op    delta
	Build-16          687ms ± 6%          664ms ± 7%    ~     (p=0.548 n=5+5)

Fixes #456.
Updates #475.
2 years ago
Daniel Martí d49c2446ee apply benchmark suggestions by lu4p
I had these ready for the PR, but forgot to push before merging.
2 years ago
Daniel Martí f497821174 redesign benchmark to be more useful and realistic
First, join the two benchmarks into one.
The previous "cached" benchmark was borderline pointless,
as it built the same package with the existing output binary,
so it would quickly realise it had nothing to do and take ~100ms.

The previous "noncached" benchmark input had no dependencies,
so it was only really benchmarking the non-obfuscation of the runtime.
All in all, neither benchmark measured obfuscating multiple packages.

The new benchmark reuses the "cached" input, but with GOCACHE="*",
meaning that we now obfuscate dozens of standard library packages.

Each iteration first does a built from scratch, the worst case scenario,
and then does an incremental rebuild of just the main package,
which is the closest to a best case scenario without being a no-op.

Since each iteration now performs both kinds of builds,
we include a new "cached-time" metric to report what portion of the
"time" metric corresponds to the incremental build.
Thus, we can see a clean build takes ~11s, and a cached takes ~0.3s:

	name      time/op
	Build-16      11.6s ± 1%

	name      bin-B
	Build-16      5.34M ± 0%

	name      cached-time/op
	Build-16      326ms ± 5%

	name      sys-time/op
	Build-16      184ms ±13%

	name      user-time/op
	Build-16      611ms ± 5%

The benchmark is also no logner parallel; see the docs.

Note that the old benchmark also reported bin-B incorrectly,
as it looked at the binary size of garble itself, not the input program.
2 years ago
Daniel Martí 8652271db2 slightly simplify how we deal with linknamed runtime deps
Obfuscating the runtime only needs to list the linknamed packages,
and doesn't need to know about their dependencies directly.

Refactor the script to return a "flat" list that includes all packages
we need, except those that we know the runtime already pulled in.

This allows us to simplify the script and avoid passing -deps to cmd/go.
Performance is unaffected, but I reckon it's worthwhile given how much
we simplified the script.

Longer term, it's also best to avoid using -deps when we don't need it,
as cmd/go could avoid computing information we don't need.

	name              old time/op       new time/op       delta
	Build/NoCache-16        1.68s ± 1%        1.68s ± 0%    ~     (p=1.000 n=5+5)

	name              old bin-B         new bin-B         delta
	Build/NoCache-16        6.72M ± 0%        6.72M ± 0%  +0.01%  (p=0.008 n=5+5)

	name              old sys-time/op   new sys-time/op   delta
	Build/NoCache-16        1.88s ± 1%        1.89s ± 2%    ~     (p=0.548 n=5+5)

	name              old user-time/op  new user-time/op  delta
	Build/NoCache-16        19.9s ± 1%        19.8s ± 0%    ~     (p=0.421 n=5+5)
2 years ago
Daniel Martí f7ed99aa25 mod: update dependencies
Now that we've done the bugfix release.
2 years ago
Daniel Martí f1880b688c finish changelog for v0.5.1
Include Andrew's last change,
which was an easy improvement and has a low chance of regressions.
2 years ago
Andrew LeFevre 474a9194c8 obfuscate exported names in main
Currently exported names in the main package are not hashed, because they
might be a Go plugin API. We don't currently support Go plugins though,
so hash them anyway.

Updates #87.
2 years ago
Daniel Martí 87799a5e4e CHANGELOG: draft for v0.5.1 2 years ago
Daniel Martí 66f5157f0f CI: use a newer gotip commit
And drop the continue-on-error line,
as we haven't had test-gotip failures in months.
Knowing when master fails can be useful.

While here, use GOGC=off to build faster, and explain why.
2 years ago
Daniel Martí c9341790d4 avoid obfuscating literals set via -ldflags=-X
The -X linker flag sets a string variable to a given value,
which is often used to inject strings such as versions.

The way garble's literal obfuscation works,
we replace string literals with anonymous functions which,
when evaluated, result in the original string.

Both of these features work fine separately,
but when intersecting, they break. For example, given:

	var myVar = "original"
	[...]
	-ldflags=-X=main.myVar=replaced

The -X flag effectively replaces the initial value,
and -literals adds code to be run at init time:

	var myVar = "replaced"
	func init() { myVar = func() string { ... } }

Since the init func runs later, -literals breaks -X.
To avoid that problem,
don't obfuscate literals whose variables are set via -ldflags=-X.

We also leave TODOs about obfuscating those in the future,
but we're also leaving regression tests to ensure we get it right.

Fixes #323.
2 years ago
Daniel Martí 96b15e0ac5 support GOGARBLE=* with -literals again
We recently made an important change when obfuscating the runtime,
so that if it's missing any linkname packages in ListedPackages,
it does an extra "go list" call to obtain their information.

This works very well, but we missed an edge case.
In main.go, we disable flagLiterals for the runtime package,
but not for other packages like sync/atomic.

And, since the runtime's extra "go list" has to compute GarbleActionIDs,
it uses the list of garble flags via appendFlags.
Unfortunately, it thinks "-literals" isn't set, when it is,
and the other packages see it as being set.

This discrepancy results in link time errors,
as each end of the linkname obfuscates with a different hash:

	> garble -literals build
	[stderr]
	# test/main
	jccGkbFG.(*yijmzGHo).String: relocation target jccGkbFG.e_77sflf not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.ce66Fmzl not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.e5kPa1qY not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.aQ_3sL3Q not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.zls3wmws not defined
	jQg9GEkg.(*NLxfRPAP).pB5p2ZP0: relocation target jQg9GEkg.g69WgKIS not defined

To fix the problem, treat flagLiterals as read-only after flag.Parse,
just like we already do with the other flags except flagDebugDir.
The code that turned flagLiterals to false is no longer needed,
as literals.Obfuscate is only called when ToObfuscate is true,
and ToObfuscate is false for runtimeAndDeps already.
2 years ago
Daniel Martí 321fa85441 remove the use of -buildinfo=false
It looks like the flag will be scrapped from Go 1.18.
Stop using it before 1.18rc1 releases without it.

See: https://github.com/golang/go/issues/50501#issuecomment-1010225207
3 years ago
Daniel Martí 716f007c2a testdata: make pointer regexp less prone to flakes
Pointers are of the form "0x" and a number of hex characters.
Those are all part of the obfuscated filename alphabet,
so if we're unlucky enough, we can get hits:

	> cmp stdout reverse.stdout
	--- stdout
	+++ reverse.stdout
	@@ -4,7 +4,7 @@
	 runtime/debug.Stack(...)
		runtime/debug/stack.go:24 +0x??
	 test/main/lib.printStackTrace(...)
	-	pv0x??mRa.go:1 +0x??
	+	test/main/lib/long_lib.go:32 +0x??
	 test/main/lib.(*ExportedLibType).ExportedLibMethod(...)
		test/main/lib/long_lib.go:19 +0x??
	 main.unexportedMainFunc.func1(...)

Include the plus sign in the regular expression,
which is used for showing pointers but isn't part of our alphabet.
3 years ago
Daniel Martí d25e718d0c stop passing ignoreObjects to literals.Obfuscate
Literal obfuscation uses constant folding now,
so it no longer needs to record identifiers to ignore.
Remove the parameter and the outdated bit of docs.
3 years ago
Daniel Martí c506f02763 handle --long build flags properly
cmd/go treats "--foo=bar" juts like "-foo=bar",
just like any other program using the flag package.

However, we didn't support this longer form in filterForwardBuildFlags.
Because of it, "garble build -tags=foo" worked,
but "garble build --tags=foo" did not,
as we wouldn't forward "--tags=foo" as a build flag for "go list".

Fixes #429.
3 years ago
Daniel Martí d8e5351175 fix test flake in reverse.txt
Just ran into this failure while fixing a different bug:

	> ! grep 'ExportedLib(Type|Field)|unexportedMainFunc|test/main|main\.go|lib\.go' main.stderr
	[main.stderr]
	lib filename: A4Spnz0u.go

	goroutine 1 [running]:
	runtime/debug.Stack(...)
		runtime/debug/stack.go:24 +0x??
	kso0S_A6.at6JKzwa(...)
		p8_ovZPW.go:1 +0x??
	kso0S_A6.(*JKArRn6w).ExportedLibMethod(...)
		h4JncykI.go:1 +0x??
	main.cQXA3D6d.func1(...)
		FaQ2WcAJ.go:1
	main.cQXA3D6d(...)
		T7Ztgy1Q.go:1 +0x??
	main.main(...)
		OHzKYhm3.go:1 +0x??

	main filename: Myi8glib.go

	FAIL: testdata/scripts/reverse.txt:16: unexpected match for `ExportedLib(Type|Field)|unexportedMainFunc|test/main|main\.go|lib\.go` found in main.stderr: lib.go

Note that "main.go" ended up obfuscated as "Myi8glib.go",
which just so happens to match against "lib.go".

Use longer filenames, so that the chances of collisions are near-zero.
3 years ago
Daniel Martí ff803004d0 avoid build ID mismatches when using -debugdir
A recent change added the -debugdir value to addGarbleToHash,
which is part of the hash seed for all obfuscation taking place.

In principle, it was okay to add, just like any other garble flag.
In practice, it broke the added test case:

	> garble -debugdir ./debug1 build
	[stderr]
	# test/main
	FBi9xa6e.(*ac0bCOhR).String: relocation target FBi9xa6e.rV55e6H9 not defined
	qmECK6zf.init.0: relocation target qmECK6zf.eUU08z98 not defined
	[...]

This is because -debugdir gets turned into an absolute path,
and not all garble processes ended up using it consistently.

The fix is rather simple; since -debugdir doesn't affect obfuscation,
don't include it in the build hash seeding at all.

Fixes #451.
3 years ago
Daniel Martí 34cbd1b841 only list missing packages when obfuscating the runtime
We were listing all of std, which certainly worked,
but was quite slow at over 200 packages.
In practice, we can only be missing up to 20-30 packages.
It was a good change as it fixed a severe bug,
but it also introduced a fairly noticeable slow-down.

The numbers are clear; this change shaves off multiple seconds when
obfuscating the runtime with a cold cache:

	name              old time/op       new time/op       delta
	Build/NoCache-16        5.06s ± 1%        1.94s ± 1%  -61.64%  (p=0.008 n=5+5)

	name              old bin-B         new bin-B         delta
	Build/NoCache-16        6.70M ± 0%        6.71M ± 0%   +0.05%  (p=0.008 n=5+5)

	name              old sys-time/op   new sys-time/op   delta
	Build/NoCache-16        13.4s ± 2%         5.0s ± 2%  -62.45%  (p=0.008 n=5+5)

	name              old user-time/op  new user-time/op  delta
	Build/NoCache-16        60.6s ± 1%        19.8s ± 1%  -67.34%  (p=0.008 n=5+5)

Since we only want to call "go list" one extra time,
instead of once for every package we find out we're missing,
we want to know what packages we could be missing in advance.
Resurrect a smarter version of the runtime-related script.

Finally, remove the runtime-related.txt test script,
as it has now been superseeded by the sanity checks in listPackage.
That is, obfuscating the runtime package will now panic if we are
missing any necessary package information.

To double check that we get the runtime's linkname edge case right,
make gogarble.txt use runtime/debug.WriteHeapDump,
which is implemented via a direct runtime linkname.
This ensures we don't lose test coverage from runtime-related.txt.
3 years ago
Daniel Martí ff09a7c672 testdata: ensure go:noinline works via runtime.Callers
While here, I spotted that reverse.txt didn't declare ExportedLibField,
even though the test used a regular expression for it.
3 years ago
Dan Kortschak a207a139bb fix binsubstr calls to not prepare for a regexp 3 years ago
Daniel Martí 4f0657a19a prepare for v0.5.0
While here, add a TODO I forgot about, and run gofumpt.

Also bump all test timeouts slightly,
as the Mac and Windows hosted runners are a bit slow
and I've hit failures twice recently.
3 years ago
Daniel Martí 29ea99fc5f CI: test on GOARCH=386
Note that this cross-compilation disables cgo by default,
and so the cgo.txt test script isn't run on GOARCH=386.
That seems fine for now, as the test isn't arch-specific.

This testing uncovered one build failure in internal/literals;
the comparison between int and math.MaxUint32 is invalid on 32-bit.
To fix that build failure, use int64 consistently.

One test also incorrectly assumed amd64; it now supports 386 too.
For any other architecture, it's being skipped for now.

I also had to increase the -race test timeout,
as it usually takes 8-9m on GitHub Actions,
and the timeout would sometimes trigger.

Finally, use "go env" rather than "go version" on CI,
which gives us much more useful information,
and also includes Go's own version now via GOVERSION.

Fixes #426.
3 years ago
Daniel Martí d146d82d33 prepare changelog for 0.5.0 3 years ago
Daniel Martí 4e97811a62 give a useful error for "garble build -tiny"
We've had confused users a handful of times by now.
And it's reasonable to expect flags to be after the command,
as that's how flags work for cmd/go itself.

I don't think we want to mix our flags with Go's,
or start accepting flags in either place.
Both seem like worse solutions long-term, as they can add confusion.

However, we can quickly give a useful hint when a flag is misplaced.
That should get new users unblocked without asking for help.

We use a regular expression for this purpose,
because it doesn't seem like a FlagSet supports what we need;
to detect whether an argument is one of our flags,
without actually applying its value to the flagset.
Our flagset would also error on Go's flags, which we don't want.
3 years ago
Daniel Martí 5abd3c468d update cmd/go flags for 1.18
As of 1.18beta1. I used bash commands like:

	diff -u <(go1.17.5 help build) <(gotip help build)

While here, supply -buildinfo=false and -buildvcs=false to go build.
We already remove that information by discarding the _gomod_.go file,
but we might as well pass the flags too.
If anything, it lets the toolchain avoid the work entirely.
Note that we can't use these flags on Go 1.17 for now, though.

Add a TODO that came to mind while writing this, too.
3 years ago
Daniel Martí 93b2873c28 ensure the runtime is built in a reproducible way
We went to great lengths to ensure garble builds are reproducible.
This includes how the tool itself works,
as its behavior should be the same given the same inputs.

However, we made one crucial mistake with the runtime package.
It has go:linkname directives pointing at other packages,
and some of those pointed packages aren't its dependencies.

Imagine two scenarios where garble builds the runtime package:

1) We run "garble build runtime". The way we handle linkname directives
   calls listPackage on the target package, to obfuscate the target's
   import path and object name. However, since we only obtained build
   info of runtime and its deps, calls for some linknames such as
   listPackage("sync/atomic") will fail. The linkname directive will
   leave its target untouched.

2) We run "garble build std". Unlike the first scenario, all listPackage
   calls issued by runtime's linkname directives will succeed, so its
   linkname directive targets will be obfuscated.

At best, this can result in inconsistent builds, depending on how the
runtime package was built. At worst, the mismatching object names can
result in errors at link time, if the target packages are actually used.

The modified test reproduces the worst case scenario reliably,
when the fix is reverted:

	> env GOCACHE=${WORK}/gocache-empty
	> garble build -a runtime
	> garble build -o=out_rebuild ./stdimporter
	[stderr]
	# test/main/stdimporter
	JZzQivnl.NtQJu0H3: relocation target JZzQivnl.iioHinYT not defined
	JZzQivnl.NtQJu0H3.func9: relocation target JZzQivnl.yz5z0NaH not defined
	JZzQivnl.(*ypvqhKiQ).String: relocation target JZzQivnl.eVciBQeI not defined
	JZzQivnl.(*ypvqhKiQ).PkgPath: relocation target JZzQivnl.eVciBQeI not defined
	[...]

The fix consists of two steps. First, if we're building the runtime and
listPackage fails on a package, that means we ran into scenario 1 above.
To avoid the inconsistency, we fill ListedPackages with "go list [...] std".
This means we'll always build runtime as described in scenario 2 above.

Second, when building packages other than the runtime,
we only allow listPackage to succeed if we're listing a dependency of
the current package.
This ensures we won't run into similar reproducibility bugs in the future.

Finally, re-enable test-gotip on CI since this was the last test flake.
3 years ago
Daniel Martí bd0705a536 amend redirecting URLs via "xurls -fix" 3 years ago
Daniel Martí e71dbc6f9e testdata: adjust for newer Go 1.18 tip version
The output of "go version -m" has changed format slightly:

	main: devel go1.18-b357b05b70 Thu Dec 23 20:03:38 2021 +0000
		path	test/main
		mod	test/main	(devel)
		build	-compiler=gc
		build	-tags=veryuniquebuildtag
		build	CGO_ENABLED=1
		build	CGO_CFLAGS=
		build	CGO_CPPFLAGS=
		build	CGO_CXXFLAGS=
		build	CGO_LDFLAGS=
		build	GOARCH=amd64
		build	GOOS=linux
		build	GOAMD64=v3
		build	vcs=git
		build	vcs.revision=5c92c49896a08d9152669b5bcbf0a9d85c5cb2ed
		build	vcs.time=2021-12-24T18:52:32Z
		build	vcs.modified=true
3 years ago
Daniel Martí 06d5972223
add -debug to aid in debugging (#431)
Not really meant for end users,
but they might still debug failures before filing bugs.

We add the -debug flag itself,
as well as machinery to deduplicate output lines.
There are quite a lot of them otherwise,
which aren't helpful and simply add noise.

In the future, if we always want to output a debug log line,
such as "choosing not to obfuscate here because X",
we can simply insert the unique position string.

Finally, turn all commented-out log.Printf calls to debugf.
Improve a few log lines to be more human-friendly,
and also add a few extras like how long it takes to load files.

We can improve the logging further in the future.
This seems like a good starting point.
3 years ago