Commit Graph

19 Commits (d2beda1f0016a11b0b0a364eff06a153eb4a448a)

Author SHA1 Message Date
Daniel Martí 4f743b0861 move curPkg and origImporter out of the globals
It is true that each garble process only obfuscates up to one package,
which is why we made them globals to begin with.
However, garble does quite a lot more now,
such as reversing the obfuscation of many packages at once.
Having a global "current package" variable makes mistakes easier.

Some funcs, like those in transformFuncs, are now transformer methods.
1 year ago
Daniel Martí 60dbece24f work around another go/printer bug to fix andybalholm/brotli
When obfuscating the following piece of code:

	func issue_573(s struct{ f int }) {
		var _ *int = &s.f
		/*x*/
	}

the function body would roughly end up printed as:
we would roughly end up with:

	var _ *int = &dZ4xYx3N
	/*x*/.rbg1IM3V

Note that the /*x*/ comment got moved earlier in the source code.
This happens because the new identifiers are longer, so the printer
thinks that the selector now ends past the comment.

That would be fine - we don't really mind where comments end up,
because these non-directive comments end up being removed anyway.

However, the resulting syntax is wrong, as the period for the selector
must be on the first line rather than the second.
This is a go/printer bug that we should fix upstream,
but until then, we must work around it in Go 1.18.x and 1.19.x.

The fix is somewhat obvious in hindsight. To reduce the chances that
go/printer will trip over comments and produce invalid syntax,
get rid of most comments before we use the printer.
We still keep the removal of comments after printing,
since go/printer consumes some comments in ast.Node Doc fields.

Add the minimized unit test case above, and add the upstream project
that found this bug to check-third-party.
andybalholm/brotli helps cover a compression algorithm and ccgo code
generation from C to Go, and it's also a fairly popular module,
particular with HTTP implementations which want pure-Go brotli.

While here, fix the check-third-party script: it was setting GOFLAGS
a bit too late, so it may run `go get` on the wrong mod file.

Fixes #573.
2 years ago
Daniel Martí 21dfbd3379 obfuscate cgo-generated-Go filenames
It's not a problem to leak filenames like _cgo_gotypes.go,
but it is a problem when it includes the import path:

	$ strings main | grep _cgo_gotypes
	test/main/_cgo_gotypes.go

Here, "test/main" is the module path, which we want to hide.
We hadn't caught this before because the cgo.txt test did not check that
module paths aren't being leaked - it does now.

The fix is rather simple; we let printFile handle cgo-generated files.
We used to avoid that due to compiler errors, as the compiler only
allows some special cgo comment directives to work in cgo-generated
code, to prevent misuse in user code.

The fix is rather easy: the obfuscated filenames should begin with
"_cgo_" to appease the compiler's check.
2 years ago
Daniel Martí d2622e8223 replace go/parser with go/scanner in printFile
printFile is one of the functions to blame for most of the CPU cost and
allocations for garble itself, as reported by `perf record` for a clean build.

One contributor is how we print each file and then parse it again,
which we did for the sake of inserting line directives correctly.

With a bit of care, we can do this by tokenizing after printing,
as opposed to parsing into a full go/ast again.

This is moderately cheaper, but more than anything, allocates far less.
That is to be expected given how go/ast is a tree of pointers,
whereas go/scanner simply gives us a stream of tokens.

	name      old time/op         new time/op         delta
	Build-16          10.4s ± 2%          10.3s ± 1%    ~     (p=0.393 n=10+10)

	name      old bin-B           new bin-B           delta
	Build-16          5.51M ± 0%          5.51M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          398ms ±12%          391ms ±10%    ~     (p=0.529 n=10+10)

	name      old mallocs/op      new mallocs/op      delta
	Build-16          34.4M ± 0%          31.8M ± 0%  -7.65%  (p=0.000 n=10+10)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          5.80s ± 6%          5.86s ± 4%    ~     (p=0.218 n=10+10)

The new code is shorter, but perhaps a bit trickier,
so I also added more comments to explain what's going on.

Note how the time/op change is practically noise,
but mallocs/op goes down significantly, which is always a good sign.
2 years ago
Daniel Martí 99f9b88363 fix a couple of TODOs in position.go
Now that we've required Go 1.18 or later for some time,
stop supporting `// +build` directives entirely.
That should be fine, given that `go build` will fail too.

The TODO about ToObfuscate is also obsolete; see the added comment.

Finally, tweak the comments a bit after reading them again.
2 years ago
Daniel Martí d18dd73556 use go/parser.SkipObjectResolution
We don't use go/ast.Objects, as we use go/types instead.
Avoiding this work saves a bit of CPU and memory allocs.

	name      old time/op         new time/op         delta
	Build-16          10.2s ± 1%          10.2s ± 1%    ~     (p=0.937 n=6+6)

	name      old bin-B           new bin-B           delta
	Build-16          5.47M ± 0%          5.47M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          328ms ±14%          321ms ± 6%    ~     (p=0.589 n=6+6)

	name      old mallocs/op      new mallocs/op      delta
	Build-16          34.8M ± 0%          34.0M ± 0%  -2.26%  (p=0.010 n=6+4)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          5.89s ± 3%          5.89s ± 3%    ~     (p=0.937 n=6+6)

See golang/go#52463.
2 years ago
Daniel Martí 1c564ef091 slightly improve code thanks to Go 1.18 APIs
strings.Cut makes some string handling code more intuitive.
Note that we can't use it everywhere, as some places need LastIndexByte.

Start using x/exp/slices, too, which is our first use of generics.
Note that its API is experimental and may still change,
but since we are not a library, we can control its version updates.

I also noticed that we were using TrimSpace for importcfg files.
It's actually unnecessary if we swap strings.SplitAfter for Split,
as the only whitespace present was the trailing newline.

While here, I noticed an unused copy of printfWithoutPackage.
2 years ago
lu4p 1a0b028db7 all: drop support for Go 1.17
Now that we've released v0.6.0, that will be the last feature release to
feature support for Go 1.17. The upcoming v0.7.0 will be Go 1.18+.

Code-wise, the cleanup here isn't super noticeable,
but it will be easier to work on features like VCS-aware version
information and generics support without worrying about Go 1.17.
Plus, now CI is back to being much faster.

Note how "go 1.18" in go.mod makes "go mod tidy" more aggressive.
2 years ago
Daniel Martí c1c90fee13 make obfuscation fully deterministic with -seed
The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.

Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.

In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.

Expand the README docs with more examples and detail.

Fixes #449.
2 years ago
Daniel Martí 7994877a52 allocate slightly less in printFile
Reuse buffers, so they only grow once.
Join a fmt.Sprintf with a string addition.

	name      old time/op         new time/op         delta
	Build-16          9.39s ± 2%          9.28s ± 2%    ~     (p=0.548 n=5+5)

	name      old bin-B           new bin-B           delta
	Build-16          5.16M ± 0%          5.16M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          324ms ± 6%          317ms ± 2%    ~     (p=0.421 n=5+5)

	name      old mallocs/op      new mallocs/op      delta
	Build-16          29.4M ± 0%          29.3M ± 0%  -0.25%  (p=0.008 n=5+5)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          4.66s ± 2%          4.59s ± 2%    ~     (p=0.310 n=5+5)
2 years ago
Daniel Martí 5f74a1c9f0 unify the definition and storage of flag values
The parent garble process parses the original flags,
as provided by the user via the command line.
Previously, those got stored in the shared cache file,
so that child processes spawned by toolexec could see them.

Unfortunately, this made the code relatively easy to misuse.
A child process would always see flagLiterals as zero value,
given that it should never see such a flag argument directly.
Similarly, one would have to be careful with cached options,
as they could only be consumed after the cache file is loaded.

Simplify the situation by deduplicating the storage of flags.
Now, the parent passes all flags onto children via toolexec.

One exception is GarbleDir, which now becomes an env var.
This seems in line with other top-level dirs like GARBLE_SHARED.

Finally, we turn -seed into a flag.Value,
which lets us implement its "set" behavior as part of flag.Parse.

Overall, we barely reduce the amount of code involved,
but we certainly remove a couple of footguns.
As part of the cleanup, we also introduce appendFlags.
3 years ago
Daniel Martí fceb19f6da
deprecate using GOPRIVATE in favor of GOGARBLE (#427)
Piggybacking off of GOPRIVATE is great for a number of reasons:

* People tend to obfuscate private code, whose package paths will
  generally be in GOPRIVATE already

* Its meaning and syntax are well understood

* It allows all the flexibility we need without adding our own env var
  or config option

However, using GOPRIVATE directly has one main drawback.
It's fairly common to also want to obfuscate public dependencies,
to make the code in private packages even harder to follow.
However, using "GOPRIVATE=*" will result in two main downsides:

* GONOPROXY defaults to GOPRIVATE, so the proxy would be entirely disabled.
  Downloading modules, such as when adding or updating dependencies,
  or when the local cache is cold, can be less reliable.

* GONOSUMDB defaults to GOPRIVATE, so the sumdb would be entirely disabled.
  Adding entries to go.sum, such as when adding or updating dependencies,
  can be less secure.

We will continue to consume GOPRIVATE as a fallback,
but we now expect users to set GOGARBLE instead.
The new logic is documented in the README.

While here, rewrite some uses of "private" with "to obfuscate",
to make the code easier to follow and harder to misunderstand.

Fixes #276.
3 years ago
Daniel Martí 8edde922ee remove unused code spotted by -coverprofile
Remove some asthelper APIs that haven't been used for some time.
They can be recovered from the git history if needed again.

One type assertion in the literals package is always true.

Embedded field objects are handled near the top of transformGo, so the
extra !obj.Embedded() check was always true. Remove it.

We always obfuscate standalone funcs now, so the obfuscatedTypesPackage
check is no longer necessary. This was necessary when we used to not
obfuscate func names when they were used in linkname directives.

The workaround for test package imports in obfuscatedTypesPackage I had
to add a few commits ago no longer seems to be necessary. This might be
thanks to the simplification with functions in the paragraph just above.

It's impossible to run garble without -trimpath nowadays, as we error
before the build even starts:

	$ go build -toolexec=garble
	go tool compile: exit status 1
	cannot open shared file, this is most likely due to not running "garble [command]"

When run as "garble build", the trimpath flag is always set. So the
check in alterTrimpath never triggers anymore, and couldn't be tested.

Finally, simplify the handling of comment syntax in printFile, and add a
few TODOs for other code paths not covered by our existing tests.

Total code coverage is up from 90.3% to 91.0%.
3 years ago
Daniel Martí 2ef9386942 use an empty filename when re-parsing source files
In printFile, we print and re-parse the modified AST to be able to have
reliable position information.

The re-parsing step can fail if something goes very wrong, such as a bug
in -literals. It should generally not happen. However, in rare cases it
has happened, and it's confusing for the end user to see syntax errors
pointing at an existing file on disk, when the code doesn't align
- since we're on a modified copy.

To prevent such confusion, use an empty filename. Syntax errors will
still not be terribly helpful, but they should be extremely rare and
promptly fixed, so that's not a huge concern.

For that same reason, we can't really add a good test here. We could
perhaps add a test that forces garble to mess up the src slice in some
way, but that would be a weird test, and not particularly worth it.

Fixes #286.
3 years ago
Daniel Martí d38dfd4e90 make garble work on Go tip again
Just two minor tweaks were necessary to get "go test" to pass on:

	go version devel go1.17-a25c584629 Tue Apr 6 04:48:09 2021 +0000 linux/amd64

Re-enable the CI for it, too. The config needed changing since the
set-env and add-path commands now use special files instead, due to some
security issues uncovered last winter.

It's possible that CI on master could suddenly break, if Go master
changes in some substantial way that requires more tweaks. If that turns
out to be an issue pretty often, we could always pin a specific git repo
commit and update it every few weeks.
3 years ago
Daniel Martí e2f06cce94 set positions when using cursor.Replace
The regular obfuscation process simply modifies some simple nodes, such
as identifiers and strings. In those cases, we modify the nodes
in-place, meaning that their positions remain the same. This hasn't
caused any problems.

Literal obfuscation is trickier. Since we replace one expression with an
entirely different one, we use cursor.Replace. The new expression is
entirely made up on the spot, so it lacks position information.

This was causing problems. For example, in the added test input:

	> garble -literals build
	[stderr]
	# test/main
	dgcm4t6w.go:3: misplaced compiler directive
	dgcm4t6w.go:4: misplaced compiler directive
	dgcm4t6w.go:3: misplaced compiler directive
	dgcm4t6w.go:6: misplaced compiler directive
	dgcm4t6w.go:7: misplaced compiler directive
	dgcm4t6w.go:3: misplaced compiler directive
	dgcm4t6w.go:9: misplaced compiler directive
	dgcm4t6w.go:3: misplaced compiler directive
	dgcm4t6w.go:3: too many errors

The build errors are because we'd move the compiler directives, which
makes the compiler unhappy as they must be directly followed by a
function declaration.

The root cause there seems to be that, since the replacement nodes lack
position information, go/printer would try to estimate its printing
position by adding to the last known position. Since -literals adds
code, this would result in the printer position increasing rapidly, and
potentially printing directive comments earlier than needed.

For now, making the replacement nodes have the same position as the
original node seems to stop go/printer from making this mistake.

It's possible that this workaround won't be bulletproof forever, but it
works well for now, and I don't see a simpler workaround right now.
It would be possible to use fancier mechanisms like go/ast.CommentMap or
dave/dst, but those are a significant amount of added complexity as well.

Fixes #285.
3 years ago
Daniel Martí 1a8e32227f
improve "reverse" even further (#289)
Fix up a few TODOs, and simplify the way we handle comments.

We now add whitespace around inline /*line*/ directives, to ensure we
don't break programs. A test case is added too.

We now add line directives to call sites, not function declarations,
since those are what actually shows up in stack traces.
It's unclear if we care about any other lines inside functions at all.
This also fixes reversing with -literals, since that feature adds a
significant amount of code which shuffles line numbers around.

Finally, we extend the tests with types, methods, and anonymous
functions, and we make all of them work well.

Updates #5.
3 years ago
Daniel Martí a8c5d534d1
support reversing stack trace positions (#287)
In particular, the positions within function declarations, including the
positions of call sites to other functions.

Note that this isn't well tested just yet, particularly not with other
features like -literals. We can extend the tests and code over time.
This gets us the core basics.

The issue will be closed once the feature is documented for users, in a
follow-up PR.

Updates #5.
3 years ago
Daniel Martí 961daf20c4
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.

Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".

Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.

The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.

The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.

Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.

Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.

For #5.
3 years ago