Commit Graph

312 Commits (ff361f6b1599c542c389904e1666357a0a558cb2)
 

Author SHA1 Message Date
Daniel Martí 664f834906 document "garble reverse"
Fixes #5.
3 years ago
Daniel Martí 1662fc93b4 make reverse error if no changes were made
Just the exit status code is used. Printing an error would be confusing,
since we're printing the input too. Plus, the error is binary anyway -
either we changed something, or we did not.

One could make a case for only printing if we changed anything, and
showing an error otherwise. But that would mean buffering the entire
input, which could be problematic or slow when e.g. reversing large log
files.
3 years ago
Daniel Martí fe095ef132
handle unknown flags in reverse (#290)
While at it, expand the tests for build and test too.
3 years ago
Daniel Martí 1a8e32227f
improve "reverse" even further (#289)
Fix up a few TODOs, and simplify the way we handle comments.

We now add whitespace around inline /*line*/ directives, to ensure we
don't break programs. A test case is added too.

We now add line directives to call sites, not function declarations,
since those are what actually shows up in stack traces.
It's unclear if we care about any other lines inside functions at all.
This also fixes reversing with -literals, since that feature adds a
significant amount of code which shuffles line numbers around.

Finally, we extend the tests with types, methods, and anonymous
functions, and we make all of them work well.

Updates #5.
3 years ago
Daniel Martí 10ec00b37a
make flags like -literals and GOPRIVATE affect hashing (#288)
In 6898d61637, we switched from using action IDs from "go list
-toolexec=garble" to those from the original "go list". We still wanted
the obfuscation and hashing to change if the version of garble changes,
so we hashed that "original action ID" with garble's own content ID, and
called the new hash "garble action ID".

While working on a different patch, I noticed something weird: with the
new mechanism, adding or removing flags like -literals did not alter
those hashes, unlike the old method. This is because the old method used
ownContentID, which includes such bits of information, but the new
method does not.

Change that, and add a test that locks in the behavior we want. In
seed.txt, we check that a single function name gets hashed in particular
ways in different scenarios.

Note that we use a mix of "cmp" and "! bincmp", since the former has no
negated form.

While at it, the seed.txt test is revamped a bit. Now, we only run with
-literals once, as this test is mainly about -seed. We also declare seed
strings once, as environment variables, which makes it easier to track
what each step is doing.
3 years ago
Daniel Martí a8c5d534d1
support reversing stack trace positions (#287)
In particular, the positions within function declarations, including the
positions of call sites to other functions.

Note that this isn't well tested just yet, particularly not with other
features like -literals. We can extend the tests and code over time.
This gets us the core basics.

The issue will be closed once the feature is documented for users, in a
follow-up PR.

Updates #5.
3 years ago
Daniel Martí ce2c45440a
simplify the uses of obfuscatedTypesPackage (#284)
obfuscatedTypesPackage is now only useful for one scenario: a type which
is used with reflection in a dependency, so neither its name nor any of
its potential struct field names are obfuscated.

This is because we can only detect the use of reflection with go/ast,
which we don't have for dependencies.

As such, we only need obfuscatedTypesPackage in two places - when
considering to obfuscate a field or a type name.

There were two other calls to the function, which we remove.

The first was used for linkname directives. Those directives only work
for variables and functions, neither of which is affected by the
reflection detection.

The second was used for all identifiers, at the very end of the
transformGo inner func. It's entirely unnecessary right now, as it never
triggers anymore. It's possible it was necessary some time ago when we
still didn't obfuscate assembly functions.

While at it, improve some comments and add a few TODOs for edge cases
which do not have code coverage.
3 years ago
Daniel Martí 091f8239c0 rework the build benchmarks
First, stop writing binaries into the current directory, which pollutes
the git clone.

Second, split the benchmark into two. The old benchmark always used the
build cache after the first iteration, meaning that we weren't really
measuring the cost of cold fresh builds.

The new benchmarks show a build with an always-warm cache, and one
without any cache.

Note that NoCache with the main package importing "fmt" took about 4s
wall time, which makes benchmarking too slow. For that reason, the new
bench-nocache program has no std dependencies other than runtime, which
already pulls in half a dozen dependencies we recompile at every
iteration. This reduces the wall time to 2s, which is bearable.

On the other hand, Cache is already fast, so we add a second and
slightly heavier dependency, net/http. The build still takes under 300ms
of wall time. This also helps the Cache benchmark imitate larger rebuilds
with a warm cache.

Longer term, both benchmarks will be useful, because we want both
scenarios to be as efficient as possible.

	name             time/op
	Build/Cache-8      161ms ± 1%
	Build/NoCache-8    1.21s ± 1%

	name             bin-B
	Build/Cache-8      6.35M ± 0%
	Build/NoCache-8    6.35M ± 0%

	name             sys-time/op
	Build/Cache-8      218ms ± 7%
	Build/NoCache-8    522ms ± 4%

	name             user-time/op
	Build/Cache-8      825ms ± 1%
	Build/NoCache-8    8.17s ± 1%
3 years ago
Daniel Martí a1f11fb231 add a writeTemp helper func
We had two nearly identical copies of the code to open a temp file with
CreateTemp, write to it, and handle closing properly. Unify them.

With the added docs this isn't exactly a net win, but we want the long
funcs such as transformCompile to be easy to follow, and this helps.
3 years ago
Daniel Martí 13e4ba2ae0 use "obfuscate" instead of "garble" in some more places
Mainly comments. "garble" refers to the tool, but the verb and adjective
is more intuitive as "obfuscate" and "obfuscated" instead of "garble"
and "garbled".
3 years ago
Daniel Martí 961daf20c4
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.

Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".

Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.

The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.

The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.

Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.

Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.

For #5.
3 years ago
Daniel Martí ea19e39aa4 use hashWith for obfuscation position information
Position information was obfuscated with math/rand manually, which meant
that the resulting positions were pretty small like "x.go:34", but they
were also very hard to reverse due to their short length and difficulty
to reproduce.

We now hash them with hashWith and the package's GarbleActionID:

	"main.go:203" hashed with 933ad1c700755b7c3a9913c55cade1 to "mwu1xuNz.go"

The input to the hash is the base filename and the byte offset of the
declaration within the file, meaning that it's unique within a package.
The output filename is long enough to allow easy reversal.

The line number is always 1, since the information needed for reversing
is contained entirely within the filename. It doesn't really matter if
we encode data in the filename or line number, but it's easier for us to
use a string.

For #5.
3 years ago
Andrew LeFevre 24518ceead fix failing to build user packages named "embed" 3 years ago
Daniel Martí 5d74ab07f5 all: replace uses of the deprecated ioutil
Now that we require Go 1.16, we can simplify code by removing ioutil.
3 years ago
Daniel Martí a328a487f8 improve the code comments around -seed
Explain why we print the random seed on failure, and why we accept
base64 padding when we don't use it.

While at it, the flagOptions.Random field is unused, so remove it. It
also seems wrong for it to exist; the random value is only for the seed
flag, which already has a field of its own.
3 years ago
Daniel Martí f3ea42230a remove the last remaining -debugdir buffer
We were still printing files into a buffer, which got left behind from
when we used to store the obfuscated source in object files.

All that code is gone, so this buffer is now just wasting CPU cycles and
memory.
3 years ago
Daniel Martí 9312938650
remove obsolete TODOs (#274)
obfuscatedTypesPackage can't be outright removed, since we still do
require knowledge of what was obfuscated for one reason: types used with
reflection. Since we can only figure that out with go/ast and not just
go/types, we rely on the original compilation to tell us that
information.

IgnoreFuncBodies=true for typechecking the original source code would be
nice, as we would save time, but ultimately it doesn't work. When we
rename top-level declarations such as functions and types, we also need
to amend their references in func bodies. We depend on type information
for that.

Finally, we've been randomizing filenames for a while now. Randomizing
the order of the files doesn't seem to be useful.
3 years ago
Daniel Martí 748c6a0538
obfuscate asm function names as well (#273)
Historically, it was impossible to rename those funcs as the
implementation was in assembly files, and we only transformed Go code.

Now that transformAsm exists, it's only about fifty lines to do some
very basic parsing and rewriting of assembly files.

This fixes the obfuscated builds of multiple std packages, including a
few dependencies of net/http, since they included assembly funcs which
called pure Go functions. Those pure Go functions had their names
obfuscated, breaking the call sites in assembly.

Fixes #258.
Fixes #261.
3 years ago
Daniel Martí ffeea469e6
remove a couple of pieces of dead code (#272)
We no longer cache the obfuscated source code for -debugdir, so
obfSrcArchive is entirely useless at this point. We likely didn't notice
because the Go compiler doesn't realise it's unused.

symAbis is also unnecessary. It used to be necessary as we could only
collect the action ID from transformCompile in the second asm
invocation. Since we no longer need the temporary file dance with
transformCompile, we can simplify that code too.
3 years ago
Daniel Martí 06a7a21e17
changelog: forgot to add github link (#271)
Otherwise the [0.1.0] link doesn't render at all.
3 years ago
lu4p 810bdf448e don't obfuscate the "embed" import path
Updates #263
3 years ago
Daniel Martí def9351b25
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.

Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.

Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.

The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".

The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.

The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.

Finally, document "garble test" again.

Fixes #241.
3 years ago
Daniel Martí b71ff42877
testdata: remove some unnecessary execs (#267)
Our use of go-internal's gotooltest.Setup means that "go" is set up as a
top-level command in the scripts, so we can simply "go build" instead of
having to "exec go build". The result is practically the same, but the
scripts are simpler.

While at it, I had left an "exec cat" behind; remove it.
3 years ago
Daniel Martí 4e9ee17ec8
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.

Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.

To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.

The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.

Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.

isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".

Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.

Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.

While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
3 years ago
Daniel Martí ff0bea73b5
all: drop support for Go 1.15.x (#265)
This mainly cleans up the few bits of code where we explicitly kept
support for Go 1.15.x. With v0.1.0 released, we can drop support now,
since the next v0.2.0 release will only support Go 1.16.x.

Also updates all modules, including test ones, to 'go 1.16'.

Note that the TOOLEXEC_IMPORTPATH refactor is not done here, despite all
the TODOs about doing so when we drop 1.15 support. This is because that
refactor needs to be done carefully and might have side effects, so it's
best to keep it to a separate commit.

Finally, update the deps.
3 years ago
Daniel Martí 1fcf50bbdd release v0.1.0 3 years ago
Daniel Martí 2a9c0b7bf4
prepare for the first release (#264)
First, write a changelog file. We will use GitHub releases, but the
content in those is not stored in git nor is it portable or machine
readable. The canonical place for the changelog is here.

Second, disable 'garble test', as it is entirely broken. Issue #241
tracks fixing and re-enabling it, which will most likely happen for the
next release.

Third, disable the undocumented 'garble list'. This was added as part of
'garble reverse', but it never got used. I can't think of any reason why
any end user would prefer it over 'go list', either.

'garble reverse' remains enabled, but undocumented as it isn't fully
functional yet. Until it supports position information, it's not
particularly useful to end users. But it's not broken either, so it can
remain where it is.

Fourth, update the '-tiny' size reduction numbers in the README. Since
we removed the in-place modification of object files, we are no longer
able to do such an aggressive stripping of info. Garble itself drops in
size by 2%, so replace the old 6-10% estimate by 2-5%. We probably will
gain some of this back in the near future.

Finally, fix the indentation formatting of the README to consistently
use tabs.
3 years ago
Daniel Martí 1267e2eced
fix link errors when importing crypto/ecdsa (#262)
First, we had some link errors such as:

	cannot find package J6OzO8GN (using -importcfg)

This was caused by the code that writes an updated importcfg, which did
not handle import maps well. That code is now fixed, and we also add an
obfuscatedImportPath method for clarity.

Once fixed, we ran into other link errors:

	Pw3g97ww.addVW: relocation target Pw3g97ww.addVWlarge not defined

After some digging, the cause of those is assembly code that we do not
yet support obfuscating. #261 tracks that.

Meanwhile, to fix "GOPRIVATE=* garble build" and to be able to have a
test for the original import path bug, we add the packages which use
that form of assembly code to runtimeRelated - math/big and
crypto/sha512. There might be more, but these were the ones found by
trying to link crypto/tls, a fairly common dependency.

Fixes #256.
3 years ago
Daniel Martí 99887c13c2 ignore -ldflags=-X flags mentioning unknown packages
That would panic, since the *listedPackage would be nil for a package
path we aren't aware of:

	panic: runtime error: invalid memory address or nil pointer dereference
	[signal SIGSEGV: segmentation violation code=0x1 addr=0x88 pc=0x126b57d]

	goroutine 1 [running]:
	main.transformLink.func1(0x7ffeefbff28b, 0x5d)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1260 +0x17d
	main.flagValueIter(0xc0000a8e20, 0x2f, 0x2f, 0x12e278e, 0x2, 0xc000129e28)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1410 +0x1e9
	main.transformLink(0xc0000a8e20, 0x30, 0x36, 0x4, 0xc000114648, 0x23, 0x12dfd60, 0x0)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1241 +0x1b9
	main.mainErr(0xc0000a8e10, 0x31, 0x37, 0x37, 0x0)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:287 +0x389
	main.main1(0xc000096058)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:150 +0xe7
	main.main()
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:83 +0x25

The linker ignores such unknown references, so we should too.

Fixes #259.
3 years ago
lu4p a397a8e94e Literals: Skip constants with inferred values.
Obfuscating literals broke constants
with values inferred via iota before,
because it would be moved to a variable declaration instead.
3 years ago
Daniel Martí b03cd08c09
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.

Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.

Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.

While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.

We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.

	name     old time/op       new time/op       delta
	Build-8        106ms ± 1%         92ms ± 0%  -14.07%  (p=0.010 n=6+4)

	name     old bin-B         new bin-B         delta
	Build-8        6.60M ± 0%        6.60M ± 0%   -0.01%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        208ms ± 5%        149ms ± 3%  -28.27%  (p=0.004 n=6+5)

	name     old user-time/op  new user-time/op  delta
	Build-8        433ms ± 3%        384ms ± 3%  -11.35%  (p=0.002 n=6+6)
3 years ago
Daniel Martí c0c5a75454
testdata: fix flakiness of tiny.txt on Windows (#252)
Sometimes it prints relatively low addresses, like:

	(0xbc200,0xdd658)

We required 6 to 8 hex digits, and sometimes it printed 5. Make the
regular expression more conservative.

Fixes #184.
3 years ago
Daniel Martí 6898d61637
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.

Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.

This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.

Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.

First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.

Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.

Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.

All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.

This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.

This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.

This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".

While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.

The numbers for Go 1.16 look very good:

	name     old time/op       new time/op       delta
	Build-8        146ms ± 4%        101ms ± 1%  -31.01%  (p=0.002 n=6+6)

	name     old bin-B         new bin-B         delta
	Build-8        6.61M ± 0%        6.60M ± 0%   -0.09%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        321ms ± 7%        202ms ± 6%  -37.11%  (p=0.002 n=6+6)

	name     old user-time/op  new user-time/op  delta
	Build-8        538ms ± 4%        414ms ± 4%  -23.12%  (p=0.002 n=6+6)
3 years ago
Daniel Martí 09e244986e formally add support for Go 1.16
This was pretty much just fixing the README and closing the issue. The
only other noteworthy user-facing change is that, if the Go version is
detected to be too old, we now suggest 1.16.x instead of 1.15.x.

While at it, refactor goversion.txt a bit. I wanted it to print a
clearer "mocking the go build" error if another command was used like
"go build", but I didn't want to learn BAT. So, instead use a simple Go
program and build it, which will work on all platforms. The added
"go build" step barely takes 100ms on my machine, given how simple the
program is.

The [short] line also doesn't seem necessary to me. The entire script
runs in under 200ms for me, so it's well within the realm of "short", at
least compared to many of the other test scripts.

Fixes #124.
3 years ago
Andrew LeFevre 081cba38d7 strip a few more unneeded runtime functions
Strip a few more traceback printing related functions in the runtime, new for Go 1.16. Also test that GODEBUG=inittrace=1 does not print anything when -tiny is passed.
3 years ago
Daniel Martí 2ee6604408
replace the -p path when assembling (#247)
The asm tool runs twice for a package with assembly. The second time it
does, the path given to the -p flag matters, just like in the compiler,
as we generate an object file.

We don't have a -buildid flag in the asm tool, so obtaining the action
ID to obfuscate the package path with is a bit tricky. We store it from
transformCompile, and read it from transformAsm. See the detailed docs
for more.

This was the last "skip" line in the tests due to Go 1.16. After all PRs
are merged, one last PR documenting that 1.16 is supported will be sent,
closing the issue for good.

It's unclear why this wasn't an issue in Go 1.15. My best guess is that
the ABI changes only happened in Go 1.16, and this causes exported asm
funcs to start showing up in object files with their package paths.

Updates #124.
3 years ago
Daniel Martí a223147093
use more bits for the obfuscated name hashes (#248)
We've been using four base64 characters for obfuscated names for a
while. And that has mostly worked, since most packages only have up to a
few hundred exported or unexported names at a time.

However, we have already encountered two collisions in the wild, which
can be reproduced with one seed but not another:

	[...] PsaN.hQyW is a field, not a method

	[...] byte is not a type

In both of those cases, we happened to run into a collision by chance.
And that's not terribly unlikely to begin with; even with just 100
names, the probability of a collision was about 0.03%. It dramatically
goes up if there are more names; with 500, we're already around 0.75%.

It's clear that four base64 chars is not enough to properly avoid
collisions in the vast majority of cases. But how many characters are
enough? The target should be that, even with a very large package and
lots of names, we should still practically never have a collision.

I did some basic estimation with "lots of names" being ten thousand,
with "practically never" being a one in a million chance. We need to go
all the way up to eight characters to reach that probability.

It's entirely possible that 7 or even 6 characters would be enough for
most users. However, collisions result in confusing errors which are
also hard to reproduce for us unless we can use exactly the same seed
and source code for a build.

So, play it safe, and use 8 characters. The constant now also has
documentation explaining how we arrived at that figure.
3 years ago
Daniel Martí 5e3ba2fc09
update the list of runtime-related packages for 1.16 (#246)
With a few extra lines, we can keep Go 1.15 support in the table too.

Re-enables the goprivate.txt test for Go 1.16.

While at it, make the script's use of grep a bit simpler with -E, which
also uses the same syntax as Go's regexp. Its skip logic was also buggy,
resulting in the macos results always being empty.

Updates #124.
3 years ago
Daniel Martí 89dbdb69a1
start working on Go 1.16 support (#244)
There are three minor bugs breaking Go 1.16 with the current version of
garble, after the import path obfuscation refactor:

1) Stripping the runtime results in an unused import error. This PR
   fixes that.

2) The asm.txt test seems to be broken; something to do with the export
   data not being right for the exported assembly func.

3) The obfuscated build of std fails, since our runtimeRelated table was
   generated for Go 1.15, not 1.16.

This PR fixes the first issue, adds conditional skip lines for 1.16 for
the other two issues, and enables 1.16 on CI.

Note that 1.16 support is not here just yet, because of the other two
issues. As such, no doc changes.

Updates #124.
3 years ago
Daniel Martí 180d64236a
properly fix obfuscated imports with their package names (#245)
The TODO I left there didn't take long to surface as a bug. If the
package path ends with a word containing a hyphen, that's not a valid
identifier, so we end up with invalid Go syntax.

Add that test case, as well as one where an import was already named.

To fix the issue, we just need to use the package name we got from
'go list -json'.

Fixes #243.
3 years ago
Daniel Martí 840cf9b68d
make hashWith a bit smarter (#238)
It used to return five-byte strings, and now it returns four bytes with
nearly the same number of bits of entropy.

It also avoids the exported vs unexported dance if the name isn't an
identifier, which is now common with import paths.

See the added docs for more details.
3 years ago
Daniel Martí 05d0dd1801
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.

That worked well, in most cases. Unfortunately, it had some flaws:

* Complexity. Even when most of the code is maintained in a separate
  module, the import_obfuscation.go file was still close to a thousand
  lines of code.

* Go compatibility. The object file format changes between Go releases,
  so we were supporting Go 1.15, but not 1.16. Fixing the object file
  package to work with 1.16 would probably break 1.15 support.

* Bugs. For example, we recently had to add a workaround for #224, since
  import paths containing dots after the domain would end up escaped.
  Another example is #190, which seems to be caused by the object file
  parser or writer corrupting the compiled code and causing segfaults in
  some rare edge cases.

Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:

1) Replace its "package foo" lines with the obfuscated package path. No
   need to separate the package path and name, since the obfuscated path
   does not contain slashes.

2) Replace the "-p pkg/foo" flag with the obfuscated path.

3) Replace the "import" spec lines with the obfuscated package paths,
   for those dependencies which were obfuscated.

4) Replace the "-importcfg [...]" file with a version that uses the
   obfuscated paths instead.

The linker also needs that last step, since it also uses an importcfg
file to find object files.

There are three noteworthy drawbacks to this new method:

1) Since we no longer write object files, we can't use them to store
   data to be cached. As such, the -debugdir flag goes back to using the
   "-a" build flag to always rebuild all packages. On the plus side,
   that caching didn't work very well; see #176.

2) The package name "main" remains in all declarations under it, not
   just "func main", since we can only rename entire packages. This
   seems fine, as it gives little information to the end user.

3) The -tiny mode no longer sets all lines to 0, since it did that by
   modifying object files. As a temporary measure, we instead set all
   top-level declarations to be on line 1. A TODO is added to hopefully
   improve this again in the near future.

The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.

Fixes #176.
Fixes #190.
3 years ago
Daniel Martí e2a32634a6
simplify, improve, and test line obfuscation (#239)
First, remove the shuffling of the declarations list within each file.
This is what we used at the very start to shuffle positions. Ever since
we started obfuscating positions via //line comments, that has been
entirely unnecessary.

Second, add a proper test that will fail if we don't obfuscate line
numbers well enough. Filenames were already decently covered by other
tests.

Third, simplify the line obfuscation code. It does not require
astutil.Apply, and ranging over file.Decls is easier.

Finally, also obfuscate the position of top-level vars, since we only
used to do it for top-level funcs. Without that fix, the test would fail
as varLines was unexpectedly sorted.
3 years ago
Daniel Martí 63c42c3cc7
support typechecking all of std (#236)
There was one bug keeping the command below from working:

	GOPRIVATE='*' garble build std

The bug is rather obscure; I'm still working on a minimal reproducer
that I can submit upstream, and I'm not yet convinced about where the
bug lives and how it can be fixed.

In short, the command would fail with:

	typecheck error: /go/src/crypto/ecdsa/ecdsa.go:122:12: cannot use asn1.SEQUENCE (constant 48 of type asn1.Tag) as asn1.Tag value in argument to b.AddASN1

Note that the error is ambiguous; there are two asn1 packages, but they
are actually mismatching. We can see that by manually adding debug
prints to go/types:

	constant: asn1.SEQUENCE (constant 48 of type golang.org/x/crypto/cryptobyte/asn1.Tag)
	argument type: vendor/golang.org/x/crypto/cryptobyte/asn1.Tag

It's clear that, for some reason, go/types ends up confused and loading
a vendored and non-vendored version of asn1. There also seems to be no
way to work around this with our lookup function, as it just receives an
import path as a parameter, and returns an object file reader.

For now, work around the issue by *not* using a custom lookup function
in this rare edge case involving vendored dependencies in std packages.
The added code has a lengthy comment explaining the reasoning.

I still intend to investigate this further, but there's no reason to
keep garble failing if we can work around the bug.

Fixes #223.
3 years ago
Daniel Martí 2fa5697189
avoid panic on funcs that almost look like tests (#235)
The added test case used to crash garble:

	panic: runtime error: invalid memory address or nil pointer dereference [recovered]
		panic: runtime error: invalid memory address or nil pointer dereference
	[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x8fe71b]

	goroutine 1 [running]:
	golang.org/x/tools/go/ast/astutil.Apply.func1(0xc0001e8880, 0xc000221570)
		/go/pkg/mod/golang.org/x/tools@v0.0.0-20210115202250-e0d201561e39/go/ast/astutil/rewrite.go:47 +0x97
	panic(0x975bc0, 0xd6c610)
		/sdk/go1.15.8/src/runtime/panic.go:969 +0x1b9
	go/types.(*Named).Obj(...)
		/sdk/go1.15.8/src/go/types/type.go:473
	mvdan.cc/garble.isTestSignature(0xc0001e7080, 0xa02e84)
		/src/garble/main.go:1170 +0x7b
	mvdan.cc/garble.(*transformer).transformGo.func2(0xc000122df0, 0xaac301)
		/src/garble/main.go:1028 +0xff1

We were assuming that the first parameter was a named type, but that
might not be the case.

This crash was found out in the wild, from which a minimal repro was
written. We add two variants of it to the test data, just in case.
3 years ago
Daniel Martí e64fccd367
better document and position the hash base64 encoding (#234)
We now document why we use a custom base64 charset.

The old "b64" name was also too generic, so it might have been misused
for other purposes.
3 years ago
Daniel Martí e33179d480
reverse: support unexported names and package paths (#233)
Unexported names are a bit tricky, since they are not listed in the
export data file. Perhaps unsurprisingly, it's only meant to expose
exported objects.

One option would be to go back to adding an extra header to the export
data file, containing the unexported methods in a map[string]T or
[]string. However, we have an easier route: just parse the Go files and
look up the names directly.

This does mean that we parse the Go files every time "reverse" runs,
even if the build cache is warm, but that should not be an issue.
Parsing Go files without any typechecking is very cheap compared to
everything else we do. Plus, we save having to load go/types information
from the build cache, or having to load extra headers from export files.

It should be noted that the obfuscation process does need type
information, mainly to be careful about which names can be obfuscated
and how they should be obfuscated. Neither is a worry here; all names
belong to a single package, and it doesn't matter if some aren't
actually obfuscated, since the string replacements would simply never
trigger in practice.

The test includes an unexported func, to test the new feature. We also
start reversing the obfuscation of import paths. Now, the test's reverse
output is as follows:

	goroutine 1 [running]:
	runtime/debug.Stack(0x??, 0x??, 0x??)
		runtime/debug/stack.go:24 +0x??
	test/main/lib.ExportedLibFunc(0x??, 0x??, 0x??, 0x??)
		p.go:6 +0x??
	main.unexportedMainFunc(...)
		C.go:2
	main.main()
		z.go:3 +0x??

The only major missing feature is positions and filenames. A follow-up
PR will take care of those.

Updates #5.
3 years ago
Daniel Martí a499a6bcd7
testdata: simplify goversion test (#232)
As per the TODO, just set up the files in the right places.
3 years ago
Daniel Martí af517a20f8 make the handling of import paths more robust
First, make isPrivate panic on malformed import paths, since that should
never happen. This catches the errors that some users had run into with
packages like gopkg.in/yaml.v2 and github.com/satori/go.uuid:

	panic: malformed import path "gopkg.in/garbletest%2ev2": invalid char '%'

This seems to trigger when a module path contains a dot after the first
element, *and* that module is fetched via the proxy. This results in the
toolchain URL-encoding the second dot, and garble ends up seeing that
encoded path.

We reproduce this behavior with a fake gopkg.in module added to the test
module proxy. Using yaml.v2 directly would have been easier, but it's
pretty large. Note that we tried a replace directive, but that does not
trigger the URL-encoding bug.

Also note that we do not obfuscate the gopkg.in package; that's fine, as
the isPrivate path validity check catches the bug either way.

For now, make initImport use url.PathUnescape to work around this issue.
The underlying bug is likely in either the goobj2 fork, or in the
upstream Go toolchain itself.

hashImport also gives a better error if it cannot find a package now,
rather than just an "empty seed" panic.

Finally, the sanity check in isPrivate unearthed the fact that we do not
support garbling test packages at all, since they were invalid paths
which never matched GOPRIVATE. Add an explicit check and TODO about
that.

Fixes #224.
Fixes #228.
3 years ago
Daniel Martí 5aa1a46cf7
README: document how to use different Go versions (#229)
This is a bit tricky right now, but will hopefully become simpler if
https://github.com/golang/go/issues/44329 gets approved and implemented.

Until then, suggest ways to do it in POSIX Shell.

While at it, stop using "garble" as a verb, and just use "obfuscate".
It's a bit silly if we say "garble garbles code"; it's much more
intuitive and less repetitive to say "garble obfuscates code".

The code still says "garble X" and "X was garbled" in a few places, and
we should fix those, but at least they are not publicly visible.
3 years ago