Commit Graph

353 Commits (a1447899103a0f0294d6931567f20a4eb32594a4)
 

Author SHA1 Message Date
Daniel Martí 13e4ba2ae0 use "obfuscate" instead of "garble" in some more places
Mainly comments. "garble" refers to the tool, but the verb and adjective
is more intuitive as "obfuscate" and "obfuscated" instead of "garble"
and "garbled".
3 years ago
Daniel Martí 961daf20c4
rework the position obfuscator (#282)
First, rename line_obfuscator.go to position.go. We obfuscate filenames,
not just line numbers, and "obfuscator" is a bit redundant.

Second, use "/*line :x*/" comments rather than the "//line :x" form, as
the former allows us to insert them in any position without adding
unnecessary newlines. This will be important for changing the position
of call sites, which will be important for "garble reverse".

Third, do not rely on go/ast to remove and add comments. Since they are
free-floating, we can very easily end up with misplaced comments,
especially as the literal obfuscator heavily modifies the AST.

The new method prints and re-parses the file, to ensure all node
positions are consistent with a buffer, buf1. Then, we copy the contents
into a new buffer, buf2, while inserting the comments that we need.

The new method also modifies line numbers at the very end of obfuscating
a Go file, instead of at the very beginning. That's going to be more
robust long-term, as we will also obfuscate line numbers for any
additions or modifications to the AST.

Fourth, detachedDirectives is unnecessary, as we can accomplish the same
with two simple prefix matches.

Finally, this means we can stop using detachedComments entirely, as
printFile already inserts the comments we need.

For #5.
3 years ago
Daniel Martí ea19e39aa4 use hashWith for obfuscation position information
Position information was obfuscated with math/rand manually, which meant
that the resulting positions were pretty small like "x.go:34", but they
were also very hard to reverse due to their short length and difficulty
to reproduce.

We now hash them with hashWith and the package's GarbleActionID:

	"main.go:203" hashed with 933ad1c700755b7c3a9913c55cade1 to "mwu1xuNz.go"

The input to the hash is the base filename and the byte offset of the
declaration within the file, meaning that it's unique within a package.
The output filename is long enough to allow easy reversal.

The line number is always 1, since the information needed for reversing
is contained entirely within the filename. It doesn't really matter if
we encode data in the filename or line number, but it's easier for us to
use a string.

For #5.
3 years ago
Andrew LeFevre 24518ceead fix failing to build user packages named "embed" 3 years ago
Daniel Martí 5d74ab07f5 all: replace uses of the deprecated ioutil
Now that we require Go 1.16, we can simplify code by removing ioutil.
3 years ago
Daniel Martí a328a487f8 improve the code comments around -seed
Explain why we print the random seed on failure, and why we accept
base64 padding when we don't use it.

While at it, the flagOptions.Random field is unused, so remove it. It
also seems wrong for it to exist; the random value is only for the seed
flag, which already has a field of its own.
3 years ago
Daniel Martí f3ea42230a remove the last remaining -debugdir buffer
We were still printing files into a buffer, which got left behind from
when we used to store the obfuscated source in object files.

All that code is gone, so this buffer is now just wasting CPU cycles and
memory.
3 years ago
Daniel Martí 9312938650
remove obsolete TODOs (#274)
obfuscatedTypesPackage can't be outright removed, since we still do
require knowledge of what was obfuscated for one reason: types used with
reflection. Since we can only figure that out with go/ast and not just
go/types, we rely on the original compilation to tell us that
information.

IgnoreFuncBodies=true for typechecking the original source code would be
nice, as we would save time, but ultimately it doesn't work. When we
rename top-level declarations such as functions and types, we also need
to amend their references in func bodies. We depend on type information
for that.

Finally, we've been randomizing filenames for a while now. Randomizing
the order of the files doesn't seem to be useful.
3 years ago
Daniel Martí 748c6a0538
obfuscate asm function names as well (#273)
Historically, it was impossible to rename those funcs as the
implementation was in assembly files, and we only transformed Go code.

Now that transformAsm exists, it's only about fifty lines to do some
very basic parsing and rewriting of assembly files.

This fixes the obfuscated builds of multiple std packages, including a
few dependencies of net/http, since they included assembly funcs which
called pure Go functions. Those pure Go functions had their names
obfuscated, breaking the call sites in assembly.

Fixes #258.
Fixes #261.
3 years ago
Daniel Martí ffeea469e6
remove a couple of pieces of dead code (#272)
We no longer cache the obfuscated source code for -debugdir, so
obfSrcArchive is entirely useless at this point. We likely didn't notice
because the Go compiler doesn't realise it's unused.

symAbis is also unnecessary. It used to be necessary as we could only
collect the action ID from transformCompile in the second asm
invocation. Since we no longer need the temporary file dance with
transformCompile, we can simplify that code too.
3 years ago
Daniel Martí 06a7a21e17
changelog: forgot to add github link (#271)
Otherwise the [0.1.0] link doesn't render at all.
3 years ago
lu4p 810bdf448e don't obfuscate the "embed" import path
Updates #263
3 years ago
Daniel Martí def9351b25
fix and re-enable "garble test" (#268)
With the many refactors building up to v0.1.0, we broke "garble test" as
we no longer dealt with test packages well.

Luckily, now that we can depend on TOOLEXEC_IMPORTPATH, we can support
the test command again, as we can always figure out what package we're
currently compiling, without having to track a "main" package.

Note that one major pitfall there is test packages, where
TOOLEXEC_IMPORTPATH does not agree with ImportPath from "go list -json".
However, we can still work around that with a bit of glue code, which is
also copiously documented.

The second change necessary is to consider test packages private
depending on whether their non-test package is private or not. This can
be done via the ForTest field in "go list -json".

The third change is to obfuscate "_testmain.go" files, which are the
code-generated main functions which actually run tests. We used to not
need to obfuscate them, since test function names are never obfuscated
and we used to not obfuscate import paths at compilation time. Now we do
rewrite import paths, so we must do that for "_testmain.go" too.

The fourth change is to re-enable test.txt, and expand it with more
sanity checks and edge cases.

Finally, document "garble test" again.

Fixes #241.
3 years ago
Daniel Martí b71ff42877
testdata: remove some unnecessary execs (#267)
Our use of go-internal's gotooltest.Setup means that "go" is set up as a
top-level command in the scripts, so we can simply "go build" instead of
having to "exec go build". The result is practically the same, but the
scripts are simpler.

While at it, I had left an "exec cat" behind; remove it.
3 years ago
Daniel Martí 4e9ee17ec8
refactor "current package" with TOOLEXEC_IMPORTPATH (#266)
Now that we've dropped support for Go 1.15.x, we can finally rely on
this environment variable for toolexec calls, present in Go 1.16.

Before, we had hacky ways of trying to figure out the current package's
import path, mostly from the -p flag. The biggest rough edge there was
that, for main packages, that was simply the package name, and not its
full import path.

To work around that, we had a restriction on a single main package, so
we could work around that issue. That restriction is now gone.

The new code is simpler, especially because we can set curPkg in a
single place for all toolexec transform funcs.

Since we can always rely on curPkg not being nil now, we can also start
reusing listedPackage.Private and avoid the majority of repeated calls
to isPrivate. The function is cheap, but still not free.

isPrivate itself can also get simpler. We no longer have to worry about
the "main" edge case. Plus, the sanity check for invalid package paths
is now unnecessary; we only got malformed paths from goobj2, and we now
require exact matches with the ImportPath field from "go list -json".

Another effect of clearing up the "main" edge case is that -debugdir now
uses the right directory for main packages. We also start using
consistent debugdir paths in the tests, for the sake of being easier to
read and maintain.

Finally, note that commandReverse did not need the extra call to "go
list -toolexec", as the "shared" call stored in the cache is enough. We
still call toolexecCmd to get said cache, which should probably be
simplified in a future PR.

While at it, replace the use of the "-std" compiler flag with the
Standard field from "go list -json".
3 years ago
Daniel Martí ff0bea73b5
all: drop support for Go 1.15.x (#265)
This mainly cleans up the few bits of code where we explicitly kept
support for Go 1.15.x. With v0.1.0 released, we can drop support now,
since the next v0.2.0 release will only support Go 1.16.x.

Also updates all modules, including test ones, to 'go 1.16'.

Note that the TOOLEXEC_IMPORTPATH refactor is not done here, despite all
the TODOs about doing so when we drop 1.15 support. This is because that
refactor needs to be done carefully and might have side effects, so it's
best to keep it to a separate commit.

Finally, update the deps.
3 years ago
Daniel Martí 1fcf50bbdd release v0.1.0 3 years ago
Daniel Martí 2a9c0b7bf4
prepare for the first release (#264)
First, write a changelog file. We will use GitHub releases, but the
content in those is not stored in git nor is it portable or machine
readable. The canonical place for the changelog is here.

Second, disable 'garble test', as it is entirely broken. Issue #241
tracks fixing and re-enabling it, which will most likely happen for the
next release.

Third, disable the undocumented 'garble list'. This was added as part of
'garble reverse', but it never got used. I can't think of any reason why
any end user would prefer it over 'go list', either.

'garble reverse' remains enabled, but undocumented as it isn't fully
functional yet. Until it supports position information, it's not
particularly useful to end users. But it's not broken either, so it can
remain where it is.

Fourth, update the '-tiny' size reduction numbers in the README. Since
we removed the in-place modification of object files, we are no longer
able to do such an aggressive stripping of info. Garble itself drops in
size by 2%, so replace the old 6-10% estimate by 2-5%. We probably will
gain some of this back in the near future.

Finally, fix the indentation formatting of the README to consistently
use tabs.
3 years ago
Daniel Martí 1267e2eced
fix link errors when importing crypto/ecdsa (#262)
First, we had some link errors such as:

	cannot find package J6OzO8GN (using -importcfg)

This was caused by the code that writes an updated importcfg, which did
not handle import maps well. That code is now fixed, and we also add an
obfuscatedImportPath method for clarity.

Once fixed, we ran into other link errors:

	Pw3g97ww.addVW: relocation target Pw3g97ww.addVWlarge not defined

After some digging, the cause of those is assembly code that we do not
yet support obfuscating. #261 tracks that.

Meanwhile, to fix "GOPRIVATE=* garble build" and to be able to have a
test for the original import path bug, we add the packages which use
that form of assembly code to runtimeRelated - math/big and
crypto/sha512. There might be more, but these were the ones found by
trying to link crypto/tls, a fairly common dependency.

Fixes #256.
3 years ago
Daniel Martí 99887c13c2 ignore -ldflags=-X flags mentioning unknown packages
That would panic, since the *listedPackage would be nil for a package
path we aren't aware of:

	panic: runtime error: invalid memory address or nil pointer dereference
	[signal SIGSEGV: segmentation violation code=0x1 addr=0x88 pc=0x126b57d]

	goroutine 1 [running]:
	main.transformLink.func1(0x7ffeefbff28b, 0x5d)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1260 +0x17d
	main.flagValueIter(0xc0000a8e20, 0x2f, 0x2f, 0x12e278e, 0x2, 0xc000129e28)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1410 +0x1e9
	main.transformLink(0xc0000a8e20, 0x30, 0x36, 0x4, 0xc000114648, 0x23, 0x12dfd60, 0x0)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:1241 +0x1b9
	main.mainErr(0xc0000a8e10, 0x31, 0x37, 0x37, 0x0)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:287 +0x389
	main.main1(0xc000096058)
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:150 +0xe7
	main.main()
		mvdan.cc/garble@v0.0.0-20210302140807-b03cd08c0946/main.go:83 +0x25

The linker ignores such unknown references, so we should too.

Fixes #259.
3 years ago
lu4p a397a8e94e Literals: Skip constants with inferred values.
Obfuscating literals broke constants
with values inferred via iota before,
because it would be moved to a variable declaration instead.
3 years ago
Daniel Martí b03cd08c09
avoid one more call to 'go tool buildid' (#253)
We use it to get the content ID of garble's binary, which is used for
both the garble action IDs, as well as 'go tool compile -V=full'.

Since those two happen in separate processes, both used to call 'go tool
buildid' separately. Store it in the gob cache the first time, and reuse
it the second time.

Since each call to cmd/go costs about 10ms (new process, running its
many init funcs, etc), this results in a nice speed-up for our small
benchmark. Most builds will take many seconds though, so note that a
~15ms speedup there will likely not be noticeable.

While at it, simplify the buildInfo global, as now it just contains a
map representation of the -importcfg contents. It now has better names,
docs, and a simpler representation.

We also stop using the term "garbled import", as it was a bit confusing.
"obfuscated types.Package" is a much better description.

	name     old time/op       new time/op       delta
	Build-8        106ms ± 1%         92ms ± 0%  -14.07%  (p=0.010 n=6+4)

	name     old bin-B         new bin-B         delta
	Build-8        6.60M ± 0%        6.60M ± 0%   -0.01%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        208ms ± 5%        149ms ± 3%  -28.27%  (p=0.004 n=6+5)

	name     old user-time/op  new user-time/op  delta
	Build-8        433ms ± 3%        384ms ± 3%  -11.35%  (p=0.002 n=6+6)
3 years ago
Daniel Martí c0c5a75454
testdata: fix flakiness of tiny.txt on Windows (#252)
Sometimes it prints relatively low addresses, like:

	(0xbc200,0xdd658)

We required 6 to 8 hex digits, and sometimes it printed 5. Make the
regular expression more conservative.

Fixes #184.
3 years ago
Daniel Martí 6898d61637
start using original action IDs (#251)
When we obfuscate a name, what we do is hash the name with the action ID
of the package that contains the name. To ensure that the hash changes
if the garble tool changes, we used the action ID of the obfuscated
build, which is different than the original action ID, as we include
garble's own content ID in "go tool compile -V=full" via -toolexec.

Let's call that the "obfuscated action ID". Remember that a content ID
is roughly the hash of a binary or object file, and an action ID
contains the hash of a package's source code plus the content IDs of its
dependencies.

This had the advantage that it did what we wanted. However, it had one
massive drawback: when we compile a package, we only have the obfuscated
action IDs of its dependencies. This is because one can't have the
content ID of dependent packages before they are built.

Usually, this is not a problem, because hashing a foreign name means it
comes from a dependency, where we already have the obfuscated action ID.
However, that's not always the case.

First, go:linkname directives can point to any symbol that ends up in
the binary, even if the package is not a dependency. So garble could
only support linkname targets belonging to dependencies. This is at the
root of why we could not obfuscate the runtime; it contains linkname
directives targeting the net package, for example, which depends on runtime.

Second, some other places did not have an easy access to obfuscated
action IDs, like transformAsm, which had to recover it from a temporary
file stored by transformCompile.

Plus, this was all pretty expensive, as each toolexec sub-process had to
make repeated calls to buildidOf with the object files of dependencies.
We even had to use extra calls to "go list" in the case of indirect
dependencies, as their export files do not appear in importcfg files.

All in all, the old method was complex and expensive. A better mechanism
is to use the original action IDs directly, as listed by "go list"
without garble in the picture.

This would mean that the hashing does not change if garble changes,
meaning weaker obfuscation. To regain that property, we define the
"garble action ID", which is just the original action ID hashed together
with garble's own content ID.

This is practically the same as the obfuscated build ID we used before,
but since it doesn't go through "go tool compile -V=full" and the
obfuscated build itself, we can work out *all* the garble action IDs
upfront, before the obfuscated build even starts.

This fixes all of our problems. Now we know all garble build IDs
upfront, so a bunch of hacks can be entirely removed. Plus, since we
know them upfront, we can also cache them and avoid repeated calls to
"go tool buildid".

While at it, make use of the new BuildID field in Go 1.16's "list -json
-export". This avoids the vast majority of "go tool buildid" calls, as
the only ones that remain are 2 on the garble binary itself.

The numbers for Go 1.16 look very good:

	name     old time/op       new time/op       delta
	Build-8        146ms ± 4%        101ms ± 1%  -31.01%  (p=0.002 n=6+6)

	name     old bin-B         new bin-B         delta
	Build-8        6.61M ± 0%        6.60M ± 0%   -0.09%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        321ms ± 7%        202ms ± 6%  -37.11%  (p=0.002 n=6+6)

	name     old user-time/op  new user-time/op  delta
	Build-8        538ms ± 4%        414ms ± 4%  -23.12%  (p=0.002 n=6+6)
3 years ago
Daniel Martí 09e244986e formally add support for Go 1.16
This was pretty much just fixing the README and closing the issue. The
only other noteworthy user-facing change is that, if the Go version is
detected to be too old, we now suggest 1.16.x instead of 1.15.x.

While at it, refactor goversion.txt a bit. I wanted it to print a
clearer "mocking the go build" error if another command was used like
"go build", but I didn't want to learn BAT. So, instead use a simple Go
program and build it, which will work on all platforms. The added
"go build" step barely takes 100ms on my machine, given how simple the
program is.

The [short] line also doesn't seem necessary to me. The entire script
runs in under 200ms for me, so it's well within the realm of "short", at
least compared to many of the other test scripts.

Fixes #124.
3 years ago
Andrew LeFevre 081cba38d7 strip a few more unneeded runtime functions
Strip a few more traceback printing related functions in the runtime, new for Go 1.16. Also test that GODEBUG=inittrace=1 does not print anything when -tiny is passed.
3 years ago
Daniel Martí 2ee6604408
replace the -p path when assembling (#247)
The asm tool runs twice for a package with assembly. The second time it
does, the path given to the -p flag matters, just like in the compiler,
as we generate an object file.

We don't have a -buildid flag in the asm tool, so obtaining the action
ID to obfuscate the package path with is a bit tricky. We store it from
transformCompile, and read it from transformAsm. See the detailed docs
for more.

This was the last "skip" line in the tests due to Go 1.16. After all PRs
are merged, one last PR documenting that 1.16 is supported will be sent,
closing the issue for good.

It's unclear why this wasn't an issue in Go 1.15. My best guess is that
the ABI changes only happened in Go 1.16, and this causes exported asm
funcs to start showing up in object files with their package paths.

Updates #124.
3 years ago
Daniel Martí a223147093
use more bits for the obfuscated name hashes (#248)
We've been using four base64 characters for obfuscated names for a
while. And that has mostly worked, since most packages only have up to a
few hundred exported or unexported names at a time.

However, we have already encountered two collisions in the wild, which
can be reproduced with one seed but not another:

	[...] PsaN.hQyW is a field, not a method

	[...] byte is not a type

In both of those cases, we happened to run into a collision by chance.
And that's not terribly unlikely to begin with; even with just 100
names, the probability of a collision was about 0.03%. It dramatically
goes up if there are more names; with 500, we're already around 0.75%.

It's clear that four base64 chars is not enough to properly avoid
collisions in the vast majority of cases. But how many characters are
enough? The target should be that, even with a very large package and
lots of names, we should still practically never have a collision.

I did some basic estimation with "lots of names" being ten thousand,
with "practically never" being a one in a million chance. We need to go
all the way up to eight characters to reach that probability.

It's entirely possible that 7 or even 6 characters would be enough for
most users. However, collisions result in confusing errors which are
also hard to reproduce for us unless we can use exactly the same seed
and source code for a build.

So, play it safe, and use 8 characters. The constant now also has
documentation explaining how we arrived at that figure.
3 years ago
Daniel Martí 5e3ba2fc09
update the list of runtime-related packages for 1.16 (#246)
With a few extra lines, we can keep Go 1.15 support in the table too.

Re-enables the goprivate.txt test for Go 1.16.

While at it, make the script's use of grep a bit simpler with -E, which
also uses the same syntax as Go's regexp. Its skip logic was also buggy,
resulting in the macos results always being empty.

Updates #124.
3 years ago
Daniel Martí 89dbdb69a1
start working on Go 1.16 support (#244)
There are three minor bugs breaking Go 1.16 with the current version of
garble, after the import path obfuscation refactor:

1) Stripping the runtime results in an unused import error. This PR
   fixes that.

2) The asm.txt test seems to be broken; something to do with the export
   data not being right for the exported assembly func.

3) The obfuscated build of std fails, since our runtimeRelated table was
   generated for Go 1.15, not 1.16.

This PR fixes the first issue, adds conditional skip lines for 1.16 for
the other two issues, and enables 1.16 on CI.

Note that 1.16 support is not here just yet, because of the other two
issues. As such, no doc changes.

Updates #124.
3 years ago
Daniel Martí 180d64236a
properly fix obfuscated imports with their package names (#245)
The TODO I left there didn't take long to surface as a bug. If the
package path ends with a word containing a hyphen, that's not a valid
identifier, so we end up with invalid Go syntax.

Add that test case, as well as one where an import was already named.

To fix the issue, we just need to use the package name we got from
'go list -json'.

Fixes #243.
3 years ago
Daniel Martí 840cf9b68d
make hashWith a bit smarter (#238)
It used to return five-byte strings, and now it returns four bytes with
nearly the same number of bits of entropy.

It also avoids the exported vs unexported dance if the name isn't an
identifier, which is now common with import paths.

See the added docs for more details.
3 years ago
Daniel Martí 05d0dd1801
reimplement import path obfuscation without goobj2 (#242)
We used to rely on a parallel implementation of an object file parser
and writer to be able to obfuscate import paths. After compiling each
package, we would parse the object file, replace the import paths, and
write the updated object file in-place.

That worked well, in most cases. Unfortunately, it had some flaws:

* Complexity. Even when most of the code is maintained in a separate
  module, the import_obfuscation.go file was still close to a thousand
  lines of code.

* Go compatibility. The object file format changes between Go releases,
  so we were supporting Go 1.15, but not 1.16. Fixing the object file
  package to work with 1.16 would probably break 1.15 support.

* Bugs. For example, we recently had to add a workaround for #224, since
  import paths containing dots after the domain would end up escaped.
  Another example is #190, which seems to be caused by the object file
  parser or writer corrupting the compiled code and causing segfaults in
  some rare edge cases.

Instead, let's drop that method entirely, and force the compiler and
linker to do the work for us. The steps necessary when compiling a
package to obfuscate are:

1) Replace its "package foo" lines with the obfuscated package path. No
   need to separate the package path and name, since the obfuscated path
   does not contain slashes.

2) Replace the "-p pkg/foo" flag with the obfuscated path.

3) Replace the "import" spec lines with the obfuscated package paths,
   for those dependencies which were obfuscated.

4) Replace the "-importcfg [...]" file with a version that uses the
   obfuscated paths instead.

The linker also needs that last step, since it also uses an importcfg
file to find object files.

There are three noteworthy drawbacks to this new method:

1) Since we no longer write object files, we can't use them to store
   data to be cached. As such, the -debugdir flag goes back to using the
   "-a" build flag to always rebuild all packages. On the plus side,
   that caching didn't work very well; see #176.

2) The package name "main" remains in all declarations under it, not
   just "func main", since we can only rename entire packages. This
   seems fine, as it gives little information to the end user.

3) The -tiny mode no longer sets all lines to 0, since it did that by
   modifying object files. As a temporary measure, we instead set all
   top-level declarations to be on line 1. A TODO is added to hopefully
   improve this again in the near future.

The upside is that we get rid of all the issues mentioned before. Plus,
garble now nearly works with Go 1.16, with the exception of two very
minor bugs that look fixable. A follow-up PR will take care of that and
start testing on 1.16.

Fixes #176.
Fixes #190.
3 years ago
Daniel Martí e2a32634a6
simplify, improve, and test line obfuscation (#239)
First, remove the shuffling of the declarations list within each file.
This is what we used at the very start to shuffle positions. Ever since
we started obfuscating positions via //line comments, that has been
entirely unnecessary.

Second, add a proper test that will fail if we don't obfuscate line
numbers well enough. Filenames were already decently covered by other
tests.

Third, simplify the line obfuscation code. It does not require
astutil.Apply, and ranging over file.Decls is easier.

Finally, also obfuscate the position of top-level vars, since we only
used to do it for top-level funcs. Without that fix, the test would fail
as varLines was unexpectedly sorted.
3 years ago
Daniel Martí 63c42c3cc7
support typechecking all of std (#236)
There was one bug keeping the command below from working:

	GOPRIVATE='*' garble build std

The bug is rather obscure; I'm still working on a minimal reproducer
that I can submit upstream, and I'm not yet convinced about where the
bug lives and how it can be fixed.

In short, the command would fail with:

	typecheck error: /go/src/crypto/ecdsa/ecdsa.go:122:12: cannot use asn1.SEQUENCE (constant 48 of type asn1.Tag) as asn1.Tag value in argument to b.AddASN1

Note that the error is ambiguous; there are two asn1 packages, but they
are actually mismatching. We can see that by manually adding debug
prints to go/types:

	constant: asn1.SEQUENCE (constant 48 of type golang.org/x/crypto/cryptobyte/asn1.Tag)
	argument type: vendor/golang.org/x/crypto/cryptobyte/asn1.Tag

It's clear that, for some reason, go/types ends up confused and loading
a vendored and non-vendored version of asn1. There also seems to be no
way to work around this with our lookup function, as it just receives an
import path as a parameter, and returns an object file reader.

For now, work around the issue by *not* using a custom lookup function
in this rare edge case involving vendored dependencies in std packages.
The added code has a lengthy comment explaining the reasoning.

I still intend to investigate this further, but there's no reason to
keep garble failing if we can work around the bug.

Fixes #223.
3 years ago
Daniel Martí 2fa5697189
avoid panic on funcs that almost look like tests (#235)
The added test case used to crash garble:

	panic: runtime error: invalid memory address or nil pointer dereference [recovered]
		panic: runtime error: invalid memory address or nil pointer dereference
	[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x8fe71b]

	goroutine 1 [running]:
	golang.org/x/tools/go/ast/astutil.Apply.func1(0xc0001e8880, 0xc000221570)
		/go/pkg/mod/golang.org/x/tools@v0.0.0-20210115202250-e0d201561e39/go/ast/astutil/rewrite.go:47 +0x97
	panic(0x975bc0, 0xd6c610)
		/sdk/go1.15.8/src/runtime/panic.go:969 +0x1b9
	go/types.(*Named).Obj(...)
		/sdk/go1.15.8/src/go/types/type.go:473
	mvdan.cc/garble.isTestSignature(0xc0001e7080, 0xa02e84)
		/src/garble/main.go:1170 +0x7b
	mvdan.cc/garble.(*transformer).transformGo.func2(0xc000122df0, 0xaac301)
		/src/garble/main.go:1028 +0xff1

We were assuming that the first parameter was a named type, but that
might not be the case.

This crash was found out in the wild, from which a minimal repro was
written. We add two variants of it to the test data, just in case.
3 years ago
Daniel Martí e64fccd367
better document and position the hash base64 encoding (#234)
We now document why we use a custom base64 charset.

The old "b64" name was also too generic, so it might have been misused
for other purposes.
3 years ago
Daniel Martí e33179d480
reverse: support unexported names and package paths (#233)
Unexported names are a bit tricky, since they are not listed in the
export data file. Perhaps unsurprisingly, it's only meant to expose
exported objects.

One option would be to go back to adding an extra header to the export
data file, containing the unexported methods in a map[string]T or
[]string. However, we have an easier route: just parse the Go files and
look up the names directly.

This does mean that we parse the Go files every time "reverse" runs,
even if the build cache is warm, but that should not be an issue.
Parsing Go files without any typechecking is very cheap compared to
everything else we do. Plus, we save having to load go/types information
from the build cache, or having to load extra headers from export files.

It should be noted that the obfuscation process does need type
information, mainly to be careful about which names can be obfuscated
and how they should be obfuscated. Neither is a worry here; all names
belong to a single package, and it doesn't matter if some aren't
actually obfuscated, since the string replacements would simply never
trigger in practice.

The test includes an unexported func, to test the new feature. We also
start reversing the obfuscation of import paths. Now, the test's reverse
output is as follows:

	goroutine 1 [running]:
	runtime/debug.Stack(0x??, 0x??, 0x??)
		runtime/debug/stack.go:24 +0x??
	test/main/lib.ExportedLibFunc(0x??, 0x??, 0x??, 0x??)
		p.go:6 +0x??
	main.unexportedMainFunc(...)
		C.go:2
	main.main()
		z.go:3 +0x??

The only major missing feature is positions and filenames. A follow-up
PR will take care of those.

Updates #5.
3 years ago
Daniel Martí a499a6bcd7
testdata: simplify goversion test (#232)
As per the TODO, just set up the files in the right places.
3 years ago
Daniel Martí af517a20f8 make the handling of import paths more robust
First, make isPrivate panic on malformed import paths, since that should
never happen. This catches the errors that some users had run into with
packages like gopkg.in/yaml.v2 and github.com/satori/go.uuid:

	panic: malformed import path "gopkg.in/garbletest%2ev2": invalid char '%'

This seems to trigger when a module path contains a dot after the first
element, *and* that module is fetched via the proxy. This results in the
toolchain URL-encoding the second dot, and garble ends up seeing that
encoded path.

We reproduce this behavior with a fake gopkg.in module added to the test
module proxy. Using yaml.v2 directly would have been easier, but it's
pretty large. Note that we tried a replace directive, but that does not
trigger the URL-encoding bug.

Also note that we do not obfuscate the gopkg.in package; that's fine, as
the isPrivate path validity check catches the bug either way.

For now, make initImport use url.PathUnescape to work around this issue.
The underlying bug is likely in either the goobj2 fork, or in the
upstream Go toolchain itself.

hashImport also gives a better error if it cannot find a package now,
rather than just an "empty seed" panic.

Finally, the sanity check in isPrivate unearthed the fact that we do not
support garbling test packages at all, since they were invalid paths
which never matched GOPRIVATE. Add an explicit check and TODO about
that.

Fixes #224.
Fixes #228.
3 years ago
Daniel Martí 5aa1a46cf7
README: document how to use different Go versions (#229)
This is a bit tricky right now, but will hopefully become simpler if
https://github.com/golang/go/issues/44329 gets approved and implemented.

Until then, suggest ways to do it in POSIX Shell.

While at it, stop using "garble" as a verb, and just use "obfuscate".
It's a bit silly if we say "garble garbles code"; it's much more
intuitive and less repetitive to say "garble obfuscates code".

The code still says "garble X" and "X was garbled" in a few places, and
we should fix those, but at least they are not publicly visible.
3 years ago
Daniel Martí af7f1fc4eb
README: clarify that Go 1.16 is not yet supported (#230)
Since that version is out now, this will help avoid confusion with
newcomers. The issue is pinned now, too.

While at it, stop saying why 1.14.x is no longer supported, as it's not
a maintained version of Go at this point anymore.

For #124.
3 years ago
Daniel Martí 79c775e218
obfuscate unexported names like exported ones (#227)
In 90fa325da7, the obfuscation logic was changed to use hashes for
exported names, but incremental names starting at just one letter for
unexported names. Presumably, this was done for the sake of binary size.

I argue that this is not a good idea for the default mode for a number
of reasons:

1) It makes reversing of stack traces nearly impossible for unexported
   names, since replacing an obfuscated name "c" with "originalName"
   would trigger too many false positives by matching single characters.

2) Exported and unexported names aren't different. We need to know how
   names were obfuscated at a later time in both cases, thanks to use
   cases like -ldflags=-X. Using short names for one but not the other
   doesn't make a lot of sense, and makes the logic inconsistent.

3) Shaving off three bytes for unexported names doesn't seem like a huge
   deal for the default mode, when we already have -tiny to optimize for
   size.

This saves us a bit of work, but most importantly, simplifies the
obfuscation state as we no longer need to carry privateNameMap between
the compile and link stages.

	name     old time/op       new time/op       delta
	Build-8        153ms ± 2%        150ms ± 2%    ~     (p=0.065 n=6+6)

	name     old bin-B         new bin-B         delta
	Build-8        7.09M ± 0%        7.08M ± 0%  -0.24%  (p=0.002 n=6+6)

	name     old sys-time/op   new sys-time/op   delta
	Build-8        296ms ± 5%        277ms ± 6%  -6.50%  (p=0.026 n=6+6)

	name     old user-time/op  new user-time/op  delta
	Build-8        562ms ± 1%        558ms ± 3%    ~     (p=0.329 n=5+6)

Note that I do not oppose using short names for both exported and
unexported names in the future for -tiny, since reversing of stack
traces will by design not work there. The code can be resurrected from
the git history if we want to improve -tiny that way in the future, as
we'd need to store state in header files again.

Another major cleanup we can do here is to no longer use the
garbledImports map. From a look at obfuscateImports, we hash a package's
import path with its action ID, much like exported names, so we can
simply re-do that hashing for the linker's -X flag.

garbledImports does have some logic to handle duplicate package names,
but it's worth noting that should not affect package paths, as they are
always unique. That area of code could probably do with some
simplification in the future, too.

While at it, make hashWith panic if either parameter is empty.
obfuscateImports was hashing the main package path without a salt due to
a bug, so we want to catch those in the future.

Finally, make some tiny spacing and typo tweaks to the README.
3 years ago
Daniel Martí d8e8738216
initial support for reversing panic output (#225)
For now, this only implements reversing of exported names which are
hashed with action IDs. Many other kinds of obfuscation, like positions
and private names, are not yet implemented.

Note that we don't document this new command yet on purpose, since it's
not finished.

Some other minor cleanups were done for future changes, such as making
transformLineInfo into a method that also receives the original
filename, and making header names more self-describing.

Updates #5.
3 years ago
Daniel Martí d33faabb94
remove unused test cmds (#226)
binsubint and binsubfloat haven't been used since 388ff7d1a4 over half a
year ago, and they're a significant amount of code.

Remove them for now; we can always re-add them from the git history if
needed.
3 years ago
Daniel Martí 78b69bbdab share a single temporary directory between all processes
Each compile and link sub-process created its own temporary directory,
to be cleaned up shortly after. Moreover, we also had the global
gob-encoded temporary file.

Instead, place all of those under a single, start-to-end temporary
directory. This is cleaner for the end user, and easier to maintain for
us.

A big plus is that we can also get rid of the confusing deferred global,
as it was mostly used to clean up these extra temp dirs. The only
remaining use was post-compile code, which is now an explicit func
returned by each "transform" func.

While at it, clean up the math/rand seeding code a bit and add a debug
log line, and stop shadowing a cmd string with a cmd *exec.Cmd.

Fixes #147.
3 years ago
Daniel Martí 39d60f91e6
include a "garble version" test (#221)
Not sure why the last commit came with none.

While at it, I noticed that the version command would ignore arguments.
Error instead.
3 years ago
Daniel Martí c7ee0e08e5
add a "version" command (#220)
Mimicking "go version", this tells the user garble's own version.

The code is exactly the same that is used for another tool written in
Go, shfmt. It uses runtime/debug to fetch the module version embedded in
binaries built by Go. For example:

	$ go get mvdan.cc/sh/v3/cmd/shfmt@latest
	$ shfmt -version
	v3.2.2
	$ go get mvdan.cc/sh/v3/cmd/shfmt@master
	$ shfmt -version
	v3.3.0-0.dev.0.20210203135509-56c9918c980d

Note that this will not work for a plain "go build" or "go install"
after a "git clone", since in that case the Go tool can't know garble's
own version via go.mod - since it's the current main module:

	$ go build
	$ ./garble version
	(devel)

For the use case of the power user building from source directly, they
are probably clever enough to tell us what git commit they are on, so
this is not a big problem right now. It will also get better once
golang/go#37475 is fixed in the future.

Until then, if we need to do "release" builds locally, we can embed an
explicit version into the binary via ldflags:

	$ go build -ldflags=-X=main.version=v1.2.3
	$ ./garble version
	v1.2.3

Fixes #217.
3 years ago
Daniel Martí 1db6e1e230
make -coverprofile include toolexec processes (#216)
testscript already included magic to also account for commands in the
total code coverage. That does not happen with plain tests, since those
only include coverage from the main test process.

The main problem was that, before, indirectly executed commands did not
properly save their coverage profile anywhere for testscript to collect
it at the end. In other words, we only collected coverage from direct
garble executions like "garble -help", but not indirect ones like "go
build -toolexec=garble".

	$ go test -coverprofile=cover.out
	PASS
	coverage: 3.6% of statements
	total coverage: 16.6% of statements
	ok  	mvdan.cc/garble	6.453s

After the delicate changes to testscript, any direct or indirect
executions of commands all go through $PATH and properly count towards
the total coverage:

	$ go test -coverprofile=cover.out
	PASS
	coverage: 3.6% of statements
	total coverage: 90.5% of statements
	ok  	mvdan.cc/garble	33.258s

Note that we can also get rid of our code to set up $PATH, since
testscript now does it for us.

goversion.txt needed minor tweaks, since we no longer set up $WORK/.bin.

Finally, note that we disable the reuse of $GOCACHE when collecting
coverage information. This is to do "full builds", as otherwise the
cached package builds would result in lower coverage.

Fixes #35.
3 years ago
Andrew LeFevre e014f480f9
if the seed is random and the build fails, print the seed (#213)
Fixes #212
3 years ago