Commit Graph

51 Commits (v0.9.2)

Author SHA1 Message Date
pagran e04c605c93 preallocate data in shuffle and split literal obfuscator 1 year ago
lu4p 5b2193351f Decrease binary size for -literals
Only string literals over 8 characters in length are now being
obfuscated. This leads to around 20% smaller binaries when building with
-literals.

Fixes #618
1 year ago
pagran 45394ac13f
update linker patches from our fork 1 year ago
pagran 22c177f088
remove all unexported func names with -tiny via the linker
The patch to the linker does this when generating the pclntab,
which is the binary section containing func names.
When `-tiny` is being used, we look for unexported funcs,
and set their names to the offset `0` - a shared empty string.

We also avoid including the original name in the binary,
which saves a significant amount of space.
The following stats were collected on GOOS=linux,
which show that `-tiny` is now about 4% smaller:

	go build                  1203067
	garble build               782336
	(old) garble -tiny build   688128
	(new) garble -tiny build   659456
1 year ago
pagran 6ace03322f
patch and rebuild cmd/link to modify the magic value in pclntab
This value is hard-coded in the linker and written in a header.
We could rewrite the final binary, like we used to do with import paths,
but that would require once again maintaining libraries to do so.

Instead, we're now modifying the linker to do what we want.
It's not particularly hard, as every Go install has its source code,
and rebuilding a slightly modified linker only takes a few seconds at most.

Thanks to `go build -overlay`, we only need to copy the files we modify,
and right now we're just modifying one file in the toolchain.
We use a git patch, as the change is fairly static and small,
and the patch is easier to understand and maintain.

The other side of this change is in the runtime,
as it also hard-codes the magic value when loading information.
We modify the code via syntax trees in that case, like `-tiny` does,
because the change is tiny (one literal) and the affected lines of code
are modified regularly between major Go releases.

Since rebuilding a slightly modified linker can take a few seconds,
and Go's build cache does not cache linked binaries,
we keep our own cached version of the rebuilt binary in `os.UserCacheDir`.

The feature isn't perfect, and will be improved in the future.
See the TODOs about the added dependency on `git`,
or how we are currently only able to cache one linker binary at once.

Fixes #622.
1 year ago
Daniel Martí d955196470 avoid using math/rand's global funcs like Seed and Intn
Go 1.20 is starting to deprecate the use of math/rand's global state,
per https://go.dev/issue/56319 and https://go.dev/issue/20661.
The reasoning is sound:

	Deprecated: Programs that call Seed and then expect a specific sequence
	of results from the global random source (using functions such as Int)
	can be broken when a dependency changes how much it consumes from the
	global random source. To avoid such breakages, programs that need a
	specific result sequence should use NewRand(NewSource(seed)) to obtain a
	random generator that other packages cannot access.

Aside from the tests, we used math/rand only for obfuscating literals,
which caused a deterministic series of calls like Intn. Our call to Seed
was also deterministic, per either GarbleActionID or the -seed flag.

However, our determinism was fragile. If any of our dependencies or
other packages made any calls to math/rand's global funcs, then our
determinism could be broken entirely, and it's hard to notice.

Start using separate math/rand.Rand objects for each use case.
Also make uses of crypto/rand use "cryptorand" for consistency.

Note that this requires a bit of a refactor in internal/literals
to start passing around Rand objects. We also do away with unnecessary
short funcs, especially since math/rand's Read never errors,
and we can obtain a byte via math/rand's Uint32.
2 years ago
Daniel Martí 3c7141e801 update the state of a few TODOs related to upstream Go
The generics issue has been fixed for the upcoming Go 1.20.
Include that version as a reminder for when we can drop Go 1.19.

The fs.SkipAll proposal is also implemented for Go 1.20.

The BinaryContentID comment was a little bit trickier.
We did get stamped VCS information some time ago,
but it only provides us with the current commit info and a dirty bit.
That is not enough for our use of the build cache,
because we want any uncommitted changes to garble to cause rebuilds.

I don't think we'll get any better than using garble's own build ID.
Reword the quasi-TODO to instead explain what we're doing and why.
2 years ago
Daniel Martí 21bd89ff73 slight simplifications and alloc reductions
Reuse a buffer and a map across loop iterations, because we can.

Make recordTypeDone only track named types, as that is enough to detect
type cycles. Without named types, there can be no cycles.

These two reduce allocs by a fraction of a percent:

	name      old time/op         new time/op         delta
	Build-16          10.4s ± 2%          10.4s ± 1%    ~     (p=0.739 n=10+10)

	name      old bin-B           new bin-B           delta
	Build-16          5.51M ± 0%          5.51M ± 0%    ~     (all equal)

	name      old cached-time/op  new cached-time/op  delta
	Build-16          391ms ± 9%          407ms ± 7%    ~     (p=0.095 n=10+9)

	name      old mallocs/op      new mallocs/op      delta
	Build-16          34.5M ± 0%          34.4M ± 0%  -0.12%  (p=0.000 n=10+10)

	name      old sys-time/op     new sys-time/op     delta
	Build-16          5.87s ± 5%          5.82s ± 5%    ~     (p=0.182 n=10+9)

It doesn't seem like much, but remember that these stats are for the
entire set of processes, where garble only accounts for about 10% of the
total wall time when compared to the compiler or linker. So a ~0.1%
decrease globally is still significant.

linkerVariableStrings is also indexed by *types.Var rather than types.Object,
since -ldflags=-X only supports setting the string value of variables.
This shouldn't make a significant difference in terms of allocs,
but at least the map is less prone to confusion with other object types.
To ensure the new code doesn't trip up on non-variables, we add test cases.

Finally, for the sake of clarity, index into the types.Info maps like
Defs and Uses rather than calling ObjectOf if we know whether the
identifier we have is a definition of a name or the use of a defined name.
This isn't better in terms of performance, as ObjectOf is a tiny method,
but just like with linkerVariableStrings before, the new code is clearer.
2 years ago
lu4p 84ba444b7c
Disable seed obfuscator (#535)
The seed obfuscator uses a type declaration in order to declare a function,
which returns a function with the same type.

This breaks when obfuscating literals inside generic functions, because
type declarations inside generic functions are not currently supported.

Therefore the obfuscator gets disabled until
https://github.com/golang/go/issues/47631 is fixed.
2 years ago
shellhazard 22e3d30216
support code taking the address of a []byte literal (#530) 2 years ago
lu4p d555639657 Remove unused imports via go/types.
Fixes #481
2 years ago
Daniel Martí 2d4cc49d50 CI: bump gotip to February
While here, fix two typos.
2 years ago
Daniel Martí c9341790d4 avoid obfuscating literals set via -ldflags=-X
The -X linker flag sets a string variable to a given value,
which is often used to inject strings such as versions.

The way garble's literal obfuscation works,
we replace string literals with anonymous functions which,
when evaluated, result in the original string.

Both of these features work fine separately,
but when intersecting, they break. For example, given:

	var myVar = "original"
	[...]
	-ldflags=-X=main.myVar=replaced

The -X flag effectively replaces the initial value,
and -literals adds code to be run at init time:

	var myVar = "replaced"
	func init() { myVar = func() string { ... } }

Since the init func runs later, -literals breaks -X.
To avoid that problem,
don't obfuscate literals whose variables are set via -ldflags=-X.

We also leave TODOs about obfuscating those in the future,
but we're also leaving regression tests to ensure we get it right.

Fixes #323.
2 years ago
Daniel Martí d25e718d0c stop passing ignoreObjects to literals.Obfuscate
Literal obfuscation uses constant folding now,
so it no longer needs to record identifiers to ignore.
Remove the parameter and the outdated bit of docs.
3 years ago
Daniel Martí 4f0657a19a prepare for v0.5.0
While here, add a TODO I forgot about, and run gofumpt.

Also bump all test timeouts slightly,
as the Mac and Windows hosted runners are a bit slow
and I've hit failures twice recently.
3 years ago
Daniel Martí 29ea99fc5f CI: test on GOARCH=386
Note that this cross-compilation disables cgo by default,
and so the cgo.txt test script isn't run on GOARCH=386.
That seems fine for now, as the test isn't arch-specific.

This testing uncovered one build failure in internal/literals;
the comparison between int and math.MaxUint32 is invalid on 32-bit.
To fix that build failure, use int64 consistently.

One test also incorrectly assumed amd64; it now supports 386 too.
For any other architecture, it's being skipped for now.

I also had to increase the -race test timeout,
as it usually takes 8-9m on GitHub Actions,
and the timeout would sometimes trigger.

Finally, use "go env" rather than "go version" on CI,
which gives us much more useful information,
and also includes Go's own version now via GOVERSION.

Fixes #426.
3 years ago
lu4p a645929151
obfuscate literals via constant folding
Constants don't need to be added to ignoreObjs anymore,
because go/types now does this work for us.

Fixes #360
3 years ago
hasheddan 6b632d07e2 Fix minor typo in RecordUsedAsConstants docstring
Updates RecordUsedAsConstants docstring with minor typo fix for
identifiers.

Signed-off-by: hasheddan <georgedanielmangum@gmail.com>
3 years ago
lu4p 3ab59000f3 Follow up: Obfuscate more byte slice literals 3 years ago
Daniel Martí 691a44cecb avoid breaking const declarations using iotas
With the -literals flag, we try to convert some const declarations to
vars, as long as that doesn't break typechecking. We really only do that
for typed constant strings, really.

There was a quirk: if a numerical constant had a type and used iota, we
would not obfuscate its value, but we would still convert the
declaration from const to var. Since iotas only work within const
declarations, that would break compilation:

	> garble -literals build
	[stderr]
	# test/main
	FeWE3zwi.go:19: undefined: iota
	exit status 2

To fix the problem, make the logic more conservative: only obfuscate
constant declarations where the values are typed strings, meaning that
any numerical constants are left entirely untouched.

This fixes the build of google.golang.org/protobuf/runtime/protoiface
with -literals turned on.
3 years ago
lu4p c1672cdc0d Obfuscate more byte slice literals
Slices with hex, octal, binary and rune elements are now obfuscated.
3 years ago
lu4p 552a6bcfb0 Obfuscate literals in string slices and arrays
Fixes #354
3 years ago
Daniel Martí 8edde922ee remove unused code spotted by -coverprofile
Remove some asthelper APIs that haven't been used for some time.
They can be recovered from the git history if needed again.

One type assertion in the literals package is always true.

Embedded field objects are handled near the top of transformGo, so the
extra !obj.Embedded() check was always true. Remove it.

We always obfuscate standalone funcs now, so the obfuscatedTypesPackage
check is no longer necessary. This was necessary when we used to not
obfuscate func names when they were used in linkname directives.

The workaround for test package imports in obfuscatedTypesPackage I had
to add a few commits ago no longer seems to be necessary. This might be
thanks to the simplification with functions in the paragraph just above.

It's impossible to run garble without -trimpath nowadays, as we error
before the build even starts:

	$ go build -toolexec=garble
	go tool compile: exit status 1
	cannot open shared file, this is most likely due to not running "garble [command]"

When run as "garble build", the trimpath flag is always set. So the
check in alterTrimpath never triggers anymore, and couldn't be tested.

Finally, simplify the handling of comment syntax in printFile, and add a
few TODOs for other code paths not covered by our existing tests.

Total code coverage is up from 90.3% to 91.0%.
3 years ago
Daniel Martí 6b1a062c6f make -literals succeed on all of std
Two bugs were remaining which made the build with -literals of std fail.

First, we were ignoring too many objects in constant expressions,
including type names. This resulted in type names declared in
dependencies which were incorrectly not obfuscated in the current
package:

	# go/constant
	O1ku7TCe.go:1: undefined: alzLJ5Fd.Word
	b0ieEGVQ.go:1: undefined: alzLJ5Fd.Word
	LEpgYKdb.go:4: undefined: alzLJ5Fd.Word
	FkhHJCfm.go:1: undefined: alzLJ5Fd.Word

This edge case is easy to reproduce, so a test case is added to
literals.txt.

The second issue is trickier; in some packages like os/user, we would
get syntax errors because of comments printed out of place:

	../tip/os/user/getgrouplist_unix.go:35:130: syntax error: unexpected newline, expecting comma or )

This is a similar kind of error that we tried to fix with e2f06cce94. In
particular, it's fixed by also setting CallExpr.Rparen in withPos. We
also add many other missing Pos fields for good measure, even though
we're not sure they help just yet.

Unfortunately, all my attempts to minimize this into a reproducible
failure have failed. We can't just copy the failing file from os/user,
as it only builds on some OSs. It seems like it was the perfect mix of
cgo (which adds line directive comments) plus unlucky positioning of
literals.

For that last reason, as well as for ensuring that -literals works well
with a wide variety of software, we add a build of all of std with
-literals when not testing with -short. This is akin to what we do in
goprivate.txt, but with the -literals flag. This does make "go test"
more expensive, but also more thorough.

Fixes #285, hopefully for good this time.
3 years ago
Daniel Martí e2f06cce94 set positions when using cursor.Replace
The regular obfuscation process simply modifies some simple nodes, such
as identifiers and strings. In those cases, we modify the nodes
in-place, meaning that their positions remain the same. This hasn't
caused any problems.

Literal obfuscation is trickier. Since we replace one expression with an
entirely different one, we use cursor.Replace. The new expression is
entirely made up on the spot, so it lacks position information.

This was causing problems. For example, in the added test input:

	> garble -literals build
	[stderr]
	# test/main
	dgcm4t6w.go:3: misplaced compiler directive
	dgcm4t6w.go:4: misplaced compiler directive
	dgcm4t6w.go:3: misplaced compiler directive
	dgcm4t6w.go:6: misplaced compiler directive
	dgcm4t6w.go:7: misplaced compiler directive
	dgcm4t6w.go:3: misplaced compiler directive
	dgcm4t6w.go:9: misplaced compiler directive
	dgcm4t6w.go:3: misplaced compiler directive
	dgcm4t6w.go:3: too many errors

The build errors are because we'd move the compiler directives, which
makes the compiler unhappy as they must be directly followed by a
function declaration.

The root cause there seems to be that, since the replacement nodes lack
position information, go/printer would try to estimate its printing
position by adding to the last known position. Since -literals adds
code, this would result in the printer position increasing rapidly, and
potentially printing directive comments earlier than needed.

For now, making the replacement nodes have the same position as the
original node seems to stop go/printer from making this mistake.

It's possible that this workaround won't be bulletproof forever, but it
works well for now, and I don't see a simpler workaround right now.
It would be possible to use fancier mechanisms like go/ast.CommentMap or
dave/dst, but those are a significant amount of added complexity as well.

Fixes #285.
3 years ago
Daniel Martí b995c1b589
obfuscate literals as part of transformGo (#299)
This is easier to understand, since now the modification of the
*ast.File is all within a single chunk of code. We can also simplify
literals.Obfuscate to work on a single file, as transformGo runs in a
loop.

We also remove the "use receiver" TODOs, since the code is now in a
different package and it can't declare methods on a type here.
3 years ago
lu4p a397a8e94e Literals: Skip constants with inferred values.
Obfuscating literals broke constants
with values inferred via iota before,
because it would be moved to a variable declaration instead.
3 years ago
Andrew LeFevre e014f480f9
if the seed is random and the build fails, print the seed (#213)
Fixes #212
3 years ago
Daniel Martí f667a7ad31
all: use better names than "blacklist", and docs (#206)
The three transformer map fields are now very well documented, which was
badly needed for anyone trying to understand the source code.

ignoreObjects is also a better field name than blacklist, as it says
what the map is indexed by (types.Object) and what we do with those:
ignore them when we obfuscate code.

The rewriting of go:linkname directives is moved to a separate func, so
that we can name that func from the docs.

Finally, the docs are overall improved a bit, as I was re-tracing all
the pieces of code that used the ambiguous "blacklist" terminology.

Fixes #169.
4 years ago
Daniel Martí ba19a1d49c
do not try to obfuscate huge literals (#204)
It's common for asset bundling code generators to produce huge literals,
for example in strings. Our literal obfuscators are meant for relatively
small string-like literals that a human would write, such as URLs, file
paths, and English text.

I ran some quick experiments, and it seems like "garble build -literals"
appears to hang trying to obfuscate literals starting at 5-20KiB. It's
not really hung; it's just doing a lot of busy work obfuscating those
literals. The code it produces is also far from ideal, so it also takes
some time to finally compile.

The generated code also led to crashes. For example, using "garble build
-literals -tiny" on a package containing literals of over a megabyte,
our use of asthelper to remove comments and shuffle line numbers could
run out of stack memory.

This all points in one direction: we never designed "-literals" to deal
with large sizes. Set a source-code-size limit of 2KiB.

We alter the literals.txt test as well, to include a few 128KiB string
literals. Before this fix, "go test" would seemingly hang on that test
for over a minute (I did not wait any longer). With the fix, those large
literals are not obfuscated, so the test ends in its usual 1-3s.

As said in the const comment, I don't believe any of this is a big
problem. Come Go 1.16, most developers should stop using asset-bundling
code generators and use go:embed instead. If we wanted to somehow
obfuscate those, it would be an entirely separate feature.

And, if someone wants to work on obfuscating truly large literals for
any reason, we need good tests and benchmarks to ensure garble does not
consume CPU for minutes or run out of memory.

I also simplified the generate-literals test command. The only argument
that matters to the script is the filename, since it's used later on.

Fixes #178.
4 years ago
lu4p 2e2bd09b5e Simplify maps to boolean value 4 years ago
Nick 43163c2e9b Remove the `usesUnsafe` global variable as it's unused
I've tested the code on unsafe code bases as well, I truly believe that
this variable is not necessary/used.
4 years ago
Nick d4eee0c9bc Replaced asthelper.Ident with ast.NewIdent
No point in having around a helper method that has been implemented for
us by `go/ast`
4 years ago
Daniel Martí 805c895d59 set up an AUTHORS file to attribute copyright
Many files were missing copyright, so also add a short script to add the
missing lines with the current year, and run it.

The AUTHORS file is also self-explanatory. Contributors can add
themselves there, or we can simply update it from time to time via
git-shortlog.

Since we have two scripts now, set up a directory for them.
4 years ago
Daniel Martí f764467e9b all: update the docs a bit
Rework the features section in the README, leaving optional features at
the end of the list. Simplify the caveats list, too; the build cache and
exported field/method bits only need one point each. Overall, the
section was far too wordy for little reason.

Also redo the help text a bit. There's now a line to briefly introduce
the tool, as well as a link to the README with all the details. Finally,
the flags have shorter and more consistent help strings.

While at it, remove two unused global vars as spotted by staticcheck.
4 years ago
lu4p 388ff7d1a4
remove buggy number literal obfuscation
Also remove boolean literal obfuscation.
4 years ago
Daniel Martí 75e904f6d4
various minor cleanups and fixes (#99)
Error strings should never be capitalized.

A binsubstr line in one of the tests was duplicate and thus useless.

Remove duplicate or trailing spaces in test scripts.

Finally, add a TODO for an optimization I just spotted.
4 years ago
lu4p 870cde9a0a
Remove xor from the name of literal obfuscators. (#91) 4 years ago
pagran 28adbaa73b
Randomize operator (xor, add, subtract) on all obfuscators (#90)
Co-authored-by: lu4p <lu4p@pm.me>
4 years ago
pagran 2eba744530
Add XorSeed obfuscator (#86)
Co-authored-by: lu4p <lu4p@pm.me>
4 years ago
pagran 9c25f4c2b2
Add xorShuffle obfuscator (#85)
* Refactoring

* Rename Xor2 to XorShuffle
4 years ago
lu4p 0dd97ed0fa
math/rand.Intn(n) generates a `value`, (#83)
which has the following properties `value >=0 && value < n`.

I previously thought it was `value >=0 && value <= n`.
4 years ago
pagran c51e08ef37
Add split obfuscator (#81) 4 years ago
pagran c2079ac0a1
Add test for literal obfuscators (#80)
* Combine literals-all-obfuscators.txt nad literals.txt
Rewrite literals.txt logic

* Remove unused \s

* Refactoring and add float ast helpers
4 years ago
lu4p 5cbbac56f3
move asthelper functions to separate package (#78) 4 years ago
Daniel Martí 846ddb4097
internal/literals: minor adjustments to the last commits (#77)
First, unindent some of the AST code.

Second, genRandInt is unused; delete it.

Third, genRandIntn is really just mathrand.Intn. Just use it directly.

Fourth, don't use inline comments if they result in super long lines.
4 years ago
pagran 14a19b3e6b
intLiteral helper now accepts int (#76) 4 years ago
pagran 4b73c37ed7
Add new obfuscators for literals - swap (#74)
Implement swap obfuscator
4 years ago
lu4p 50d24cdf51 Add float, int, and boolean literal obfuscation.
Add ast helper functions to reduce ast footprint.

Add binsubfloat and binsubint functions for testing.

Fixes #55.
4 years ago
lu4p 705f9d3a28 Fix byte array and untyped constant obfuscation.
Byte arrays were previously,
obfuscated as byte slices.

Untyped constants are now skipped,
because they cannot be replaced with typed variables.
4 years ago