Commit Graph

94 Commits (master)

Author SHA1 Message Date
Daniel Martí 8ee4c91196 make gotoolchain.txtar upgrade to the host's GOVERSION
On CI we test on go1.23.x and go1.24.x, so if we always upgrade
to the latest go1.24.x, that will cause garble to complain
when running on go1.23.x:

    garble was built with "go1.23.7" and can't be used with the newer "go1.24.1"

Moreover, the test hard-coded go1.24.1, which is currently the latest
go1.24.x but will not be for long, so this test was brittle.
5 days ago
Daniel Martí db3003b9fa use the correct toolchain "go" tool under GOTOOLCHAIN=auto
We call `go list` to collect information about all the packages
to obfuscate and build, which is crucial to be able to perform
obfuscation of names used across packages.

However, when GOTOOLCHAIN causes a toolchain upgrade,
we must ensure that we use the upgraded Go tool;
otherwise we are mixing information from different toolchain versions.

Fixes #934.
5 days ago
Daniel Martí 7f80dfb59d rebuild cmd/link with the correct toolchain under GOTOOLCHAIN=auto
When we build the patched cmd/link binary for use by garble,
we perform this build in a temporary directory so that the Go module
from the user does not get in the way.

When the user module made us upgrade the toolchain per GOTOOLCHAIN,
leaving that module's directory stops upgrading the toolchain,
so we patch a newer toolchain and build it with an older toolchain.
This is largely harmless, but it makes the newer toolchain think
it is actually an older toolchain, which leads to those pesky
"linker object header mismatch" version errors.

Updates #934.
5 days ago
Daniel Martí 6f7af3b785 add test reproducing gotoolchain upgrade errors
While here, make link.version more readable by adding a newline
and document the assumption it makes about GOVERSION.

For #934.
5 days ago
Daniel Martí 2adfc43326 bump unsupportedGo to mark Go 1.24 as supported
debugdir.txtar also needed tweaking as runtime/map.go is gone
starting in Go 1.24.

Finally, modinfo.txtar needed tweaking since Go 1.24 started stamping
Go binaries with VCS-derived module versions, so we no longer end up
with empty "(devel)" versions.
2 months ago
Paul Scheduikat 97833204f8 skip all type parameters in recordType
We only did this for Container in the type switch, but not for Struct.
The added test case panics otherwise.
Just like in the previous case, we still don't need to recurse
into type parameters for fieldToStruct to be filled correctly.

Fixes #899
3 months ago
Daniel Martí e0dbea2b3d hash structs via the bundled and altered typeutil.hash
As spotted by the protobuf package via check-third-party.sh,
the two structs below are identical:

    type alias1 = int64
    type Struct1 struct { Alias alias1 }

    type alias2 = int64
    type Struct2 struct { Alias alias2 }

Our previous approach with stripStructTags dealt with struct tags,
but it did not deal with aliases, which are now present in go/types
thanks to the new alias tracking.

The new approach properly ignores struct tags and unaliases any
type aliases, resulting in correct hashes of any type.
3 months ago
Daniel Martí 83a06019be rely on go/types alias tracking
Allows us to remove EmbeddedAliasFields, recordedObjectString,
and all the logic around using them.

Resolves an issue where a user was running into a panic in
our logic to record embedded aliases.

Note that this means we require Go 1.23.5 or later now,
which also meant some changes to goversion.txtar to keep it green.

Fixes #827.
3 months ago
Daniel Martí 76905ba3bc use the latest testscript to drop func() int
And don't fail when requesting help, which is what Go's flag package
has been moving towards.
3 months ago
Paul Scheduikat 926f3de60d obfuscate all names used in reflection
Go code can retrieve and use field and method names via the `reflect` package.
For that reason, historically we did not obfuscate names of fields and methods
underneath types that we detected as used for reflection, via e.g. `reflect.TypeOf`.

However, that caused a number of issues. Since we obfuscate and build one package
at a time, we could only detect when types were used for reflection in their own package
or in upstream packages. Use of reflection in downstream packages would be detected
too late, causing one package to obfuscate the names and the other not to, leading to a build failure.

A different approach is implemented here. All names are obfuscated now, but we collect
those types used for reflection, and at the end of a build in `package main`,
we inject a function into the runtime's `internal/abi` package to reverse the obfuscation
for those names which can be used for reflection.

This does mean that the obfuscation for these names is very weak, as the binary
contains a one-to-one mapping to their original names, but they cannot be obfuscated
without breaking too many Go packages out in the wild. There is also some amount
of overhead in `internal/abi` due to this, but we aim to make the overhead insignificant.

Fixes #884, #799, #817, #881, #858, #843, #842

Closes #406
4 months ago
Daniel Martí acec5954ce clarify and test that runtime.GOROOT is not available
Closes #878.
4 months ago
Daniel Martí 30357af923
drop Go 1.22 and require Go 1.23.0 or later (#876)
This lets us start taking advantage of featurs from Go 1.23,
particularly tracking aliases in go/types and iterators.

Note that we need to add code to properly handle or skip over the new
*types.Alias type which go/types produces for Go type aliases.
Also note that we actually turn this mode off entirely for now,
due to the bug reported at https://go.dev/issue/70394.

We don't yet remove our own alias tracking code yet due to the above.
We hope to be able to remove it very soon.
5 months ago
NHAS 69d7b84f35
Fix reflection detection for linknamed methods (#883) 5 months ago
Daniel Martí 48fac78ecc set up go/types.Config.Sizes according to GOARCH
Otherwise we miscalculate int sizes, type sizes, alignments, and so on.
Caught by the GOARCH=386 go test on CI, since the os package imports
internal/syscall/unix, which uses arch-dependent padding.

The different padding between our incorrect use of go/types
and the correct typechecking done by the compiler caused different
obfuscation of fields, as the struct types stringified differently,
and they are used as a hash salt for field name obfuscation.
7 months ago
Daniel Martí 92a7b5fe8a remove test linknames into std
As of Go 1.23, these are forbidden by https://go.dev/issue/67401.

Updates #859.
7 months ago
Daniel Martí 51ee956e90 disable linker.txtar as it's broken by design on Go 1.23
Updates #859.
7 months ago
Daniel Martí 36457412db tweak script/reverse.txtar to pass on both Go 1.22 and 1.23
Updates #859.
7 months ago
Daniel Martí 04df5ea4df add linker patches for Go 1.23
Rebasing the Go 1.22 patches on top of Go 1.23.0,
as published on https://github.com/burrowers/go-patches/pull/7.

Updates #859.
7 months ago
Daniel Martí 9f82b2bbfe make pointer regular expressions more flexible
We just got a failure on Mac on CI with pointers past eight digits:

    (0x10432dec0,0x10433c690)

Use `[[:xdigit:]]+` consistently.
11 months ago
Daniel Martí 9a2ef369b2 fail early if we know we lack Go linker patches
Right now, we only have linker patches for Go 1.22.x.
We only ever maintain those for one or two major Go versions at a time.

If a user tries to use the Go toolchain from 1.21, we already fail
with "Go version too old" messages early on, but we don't for 1.23,
causing a relatively confusing error later on when we link a binary:

    cannot get modified linker: cannot retrieve linker patches: open patches/go1.23: file does not exist

Instead, fail early and with a good error message.
1 year ago
Daniel Martí 66b61406c1 obfuscate syscall again to fix x/sys/unix
When updating Garble to support Go 1.22.0, CI on MacOS spotted
that the syscall package was failing to build given that it uses
assembly code which is only allowed in some std packages.

That allowlist is based on import paths, and we were obfuscating
the syscall package's import path, so that was breaking GOOS=darwin.
As a fix, I added syscall to runtimeAndDeps to not obfuscate it.

That wasn't a great fix; it's not part of runtime and its dependencies,
and there's no reason we should avoid obfuscating the package contents.
Not obfuscating the contents in fact broke x/sys/unix,
as it contains a copy of syscall.Rlimit which it type converted with.

Undo that fix and reinstate the gogarble.txtar syscall test.
Implement the fix where we only leave syscall's import path alone.
Add a regression test, and add a note about adding x/net and x/sys
to check-third-party.sh so that we can catch these bugs earlier.

Fixes #830.
1 year ago
Daniel Martí ad2ecc7f2f drop Go 1.21 and start using go/version
Needing to awkwardly treat Go versions as if they were semver
is no longer necessary thanks to go/version being in Go 1.22.0 now.
1 year ago
Daniel Martí 55921a06d4 fix building for GOOS=darwin on Go 1.22.0
It seems like building with Go 1.22.0 for GOOS=darwin started
running into some issues with the syscall package's use of ABIInternal
in assembly source code:

    > exec garble build
    [stderr]
    # syscall
    [...].s:16: ABI selector only permitted when compiling runtime, reference was to "runtime.entersyscall"

The error can be reproduced from another platform like GOOS=linux
as long as we have any test that cross-compiles std to GOOS=darwin.
We had crossbuild.txtar which only ensured we covered GOOS=windows
and GOOS=linux, so add a third case to ensure MacOS is covered too.

This will slow down the tests a bit, but is important for the sake
of ensuring that we catch these bugs early, even without MacOS on CI.
In fact, we hadn't caught this earlier for Go 1.22 precisely because
on CI we only tested on Go tip with GOOS=linux, for the sake of speed.

Adding the rest of the package import paths from objabi.allowAsmABIPkgs
to our runtimeAndDeps generated map solves this error.
1 year ago
pagran e8fe80d627
add trash block generator (#825)
add trash block generator

For making static code analysis even more difficult, added feature for
generating trash blocks that will never be executed. In combination
with control flow flattening makes it hard to separate trash code from
the real one, plus it causes a large number of trash references to
different methods.

Trash blocks contain 2 types of statements:
1. Function/method call with writing the results into local variables
and passing them to other calls
2. Shuffling or assigning random values to local variables
1 year ago
Daniel Martí bdfa619f77 support inline comments in asm #include lines
That is, the assembly line

    #include "foo.h" // bar

would make garble run into an error, as we would try to parse
the #include directive before we stripped comments,
effectively trying to unquote the string

    "foo.h" // bar

rather than just the included filename

    "foo.h"

Add test cases for asm but also cgo, while at it.

Fixes #812.
1 year ago
pagran 3a9c9aa3d4
fix shuffle obfuscation compiler optimization
In some cases, compiler could optimize the shuffle obfuscator,
causing exposing the obfuscated literal.
As a fix, added xor encryption of array indexes.
1 year ago
Paul Scheduikat 96d2d8b0de
track types used in make assigned to a reflected type
Fixes #690.
1 year ago
Paul Scheduikat bec8043790
track converted types when recording reflection usage
Fixes #763.
Fixes #782.
Fixes #785.
Fixes #807.
1 year ago
Daniel Martí 4271bc45ae avoid panic when embedding a builtin alias
TypeName.Pkg is documented as:

    Pkg returns the package to which the object belongs.
    The result is nil for labels and objects in the Universe scope.

When a struct type embeds a builtin alias type, such as byte,
this would lead to a panic since we assumed we could use the Pkg method.

Fixes #798.
1 year ago
Daniel Martí 6f0e46f80b strip struct tags when hashing structs for type identity
This was a long standing TODO, and a user finally ran into it.
The fix isn't terribly straightforward, but I'm pretty happy with it.

Add a test case where the same struct field is identical
with no tag, with the "tagged1" json tag, and again with "tagged2".
While here, we add a test case for a regular named field too.

Fixes #801.
1 year ago
Daniel Martí 126618a0d5 drop support for Go 1.20
Go 1.21.0 was released in August 2023, so our upcoming release
will no longer support the Go 1.20 release series.

The first Go 1.22 release candidate is also due in December 2023,
less than a month from now, so dropping 1.20 will simplify 1.22 work.
1 year ago
pagran 5e80f12be7
implement flattening hardening
Without hardening, obfuscation is vulnerable to analysis via symbolic
execution because all keys are opened, and it is easy to trace their
connections. Added extendable (contribution-friendly) hardening
mechanism that makes it harder to determine relationship between key and
execution block through key obfuscation.

There are 2 hardeners implemented and both are compatible with literal
obfuscation, which can make analysis even more difficult.
1 year ago
Daniel Martí 82834ace20 testdata: skip runtime rebuild test on darwin
CI ran into the failure again; reopened #609 for now.
2 years ago
Daniel Martí 716322cdf8 all: start suggesting Go 1.21 and testing on it
Also note that the first release is now 1.21.0,
so we no longer need to use the awkward 1.21.x notation in warnings.
2 years ago
Daniel Martí 23c8641855 propagate "uses reflection" through SSA stores
Up until now, the new SSA reflection detection relied on call sites
to propagate which objects (named types, struct fields) used reflection.
For example, given the code:

    json.Marshal(new(T))

we would first record that json.Marshal calls reflect.TypeOf,
and then that the user's code called json.Marshal with the type *T.

However, this would not catch a slight variation on the above:

    var t T
    reflect.TypeOf(t)
    t.foo = struct{bar int}{}

Here, type T's fields such as "foo" and "bar" are not obfuscated,
since our logic sees the call site and marks the type T recursively.
However, the unnamed `struct{bar int}` type was still obfuscated,
causing errors such as:

    cannot use struct{uKGvcJvD24 int}{} (value of type struct{uKGvcJvD24 int}) as struct{bar int} value in assignment

The solution is to teach the analysis about *ssa.Store instructions
in a similar way to how it already knows about *ssa.Call instructions.
If we see a store where the destination type is marked for reflection,
then we mark the source type as well, fixing the bug above.

This fixes obfuscating github.com/gogo/protobuf/proto.
A number of other Go modules fail with similar errors,
and they look like very similar bugs,
but this particular fix doesn't apply to them.
Future incremental fixes will try to deal with those extra cases.

Fixes #685.
2 years ago
pagran 260cad2a3f
add "max" flag value and limits for control flow obfuscation parameters 2 years ago
pagran 0e2e483472
add control flow obfuscation
Implemented control flow flattening with additional features such as block splitting and junk jumps
2 years ago
Daniel Martí d89a55687c add a regression test for type names and reflect
We were recently altering the logic in reflect.go for type names,
which could have broken this kind of valid use of reflection.

Add a regression test, which I verified would break before my last
change to "simplify" the logic, which actually changed the logic,
as xuannv112 correctly pointed out.

After thinking about the change in behavior for a little while,
I realised that the new behavior is more correct, hence the test.
2 years ago
Daniel Martí d60957d514 add more reflect test cases and simplify logic
A recent PR added a bigger regression test for go-spew,
and fixed an issue where we would obfuscate local named types
even if they were embedded into local structs used for reflection.
This would effectively mean we were obfuscating one field name,
the one derived from the embedding, which we didn't want to.

The fix did this by searching for embedded objects with extra code.
However, as far as I can tell, that isn't necessary;
we can do the right thing by recording all local type names
just like we already do for all field names.

This results in less complicated code, and avoids needing special logic
to handle embedding struct types, so I reckon it's a win.

Add even more tests to convince myself that we're still obfuscating
local types and field names which aren't used for reflection.
2 years ago
xuannv 404b2ce128
ignore embedded fields used in reflection (#768)
Fixes #765.
2 years ago
Daniel Martí c26734c668 simplify our handling of "go list" errors
First, teach scripts/gen-go-std-tables.sh to omit test packages,
since runtime/metrics_test would always result in an error.
Instead, make transformLinkname explicitly skip that package,
leaving a comment about a potential improvement if needed.

Second, the only remaining "not found" error we had was "maps" on 1.20,
so rewrite that check based on ImportPath and GoVersionSemver.

Third, detect packages with the "exclude all Go files" error
by looking at CompiledGoFiles and IgnoredGoFiles, which is less brittle.
This means that we are no longer doing any filtering on pkg.Error.Err,
which means we are less likely to break with Go error message changes.

Fourth, the check on pkg.Incomplete is now obsolete given the above,
meaning that the CompiledGoFiles length check is plenty.

Finally, stop trying to be clever about how we print errors.
Now that we're no longer skipping packages based on pkg.Error values,
printing pkg.DepsErrors was causing duplicate messages in the output.
Simply print pkg.Error values with only minimal tweaks:
including the position if there is any, and avoiding double newlines.

Overall, this makes our logic a lot less complicated,
and garble still works the way we want it to.
2 years ago
Daniel Martí 6dd5c53a91 internal/linker: place files under GARBLE_CACHE
This means we now have a unified cache directory for garble,
which is now documented in the README.

I considered using the same hash-based cache used for pkgCache,
but decided against it since that cache implementation only stores
regular files without any executable bits set.
We could force the cache package to do what we want here,
but I'm leaning against it for now given that single files work OK.
2 years ago
Daniel Martí 79376a15f9 support computing missing pkgCache entries
Some users had been running into "cannot load cache entry" errors,
which could happen if garble's cache files in GOCACHE were removed
when Go's own cache files were not.

Now that we've moved to our own separate cache directory,
and that we've refactored the codebase to depend less on globals
and no longer assume that we're loading info for the current package,
we can now compute a pkgCache entry for a dependency if needed.

We add a pkgCache.CopyFrom method to be able to append map entries
from one pkgCache to another without needing an encoding/gob roundtrip.

We also add a parseFiles helper, since we now have three bits of code
which need to parse a list of Go files from disk.

Fixes #708.
2 years ago
Daniel Martí da5ddfa45d avoid go:linkname warnings when building on tip
Packages like os and sync have started using go:linknames pointing to
packages outside their dependency tree, much like runtime already did.
This started causing warnings to be printed while obfuscsating std:

    > exec garble build -o=out_rebuild ./stdimporter
    [stderr]
    # sync
    //go:linkname refers to syscall.hasWaitingReaders - add `import _ "syscall"` for garble to find the package
    # os
    //go:linkname refers to net.newUnixFile - add `import _ "net"` for garble to find the package
    > bincmp out_rebuild out
    PASS

Relax the restriction in listPackage so that any package in std
is now allowed to list packages in runtimeLinknamed,
which makes the warnings and any potential problems go away.
Also make these std test cases check that no warnings are printed,
since I only happened to notice this problem by chance.
2 years ago
Daniel Martí 7d1bd13778 replace our caching inside GOCACHE with GARBLE_CACHE
For each Go package we obfuscate, we need to store information about
how we obfuscated it, which is needed when obfuscating its dependents.
For example, if A depends on B to use the type B.Foo, A needs to know
whether or not B.Foo was obfuscated; it depends on B's use of reflect.

We record this information in a gob file, which is cached on disk.
To avoid rolling our own custom cache, and since garble is so closely
connected with cmd/go already, we piggybacked off of Go's GOCACHE.
In particular, for each build cache entry per `go list`'s Export field,
we would store a "garble" sibling file with that gob content.

However, this was brittle for two reasons:

1) We were doing this without cmd/go's permission or knowledge.
   We were careful to use filename suffixes similar to Export files,
   meaning that `go clean` and other commands would treat them the same.
   However, this could confuse cmd/go at any point in the future.

2) cmd/go trims cache entries in GOCACHE regularly, to keep the size of
   the build and test caches under control. Right now, this means that
   every 24h, any file not accessed in the last five days is deleted.
   However, that trimming heuristic is done per-file.
   If the trimming removed Garble's sibling file but not the original
   Export file, this could cause errors such as
   "cannot load garble export file" which users already ran into.

Instead, start using github.com/rogpeppe/go-internal/cache,
an exported copy of cmd/go's own cache implementation for GOCACHE.
Since we need an entirely separate directory, we introduce GARBLE_CACHE,
defaulting to the "garble" directory inside the user's cache directory.
For example, on Linux this would be ~/.cache/garble.

Inside GARBLE_CACHE, our gob file cache will be under "build",
which helps clarify that this cache is used when obfuscating Go builds,
and allows placing other kinds of caches inside GARBLE_CACHE.
For example, we already have a need for storing linker binaries,
which for now still use their own caching mechanism.

This commit does not make our cache properly resistant to removed files.
The proof is that our seed.txtar testscript still fails the second case.
However, we do rewrite all of our caching logic away from Export files,
which in itself is a considerable refactor, and we add a few TODOs.

One notable change is how we load gob files from dependencies
when building the cache entry for the current package.
We used to load the gob files from all packages in the Deps field.
However, that is the list of all _transitive_ dependencies.
Since these gob files are already flat, meaning they contain information
about all of their transitive dependencies as well, we need only load
the gob files from the direct dependencies, the Imports field.

Performance is largely unchanged, since the behavior is similar.
However, the change from Deps to Imports saves us some work,
which can be seen in the reduced mallocs per obfuscated build.

It's unclear why the binary size isn't stable.
When reverting the Deps to Imports change, it then settles at 5.386Mi,
which is almost exactly in between the two measurements below.
I'm not sure why, but that metric appears to be slightly unstable.

    goos: linux
    goarch: amd64
    pkg: mvdan.cc/garble
    cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics
            │    old     │             new              │
            │   sec/op   │   sec/op    vs base          │
    Build-8   11.09 ± 1%   11.08 ± 1%  ~ (p=0.796 n=10)

            │     old      │                 new                 │
            │    bin-B     │    bin-B      vs base               │
    Build-8   5.390Mi ± 0%   5.382Mi ± 0%  -0.14% (p=0.000 n=10)

            │      old      │               new               │
            │ cached-sec/op │ cached-sec/op  vs base          │
    Build-8     415.5m ± 4%     421.6m ± 1%  ~ (p=0.190 n=10)

            │     old     │                new                 │
            │ mallocs/op  │ mallocs/op   vs base               │
    Build-8   35.43M ± 0%   34.05M ± 0%  -3.89% (p=0.000 n=10)

            │    old     │             new              │
            │ sys-sec/op │ sys-sec/op  vs base          │
    Build-8   5.662 ± 1%   5.701 ± 2%  ~ (p=0.280 n=10)
2 years ago
Daniel Martí b4fa94e45b use go:build in script/imports.txtar
This form has been preferred over +build since Go 1.17.
2 years ago
Daniel Martí bdcb80ee63 adapt to tip's error message change from "GOROOT" to "std" 2 years ago
Daniel Martí 744e9a375a suggest a command when asking the user to rebuild garble
See a user's apparent confusion in #738.
2 years ago
lu4p 1526ce7fd2
rework reflection detection with ssa (#732)
This is significantly more robust, than the ast based detection and can
record very complex cases of indirect parameter reflection.

Fixes #554
2 years ago
Daniel Martí b0ff2fb133 fix encoding/asn1 marshaling of pkix types
See the added comment and test, which failed before the fix when
the encoded certificate was decoded.

It took a while to find the culprit in asn1, but in hindsight it's easy:

	case reflect.Slice:
		if t.Elem().Kind() == reflect.Uint8 {
			return false, TagOctetString, false, true
		}
		if strings.HasSuffix(t.Name(), "SET") {
			return false, TagSet, true, true
		}
		return false, TagSequence, true, true

Fixes #711.
2 years ago