Commit Graph

623 Commits (126618a0d5971456c7bba5a6f4081d432d812083)
 

Author SHA1 Message Date
Daniel Martí 126618a0d5 drop support for Go 1.20
Go 1.21.0 was released in August 2023, so our upcoming release
will no longer support the Go 1.20 release series.

The first Go 1.22 release candidate is also due in December 2023,
less than a month from now, so dropping 1.20 will simplify 1.22 work.
8 months ago
Daniel Martí abcdc1fcbf re-generate go_std_tables.go with Go master
Two new packages linknamed with the runtime package,
one new intrinsic function, and one that is being removed in Go 1.22
but we want to keep around as long as we support Go 1.21.

Also note that, since math/rand/v2 simply does not exist until Go 1.22,
we need to adjust appendListedPackages to not fail on older versions.
8 months ago
Daniel Martí e712e720ce use x/tools version from go.mod in go:generate
Otherwise we have to update that `@semver` string
alongside the regular x/tools updates in go.mod.
There's no reason to separate the two versions either.
8 months ago
Daniel Martí c314fcb61c update deps
In particular, the x/tools update fixes support for Go 1.22,
due to https://go.dev/issue/62167 happening in August 2023.
8 months ago
pagran 5e80f12be7
implement flattening hardening
Without hardening, obfuscation is vulnerable to analysis via symbolic
execution because all keys are opened, and it is easy to trace their
connections. Added extendable (contribution-friendly) hardening
mechanism that makes it harder to determine relationship between key and
execution block through key obfuscation.

There are 2 hardeners implemented and both are compatible with literal
obfuscation, which can make analysis even more difficult.
8 months ago
Daniel Martí 978fd6d518 appease Go 1.22's stricter base64 sanity checks
We were using an alphabet with a duplicate character on purpose.
Go 1.21 was perfectly fine with that, but 1.22 started noticing:

    panic: encoding alphabet includes duplicate symbols

I can't fault the new sanity check, because it makes sense in general.
What we are doing here is slightly bizarre, because we don't care
about decoding the name hashes at all.

Appease the sanity check by replacing dashes with duplicate characters
as a follow-up step. While here, use "a" rather than "z",
which is more common and less likely to be noticeable.
9 months ago
Daniel Martí 82834ace20 testdata: skip runtime rebuild test on darwin
CI ran into the failure again; reopened #609 for now.
9 months ago
Daniel Martí 716322cdf8 all: start suggesting Go 1.21 and testing on it
Also note that the first release is now 1.21.0,
so we no longer need to use the awkward 1.21.x notation in warnings.
9 months ago
Daniel Martí 344cdd5e7b make `go test -race` fast again on Go 1.21
On my laptop, `go test -short -race` on Go 1.20 used to take about 62s.
Jumping to Go 1.21, I was surprised to see an increase to 152s,
more than double - which was weird given how often the CPU was idle.

This manifested when updating our CI to start testing on Go 1.21.
Where Go 1.20 on Linux took about 10m to run `go test -race`,
Go 1.21 hit the 20m timeout every single time.

After a bit of googling, I was reminded of https://go.dev/issues/20364
as well as https://go.dev/doc/articles/race_detector#Options:

    atexit_sleep_ms (default 1000): Amount of milliseconds to sleep in the main goroutine before exiting.

This default is a bit aggressive for Go, but usually harmless,
having each test binary sleep for 1s after the package has been tested.

However, this 1s sleep after main runs is horrendous for garble's tests;
the testscripts run `garble build` many times, running the test binary.
It then runs `go build -toolexec=garble`, which runs the test binary
many more times: for every compiler, linker, etc invocation.

This means that our testscripts would include dozens of 1s sleeps,
in many cases blocking the continuation of the entire test.
This seemed to not be happening on earlier Go versions due to a bug;
Go 1.21's race mode started obeying this default properly.

The added change sets atexit_sleep_ms to something more reasonable
if GORACE isn't set at all; 10ms doesn't disable this check entirely,
but its overhead is orders of magnitude less noticeable than 1000ms.
`go test -short -race` on Go 1.21 drops back down to 68s for me.
9 months ago
Hritik Vijay 66bdc8b124 Use go install instead of garble install
garble install command does not exist
9 months ago
Daniel Martí 23c8641855 propagate "uses reflection" through SSA stores
Up until now, the new SSA reflection detection relied on call sites
to propagate which objects (named types, struct fields) used reflection.
For example, given the code:

    json.Marshal(new(T))

we would first record that json.Marshal calls reflect.TypeOf,
and then that the user's code called json.Marshal with the type *T.

However, this would not catch a slight variation on the above:

    var t T
    reflect.TypeOf(t)
    t.foo = struct{bar int}{}

Here, type T's fields such as "foo" and "bar" are not obfuscated,
since our logic sees the call site and marks the type T recursively.
However, the unnamed `struct{bar int}` type was still obfuscated,
causing errors such as:

    cannot use struct{uKGvcJvD24 int}{} (value of type struct{uKGvcJvD24 int}) as struct{bar int} value in assignment

The solution is to teach the analysis about *ssa.Store instructions
in a similar way to how it already knows about *ssa.Call instructions.
If we see a store where the destination type is marked for reflection,
then we mark the source type as well, fixing the bug above.

This fixes obfuscating github.com/gogo/protobuf/proto.
A number of other Go modules fail with similar errors,
and they look like very similar bugs,
but this particular fix doesn't apply to them.
Future incremental fixes will try to deal with those extra cases.

Fixes #685.
11 months ago
pagran afe1aad916 Removing obsolete TODO
indirect import issue has already been fixed
11 months ago
pagran 9612b29423
add generic function support for control flow obfuscation 12 months ago
pagran 260cad2a3f
add "max" flag value and limits for control flow obfuscation parameters 1 year ago
pagran 8fd5f10d1d
add control flow obfuscation docs 1 year ago
pagran 0e2e483472
add control flow obfuscation
Implemented control flow flattening with additional features such as block splitting and junk jumps
1 year ago
Daniel Martí d89a55687c add a regression test for type names and reflect
We were recently altering the logic in reflect.go for type names,
which could have broken this kind of valid use of reflection.

Add a regression test, which I verified would break before my last
change to "simplify" the logic, which actually changed the logic,
as xuannv112 correctly pointed out.

After thinking about the change in behavior for a little while,
I realised that the new behavior is more correct, hence the test.
1 year ago
Daniel Martí 8f7248939c CHANGELOG: add entry for v0.10.1
No longer linking to issues manually, since GitHub does it for us
in the releases markdown rendering.
1 year ago
Daniel Martí 2a10dc7f41 minor tweaks in preparation for Go 1.21
Update CI to use a newer version of Go master,
now that we're already getting release candidates.

Look at the diffs between Go 1.20 and master of `go help build`
and `go help testflag`, and add two flags that were recently added.

While here, bump a hopeful TODO for a feature request,
since that one definitely did not happen for 1.21.
1 year ago
Daniel Martí d60957d514 add more reflect test cases and simplify logic
A recent PR added a bigger regression test for go-spew,
and fixed an issue where we would obfuscate local named types
even if they were embedded into local structs used for reflection.
This would effectively mean we were obfuscating one field name,
the one derived from the embedding, which we didn't want to.

The fix did this by searching for embedded objects with extra code.
However, as far as I can tell, that isn't necessary;
we can do the right thing by recording all local type names
just like we already do for all field names.

This results in less complicated code, and avoids needing special logic
to handle embedding struct types, so I reckon it's a win.

Add even more tests to convince myself that we're still obfuscating
local types and field names which aren't used for reflection.
1 year ago
xuannv 404b2ce128
ignore embedded fields used in reflection (#768)
Fixes #765.
1 year ago
Emmanuel Chee-zaram Okeke 47296634f1 Include actual count of files with `CRLF` endings found
The script has been updated to include the actual count of files
with CRLF endings found.

The exit status of the script now accurately reflects the number of
files with incorrect line endings.
1 year ago
Daniel Martí d3763143bd
ready changelog for release 1 year ago
Daniel Martí 3364c5c38f
update deps prior to release
Keeping x/tools up to date is particularly important.
1 year ago
Daniel Martí 155bc3228e CHANGELOG: add entry for the imminent v0.10.0
Starting to explain new features in more detail.
A bullet list of single lines can be enough for most bug fixes,
but some of the big refactors like SSA or caching need some context.
1 year ago
Daniel Martí c26734c668 simplify our handling of "go list" errors
First, teach scripts/gen-go-std-tables.sh to omit test packages,
since runtime/metrics_test would always result in an error.
Instead, make transformLinkname explicitly skip that package,
leaving a comment about a potential improvement if needed.

Second, the only remaining "not found" error we had was "maps" on 1.20,
so rewrite that check based on ImportPath and GoVersionSemver.

Third, detect packages with the "exclude all Go files" error
by looking at CompiledGoFiles and IgnoredGoFiles, which is less brittle.
This means that we are no longer doing any filtering on pkg.Error.Err,
which means we are less likely to break with Go error message changes.

Fourth, the check on pkg.Incomplete is now obsolete given the above,
meaning that the CompiledGoFiles length check is plenty.

Finally, stop trying to be clever about how we print errors.
Now that we're no longer skipping packages based on pkg.Error values,
printing pkg.DepsErrors was causing duplicate messages in the output.
Simply print pkg.Error values with only minimal tweaks:
including the position if there is any, and avoiding double newlines.

Overall, this makes our logic a lot less complicated,
and garble still works the way we want it to.
1 year ago
Daniel Martí 59222cb14b various minor TODO cleanups
computeLinkerVariableStrinsg had an unusedargument.

Only skip obfuscating the name "FS" in the "embed" package.

The reflect methods no longer use the transformer receiver type,
so that TODO now feels unnecessary. Many methods need to be aware
of what the current types.Package is, and that seems reasonable.

We no longer use writeFileExclusive for our own cache on disk,
so the TODO about using locking or atomic writes is no longer relevant.
1 year ago
Daniel Martí 31d2d9263a README: suggest how to install master
This might be known by some Go users, but a few users were unaware,
so make an explicit mention of it.
1 year ago
Daniel Martí eb18969adb finally remove garbleActionID hack
Our cache is now robust against different Go package build inputs
which result in exactly the same build outputs.
In the past, this caused Go's cache to be filled but not ours.
In the present, our cache is just as resilient as Go's is.
1 year ago
Daniel Martí 6dd5c53a91 internal/linker: place files under GARBLE_CACHE
This means we now have a unified cache directory for garble,
which is now documented in the README.

I considered using the same hash-based cache used for pkgCache,
but decided against it since that cache implementation only stores
regular files without any executable bits set.
We could force the cache package to do what we want here,
but I'm leaning against it for now given that single files work OK.
1 year ago
Daniel Martí 79376a15f9 support computing missing pkgCache entries
Some users had been running into "cannot load cache entry" errors,
which could happen if garble's cache files in GOCACHE were removed
when Go's own cache files were not.

Now that we've moved to our own separate cache directory,
and that we've refactored the codebase to depend less on globals
and no longer assume that we're loading info for the current package,
we can now compute a pkgCache entry for a dependency if needed.

We add a pkgCache.CopyFrom method to be able to append map entries
from one pkgCache to another without needing an encoding/gob roundtrip.

We also add a parseFiles helper, since we now have three bits of code
which need to parse a list of Go files from disk.

Fixes #708.
1 year ago
Daniel Martí c5af68cd80 move curPkgCache out of the global scope
loadPkgCache also uses different pkgCache variables
depending on whether it finds a direct cache hit or not.

Now we only initialize an entirely new pkgCache
with the first two ReflectAPIs entries when we get a direct cache miss,
since a direct cache hit will already load those from the cache.
1 year ago
Daniel Martí 4f743b0861 move curPkg and origImporter out of the globals
It is true that each garble process only obfuscates up to one package,
which is why we made them globals to begin with.
However, garble does quite a lot more now,
such as reversing the obfuscation of many packages at once.
Having a global "current package" variable makes mistakes easier.

Some funcs, like those in transformFuncs, are now transformer methods.
1 year ago
Daniel Martí 5fddfe1e61 rename func and update docs on handleDirectives
We've been obfuscating all linknamed names for a while now,
so the part in the docs about "recording" is no longer true.

All it does is transform the directives to use obfuscated names.
Give it a better name and rewrite the docs.
1 year ago
Daniel Martí d5cbf2edca separate and rename prefillObjectMaps
The name and docs on that func were wildly out of date,
since it no longer has anything to do with reflection at all.

We only use the linkerVariableStrings map with -literals,
so we can avoid the call entirely if the flag isn't set.
1 year ago
Daniel Martí f85492a728 split typecheck and loadPkgCache from transformer
Neither of them has anything to do with transforming Go code;
they simply load or compute the information necessary for doing so.

Split typecheck into two functions as well.
The new typecheck function only does typechecking and nothing else.
A new comptueFieldToStruct func fills the fieldToStruct map,
which depends on typecheck, but is not needed when computing pkgCache.

This isolation also forces us to separate the code that fills pkgCache
from the code that fills the in-memory-only maps in transformer,
removing the need for the NOTE that we had left around as a reminder.
1 year ago
Daniel Martí 4b0b2acf6f isolate reflect.go from updating globals directly
That is, stop reusing "transformer" as the receiver on methods,
and stop writing the results to the global curPkgCache struct.

Soon we will need to support computing pkgCache for any dependency,
not just the current package, to make the caching properly robust.
This allows us to fill reflectInspector with different values.

The explicit isolation also helps prevent bugs.
For instance, we were calling recursivelyRecordAsNotObfuscated from
transformCompile, which happens after we have loaded or saved pkgCache.
Meaning, the current package sees a larger pkgCache than its dependents.
In this particular case it wasn't causing any bugs,
since the two reflect types in question only had unexported fields,
but it's still good to treat pkgCache as read-only in transformCompile.
1 year ago
Daniel Martí d108f21846 apply TODO to rename "cannot obfuscate" APIs
They have been exclusively about reflect for over a year now.
Make that clearer, and update the docs as well.
1 year ago
Daniel Martí e079c0af43 declare a type for cachedOutput
To properly make our cache robust, we'll need to be able to compute
cache entries for dependencies as needed if they are missing.
So we'll need to create more of these struct values in the code.

Rename cachedOutput to curPkgCache, to clarify that it relates
to the current package.

While here, remove the "known" prefix on all pkgCache fields.
All of the names still make perfect sense without it.
1 year ago
Daniel Martí da5ddfa45d avoid go:linkname warnings when building on tip
Packages like os and sync have started using go:linknames pointing to
packages outside their dependency tree, much like runtime already did.
This started causing warnings to be printed while obfuscsating std:

    > exec garble build -o=out_rebuild ./stdimporter
    [stderr]
    # sync
    //go:linkname refers to syscall.hasWaitingReaders - add `import _ "syscall"` for garble to find the package
    # os
    //go:linkname refers to net.newUnixFile - add `import _ "net"` for garble to find the package
    > bincmp out_rebuild out
    PASS

Relax the restriction in listPackage so that any package in std
is now allowed to list packages in runtimeLinknamed,
which makes the warnings and any potential problems go away.
Also make these std test cases check that no warnings are printed,
since I only happened to notice this problem by chance.
1 year ago
Daniel Martí 0f2b59d794 merge the two "known cannot obfuscate" maps
Per the TODOs that I left myself in the last commit.
As expected, this change allows tidying up the code a bit,
makes our use of caching a bit more consistent,
and also allows us to load the current package from the cache.
1 year ago
Daniel Martí 7d1bd13778 replace our caching inside GOCACHE with GARBLE_CACHE
For each Go package we obfuscate, we need to store information about
how we obfuscated it, which is needed when obfuscating its dependents.
For example, if A depends on B to use the type B.Foo, A needs to know
whether or not B.Foo was obfuscated; it depends on B's use of reflect.

We record this information in a gob file, which is cached on disk.
To avoid rolling our own custom cache, and since garble is so closely
connected with cmd/go already, we piggybacked off of Go's GOCACHE.
In particular, for each build cache entry per `go list`'s Export field,
we would store a "garble" sibling file with that gob content.

However, this was brittle for two reasons:

1) We were doing this without cmd/go's permission or knowledge.
   We were careful to use filename suffixes similar to Export files,
   meaning that `go clean` and other commands would treat them the same.
   However, this could confuse cmd/go at any point in the future.

2) cmd/go trims cache entries in GOCACHE regularly, to keep the size of
   the build and test caches under control. Right now, this means that
   every 24h, any file not accessed in the last five days is deleted.
   However, that trimming heuristic is done per-file.
   If the trimming removed Garble's sibling file but not the original
   Export file, this could cause errors such as
   "cannot load garble export file" which users already ran into.

Instead, start using github.com/rogpeppe/go-internal/cache,
an exported copy of cmd/go's own cache implementation for GOCACHE.
Since we need an entirely separate directory, we introduce GARBLE_CACHE,
defaulting to the "garble" directory inside the user's cache directory.
For example, on Linux this would be ~/.cache/garble.

Inside GARBLE_CACHE, our gob file cache will be under "build",
which helps clarify that this cache is used when obfuscating Go builds,
and allows placing other kinds of caches inside GARBLE_CACHE.
For example, we already have a need for storing linker binaries,
which for now still use their own caching mechanism.

This commit does not make our cache properly resistant to removed files.
The proof is that our seed.txtar testscript still fails the second case.
However, we do rewrite all of our caching logic away from Export files,
which in itself is a considerable refactor, and we add a few TODOs.

One notable change is how we load gob files from dependencies
when building the cache entry for the current package.
We used to load the gob files from all packages in the Deps field.
However, that is the list of all _transitive_ dependencies.
Since these gob files are already flat, meaning they contain information
about all of their transitive dependencies as well, we need only load
the gob files from the direct dependencies, the Imports field.

Performance is largely unchanged, since the behavior is similar.
However, the change from Deps to Imports saves us some work,
which can be seen in the reduced mallocs per obfuscated build.

It's unclear why the binary size isn't stable.
When reverting the Deps to Imports change, it then settles at 5.386Mi,
which is almost exactly in between the two measurements below.
I'm not sure why, but that metric appears to be slightly unstable.

    goos: linux
    goarch: amd64
    pkg: mvdan.cc/garble
    cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics
            │    old     │             new              │
            │   sec/op   │   sec/op    vs base          │
    Build-8   11.09 ± 1%   11.08 ± 1%  ~ (p=0.796 n=10)

            │     old      │                 new                 │
            │    bin-B     │    bin-B      vs base               │
    Build-8   5.390Mi ± 0%   5.382Mi ± 0%  -0.14% (p=0.000 n=10)

            │      old      │               new               │
            │ cached-sec/op │ cached-sec/op  vs base          │
    Build-8     415.5m ± 4%     421.6m ± 1%  ~ (p=0.190 n=10)

            │     old     │                new                 │
            │ mallocs/op  │ mallocs/op   vs base               │
    Build-8   35.43M ± 0%   34.05M ± 0%  -3.89% (p=0.000 n=10)

            │    old     │             new              │
            │ sys-sec/op │ sys-sec/op  vs base          │
    Build-8   5.662 ± 1%   5.701 ± 2%  ~ (p=0.280 n=10)
1 year ago
Daniel Martí cee53a7868 make GarbleActionID a full sha256 hash
This is in preparation for the switch to Go's cache package,
whose ActionID type is also a full sha256 hash with 32 bytes.
We were using "short" hashes as shown by `go tool buildid`,
since that was consistent and 15 bytes was generally enough.
1 year ago
Daniel Martí 7872177381 CI: try macos-latest again
And bump go-internal to its latest version, to include its fix
for those pesky "signal: killed" failures on macos.

While here, run the tests with -short on GOARCH=386,
and make our use of actions/setup-go a bit more consistent.
1 year ago
Daniel Martí b4fa94e45b use go:build in script/imports.txtar
This form has been preferred over +build since Go 1.17.
1 year ago
Daniel Martí 414e3b7f70 tidy our build ID hash code a bit
First, rename "component" to "hash", since it's shorter and more useful.
A full build ID is two or four hashes joined with slashes.

Second, add sanity checks that buildIDHashLength is being followed.
Otherwise the use of []byte could lead to human error.

Third, move all the hash encoding and decoding logic together.
1 year ago
Daniel Martí 0c9a59127a rename cache global to sharedCache
Since we will start importing github.com/rogpeppe/go-internal/cache,
and I don't want to have to rename it or leave confusion around.
1 year ago
Daniel Martí bdcb80ee63 adapt to tip's error message change from "GOROOT" to "std" 1 year ago
Daniel Martí 9f50e1a8a5 tweak when we read and write cachedOutput files
We first called the typecheck method, which starts filling cachedOutput
with information from the current package, and later we would load the
gob files for all dependencies via loadCachedOutputs.

This was a bit confusing; instead, load the cached gob files first,
and then do all the operations which fill information for curPkg.

Similarly, we were waiting until the very end of transformCompile to
write curPkg's cachedOutput gob file to the disk cache.
We can write the file at an earlier point, before we have obfuscated and
re-printed all Go files for the current package.
We can also write the file before other work like processImportCfg.

None of these changes should affect garble's behavior,
but they will make the cache redesign for #708 easier.
1 year ago
Daniel Martí 744e9a375a suggest a command when asking the user to rebuild garble
See a user's apparent confusion in #738.
1 year ago