garble

Commit Graph

Author	SHA1	Message	Date
Daniel Martí	c012f08c66	store name pairs for _realName as a slice of pairs Iterating over a map is much more expensive than iterating over a slice, given how it needs to work out which keys are present in each bucket and then randomize the order in which to navigate the keys. None of this work needs to happen when iterating over a slice. A map would be nice if we were to actually do map lookups, but we don't. │ old │ new │ │ sec/op │ sec/op vs base │ AbiRealName-8 707.1µ ± 1% 196.7µ ± 1% -72.17% (p=0.001 n=7) │ old │ new │ │ B/s │ B/s vs base │ AbiRealName-8 517.6Ki ± 2% 1816.4Ki ± 1% +250.94% (p=0.001 n=7) │ old │ new │ │ B/op │ B/op vs base │ AbiRealName-8 5.362Ki ± 0% 5.359Ki ± 0% -0.05% (p=0.001 n=7) │ old │ new │ │ allocs/op │ allocs/op vs base │ AbiRealName-8 19.00 ± 0% 19.00 ± 0% ~ (p=1.000 n=7) ¹	4 months ago
Daniel Martí	210b19ac59	add benchmark for the injected _realName abi code │ new │ │ sec/op │ AbiRealName-8 1.026m ± 6% │ new │ │ B/s │ AbiRealName-8 351.6Ki ± 6% │ new │ │ B/op │ AbiRealName-8 5.363Ki ± 0% │ new │ │ allocs/op │ AbiRealName-8 19.00 ± 0%	4 months ago
Daniel Martí	ec5b6df439	support collecting deep cpu and memory profiles from benchmarks This allows me to collect a full CPU profile, showing us that we clearly spend too much CPU time in garbage collection. When collecting a full memory profile, we can see where the allocations come from now: Showing nodes accounting for 5636770, 31.07% of 18141579 total Dropped 630 nodes (cum <= 90707) Showing top 10 nodes out of 278 flat flat% sum% cum cum% 1692229 9.33% 9.33% 1910679 10.53% encoding/gob.decStringSlice 1005596 5.54% 14.87% 1005596 5.54% golang.org/x/tools/go/ssa.buildReferrers 458753 2.53% 17.40% 458753 2.53% go/scanner.(Scanner).scanIdentifier 458752 2.53% 19.93% 458752 2.53% reflect.(structType).Field 425984 2.35% 22.28% 448136 2.47% go/parser.(parser).parseIdent 390049 2.15% 24.43% 390049 2.15% golang.org/x/tools/go/ssa.(BasicBlock).emit 327683 1.81% 26.23% 371373 2.05% golang.org/x/tools/go/ssa.NewConst 311296 1.72% 27.95% 1024551 5.65% mvdan.cc/garble.(transformer).transformGoFile.func1 287891 1.59% 29.54% 287891 1.59% encoding/gob.decString 278537 1.54% 31.07% 657043 3.62% golang.org/x/tools/go/ssa.(builder).compLit This method can work for any invocation of garble, but for now we only directly wire it up for `go test -bench`. It can still be used for regular invocations of `garble build`.	4 months ago
Daniel Martí	20a92460d5	all: use cmd.Environ rather than os.Environ Added in Go 1.19, this keeps os/exec's default environment logic, such as ensuring that $PWD is always set.	1 year ago
Daniel Martí	d2beda1f00	switch frankban/quicktest for go-quicktest/qt The latter is newer and uses generics.	1 year ago
Daniel Martí	69bc62c56c	start using some Go 1.22 features We no longer need to worry about the scope of range variables, we can iterate over integers directly, and we can use cmp.Or too. I haven't paid close attention to using these everywhere. This is mainly testing out the new features where I saw some benefit.	1 year ago
Daniel Martí	7d1bd13778	replace our caching inside GOCACHE with GARBLE_CACHE For each Go package we obfuscate, we need to store information about how we obfuscated it, which is needed when obfuscating its dependents. For example, if A depends on B to use the type B.Foo, A needs to know whether or not B.Foo was obfuscated; it depends on B's use of reflect. We record this information in a gob file, which is cached on disk. To avoid rolling our own custom cache, and since garble is so closely connected with cmd/go already, we piggybacked off of Go's GOCACHE. In particular, for each build cache entry per `go list`'s Export field, we would store a "garble" sibling file with that gob content. However, this was brittle for two reasons: 1) We were doing this without cmd/go's permission or knowledge. We were careful to use filename suffixes similar to Export files, meaning that `go clean` and other commands would treat them the same. However, this could confuse cmd/go at any point in the future. 2) cmd/go trims cache entries in GOCACHE regularly, to keep the size of the build and test caches under control. Right now, this means that every 24h, any file not accessed in the last five days is deleted. However, that trimming heuristic is done per-file. If the trimming removed Garble's sibling file but not the original Export file, this could cause errors such as "cannot load garble export file" which users already ran into. Instead, start using github.com/rogpeppe/go-internal/cache, an exported copy of cmd/go's own cache implementation for GOCACHE. Since we need an entirely separate directory, we introduce GARBLE_CACHE, defaulting to the "garble" directory inside the user's cache directory. For example, on Linux this would be ~/.cache/garble. Inside GARBLE_CACHE, our gob file cache will be under "build", which helps clarify that this cache is used when obfuscating Go builds, and allows placing other kinds of caches inside GARBLE_CACHE. For example, we already have a need for storing linker binaries, which for now still use their own caching mechanism. This commit does not make our cache properly resistant to removed files. The proof is that our seed.txtar testscript still fails the second case. However, we do rewrite all of our caching logic away from Export files, which in itself is a considerable refactor, and we add a few TODOs. One notable change is how we load gob files from dependencies when building the cache entry for the current package. We used to load the gob files from all packages in the Deps field. However, that is the list of all _transitive_ dependencies. Since these gob files are already flat, meaning they contain information about all of their transitive dependencies as well, we need only load the gob files from the direct dependencies, the Imports field. Performance is largely unchanged, since the behavior is similar. However, the change from Deps to Imports saves us some work, which can be seen in the reduced mallocs per obfuscated build. It's unclear why the binary size isn't stable. When reverting the Deps to Imports change, it then settles at 5.386Mi, which is almost exactly in between the two measurements below. I'm not sure why, but that metric appears to be slightly unstable. goos: linux goarch: amd64 pkg: mvdan.cc/garble cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics │ old │ new │ │ sec/op │ sec/op vs base │ Build-8 11.09 ± 1% 11.08 ± 1% ~ (p=0.796 n=10) │ old │ new │ │ bin-B │ bin-B vs base │ Build-8 5.390Mi ± 0% 5.382Mi ± 0% -0.14% (p=0.000 n=10) │ old │ new │ │ cached-sec/op │ cached-sec/op vs base │ Build-8 415.5m ± 4% 421.6m ± 1% ~ (p=0.190 n=10) │ old │ new │ │ mallocs/op │ mallocs/op vs base │ Build-8 35.43M ± 0% 34.05M ± 0% -3.89% (p=0.000 n=10) │ old │ new │ │ sys-sec/op │ sys-sec/op vs base │ Build-8 5.662 ± 1% 5.701 ± 2% ~ (p=0.280 n=10)	2 years ago
Daniel Martí	a186419d3d	avoid rebuilding garble in the main benchmark Similar to what testscript does, we can reuse the test binary by telling TestMain to run the main function rather than the Go tests. This saves a few hundred milliseconds out of each benchmark run.	2 years ago
Daniel Martí	d4e7abc28c	reuse calls to testing.B.TempDir in the build benchmark Multiple calls to TempDir give new unique temporary directories. We don't need that, as we already used subdirectories.	2 years ago
Daniel Martí	481e3a1f09	default to GOGARBLE=, stop using GOPRIVATE We can drop the code that kicked in when GOGARBLE was empty. We can also add the value in addGarbleToHash unconditionally, as we never allow it to be empty. In the tests, remove all GOGARBLE lines where it just meant "obfuscate everything" or "obfuscate the entire main module". cgo.txtar had "obfuscate everything" as a separate step, so remove it entirely. linkname.txtar started failing because the imported package did not import strings, so listPackage errored out. This wasn't a problem when strings itself wasn't obfuscated, as transformLinkname silently left strings.IndexByte untouched. It is a problem when IndexByte does get obfuscated. Make that kind of listPackage error visible, and fix it. reflect.txtar started failing with "unreachable method" runtime throws. It's not clear to me why; it appears that GOGARBLE= makes the linker think that ExportedMethodName is suddenly unreachable. Work around the problem by making the method explicitly reachable, and leave a TODO as a reminder to investigate. Finally, gogarble.txtar no longer needs to test for GOPRIVATE. The rest of the test is left the same, as we still want the various values for GOGARBLE to continue to work just like before. Fixes #594.	2 years ago
Daniel Martí	91d4a8b6af	start reporting total allocs by garble in the benchmark This is like allocs/op by testing.B.ReportAllocs, but it combines all allocations by garble sub-processes. As we currently generate quite a bit of garbage, and reductions in it may only reduce time/op very slowly, this new metric will help us visualize small improvements. The regular ReportAllocs would not help us at all, as the main process simply executes "garble build". We remove user-ns/op to make space for mallocs/op, and also since it's a bit redundant given sys-ns/op and time-ns/op. name time/op Build-16 9.20s ± 1% name bin-B Build-16 5.16M ± 0% name cached-time/op Build-16 304ms ± 4% name mallocs/op Build-16 30.7M ± 0% name sys-time/op Build-16 4.78s ± 4%	3 years ago
Daniel Martí	d49c2446ee	apply benchmark suggestions by lu4p I had these ready for the PR, but forgot to push before merging.	3 years ago
Daniel Martí	f497821174	redesign benchmark to be more useful and realistic First, join the two benchmarks into one. The previous "cached" benchmark was borderline pointless, as it built the same package with the existing output binary, so it would quickly realise it had nothing to do and take ~100ms. The previous "noncached" benchmark input had no dependencies, so it was only really benchmarking the non-obfuscation of the runtime. All in all, neither benchmark measured obfuscating multiple packages. The new benchmark reuses the "cached" input, but with GOCACHE="*", meaning that we now obfuscate dozens of standard library packages. Each iteration first does a built from scratch, the worst case scenario, and then does an incremental rebuild of just the main package, which is the closest to a best case scenario without being a no-op. Since each iteration now performs both kinds of builds, we include a new "cached-time" metric to report what portion of the "time" metric corresponds to the incremental build. Thus, we can see a clean build takes ~11s, and a cached takes ~0.3s: name time/op Build-16 11.6s ± 1% name bin-B Build-16 5.34M ± 0% name cached-time/op Build-16 326ms ± 5% name sys-time/op Build-16 184ms ±13% name user-time/op Build-16 611ms ± 5% The benchmark is also no logner parallel; see the docs. Note that the old benchmark also reported bin-B incorrectly, as it looked at the binary size of garble itself, not the input program.	3 years ago
Daniel Martí	091f8239c0	rework the build benchmarks First, stop writing binaries into the current directory, which pollutes the git clone. Second, split the benchmark into two. The old benchmark always used the build cache after the first iteration, meaning that we weren't really measuring the cost of cold fresh builds. The new benchmarks show a build with an always-warm cache, and one without any cache. Note that NoCache with the main package importing "fmt" took about 4s wall time, which makes benchmarking too slow. For that reason, the new bench-nocache program has no std dependencies other than runtime, which already pulls in half a dozen dependencies we recompile at every iteration. This reduces the wall time to 2s, which is bearable. On the other hand, Cache is already fast, so we add a second and slightly heavier dependency, net/http. The build still takes under 300ms of wall time. This also helps the Cache benchmark imitate larger rebuilds with a warm cache. Longer term, both benchmarks will be useful, because we want both scenarios to be as efficient as possible. name time/op Build/Cache-8 161ms ± 1% Build/NoCache-8 1.21s ± 1% name bin-B Build/Cache-8 6.35M ± 0% Build/NoCache-8 6.35M ± 0% name sys-time/op Build/Cache-8 218ms ± 7% Build/NoCache-8 522ms ± 4% name user-time/op Build/Cache-8 825ms ± 1% Build/NoCache-8 8.17s ± 1%	4 years ago
Daniel Martí	5d74ab07f5	all: replace uses of the deprecated ioutil Now that we require Go 1.16, we can simplify code by removing ioutil.	4 years ago
Daniel Martí	79c775e218	obfuscate unexported names like exported ones (#227 ) In `90fa325da7`, the obfuscation logic was changed to use hashes for exported names, but incremental names starting at just one letter for unexported names. Presumably, this was done for the sake of binary size. I argue that this is not a good idea for the default mode for a number of reasons: 1) It makes reversing of stack traces nearly impossible for unexported names, since replacing an obfuscated name "c" with "originalName" would trigger too many false positives by matching single characters. 2) Exported and unexported names aren't different. We need to know how names were obfuscated at a later time in both cases, thanks to use cases like -ldflags=-X. Using short names for one but not the other doesn't make a lot of sense, and makes the logic inconsistent. 3) Shaving off three bytes for unexported names doesn't seem like a huge deal for the default mode, when we already have -tiny to optimize for size. This saves us a bit of work, but most importantly, simplifies the obfuscation state as we no longer need to carry privateNameMap between the compile and link stages. name old time/op new time/op delta Build-8 153ms ± 2% 150ms ± 2% ~ (p=0.065 n=6+6) name old bin-B new bin-B delta Build-8 7.09M ± 0% 7.08M ± 0% -0.24% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 296ms ± 5% 277ms ± 6% -6.50% (p=0.026 n=6+6) name old user-time/op new user-time/op delta Build-8 562ms ± 1% 558ms ± 3% ~ (p=0.329 n=5+6) Note that I do not oppose using short names for both exported and unexported names in the future for -tiny, since reversing of stack traces will by design not work there. The code can be resurrected from the git history if we want to improve -tiny that way in the future, as we'd need to store state in header files again. Another major cleanup we can do here is to no longer use the garbledImports map. From a look at obfuscateImports, we hash a package's import path with its action ID, much like exported names, so we can simply re-do that hashing for the linker's -X flag. garbledImports does have some logic to handle duplicate package names, but it's worth noting that should not affect package paths, as they are always unique. That area of code could probably do with some simplification in the future, too. While at it, make hashWith panic if either parameter is empty. obfuscateImports was hashing the main package path without a salt due to a bug, so we want to catch those in the future. Finally, make some tiny spacing and typo tweaks to the README.	4 years ago
Daniel Martí	805c895d59	set up an AUTHORS file to attribute copyright Many files were missing copyright, so also add a short script to add the missing lines with the current year, and run it. The AUTHORS file is also self-explanatory. Contributors can add themselves there, or we can simply update it from time to time via git-shortlog. Since we have two scripts now, set up a directory for them.	5 years ago
pagran	5737cb7f8a	Add windows support for benchmark (#105 ) Benchmark on windows requires a .exe extension for garble	5 years ago
Daniel Martí	9c4b7d5a44	add the first benchmark and CONTRIBUTING doc	5 years ago

19 Commits (c012f08c66b3d906904c956477563f4d2117bdbd)