garble/bench_test.go

// Copyright (c) 2020, The Garble Authors.
// See LICENSE for licensing information.

package main

import (
	"io/ioutil"
	"os"
	"os/exec"
	"path/filepath"
	"runtime"
	"sync/atomic"
	"testing"
)

// BenchmarkBuild is a parallel benchmark for 'garble build' on a fairly simple
// main package with a handful of standard library depedencies.
//
// We use a real garble binary and exec it, to simulate what the real user would
// run. The real obfuscation and compilation will happen in sub-processes
// anyway, so skipping one exec layer doesn't help us in any way.
//
// At the moment, each iteration takes 1-2s on a laptop, so we can't make the
// benchmark include any more features unless we make it significantly faster.
func BenchmarkBuild(b *testing.B) {
	tdir, err := ioutil.TempDir("", "garble-bench")
	if err != nil {
		b.Fatal(err)
	}
	defer os.RemoveAll(tdir)

	garbleBin := filepath.Join(tdir, "garble")
	if runtime.GOOS == "windows" {
		garbleBin += ".exe"
	}

	if err := exec.Command("go", "build", "-o="+garbleBin).Run(); err != nil {
		b.Fatalf("building garble: %v", err)
	}

	// We collect extra metrics.
	var n, userTime, systemTime int64

	b.ResetTimer()
	b.RunParallel(func(pb *testing.PB) {
		for pb.Next() {
			cmd := exec.Command(garbleBin, "build", "./testdata/bench")
			if out, err := cmd.CombinedOutput(); err != nil {
				b.Fatalf("%v: %s", err, out)
			}

			atomic.AddInt64(&n, 1)
			atomic.AddInt64(&userTime, int64(cmd.ProcessState.UserTime()))
			atomic.AddInt64(&systemTime, int64(cmd.ProcessState.SystemTime()))
		}
	})
	b.ReportMetric(float64(userTime)/float64(n), "user-ns/op")
	b.ReportMetric(float64(systemTime)/float64(n), "sys-ns/op")
	info, err := os.Stat(garbleBin)
	if err != nil {
		b.Fatal(err)
	}
	b.ReportMetric(float64(info.Size()), "bin-B")
}
set up an AUTHORS file to attribute copyright Many files were missing copyright, so also add a short script to add the missing lines with the current year, and run it. The AUTHORS file is also self-explanatory. Contributors can add themselves there, or we can simply update it from time to time via git-shortlog. Since we have two scripts now, set up a directory for them. 5 years ago			`// Copyright (c) 2020, The Garble Authors.`
			`// See LICENSE for licensing information.`
add the first benchmark and CONTRIBUTING doc 5 years ago
			`package main`

			`import (`
			`"io/ioutil"`
			`"os"`
			`"os/exec"`
			`"path/filepath"`
Add windows support for benchmark (#105) Benchmark on windows requires a .exe extension for garble 5 years ago			`"runtime"`
add the first benchmark and CONTRIBUTING doc 5 years ago			`"sync/atomic"`
			`"testing"`
			`)`

			`// BenchmarkBuild is a parallel benchmark for 'garble build' on a fairly simple`
			`// main package with a handful of standard library depedencies.`
			`//`
			`// We use a real garble binary and exec it, to simulate what the real user would`
			`// run. The real obfuscation and compilation will happen in sub-processes`
			`// anyway, so skipping one exec layer doesn't help us in any way.`
			`//`
			`// At the moment, each iteration takes 1-2s on a laptop, so we can't make the`
			`// benchmark include any more features unless we make it significantly faster.`
			`func BenchmarkBuild(b *testing.B) {`
			`tdir, err := ioutil.TempDir("", "garble-bench")`
			`if err != nil {`
			`b.Fatal(err)`
			`}`
			`defer os.RemoveAll(tdir)`

			`garbleBin := filepath.Join(tdir, "garble")`
Add windows support for benchmark (#105) Benchmark on windows requires a .exe extension for garble 5 years ago			`if runtime.GOOS == "windows" {`
			`garbleBin += ".exe"`
			`}`

add the first benchmark and CONTRIBUTING doc 5 years ago			`if err := exec.Command("go", "build", "-o="+garbleBin).Run(); err != nil {`
			`b.Fatalf("building garble: %v", err)`
			`}`

			`// We collect extra metrics.`
			`var n, userTime, systemTime int64`

			`b.ResetTimer()`
			`b.RunParallel(func(pb *testing.PB) {`
			`for pb.Next() {`
			`cmd := exec.Command(garbleBin, "build", "./testdata/bench")`
obfuscate unexported names like exported ones (#227) In 90fa325da7, the obfuscation logic was changed to use hashes for exported names, but incremental names starting at just one letter for unexported names. Presumably, this was done for the sake of binary size. I argue that this is not a good idea for the default mode for a number of reasons: 1) It makes reversing of stack traces nearly impossible for unexported names, since replacing an obfuscated name "c" with "originalName" would trigger too many false positives by matching single characters. 2) Exported and unexported names aren't different. We need to know how names were obfuscated at a later time in both cases, thanks to use cases like -ldflags=-X. Using short names for one but not the other doesn't make a lot of sense, and makes the logic inconsistent. 3) Shaving off three bytes for unexported names doesn't seem like a huge deal for the default mode, when we already have -tiny to optimize for size. This saves us a bit of work, but most importantly, simplifies the obfuscation state as we no longer need to carry privateNameMap between the compile and link stages. name old time/op new time/op delta Build-8 153ms ± 2% 150ms ± 2% ~ (p=0.065 n=6+6) name old bin-B new bin-B delta Build-8 7.09M ± 0% 7.08M ± 0% -0.24% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 296ms ± 5% 277ms ± 6% -6.50% (p=0.026 n=6+6) name old user-time/op new user-time/op delta Build-8 562ms ± 1% 558ms ± 3% ~ (p=0.329 n=5+6) Note that I do not oppose using short names for both exported and unexported names in the future for -tiny, since reversing of stack traces will by design not work there. The code can be resurrected from the git history if we want to improve -tiny that way in the future, as we'd need to store state in header files again. Another major cleanup we can do here is to no longer use the garbledImports map. From a look at obfuscateImports, we hash a package's import path with its action ID, much like exported names, so we can simply re-do that hashing for the linker's -X flag. garbledImports does have some logic to handle duplicate package names, but it's worth noting that should not affect package paths, as they are always unique. That area of code could probably do with some simplification in the future, too. While at it, make hashWith panic if either parameter is empty. obfuscateImports was hashing the main package path without a salt due to a bug, so we want to catch those in the future. Finally, make some tiny spacing and typo tweaks to the README. 4 years ago			`if out, err := cmd.CombinedOutput(); err != nil {`
			`b.Fatalf("%v: %s", err, out)`
add the first benchmark and CONTRIBUTING doc 5 years ago			`}`

			`atomic.AddInt64(&n, 1)`
			`atomic.AddInt64(&userTime, int64(cmd.ProcessState.UserTime()))`
			`atomic.AddInt64(&systemTime, int64(cmd.ProcessState.SystemTime()))`
			`}`
			`})`
			`b.ReportMetric(float64(userTime)/float64(n), "user-ns/op")`
			`b.ReportMetric(float64(systemTime)/float64(n), "sys-ns/op")`
obfuscate unexported names like exported ones (#227) In 90fa325da7, the obfuscation logic was changed to use hashes for exported names, but incremental names starting at just one letter for unexported names. Presumably, this was done for the sake of binary size. I argue that this is not a good idea for the default mode for a number of reasons: 1) It makes reversing of stack traces nearly impossible for unexported names, since replacing an obfuscated name "c" with "originalName" would trigger too many false positives by matching single characters. 2) Exported and unexported names aren't different. We need to know how names were obfuscated at a later time in both cases, thanks to use cases like -ldflags=-X. Using short names for one but not the other doesn't make a lot of sense, and makes the logic inconsistent. 3) Shaving off three bytes for unexported names doesn't seem like a huge deal for the default mode, when we already have -tiny to optimize for size. This saves us a bit of work, but most importantly, simplifies the obfuscation state as we no longer need to carry privateNameMap between the compile and link stages. name old time/op new time/op delta Build-8 153ms ± 2% 150ms ± 2% ~ (p=0.065 n=6+6) name old bin-B new bin-B delta Build-8 7.09M ± 0% 7.08M ± 0% -0.24% (p=0.002 n=6+6) name old sys-time/op new sys-time/op delta Build-8 296ms ± 5% 277ms ± 6% -6.50% (p=0.026 n=6+6) name old user-time/op new user-time/op delta Build-8 562ms ± 1% 558ms ± 3% ~ (p=0.329 n=5+6) Note that I do not oppose using short names for both exported and unexported names in the future for -tiny, since reversing of stack traces will by design not work there. The code can be resurrected from the git history if we want to improve -tiny that way in the future, as we'd need to store state in header files again. Another major cleanup we can do here is to no longer use the garbledImports map. From a look at obfuscateImports, we hash a package's import path with its action ID, much like exported names, so we can simply re-do that hashing for the linker's -X flag. garbledImports does have some logic to handle duplicate package names, but it's worth noting that should not affect package paths, as they are always unique. That area of code could probably do with some simplification in the future, too. While at it, make hashWith panic if either parameter is empty. obfuscateImports was hashing the main package path without a salt due to a bug, so we want to catch those in the future. Finally, make some tiny spacing and typo tweaks to the README. 4 years ago			`info, err := os.Stat(garbleBin)`
			`if err != nil {`
			`b.Fatal(err)`
			`}`
			`b.ReportMetric(float64(info.Size()), "bin-B")`
add the first benchmark and CONTRIBUTING doc 5 years ago			`}`