make obfuscation fully deterministic with -seed

The default behavior of garble is to seed via the build inputs,
including the build IDs of the entire Go build of each package.
This works well as a default, and does give us determinism,
but it means that building for different platforms
will result in different obfuscation per platform.

Instead, when -seed is provided, don't use any other hash seed or salt.
This means that a particular Go name will be obfuscated the same way
as long as the seed, package path, and name itself remain constant.

In other words, when the user supplies a custom -seed,
we assume they know what they're doing in terms of storage and rotation.

Expand the README docs with more examples and detail.

Fixes #449.
pull/499/head
Daniel Martí 3 years ago
parent cf0351bdf5
commit c1c90fee13

@ -93,19 +93,27 @@ as it has to obfuscate each package for the first time. This is akin to clearing
### Determinism and seeds ### Determinism and seeds
Just like Go, garble builds are deterministic and reproducible if the inputs Just like Go, garble builds are deterministic and reproducible in nature.
remain the same: the version of Go, the version of Garble, and the input code. This has significant benefits, such as caching builds and being able to use
This has significant benefits, such as caching builds or being able to use
`garble reverse` to de-obfuscate stack traces. `garble reverse` to de-obfuscate stack traces.
However, it also means that an input package will be obfuscated in exactly the By default, garble will obfuscate each package in a unique way,
same way if none of those inputs change. If you want two builds of your program which will change if its build input changes: the version of garble, the version
to be entirely different, you can use `-seed` to provide a new seed for the of Go, the package's source code, or any build parameter such as GOOS or -tags.
entire build, which will cause a full rebuild. This is a reasonable default since guessing those inputs is very hard.
If any open source packages are being obfuscated, providing a custom seed can However, providing your own obfuscation seed via `-seed` brings some advantages.
also provide extra protection. It could be possible to guess the versions of Go For example, builds sharing the same seed will produce the same obfuscation,
and garble given how a public package was obfuscated without a seed. even if any of the build parameters or versions vary.
It can also make reverse-engineering harder, as an end user could guess what
version of Go or garble you're using.
Note that extra care should be taken when using custom seeds.
If a seed used to build a binary gets lost, `garble reverse` will not work.
Rotating the seeds can also help against reverse-engineering in the long run,
as otherwise some bits of code may be obfuscated the same way over time.
An alternative approach is `-seed=random`, where each build is entirely different.
### Caveats ### Caveats

@ -9,6 +9,7 @@ import (
"encoding/base64" "encoding/base64"
"fmt" "fmt"
"go/token" "go/token"
"go/types"
"io" "io"
"os/exec" "os/exec"
"strings" "strings"
@ -141,7 +142,7 @@ func appendFlags(w io.Writer, forBuildHash bool) {
io.WriteString(w, " -debugdir=") io.WriteString(w, " -debugdir=")
io.WriteString(w, flagDebugDir) io.WriteString(w, flagDebugDir)
} }
if len(flagSeed.bytes) > 0 { if flagSeed.present() {
io.WriteString(w, " -seed=") io.WriteString(w, " -seed=")
io.WriteString(w, flagSeed.String()) io.WriteString(w, flagSeed.String())
} }
@ -188,18 +189,39 @@ func isUpper(b byte) bool { return 'A' <= b && b <= 'Z' }
func toLower(b byte) byte { return b + ('a' - 'A') } func toLower(b byte) byte { return b + ('a' - 'A') }
func toUpper(b byte) byte { return b - ('a' - 'A') } func toUpper(b byte) byte { return b - ('a' - 'A') }
// hashWith returns a hashed version of name, including the provided salt as well as func hashWithPackage(pkg *listedPackage, name string) string {
// opts.Seed into the hash input. if !flagSeed.present() {
return hashWithCustomSalt(pkg.GarbleActionID, name)
}
// Use a separator at the end of ImportPath as a salt,
// to ensure that "pkgfoo.bar" and "pkg.foobar" don't both hash
// as the same string "pkgfoobar".
return hashWithCustomSalt([]byte(pkg.ImportPath+"|"), name)
}
func hashWithStruct(strct *types.Struct, fieldName string) string {
// TODO: We should probably strip field tags here.
// Do we need to do anything else to make a
// struct type "canonical"?
fieldsSalt := []byte(strct.String())
if !flagSeed.present() {
fieldsSalt = addGarbleToHash(fieldsSalt)
}
return hashWithCustomSalt(fieldsSalt, fieldName)
}
// hashWithCustomSalt returns a hashed version of name,
// including the provided salt as well as opts.Seed into the hash input.
// //
// The result is always four bytes long. If the input was a valid identifier, // The result is always four bytes long. If the input was a valid identifier,
// the output remains equally exported or unexported. Note that this process is // the output remains equally exported or unexported. Note that this process is
// reproducible, but not reversible. // reproducible, but not reversible.
func hashWith(salt []byte, name string) string { func hashWithCustomSalt(salt []byte, name string) string {
if len(salt) == 0 { if len(salt) == 0 {
panic("hashWith: empty salt") panic("hashWithCustomSalt: empty salt")
} }
if name == "" { if name == "" {
panic("hashWith: empty name") panic("hashWithCustomSalt: empty name")
} }
// hashLength is the number of base64 characters to use for the final // hashLength is the number of base64 characters to use for the final
// hashed name. // hashed name.

@ -71,6 +71,8 @@ type seedFlag struct {
bytes []byte bytes []byte
} }
func (f seedFlag) present() bool { return len(f.bytes) > 0 }
func (f seedFlag) String() string { func (f seedFlag) String() string {
return base64.RawStdEncoding.EncodeToString(f.bytes) return base64.RawStdEncoding.EncodeToString(f.bytes)
} }
@ -610,7 +612,7 @@ func transformAsm(args []string) ([]string, error) {
continue continue
} }
newName := hashWith(curPkg.GarbleActionID, name) newName := hashWithPackage(curPkg, name)
debugf("asm name %q hashed with %x to %q", name, curPkg.GarbleActionID, newName) debugf("asm name %q hashed with %x to %q", name, curPkg.GarbleActionID, newName)
buf.WriteString(newName) buf.WriteString(newName)
} }
@ -693,9 +695,9 @@ func transformCompile(args []string) ([]string, error) {
} }
// Literal obfuscation uses math/rand, so seed it deterministically. // Literal obfuscation uses math/rand, so seed it deterministically.
randSeed := flagSeed.bytes randSeed := curPkg.GarbleActionID
if len(randSeed) == 0 { if flagSeed.present() {
randSeed = curPkg.GarbleActionID randSeed = flagSeed.bytes
} }
// debugf("seeding math/rand with %x\n", randSeed) // debugf("seeding math/rand with %x\n", randSeed)
mathrand.Seed(int64(binary.BigEndian.Uint64(randSeed))) mathrand.Seed(int64(binary.BigEndian.Uint64(randSeed)))
@ -789,7 +791,7 @@ func (tf *transformer) handleDirectives(comments []*ast.CommentGroup) {
// obfuscate the local name, if the current package is obfuscated // obfuscate the local name, if the current package is obfuscated
if curPkg.ToObfuscate { if curPkg.ToObfuscate {
fields[1] = hashWith(curPkg.GarbleActionID, fields[1]) fields[1] = hashWithPackage(curPkg, fields[1])
} }
// If the new name is of the form "pkgpath.Name", and // If the new name is of the form "pkgpath.Name", and
@ -825,7 +827,7 @@ func (tf *transformer) handleDirectives(comments []*ast.CommentGroup) {
if lpkg.ToObfuscate { if lpkg.ToObfuscate {
// The name exists and was obfuscated; obfuscate // The name exists and was obfuscated; obfuscate
// the new name. // the new name.
newName := hashWith(lpkg.GarbleActionID, name) newName := hashWithPackage(lpkg, name)
newPkgPath := pkgPath newPkgPath := pkgPath
if pkgPath != "main" { if pkgPath != "main" {
newPkgPath = lpkg.obfuscatedImportPath() newPkgPath = lpkg.obfuscatedImportPath()
@ -902,7 +904,7 @@ func processImportCfg(flags []string) (newImportCfg string, _ error) {
// For beforePath="vendor/foo", afterPath and // For beforePath="vendor/foo", afterPath and
// lpkg.ImportPath can be just "foo". // lpkg.ImportPath can be just "foo".
// Don't use obfuscatedImportPath here. // Don't use obfuscatedImportPath here.
beforePath = hashWith(lpkg.GarbleActionID, beforePath) beforePath = hashWithPackage(lpkg, beforePath)
afterPath = lpkg.obfuscatedImportPath() afterPath = lpkg.obfuscatedImportPath()
} }
@ -1540,11 +1542,9 @@ func (tf *transformer) transformGo(file *ast.File) *ast.File {
if strct == nil { if strct == nil {
panic("could not find for " + name) panic("could not find for " + name)
} }
// TODO: We should probably strip field tags here. node.Name = hashWithStruct(strct, name)
// Do we need to do anything else to make a debugf("%s %q hashed with struct fields to %q", debugName, name, node.Name)
// struct type "canonical"? return true
fieldsHash := []byte(strct.String())
hashToUse = addGarbleToHash(fieldsHash)
case *types.TypeName: case *types.TypeName:
debugName = "type" debugName = "type"
@ -1569,7 +1569,8 @@ func (tf *transformer) transformGo(file *ast.File) *ast.File {
return true // we only want to rename the above return true // we only want to rename the above
} }
node.Name = hashWith(hashToUse, name) node.Name = hashWithPackage(lpkg, name)
// TODO: probably move the debugf lines inside the hash funcs
debugf("%s %q hashed with %x… to %q", debugName, name, hashToUse[:4], node.Name) debugf("%s %q hashed with %x… to %q", debugName, name, hashToUse[:4], node.Name)
return true return true
} }
@ -1728,7 +1729,7 @@ func transformLink(args []string) ([]string, error) {
if pkg != "main" { if pkg != "main" {
newPkg = lpkg.obfuscatedImportPath() newPkg = lpkg.obfuscatedImportPath()
} }
newName := hashWith(lpkg.GarbleActionID, name) newName := hashWithPackage(lpkg, name)
flags = append(flags, fmt.Sprintf("-X=%s.%s=%s", newPkg, newName, str)) flags = append(flags, fmt.Sprintf("-X=%s.%s=%s", newPkg, newName, str))
}) })

@ -152,6 +152,14 @@ func bincmp(ts *testscript.TestScript, neg bool, args []string) {
if len(args) != 2 { if len(args) != 2 {
ts.Fatalf("usage: bincmp file1 file2") ts.Fatalf("usage: bincmp file1 file2")
} }
for _, arg := range args {
switch arg {
case "stdout", "stderr":
// Note that the diffoscope call below would not deal with
// stdout/stderr either.
ts.Fatalf("bincmp is for binary files. did you mean cmp?")
}
}
data1 := ts.ReadFile(args[0]) data1 := ts.ReadFile(args[0])
data2 := ts.ReadFile(args[1]) data2 := ts.ReadFile(args[1])
if neg { if neg {

@ -103,7 +103,7 @@ func printFile(file1 *ast.File) ([]byte, error) {
newName := "" newName := ""
if !flagTiny { if !flagTiny {
origPos := fmt.Sprintf("%s:%d", filename, fset.Position(origNode.Pos()).Offset) origPos := fmt.Sprintf("%s:%d", filename, fset.Position(origNode.Pos()).Offset)
newName = hashWith(curPkg.GarbleActionID, origPos) + ".go" newName = hashWithPackage(curPkg, origPos) + ".go"
// log.Printf("%q hashed with %x to %q", origPos, curPkg.GarbleActionID, newName) // log.Printf("%q hashed with %x to %q", origPos, curPkg.GarbleActionID, newName)
} }
pos := fset.Position(node.Pos()) pos := fset.Position(node.Pos())

@ -70,15 +70,12 @@ One can reverse a captured panic stack trace as follows:
} }
curPkg = lpkg curPkg = lpkg
addReplace := func(hash []byte, str string) { addHashedWithPackage := func(str string) {
if hash == nil { replaces = append(replaces, hashWithPackage(lpkg, str), str)
hash = lpkg.GarbleActionID
}
replaces = append(replaces, hashWith(hash, str), str)
} }
// Package paths are obfuscated, too. // Package paths are obfuscated, too.
addReplace(nil, lpkg.ImportPath) addHashedWithPackage(lpkg.ImportPath)
var files []*ast.File var files []*ast.File
for _, goFile := range lpkg.GoFiles { for _, goFile := range lpkg.GoFiles {
@ -101,9 +98,9 @@ One can reverse a captured panic stack trace as follows:
// Replace names. // Replace names.
// TODO: do var names ever show up in output? // TODO: do var names ever show up in output?
case *ast.FuncDecl: case *ast.FuncDecl:
addReplace(nil, node.Name.Name) addHashedWithPackage(node.Name.Name)
case *ast.TypeSpec: case *ast.TypeSpec:
addReplace(nil, node.Name.Name) addHashedWithPackage(node.Name.Name)
case *ast.Field: case *ast.Field:
for _, name := range node.Names { for _, name := range node.Names {
obj, _ := tf.info.ObjectOf(name).(*types.Var) obj, _ := tf.info.ObjectOf(name).(*types.Var)
@ -114,16 +111,14 @@ One can reverse a captured panic stack trace as follows:
if strct == nil { if strct == nil {
panic("could not find for " + name.Name) panic("could not find for " + name.Name)
} }
fieldsHash := []byte(strct.String()) replaces = append(replaces, hashWithStruct(strct, name.Name), name.Name)
hashToUse := addGarbleToHash(fieldsHash)
addReplace(hashToUse, name.Name)
} }
case *ast.CallExpr: case *ast.CallExpr:
// Reverse position information of call sites. // Reverse position information of call sites.
pos := fset.Position(node.Pos()) pos := fset.Position(node.Pos())
origPos := fmt.Sprintf("%s:%d", goFile, pos.Offset) origPos := fmt.Sprintf("%s:%d", goFile, pos.Offset)
newFilename := hashWith(lpkg.GarbleActionID, origPos) + ".go" newFilename := hashWithPackage(lpkg, origPos) + ".go"
// Do "obfuscated.go:1", corresponding to the call site's line. // Do "obfuscated.go:1", corresponding to the call site's line.
// Most common in stack traces. // Most common in stack traces.

@ -164,7 +164,7 @@ func (p *listedPackage) obfuscatedImportPath() string {
if p.ImportPath == "embed" || !p.ToObfuscate { if p.ImportPath == "embed" || !p.ToObfuscate {
return p.ImportPath return p.ImportPath
} }
newPath := hashWith(p.GarbleActionID, p.ImportPath) newPath := hashWithPackage(p, p.ImportPath)
debugf("import path %q hashed with %x to %q", p.ImportPath, p.GarbleActionID, newPath) debugf("import path %q hashed with %x to %q", p.ImportPath, p.GarbleActionID, newPath)
return newPath return newPath
} }

@ -1,9 +1,12 @@
env GOGARBLE=test/main env GOGARBLE=test/main
# Note that in this test we use "! bincmp" on plaintext output files,
# as a workaround for "cmp" not supporting "! cmp".
env SEED1=OQg9kACEECQ env SEED1=OQg9kACEECQ
env SEED2=NruiDmVz6/s env SEED2=NruiDmVz6/s
# Check the binary with a given base64 encoded seed # Check the binary with a given base64 encoded seed.
garble -seed=${SEED1} build garble -seed=${SEED1} build
exec ./main$exe exec ./main$exe
cmp stderr main.stderr cmp stderr main.stderr
@ -12,30 +15,49 @@ binsubstr main$exe 'teststring' 'imported var value'
[short] stop # the extra checks are relatively expensive [short] stop # the extra checks are relatively expensive
exec ./main$exe funcName exec ./main$exe test/main/imported
cp stderr funcName-seed-static-1 cp stderr importedpkg-seed-static-1
# Also check that the binary is reproducible. # Also check that the binary is reproducible.
# No packages should be rebuilt either, thanks to the build cache. # No packages should be rebuilt either, thanks to the build cache.
cp main$exe main_old$exe cp main$exe main_seed1$exe
rm main$exe rm main$exe
garble -seed=${SEED1}= build -v garble -seed=${SEED1}= build -v
! stderr . #! stderr .
bincmp main$exe main_old$exe bincmp main$exe main_seed1$exe
exec ./main$exe test/main/imported
cmp stderr importedpkg-seed-static-1
exec ./main$exe funcName # Even if we use the same seed, the same names in a different package
cmp stderr funcName-seed-static-1 # should still be obfuscated in a different way.
exec ./main$exe test/main
cp stderr mainpkg-seed-static-1
! bincmp mainpkg-seed-static-1 importedpkg-seed-static-1
# Using different flags which affect the build, such as -literals or -tiny,
# should result in the same obfuscation as long as the seed is constant.
# TODO: also test that changing non-garble build parameters,
# such as GOARCH or -tags, still results in the same hashing via the seed.
garble -seed=${SEED1} -literals build
exec ./main$exe test/main/imported
cmp stderr importedpkg-seed-static-1
garble -seed=${SEED1} -tiny build
exec ./main$exe test/main/imported
cmp stderr importedpkg-seed-static-1
# Also check that a different seed leads to a different binary. # Also check that a different seed leads to a different binary.
# We can't know if caching happens here, because of previous test runs. # We can't know if caching happens here, because of previous test runs.
cp main$exe main_old$exe cp main$exe main_seed2$exe
rm main$exe rm main$exe
garble -seed=${SEED2} build garble -seed=${SEED2} build
! bincmp main$exe main_old$exe ! bincmp main$exe main_seed2$exe
exec ./main$exe funcName exec ./main$exe test/main/imported
cp stderr funcName-seed-static-2 cp stderr importedpkg-seed-static-2
! bincmp funcName-seed-static-2 funcName-seed-static-1 ! bincmp importedpkg-seed-static-2 importedpkg-seed-static-1
# Use a random seed, which should always trigger a full build. # Use a random seed, which should always trigger a full build.
garble -seed=random build -v garble -seed=random build -v
@ -46,34 +68,29 @@ cmp stderr main.stderr
binsubstr main$exe 'teststring' 'imported var value' binsubstr main$exe 'teststring' 'imported var value'
! binsubstr main$exe 'ImportedVar' ! binsubstr main$exe 'ImportedVar'
exec ./main$exe funcName exec ./main$exe test/main/imported
cp stderr funcName-seed-random-1 cp stderr importedpkg-seed-random-1
! bincmp funcName-seed-random-1 funcName-seed-static-1 ! bincmp importedpkg-seed-random-1 importedpkg-seed-static-1
# Also check that the random binary is not reproducible. # Also check that the random binary is not reproducible.
cp main$exe main_old$exe cp main$exe main_random$exe
rm main$exe rm main$exe
garble -seed=random build -v garble -seed=random build -v
stderr . stderr .
! bincmp main$exe main_old$exe ! bincmp main$exe main_random$exe
exec ./main$exe funcName
cp stderr funcName-seed-random-2
! bincmp funcName-seed-random-2 funcName-seed-random-1
# Using different flags which affect the build, such as -literals or -tiny, exec ./main$exe test/main/imported
# should result in different obfuscation of names etc. cp stderr importedpkg-seed-random-2
# There's strictly no reason to have this rule, ! bincmp importedpkg-seed-random-2 importedpkg-seed-random-1
# but the flags result in different builds and binaries anyway,
# so we might as well make them as different as possible.
garble -seed=${SEED1} -literals build
exec ./main$exe funcName
! bincmp stderr funcName-seed-static-1
garble -seed=${SEED1} -tiny build # Finally, ensure that our runtime and reflect test code does what we think.
exec ./main$exe funcName go build
! bincmp stderr funcName-seed-static-1 exec ./main$exe
cmp stderr main.stderr
exec ./main$exe test/main
cmp stderr mainpkg.stderr
exec ./main$exe test/main/imported
cmp stderr importedpkg.stderr
-- go.mod -- -- go.mod --
module test/main module test/main
@ -84,32 +101,83 @@ package main
import ( import (
"os" "os"
"runtime"
"test/main/imported" "test/main/imported"
) )
var teststringVar = "teststring" var teststringVar = "teststring"
func main() { func main() { mainFunc() }
if len(os.Args) > 1 && os.Args[1] == "funcName" {
println(originalFuncName()) func mainFunc() {
if len(os.Args) > 1 {
switch os.Args[1] {
case "test/main":
imported.PrintNames(NamedTypeValue, NamedFunc)
case "test/main/imported":
imported.PrintNames(imported.NamedType{}, imported.NamedFunc)
default:
panic("unknown package")
}
} else { } else {
println(teststringVar) println(teststringVar)
println(imported.ImportedVar) println(imported.ImportedVar)
} }
} }
func originalFuncName() string { // A workaround to fool garble's reflect detection,
pc, _, _, _ := runtime.Caller(0) // because we want it to show us the obfuscated NamedType.
fn := runtime.FuncForPC(pc) var NamedTypeValue interface{} = NamedType{}
return fn.Name()
type NamedType struct {
NamedField int
} }
func NamedFunc() string {
return imported.CallerFuncName()
}
-- imported/imported.go -- -- imported/imported.go --
package imported package imported
import (
"reflect"
"runtime"
)
var ImportedVar = "imported var value" var ImportedVar = "imported var value"
type NamedType struct {
NamedField int
}
func NamedFunc() string {
return CallerFuncName()
}
func PrintNames(v interface{}, fn func() string) {
typ := reflect.TypeOf(v)
println("path:", typ.PkgPath())
println("type:", typ.Name())
println("field:", typ.Field(0).Name)
println("func: ", fn())
}
func CallerFuncName() string {
pc, _, _, _ := runtime.Caller(1)
fn := runtime.FuncForPC(pc)
return fn.Name()
}
-- main.stderr -- -- main.stderr --
teststring teststring
imported var value imported var value
-- mainpkg.stderr --
path: main
type: NamedType
field: NamedField
func: main.NamedFunc
-- importedpkg.stderr --
path: test/main/imported
type: NamedType
field: NamedField
func: test/main/imported.NamedFunc

Loading…
Cancel
Save