You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
garble/testdata/script/seed.txtar

234 lines
6.6 KiB
Plaintext

# Note that in this test we use "! bincmp" on plaintext output files,
# as a workaround for "cmp" not supporting "! cmp".
# TODO: now that obfuscation with -seed is deterministic,
# can we just rely on the regular "cmp" with fixed output files?
# TODO: consider setting these seeds globally,
# so we can reuse them across tests and make better use of the shared build cache.
env SEED1=OQg9kACEECQ
env SEED2=NruiDmVz6/s
# Check the binary with a given base64 encoded seed.
exec garble -seed=${SEED1} build
exec ./main$exe
cmp stderr main.stderr
binsubstr main$exe 'teststring' 'imported var value'
! binsubstr main$exe 'ImportedVar' ${SEED1}
[short] stop # the extra checks are relatively expensive
exec ./main$exe test/main/imported
cp stderr importedpkg-seed-static-1
# Also check that the binary is reproducible.
initial support for build caching (#142) As per the discussion in https://github.com/golang/go/issues/41145, it turns out that we don't need special support for build caching in -toolexec. We can simply modify the behavior of "[...]/compile -V=full" and "[...]/link -V=full" so that they include garble's own version and options in the printed build ID. The part of the build ID that matters is the last, since it's the "content ID" which is used to work out whether there is a need to redo the action (build) or not. Since cmd/go parses the last word in the output as "buildID=...", we simply add "+garble buildID=_/_/_/${hash}". The slashes let us imitate a full binary build ID, but we assume that the other components such as the action ID are not necessary, since the only reader here is cmd/go and it only consumes the content ID. The reported content ID includes the tool's original content ID, garble's own content ID from the built binary, and the garble options which modify how we obfuscate code. If any of the three changes, we should use a different build cache key. GOPRIVATE also affects caching, since a different GOPRIVATE value means that we might have to garble a different set of packages. Include tests, which mainly check that 'garble build -v' prints package lines when we expect to always need to rebuild packages, and that it prints nothing when we should be reusing the build cache even when the built binary is missing. After this change, 'go test' on Go 1.15.2 stabilizes at about 8s on my machine, whereas it used to be at around 25s before.
4 years ago
# No packages should be rebuilt either, thanks to the build cache.
cp main$exe main_seed1$exe
rm main$exe
exec garble -seed=${SEED1}= build -v
! stderr .
bincmp main$exe main_seed1$exe
exec ./main$exe test/main/imported
cmp stderr importedpkg-seed-static-1
# Even if we use the same seed, the same names in a different package
# should still be obfuscated in a different way.
exec ./main$exe test/main
cp stderr mainpkg-seed-static-1
! bincmp mainpkg-seed-static-1 importedpkg-seed-static-1
# Using different flags which affect the build, such as -literals or -tiny,
# should result in the same obfuscation as long as the seed is constant.
# TODO: also test that changing non-garble build parameters,
# such as GOARCH or -tags, still results in the same hashing via the seed.
exec garble -seed=${SEED1} -literals build
exec ./main$exe test/main/imported
cmp stderr importedpkg-seed-static-1
exec garble -seed=${SEED1} -tiny build
exec ./main$exe test/main/imported
cmp stderr importedpkg-seed-static-1
initial support for build caching (#142) As per the discussion in https://github.com/golang/go/issues/41145, it turns out that we don't need special support for build caching in -toolexec. We can simply modify the behavior of "[...]/compile -V=full" and "[...]/link -V=full" so that they include garble's own version and options in the printed build ID. The part of the build ID that matters is the last, since it's the "content ID" which is used to work out whether there is a need to redo the action (build) or not. Since cmd/go parses the last word in the output as "buildID=...", we simply add "+garble buildID=_/_/_/${hash}". The slashes let us imitate a full binary build ID, but we assume that the other components such as the action ID are not necessary, since the only reader here is cmd/go and it only consumes the content ID. The reported content ID includes the tool's original content ID, garble's own content ID from the built binary, and the garble options which modify how we obfuscate code. If any of the three changes, we should use a different build cache key. GOPRIVATE also affects caching, since a different GOPRIVATE value means that we might have to garble a different set of packages. Include tests, which mainly check that 'garble build -v' prints package lines when we expect to always need to rebuild packages, and that it prints nothing when we should be reusing the build cache even when the built binary is missing. After this change, 'go test' on Go 1.15.2 stabilizes at about 8s on my machine, whereas it used to be at around 25s before.
4 years ago
# Also check that a different seed leads to a different binary.
# We can't know if caching happens here, because of previous test runs.
cp main$exe main_seed2$exe
rm main$exe
exec garble -seed=${SEED2} build
! bincmp main$exe main_seed2$exe
exec ./main$exe test/main/imported
cp stderr importedpkg-seed-static-2
! bincmp importedpkg-seed-static-2 importedpkg-seed-static-1
initial support for build caching (#142) As per the discussion in https://github.com/golang/go/issues/41145, it turns out that we don't need special support for build caching in -toolexec. We can simply modify the behavior of "[...]/compile -V=full" and "[...]/link -V=full" so that they include garble's own version and options in the printed build ID. The part of the build ID that matters is the last, since it's the "content ID" which is used to work out whether there is a need to redo the action (build) or not. Since cmd/go parses the last word in the output as "buildID=...", we simply add "+garble buildID=_/_/_/${hash}". The slashes let us imitate a full binary build ID, but we assume that the other components such as the action ID are not necessary, since the only reader here is cmd/go and it only consumes the content ID. The reported content ID includes the tool's original content ID, garble's own content ID from the built binary, and the garble options which modify how we obfuscate code. If any of the three changes, we should use a different build cache key. GOPRIVATE also affects caching, since a different GOPRIVATE value means that we might have to garble a different set of packages. Include tests, which mainly check that 'garble build -v' prints package lines when we expect to always need to rebuild packages, and that it prints nothing when we should be reusing the build cache even when the built binary is missing. After this change, 'go test' on Go 1.15.2 stabilizes at about 8s on my machine, whereas it used to be at around 25s before.
4 years ago
# Use a random seed, which should always trigger a full build.
exec garble -seed=random build -v
stderr -count=1 '^-seed chosen at random: .+'
stderr -count=1 '^runtime$'
stderr -count=1 '^test/main$'
exec ./main$exe
cmp stderr main.stderr
binsubstr main$exe 'teststring' 'imported var value'
! binsubstr main$exe 'ImportedVar'
exec ./main$exe test/main/imported
cp stderr importedpkg-seed-random-1
! bincmp importedpkg-seed-random-1 importedpkg-seed-static-1
# Also check that the random binary is not reproducible.
cp main$exe main_random$exe
rm main$exe
exec garble -seed=random build -v
initial support for build caching (#142) As per the discussion in https://github.com/golang/go/issues/41145, it turns out that we don't need special support for build caching in -toolexec. We can simply modify the behavior of "[...]/compile -V=full" and "[...]/link -V=full" so that they include garble's own version and options in the printed build ID. The part of the build ID that matters is the last, since it's the "content ID" which is used to work out whether there is a need to redo the action (build) or not. Since cmd/go parses the last word in the output as "buildID=...", we simply add "+garble buildID=_/_/_/${hash}". The slashes let us imitate a full binary build ID, but we assume that the other components such as the action ID are not necessary, since the only reader here is cmd/go and it only consumes the content ID. The reported content ID includes the tool's original content ID, garble's own content ID from the built binary, and the garble options which modify how we obfuscate code. If any of the three changes, we should use a different build cache key. GOPRIVATE also affects caching, since a different GOPRIVATE value means that we might have to garble a different set of packages. Include tests, which mainly check that 'garble build -v' prints package lines when we expect to always need to rebuild packages, and that it prints nothing when we should be reusing the build cache even when the built binary is missing. After this change, 'go test' on Go 1.15.2 stabilizes at about 8s on my machine, whereas it used to be at around 25s before.
4 years ago
stderr .
! bincmp main$exe main_random$exe
exec ./main$exe test/main/imported
cp stderr importedpkg-seed-random-2
! bincmp importedpkg-seed-random-2 importedpkg-seed-random-1
# Finally, ensure that our runtime and reflect test code does what we think.
go build
exec ./main$exe
cmp stderr main.stderr
exec ./main$exe test/main
cmp stderr mainpkg.stderr
exec ./main$exe test/main/imported
cmp stderr importedpkg.stderr
-- go.mod --
module test/main
go 1.21
-- main.go --
package main
import (
"os"
"test/main/imported"
)
var teststringVar = "teststring"
func main() { mainFunc() }
func mainFunc() {
if len(os.Args) > 1 {
switch os.Args[1] {
case "test/main":
imported.PrintNames(NamedTypeValue, NamedFunc)
case "test/main/imported":
imported.PrintNames(imported.NamedType{}, imported.NamedFunc)
default:
panic("unknown package")
}
} else {
println(teststringVar)
println(imported.ImportedVar)
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
// When we're obfuscating, check that the obfuscated name lengths vary.
// With sixteen hashed names, and a range of 7 (between 6 and 12),
// the chances of fourteen repeats are incredibly minuscule.
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
// If that happens, then our randomness is clearly broken.
if hashedNames[0] != "main.hashed_00" {
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
var count [16]int
for _, name := range hashedNames {
name = name[len("main."):]
if len(name) < 6 {
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
panic("ended up with a hashed name that's too short: "+name)
}
if len(name) > 12 {
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
panic("ended up with a hashed name that's too long: "+name)
}
count[len(name)]++
if count[len(name)] >= 14 {
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
for _, name := range hashedNames {
println(name)
}
panic("six or more hashed names with the same length")
}
}
}
}
}
// A workaround to fool garble's reflect detection,
// because we want it to show us the obfuscated NamedType.
var NamedTypeValue any = NamedType{}
type NamedType struct {
NamedField int
}
func NamedFunc() string {
return imported.CallerFuncName()
}
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
var hashedNames = []string{
hashed_00(), hashed_01(), hashed_02(), hashed_03(),
hashed_04(), hashed_05(), hashed_06(), hashed_07(),
hashed_08(), hashed_09(), hashed_10(), hashed_11(),
hashed_12(), hashed_13(), hashed_14(), hashed_15(),
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
}
func hashed_00() string { return imported.CallerFuncName() }
func hashed_01() string { return imported.CallerFuncName() }
func hashed_02() string { return imported.CallerFuncName() }
func hashed_03() string { return imported.CallerFuncName() }
func hashed_04() string { return imported.CallerFuncName() }
func hashed_05() string { return imported.CallerFuncName() }
func hashed_06() string { return imported.CallerFuncName() }
func hashed_07() string { return imported.CallerFuncName() }
func hashed_08() string { return imported.CallerFuncName() }
func hashed_09() string { return imported.CallerFuncName() }
func hashed_10() string { return imported.CallerFuncName() }
func hashed_11() string { return imported.CallerFuncName() }
func hashed_12() string { return imported.CallerFuncName() }
func hashed_13() string { return imported.CallerFuncName() }
func hashed_14() string { return imported.CallerFuncName() }
func hashed_15() string { return imported.CallerFuncName() }
immprove how hashWithCustomSalt comes up with its random lengths The last change made it so that hashWithCustomSalt does not always end up with 8 base64 characters, which is a good change for the sake of avoiding easy patterns in obfuscated code. However, the docs weren't updated accordingly, and it wasn't particularly clear that the byte giving us randomness wasn't part of the resulting base64-encoded name. First, refactor the code to only feed as many checksum bytes to the base64 encoder as necessary, which is 12. This shrinks b64NameBuffer and saves us some base64 encoding work. Second, use the first checksum byte that we don't use, the 13th, as the source of the randomness. Note how before we used a base64-encoded byte for the randomness, which isn't great as that byte was only one of 63 characters, whereas a checksum byte is one of 256. Third, update the docs so that the code is as clear as possible. This is particularly important given that we have no tests. With debug prints in the gogarble.txt test, we can see that the randomness in hash lengths is working as intended: # test/main/stdimporter hashLength = 13 hashLength = 8 hashLength = 12 hashLength = 15 hashLength = 10 hashLength = 15 hashLength = 9 hashLength = 8 hashLength = 15 hashLength = 15 hashLength = 12 hashLength = 10 hashLength = 13 hashLength = 13 hashLength = 8 hashLength = 15 hashLength = 11 Finally, add a regression test that will complain if we end up with hashed names that reuse the same length too often. Out of eight hashed names, the test will fail if six end up with the same length, as that is incredibly unlikely given that each should pick one of eight lengths with a fair distribution.
2 years ago
-- imported/imported.go --
package imported
import (
"reflect"
"runtime"
)
var ImportedVar = "imported var value"
type NamedType struct {
NamedField int
}
func NamedFunc() string {
return CallerFuncName()
}
func PrintNames(v any, fn func() string) {
typ := reflect.TypeOf(v)
println("path:", typ.PkgPath())
println("type:", typ.Name())
println("field:", typ.Field(0).Name)
println("func: ", fn())
}
func CallerFuncName() string {
pc, _, _, _ := runtime.Caller(1)
fn := runtime.FuncForPC(pc)
return fn.Name()
}
-- main.stderr --
teststring
imported var value
-- mainpkg.stderr --
path: main
type: NamedType
field: NamedField
func: main.NamedFunc
-- importedpkg.stderr --
path: test/main/imported
type: NamedType
field: NamedField
func: test/main/imported.NamedFunc