You cannot select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
garble/scripts/gen-go-std-tables.sh

61 lines
1.7 KiB
Bash

#!/bin/bash
# We can rewrite this bash script in Go if a dependency on bash and coreutils
# is a problem during development.
avoid breaking intrinsics when obfuscating names We obfuscate import paths as well as their declared names. The compiler treats some packages and APIs in special ways, and the way it detects those is by looking at import paths and names. In the past, we have avoided obfuscating some names like embed.FS or reflect.Value.MethodByName for this reason. Otherwise, go:embed or the linker's deadcode elimination might be broken. This matching by path and name also happens with compiler intrinsics. Intrinsics allow the compiler to rewrite some standard library calls with small and efficient assembly, depending on the target GOARCH. For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32, which on amd64 may result in using the TZCNTL instruction. We never noticed that we were breaking many of these intrinsics. The intrinsics for funcs declared in the runtime and its dependencies still worked properly, as we do not obfuscate those packages yet. However, for other packages like math/bits and sync/atomic, the intrinsics were being entirely disabled due to obfuscated names. Skipping intrinsics is particularly bad for performance, and it also leads to slightly larger binaries: │ old │ new │ │ bin-B │ bin-B vs base │ Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4) Finally, the main reason we noticed that intrinsics were broken is that apparently GOARCH=mips fails to link without them, as some symbols end up being not defined at all. This patch fixes builds for the MIPS family of architectures. Rather than building and linking all of std for every GOARCH, test that intrinsics work by asking the compiler to print which intrinsics are being applied, and checking that math/bits gets them. This fix is relatively unfortunate, as it means we stop obfuscating about 120 function names and a handful of package paths. However, fixing builds and intrinsics is much more important. We can figure out better ways to deal with intrinsics in the future. Fixes #646.
1 year ago
goroot=$(go env GOROOT)
go_version=$(go env GOVERSION) # not "go version", to exclude GOOS/GOARCH
runtime_and_deps=$(go list -deps runtime)
# All packages that the runtime linknames to, except runtime and its dependencies.
# This resulting list is what we need to "go list" when obfuscating the runtime,
# as they are the packages that we may be missing.
runtime_linknamed=$(comm -23 <(
sed -rn 's@//go:linkname .* ([^.]*)\.[^.]*@\1@p' "${goroot}"/src/runtime/*.go | grep -vE '^main|^runtime\.|_test$' | sort -u
) <(
# Note that we assume this is constant across platforms.
go list -deps runtime | sort -u
))
avoid breaking intrinsics when obfuscating names We obfuscate import paths as well as their declared names. The compiler treats some packages and APIs in special ways, and the way it detects those is by looking at import paths and names. In the past, we have avoided obfuscating some names like embed.FS or reflect.Value.MethodByName for this reason. Otherwise, go:embed or the linker's deadcode elimination might be broken. This matching by path and name also happens with compiler intrinsics. Intrinsics allow the compiler to rewrite some standard library calls with small and efficient assembly, depending on the target GOARCH. For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32, which on amd64 may result in using the TZCNTL instruction. We never noticed that we were breaking many of these intrinsics. The intrinsics for funcs declared in the runtime and its dependencies still worked properly, as we do not obfuscate those packages yet. However, for other packages like math/bits and sync/atomic, the intrinsics were being entirely disabled due to obfuscated names. Skipping intrinsics is particularly bad for performance, and it also leads to slightly larger binaries: │ old │ new │ │ bin-B │ bin-B vs base │ Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4) Finally, the main reason we noticed that intrinsics were broken is that apparently GOARCH=mips fails to link without them, as some symbols end up being not defined at all. This patch fixes builds for the MIPS family of architectures. Rather than building and linking all of std for every GOARCH, test that intrinsics work by asking the compiler to print which intrinsics are being applied, and checking that math/bits gets them. This fix is relatively unfortunate, as it means we stop obfuscating about 120 function names and a handful of package paths. However, fixing builds and intrinsics is much more important. We can figure out better ways to deal with intrinsics in the future. Fixes #646.
1 year ago
compiler_intrinsics_table="$(sed -rn 's@.*\b(addF|alias)\("([^"]*)", "([^"]*)",.*@\2 \3@p' "${goroot}"/src/cmd/compile/internal/ssagen/ssa.go | sort -u)"
compiler_intrinsics_paths="$(while read path name; do
echo ${path}
done <<<"${compiler_intrinsics_table}" | sort -u)"
gofmt >go_std_tables.go <<EOF
// Code generated by scripts/gen-go-std-tables.sh; DO NOT EDIT.
// Generated from Go version ${go_version}.
package main
var runtimeAndDeps = map[string]bool{
$(for path in ${runtime_and_deps}; do
echo "\"${path}\": true,"
done)
}
var runtimeLinknamed = []string{
$(for path in ${runtime_linknamed}; do
echo "\"${path}\"",
done)
}
avoid breaking intrinsics when obfuscating names We obfuscate import paths as well as their declared names. The compiler treats some packages and APIs in special ways, and the way it detects those is by looking at import paths and names. In the past, we have avoided obfuscating some names like embed.FS or reflect.Value.MethodByName for this reason. Otherwise, go:embed or the linker's deadcode elimination might be broken. This matching by path and name also happens with compiler intrinsics. Intrinsics allow the compiler to rewrite some standard library calls with small and efficient assembly, depending on the target GOARCH. For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32, which on amd64 may result in using the TZCNTL instruction. We never noticed that we were breaking many of these intrinsics. The intrinsics for funcs declared in the runtime and its dependencies still worked properly, as we do not obfuscate those packages yet. However, for other packages like math/bits and sync/atomic, the intrinsics were being entirely disabled due to obfuscated names. Skipping intrinsics is particularly bad for performance, and it also leads to slightly larger binaries: │ old │ new │ │ bin-B │ bin-B vs base │ Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4) Finally, the main reason we noticed that intrinsics were broken is that apparently GOARCH=mips fails to link without them, as some symbols end up being not defined at all. This patch fixes builds for the MIPS family of architectures. Rather than building and linking all of std for every GOARCH, test that intrinsics work by asking the compiler to print which intrinsics are being applied, and checking that math/bits gets them. This fix is relatively unfortunate, as it means we stop obfuscating about 120 function names and a handful of package paths. However, fixing builds and intrinsics is much more important. We can figure out better ways to deal with intrinsics in the future. Fixes #646.
1 year ago
var compilerIntrinsicsPkgs = map[string]bool{
$(for path in ${compiler_intrinsics_paths}; do
echo "\"${path}\": true,"
done)
}
var compilerIntrinsicsFuncs = map[string]bool{
$(while read path name; do
echo "\"${path}.${name}\": true,"
done <<<"${compiler_intrinsics_table}")
}
var reflectSkipPkg = map[string]bool{
"fmt": true,
}
EOF