|
|
|
#!/bin/bash
|
|
|
|
|
|
|
|
# We can rewrite this bash script in Go if a dependency on bash and coreutils
|
|
|
|
# is a problem during development.
|
|
|
|
|
avoid breaking intrinsics when obfuscating names
We obfuscate import paths as well as their declared names.
The compiler treats some packages and APIs in special ways,
and the way it detects those is by looking at import paths and names.
In the past, we have avoided obfuscating some names like embed.FS or
reflect.Value.MethodByName for this reason. Otherwise,
go:embed or the linker's deadcode elimination might be broken.
This matching by path and name also happens with compiler intrinsics.
Intrinsics allow the compiler to rewrite some standard library calls
with small and efficient assembly, depending on the target GOARCH.
For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32,
which on amd64 may result in using the TZCNTL instruction.
We never noticed that we were breaking many of these intrinsics.
The intrinsics for funcs declared in the runtime and its dependencies
still worked properly, as we do not obfuscate those packages yet.
However, for other packages like math/bits and sync/atomic,
the intrinsics were being entirely disabled due to obfuscated names.
Skipping intrinsics is particularly bad for performance,
and it also leads to slightly larger binaries:
│ old │ new │
│ bin-B │ bin-B vs base │
Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4)
Finally, the main reason we noticed that intrinsics were broken
is that apparently GOARCH=mips fails to link without them,
as some symbols end up being not defined at all.
This patch fixes builds for the MIPS family of architectures.
Rather than building and linking all of std for every GOARCH,
test that intrinsics work by asking the compiler to print which
intrinsics are being applied, and checking that math/bits gets them.
This fix is relatively unfortunate, as it means we stop obfuscating
about 120 function names and a handful of package paths.
However, fixing builds and intrinsics is much more important.
We can figure out better ways to deal with intrinsics in the future.
Fixes #646.
2 years ago
|
|
|
goroot=$(go env GOROOT)
|
|
|
|
go_version=$(go env GOVERSION) # not "go version", to exclude GOOS/GOARCH
|
|
|
|
|
|
|
|
runtime_and_deps=$(go list -deps runtime)
|
|
|
|
|
|
|
|
# All packages that the runtime linknames to, except runtime and its dependencies.
|
|
|
|
# This resulting list is what we need to "go list" when obfuscating the runtime,
|
|
|
|
# as they are the packages that we may be missing.
|
|
|
|
runtime_linknamed=$(comm -23 <(
|
simplify our handling of "go list" errors
First, teach scripts/gen-go-std-tables.sh to omit test packages,
since runtime/metrics_test would always result in an error.
Instead, make transformLinkname explicitly skip that package,
leaving a comment about a potential improvement if needed.
Second, the only remaining "not found" error we had was "maps" on 1.20,
so rewrite that check based on ImportPath and GoVersionSemver.
Third, detect packages with the "exclude all Go files" error
by looking at CompiledGoFiles and IgnoredGoFiles, which is less brittle.
This means that we are no longer doing any filtering on pkg.Error.Err,
which means we are less likely to break with Go error message changes.
Fourth, the check on pkg.Incomplete is now obsolete given the above,
meaning that the CompiledGoFiles length check is plenty.
Finally, stop trying to be clever about how we print errors.
Now that we're no longer skipping packages based on pkg.Error values,
printing pkg.DepsErrors was causing duplicate messages in the output.
Simply print pkg.Error values with only minimal tweaks:
including the position if there is any, and avoiding double newlines.
Overall, this makes our logic a lot less complicated,
and garble still works the way we want it to.
2 years ago
|
|
|
sed -rn 's@//go:linkname .* ([^.]*)\.[^.]*@\1@p' "${goroot}"/src/runtime/*.go | grep -vE '^main|^runtime\.|_test$' | sort -u
|
|
|
|
) <(
|
|
|
|
# Note that we assume this is constant across platforms.
|
|
|
|
go list -deps runtime | sort -u
|
|
|
|
))
|
|
|
|
|
avoid breaking intrinsics when obfuscating names
We obfuscate import paths as well as their declared names.
The compiler treats some packages and APIs in special ways,
and the way it detects those is by looking at import paths and names.
In the past, we have avoided obfuscating some names like embed.FS or
reflect.Value.MethodByName for this reason. Otherwise,
go:embed or the linker's deadcode elimination might be broken.
This matching by path and name also happens with compiler intrinsics.
Intrinsics allow the compiler to rewrite some standard library calls
with small and efficient assembly, depending on the target GOARCH.
For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32,
which on amd64 may result in using the TZCNTL instruction.
We never noticed that we were breaking many of these intrinsics.
The intrinsics for funcs declared in the runtime and its dependencies
still worked properly, as we do not obfuscate those packages yet.
However, for other packages like math/bits and sync/atomic,
the intrinsics were being entirely disabled due to obfuscated names.
Skipping intrinsics is particularly bad for performance,
and it also leads to slightly larger binaries:
│ old │ new │
│ bin-B │ bin-B vs base │
Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4)
Finally, the main reason we noticed that intrinsics were broken
is that apparently GOARCH=mips fails to link without them,
as some symbols end up being not defined at all.
This patch fixes builds for the MIPS family of architectures.
Rather than building and linking all of std for every GOARCH,
test that intrinsics work by asking the compiler to print which
intrinsics are being applied, and checking that math/bits gets them.
This fix is relatively unfortunate, as it means we stop obfuscating
about 120 function names and a handful of package paths.
However, fixing builds and intrinsics is much more important.
We can figure out better ways to deal with intrinsics in the future.
Fixes #646.
2 years ago
|
|
|
compiler_intrinsics_table="$(sed -rn 's@.*\b(addF|alias)\("([^"]*)", "([^"]*)",.*@\2 \3@p' "${goroot}"/src/cmd/compile/internal/ssagen/ssa.go | sort -u)"
|
|
|
|
compiler_intrinsics_paths="$(while read path name; do
|
|
|
|
echo ${path}
|
|
|
|
done <<<"${compiler_intrinsics_table}" | sort -u)"
|
|
|
|
|
|
|
|
gofmt >go_std_tables.go <<EOF
|
|
|
|
// Code generated by scripts/gen-go-std-tables.sh; DO NOT EDIT.
|
|
|
|
|
|
|
|
// Generated from Go version ${go_version}.
|
|
|
|
|
|
|
|
package main
|
|
|
|
|
|
|
|
var runtimeAndDeps = map[string]bool{
|
|
|
|
$(for path in ${runtime_and_deps}; do
|
|
|
|
echo "\"${path}\": true,"
|
|
|
|
done)
|
|
|
|
// Not a runtime dependency, but still uses tricks allowed by import path.
|
|
|
|
// Not a big deal either way, given that it's only imported in test packages.
|
|
|
|
"runtime/internal/startlinetest": true,
|
|
|
|
}
|
|
|
|
|
|
|
|
var runtimeLinknamed = []string{
|
|
|
|
$(for path in ${runtime_linknamed}; do
|
|
|
|
echo "\"${path}\"",
|
|
|
|
done)
|
|
|
|
}
|
avoid breaking intrinsics when obfuscating names
We obfuscate import paths as well as their declared names.
The compiler treats some packages and APIs in special ways,
and the way it detects those is by looking at import paths and names.
In the past, we have avoided obfuscating some names like embed.FS or
reflect.Value.MethodByName for this reason. Otherwise,
go:embed or the linker's deadcode elimination might be broken.
This matching by path and name also happens with compiler intrinsics.
Intrinsics allow the compiler to rewrite some standard library calls
with small and efficient assembly, depending on the target GOARCH.
For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32,
which on amd64 may result in using the TZCNTL instruction.
We never noticed that we were breaking many of these intrinsics.
The intrinsics for funcs declared in the runtime and its dependencies
still worked properly, as we do not obfuscate those packages yet.
However, for other packages like math/bits and sync/atomic,
the intrinsics were being entirely disabled due to obfuscated names.
Skipping intrinsics is particularly bad for performance,
and it also leads to slightly larger binaries:
│ old │ new │
│ bin-B │ bin-B vs base │
Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4)
Finally, the main reason we noticed that intrinsics were broken
is that apparently GOARCH=mips fails to link without them,
as some symbols end up being not defined at all.
This patch fixes builds for the MIPS family of architectures.
Rather than building and linking all of std for every GOARCH,
test that intrinsics work by asking the compiler to print which
intrinsics are being applied, and checking that math/bits gets them.
This fix is relatively unfortunate, as it means we stop obfuscating
about 120 function names and a handful of package paths.
However, fixing builds and intrinsics is much more important.
We can figure out better ways to deal with intrinsics in the future.
Fixes #646.
2 years ago
|
|
|
|
|
|
|
var compilerIntrinsicsPkgs = map[string]bool{
|
|
|
|
$(for path in ${compiler_intrinsics_paths}; do
|
|
|
|
echo "\"${path}\": true,"
|
|
|
|
done)
|
|
|
|
}
|
|
|
|
|
|
|
|
var compilerIntrinsicsFuncs = map[string]bool{
|
|
|
|
"runtime.mulUintptr": true, // Existed in Go 1.21; removed in Go 1.22.
|
avoid breaking intrinsics when obfuscating names
We obfuscate import paths as well as their declared names.
The compiler treats some packages and APIs in special ways,
and the way it detects those is by looking at import paths and names.
In the past, we have avoided obfuscating some names like embed.FS or
reflect.Value.MethodByName for this reason. Otherwise,
go:embed or the linker's deadcode elimination might be broken.
This matching by path and name also happens with compiler intrinsics.
Intrinsics allow the compiler to rewrite some standard library calls
with small and efficient assembly, depending on the target GOARCH.
For example, math/bits.TrailingZeros32 gets replaced with ssa.OpCtz32,
which on amd64 may result in using the TZCNTL instruction.
We never noticed that we were breaking many of these intrinsics.
The intrinsics for funcs declared in the runtime and its dependencies
still worked properly, as we do not obfuscate those packages yet.
However, for other packages like math/bits and sync/atomic,
the intrinsics were being entirely disabled due to obfuscated names.
Skipping intrinsics is particularly bad for performance,
and it also leads to slightly larger binaries:
│ old │ new │
│ bin-B │ bin-B vs base │
Build-16 5.450Mi ± ∞ ¹ 5.333Mi ± ∞ ¹ -2.15% (p=0.029 n=4)
Finally, the main reason we noticed that intrinsics were broken
is that apparently GOARCH=mips fails to link without them,
as some symbols end up being not defined at all.
This patch fixes builds for the MIPS family of architectures.
Rather than building and linking all of std for every GOARCH,
test that intrinsics work by asking the compiler to print which
intrinsics are being applied, and checking that math/bits gets them.
This fix is relatively unfortunate, as it means we stop obfuscating
about 120 function names and a handful of package paths.
However, fixing builds and intrinsics is much more important.
We can figure out better ways to deal with intrinsics in the future.
Fixes #646.
2 years ago
|
|
|
$(while read path name; do
|
|
|
|
echo "\"${path}.${name}\": true,"
|
|
|
|
done <<<"${compiler_intrinsics_table}")
|
|
|
|
}
|
|
|
|
|
|
|
|
var reflectSkipPkg = map[string]bool{
|
|
|
|
"fmt": true,
|
|
|
|
}
|
|
|
|
EOF
|