Now that we only use the list to create a replacer at init time,
we no longer need to spend extra effort sorting by length first.
The benchmark shows no measurable difference in performance.
This way, rather than using a double loop quadratic algorithm
to search for each name to replace in a string,
we can make use of the reasonably efficient generic replacer
which makes use of tries.
Copying some code from the strings package is not ideal,
but it beats having to re-implement such an algorithm ourselves.
Not only is the algorithm much faster, as we are no longer quadratic,
but the replacer also appends into a buffer to avoid repeated string
copies, which means we allocate fewer bytes per operation.
│ old │ new │
│ sec/op │ sec/op vs base │
AbiOriginalNames-8 135708.0n ± 0% 391.1n ± 5% -99.71% (p=0.001 n=7)
│ old │ new │
│ B/s │ B/s vs base │
AbiOriginalNames-8 2.565Mi ± 0% 890.112Mi ± 4% +34597.03% (p=0.001 n=7)
│ old │ new │
│ B/op │ B/op vs base │
AbiOriginalNames-8 5464.0 ± 0% 848.0 ± 0% -84.48% (p=0.001 n=7)
│ old │ new │
│ allocs/op │ allocs/op vs base │
AbiOriginalNames-8 18.00 ± 0% 16.00 ± 0% -11.11% (p=0.001 n=7)
Use the term "original name" rather than "real name" for the code
as it is clearer that we mean the unobfuscated name.
Update the comments at the top of the file to be clearer with the
explanation of what kinds of inputs we can expect.
While here, use fmt.Appendf to simplify the generation code a bit.
This lets us only check names up to the remaining input string length.
│ old │ new │
│ sec/op │ sec/op vs base │
AbiRealName-8 196.7µ ± 1% 172.3µ ± 1% -12.41% (p=0.001 n=7)
│ old │ new │
│ B/s │ B/s vs base │
AbiRealName-8 1.774Mi ± 1% 2.022Mi ± 0% +13.98% (p=0.001 n=7)
│ old │ new │
│ B/op │ B/op vs base │
AbiRealName-8 5.359Ki ± 0% 5.336Ki ± 0% -0.44% (p=0.001 n=7)
│ old │ new │
│ allocs/op │ allocs/op vs base │
AbiRealName-8 19.00 ± 0% 18.00 ± 0% -5.26% (p=0.001 n=7)
Iterating over a map is much more expensive than iterating over a slice,
given how it needs to work out which keys are present in each bucket
and then randomize the order in which to navigate the keys.
None of this work needs to happen when iterating over a slice.
A map would be nice if we were to actually do map lookups, but we don't.
│ old │ new │
│ sec/op │ sec/op vs base │
AbiRealName-8 707.1µ ± 1% 196.7µ ± 1% -72.17% (p=0.001 n=7)
│ old │ new │
│ B/s │ B/s vs base │
AbiRealName-8 517.6Ki ± 2% 1816.4Ki ± 1% +250.94% (p=0.001 n=7)
│ old │ new │
│ B/op │ B/op vs base │
AbiRealName-8 5.362Ki ± 0% 5.359Ki ± 0% -0.05% (p=0.001 n=7)
│ old │ new │
│ allocs/op │ allocs/op vs base │
AbiRealName-8 19.00 ± 0% 19.00 ± 0% ~ (p=1.000 n=7) ¹
Go code can retrieve and use field and method names via the `reflect` package.
For that reason, historically we did not obfuscate names of fields and methods
underneath types that we detected as used for reflection, via e.g. `reflect.TypeOf`.
However, that caused a number of issues. Since we obfuscate and build one package
at a time, we could only detect when types were used for reflection in their own package
or in upstream packages. Use of reflection in downstream packages would be detected
too late, causing one package to obfuscate the names and the other not to, leading to a build failure.
A different approach is implemented here. All names are obfuscated now, but we collect
those types used for reflection, and at the end of a build in `package main`,
we inject a function into the runtime's `internal/abi` package to reverse the obfuscation
for those names which can be used for reflection.
This does mean that the obfuscation for these names is very weak, as the binary
contains a one-to-one mapping to their original names, but they cannot be obfuscated
without breaking too many Go packages out in the wild. There is also some amount
of overhead in `internal/abi` due to this, but we aim to make the overhead insignificant.
Fixes#884, #799, #817, #881, #858, #843, #842Closes#406