It's common for asset bundling code generators to produce huge literals,
for example in strings. Our literal obfuscators are meant for relatively
small string-like literals that a human would write, such as URLs, file
paths, and English text.
I ran some quick experiments, and it seems like "garble build -literals"
appears to hang trying to obfuscate literals starting at 5-20KiB. It's
not really hung; it's just doing a lot of busy work obfuscating those
literals. The code it produces is also far from ideal, so it also takes
some time to finally compile.
The generated code also led to crashes. For example, using "garble build
-literals -tiny" on a package containing literals of over a megabyte,
our use of asthelper to remove comments and shuffle line numbers could
run out of stack memory.
This all points in one direction: we never designed "-literals" to deal
with large sizes. Set a source-code-size limit of 2KiB.
We alter the literals.txt test as well, to include a few 128KiB string
literals. Before this fix, "go test" would seemingly hang on that test
for over a minute (I did not wait any longer). With the fix, those large
literals are not obfuscated, so the test ends in its usual 1-3s.
As said in the const comment, I don't believe any of this is a big
problem. Come Go 1.16, most developers should stop using asset-bundling
code generators and use go:embed instead. If we wanted to somehow
obfuscate those, it would be an entirely separate feature.
And, if someone wants to work on obfuscating truly large literals for
any reason, we need good tests and benchmarks to ensure garble does not
consume CPU for minutes or run out of memory.
I also simplified the generate-literals test command. The only argument
that matters to the script is the filename, since it's used later on.
Fixes#178.
Many files were missing copyright, so also add a short script to add the
missing lines with the current year, and run it.
The AUTHORS file is also self-explanatory. Contributors can add
themselves there, or we can simply update it from time to time via
git-shortlog.
Since we have two scripts now, set up a directory for them.
Rework the features section in the README, leaving optional features at
the end of the list. Simplify the caveats list, too; the build cache and
exported field/method bits only need one point each. Overall, the
section was far too wordy for little reason.
Also redo the help text a bit. There's now a line to briefly introduce
the tool, as well as a link to the README with all the details. Finally,
the flags have shorter and more consistent help strings.
While at it, remove two unused global vars as spotted by staticcheck.
Error strings should never be capitalized.
A binsubstr line in one of the tests was duplicate and thus useless.
Remove duplicate or trailing spaces in test scripts.
Finally, add a TODO for an optimization I just spotted.
First, unindent some of the AST code.
Second, genRandInt is unused; delete it.
Third, genRandIntn is really just mathrand.Intn. Just use it directly.
Fourth, don't use inline comments if they result in super long lines.
Implement a literal obfuscator interface,
to allow the easy addition of new encodings.
Add literal obfuscation for byte literals.
Choose a random obfuscator on literal obfuscation,
useful when multiple obfuscators are implemented.
Fixes#62