~mcf/b3sum

Merge branch 'upstream'
Import blake3 0.3.6
Merge branch 'upstream'
Import blake3 0.3.5
Make .ctors section writable

This is needed for PIE.
Add mechanism to control use of assembly implementations
Rewrite CPU feature detection

Use assembly to make it run as a constructor.
Move definitions from blake3_impl.h to their corresponding source files
Import blake3 0.3.4
Remove BLAKE3_TESTING guards

To disable asserts, just build with -D NDEBUG.
Add Makefile and b3sum utility
Remove unnecessary immintrin.h include
Inline counter_low and counter_high
Move CPU feature detection to assembly source
Next