xxHash 0.8.2
Extremely fast noncryptographic hash function

Data Structures  
struct  XXH128_hash_t 
The return value from 128bit hashes. More...  
struct  XXH128_canonical_t 
Macros  
#define  XXH3_SECRET_SIZE_MIN 136 
Typedefs  
typedef struct XXH3_state_s  XXH3_state_t 
The state struct for the XXH3 streaming API.  
Functions  
XXH64_hash_t  XXH3_64bits (XXH_NOESCAPE const void *input, size_t length) 
64bit unseeded variant of XXH3.  
XXH64_hash_t  XXH3_64bits_withSeed (XXH_NOESCAPE const void *input, size_t length, XXH64_hash_t seed) 
64bit seeded variant of XXH3  
XXH64_hash_t  XXH3_64bits_withSecret (XXH_NOESCAPE const void *data, size_t len, XXH_NOESCAPE const void *secret, size_t secretSize) 
64bit variant of XXH3 with a custom "secret".  
XXH3_state_t *  XXH3_createState (void) 
Allocate an XXH3_state_t.  
XXH_errorcode  XXH3_freeState (XXH3_state_t *statePtr) 
Frees an XXH3_state_t.  
void  XXH3_copyState (XXH_NOESCAPE XXH3_state_t *dst_state, XXH_NOESCAPE const XXH3_state_t *src_state) 
Copies one XXH3_state_t to another.  
XXH_errorcode  XXH3_64bits_reset (XXH_NOESCAPE XXH3_state_t *statePtr) 
Resets an XXH3_state_t to begin a new hash.  
XXH_errorcode  XXH3_64bits_reset_withSeed (XXH_NOESCAPE XXH3_state_t *statePtr, XXH64_hash_t seed) 
Resets an XXH3_state_t with 64bit seed to begin a new hash.  
XXH_errorcode  XXH3_64bits_reset_withSecret (XXH_NOESCAPE XXH3_state_t *statePtr, XXH_NOESCAPE const void *secret, size_t secretSize) 
XXH_errorcode  XXH3_64bits_update (XXH_NOESCAPE XXH3_state_t *statePtr, XXH_NOESCAPE const void *input, size_t length) 
Consumes a block of input to an XXH3_state_t.  
XXH64_hash_t  XXH3_64bits_digest (XXH_NOESCAPE const XXH3_state_t *statePtr) 
Returns the calculated XXH3 64bit hash value from an XXH3_state_t.  
XXH128_hash_t  XXH3_128bits (XXH_NOESCAPE const void *data, size_t len) 
Unseeded 128bit variant of XXH3.  
XXH128_hash_t  XXH3_128bits_withSeed (XXH_NOESCAPE const void *data, size_t len, XXH64_hash_t seed) 
Seeded 128bit variant of XXH3.  
XXH128_hash_t  XXH3_128bits_withSecret (XXH_NOESCAPE const void *data, size_t len, XXH_NOESCAPE const void *secret, size_t secretSize) 
Custom secret 128bit variant of XXH3.  
XXH_errorcode  XXH3_128bits_reset (XXH_NOESCAPE XXH3_state_t *statePtr) 
Resets an XXH3_state_t to begin a new hash.  
XXH_errorcode  XXH3_128bits_reset_withSeed (XXH_NOESCAPE XXH3_state_t *statePtr, XXH64_hash_t seed) 
Resets an XXH3_state_t with 64bit seed to begin a new hash.  
XXH_errorcode  XXH3_128bits_reset_withSecret (XXH_NOESCAPE XXH3_state_t *statePtr, XXH_NOESCAPE const void *secret, size_t secretSize) 
Custom secret 128bit variant of XXH3.  
XXH_errorcode  XXH3_128bits_update (XXH_NOESCAPE XXH3_state_t *statePtr, XXH_NOESCAPE const void *input, size_t length) 
Consumes a block of input to an XXH3_state_t.  
XXH128_hash_t  XXH3_128bits_digest (XXH_NOESCAPE const XXH3_state_t *statePtr) 
Returns the calculated XXH3 128bit hash value from an XXH3_state_t.  
int  XXH128_isEqual (XXH128_hash_t h1, XXH128_hash_t h2) 
int  XXH128_cmp (XXH_NOESCAPE const void *h128_1, XXH_NOESCAPE const void *h128_2) 
Compares two XXH128_hash_t This comparator is compatible with stdlib's qsort() /bsearch() .  
void  XXH128_canonicalFromHash (XXH_NOESCAPE XXH128_canonical_t *dst, XXH128_hash_t hash) 
Converts an XXH128_hash_t to a big endian XXH128_canonical_t.  
XXH128_hash_t  XXH128_hashFromCanonical (XXH_NOESCAPE const XXH128_canonical_t *src) 
Converts an XXH128_canonical_t to a native XXH128_hash_t.  
XXH_errorcode  XXH3_64bits_reset_withSecretandSeed (XXH_NOESCAPE XXH3_state_t *statePtr, XXH_NOESCAPE const void *secret, size_t secretSize, XXH64_hash_t seed64) 
XXH128_hash_t  XXH3_128bits_withSecretandSeed (XXH_NOESCAPE const void *input, size_t length, XXH_NOESCAPE const void *secret, size_t secretSize, XXH64_hash_t seed64) 
XXH128_hash_t  XXH128 (XXH_NOESCAPE const void *data, size_t len, XXH64_hash_t seed) 
XXH_errorcode  XXH3_128bits_reset_withSecretandSeed (XXH_NOESCAPE XXH3_state_t *statePtr, XXH_NOESCAPE const void *secret, size_t secretSize, XXH64_hash_t seed64) 
XXH_errorcode  XXH3_generateSecret (XXH_NOESCAPE void *secretBuffer, size_t secretSize, XXH_NOESCAPE const void *customSeed, size_t customSeedSize) 
void  XXH3_generateSecret_fromSeed (XXH_NOESCAPE void *secretBuffer, XXH64_hash_t seed) 
Generate the same secret as the _withSeed() variants.  
XXH3 is a more recent hash algorithm featuring:
Speed analysis methodology is explained here:
https://fastcompression.blogspot.com/2019/03/presentingxxh3.html
Compared to XXH64, expect XXH3 to run approximately ~2x faster on large inputs and >3x faster on small ones, exact differences vary depending on platform.
XXH3's speed benefits greatly from SIMD and 64bit arithmetic, but does not require it. Most 32bit and 64bit targets that can run XXH32 smoothly can run XXH3 at competitive speeds, even without vector support. Further details are explained in the implementation.
XXH3 has a fast scalar implementation, but it also includes accelerated SIMD implementations for many common platforms:
XXH3 implementation is portable: it has a generic C90 formulation that can be compiled on any platform, all implementations generate exactly the same hash value on all platforms. Starting from v0.8.0, it's also labelled "stable", meaning that any future version will also generate the same hash value.
XXH3 offers 2 variants, _64bits and _128bits.
When only 64 bits are needed, prefer invoking the _64bits variant, as it reduces the amount of mixing, resulting in faster speed on small inputs. It's also generally simpler to manipulate a scalar return type than a struct.
The API supports oneshot hashing, streaming mode, and custom secrets.
#define XXH3_SECRET_SIZE_MIN 136 
The bare minimum size for a custom secret.
typedef struct XXH3_state_s XXH3_state_t 
The state struct for the XXH3 streaming API.
XXH64_hash_t XXH3_64bits  (  XXH_NOESCAPE const void *  input, 
size_t  length  
) 
64bit unseeded variant of XXH3.
This is equivalent to XXH3_64bits_withSeed() with a seed of 0, however it may have slightly better performance due to constant propagation of the defaults.
XXH64_hash_t XXH3_64bits_withSeed  (  XXH_NOESCAPE const void *  input, 
size_t  length,  
XXH64_hash_t  seed  
) 
64bit seeded variant of XXH3
This variant generates a custom secret on the fly based on default secret altered using the seed
value.
While this operation is decently fast, note that it's not completely free.
input  The data to hash 
length  The length 
seed  The 64bit seed to alter the state. 
XXH64_hash_t XXH3_64bits_withSecret  (  XXH_NOESCAPE const void *  data, 
size_t  len,  
XXH_NOESCAPE const void *  secret,  
size_t  secretSize  
) 
64bit variant of XXH3 with a custom "secret".
It's possible to provide any blob of bytes as a "secret" to generate the hash. This makes it more difficult for an external actor to prepare an intentional collision. The main condition is that secretSize must be large enough (>= XXH3_SECRET_SIZE_MIN). However, the quality of the secret impacts the dispersion of the hash algorithm. Therefore, the secret must look like a bunch of random bytes. Avoid "trivial" or structured data such as repeated sequences or a text document. Whenever in doubt about the "randomness" of the blob of bytes, consider employing "XXH3_generateSecret()" instead (see below). It will generate a proper high entropy secret derived from the blob of bytes. Another advantage of using XXH3_generateSecret() is that it guarantees that all bits within the initial blob of bytes will impact every bit of the output. This is not necessarily the case when using the blob of bytes directly because, when hashing small inputs, only a portion of the secret is employed.
XXH3_state_t * XXH3_createState  (  void  ) 
Allocate an XXH3_state_t.
Must be freed with XXH3_freeState().
NULL
on failure. XXH_errorcode XXH3_freeState  (  XXH3_state_t *  statePtr  ) 
Frees an XXH3_state_t.
Must be allocated with XXH3_createState().
statePtr  A pointer to an XXH3_state_t allocated with XXH3_createState(). 
void XXH3_copyState  (  XXH_NOESCAPE XXH3_state_t *  dst_state, 
XXH_NOESCAPE const XXH3_state_t *  src_state  
) 
Copies one XXH3_state_t to another.
dst_state  The state to copy to. 
src_state  The state to copy from. 
dst_state
and src_state
must not be NULL
and must not overlap. XXH_errorcode XXH3_64bits_reset  (  XXH_NOESCAPE XXH3_state_t *  statePtr  ) 
Resets an XXH3_state_t to begin a new hash.
This function resets statePtr
and generate a secret with default parameters. Call it before XXH3_64bits_update(). Digest will be equivalent to XXH3_64bits()
.
statePtr  The state struct to reset. 
statePtr
must not be NULL
.XXH_errorcode XXH3_64bits_reset_withSeed  (  XXH_NOESCAPE XXH3_state_t *  statePtr, 
XXH64_hash_t  seed  
) 
Resets an XXH3_state_t with 64bit seed to begin a new hash.
This function resets statePtr
and generate a secret from seed
. Call it before XXH3_64bits_update(). Digest will be equivalent to XXH3_64bits_withSeed()
.
statePtr  The state struct to reset. 
seed  The 64bit seed to alter the state. 
statePtr
must not be NULL
.XXH_errorcode XXH3_64bits_reset_withSecret  (  XXH_NOESCAPE XXH3_state_t *  statePtr, 
XXH_NOESCAPE const void *  secret,  
size_t  secretSize  
) 
XXH3_64bits_reset_withSecret(): secret
is referenced, it must outlive the hash streaming session. Similar to oneshot API, secretSize
must be >= XXH3_SECRET_SIZE_MIN
, and the quality of produced hash values depends on secret's entropy (secret's content should look like a bunch of random bytes). When in doubt about the randomness of a candidate secret
, consider employing XXH3_generateSecret()
instead (see below).
XXH_errorcode XXH3_64bits_update  (  XXH_NOESCAPE XXH3_state_t *  statePtr, 
XXH_NOESCAPE const void *  input,  
size_t  length  
) 
Consumes a block of input
to an XXH3_state_t.
Call this to incrementally consume blocks of data.
statePtr  The state struct to update. 
input  The block of data to be hashed, at least length bytes in size. 
length  The length of input , in bytes. 
statePtr
must not be NULL
. input
and input
+ length
must be valid, readable, contiguous memory. However, if length
is 0
, input
may be NULL
. In C++, this also must be TriviallyCopyable.XXH64_hash_t XXH3_64bits_digest  (  XXH_NOESCAPE const XXH3_state_t *  statePtr  ) 
Returns the calculated XXH3 64bit hash value from an XXH3_state_t.
statePtr
, so you can update, digest, and update again.statePtr  The state struct to calculate the hash from. 
statePtr
must not be NULL
.XXH128_hash_t XXH3_128bits  (  XXH_NOESCAPE const void *  data, 
size_t  len  
) 
Unseeded 128bit variant of XXH3.
The 128bit variant of XXH3 has more strength, but it has a bit of overhead for shorter inputs.
This is equivalent to XXH3_128bits_withSeed() with a seed of 0, however it may have slightly better performance due to constant propagation of the defaults.
XXH128_hash_t XXH3_128bits_withSeed  (  XXH_NOESCAPE const void *  data, 
size_t  len,  
XXH64_hash_t  seed  
) 
Seeded 128bit variant of XXH3.
XXH128_hash_t XXH3_128bits_withSecret  (  XXH_NOESCAPE const void *  data, 
size_t  len,  
XXH_NOESCAPE const void *  secret,  
size_t  secretSize  
) 
Custom secret 128bit variant of XXH3.
XXH_errorcode XXH3_128bits_reset  (  XXH_NOESCAPE XXH3_state_t *  statePtr  ) 
Resets an XXH3_state_t to begin a new hash.
This function resets statePtr
and generate a secret with default parameters. Call it before XXH3_128bits_update(). Digest will be equivalent to XXH3_128bits()
.
statePtr  The state struct to reset. 
statePtr
must not be NULL
.XXH_errorcode XXH3_128bits_reset_withSeed  (  XXH_NOESCAPE XXH3_state_t *  statePtr, 
XXH64_hash_t  seed  
) 
Resets an XXH3_state_t with 64bit seed to begin a new hash.
This function resets statePtr
and generate a secret from seed
. Call it before XXH3_128bits_update(). Digest will be equivalent to XXH3_128bits_withSeed()
.
statePtr  The state struct to reset. 
seed  The 64bit seed to alter the state. 
statePtr
must not be NULL
.XXH_errorcode XXH3_128bits_reset_withSecret  (  XXH_NOESCAPE XXH3_state_t *  statePtr, 
XXH_NOESCAPE const void *  secret,  
size_t  secretSize  
) 
Custom secret 128bit variant of XXH3.
XXH_errorcode XXH3_128bits_update  (  XXH_NOESCAPE XXH3_state_t *  statePtr, 
XXH_NOESCAPE const void *  input,  
size_t  length  
) 
Consumes a block of input
to an XXH3_state_t.
Call this to incrementally consume blocks of data.
statePtr  The state struct to update. 
input  The block of data to be hashed, at least length bytes in size. 
length  The length of input , in bytes. 
statePtr
must not be NULL
. input
and input
+ length
must be valid, readable, contiguous memory. However, if length
is 0
, input
may be NULL
. In C++, this also must be TriviallyCopyable.XXH128_hash_t XXH3_128bits_digest  (  XXH_NOESCAPE const XXH3_state_t *  statePtr  ) 
Returns the calculated XXH3 128bit hash value from an XXH3_state_t.
statePtr
, so you can update, digest, and update again.statePtr  The state struct to calculate the hash from. 
statePtr
must not be NULL
.int XXH128_isEqual  (  XXH128_hash_t  h1, 
XXH128_hash_t  h2  
) 
XXH128_isEqual(): Return: 1 if h1
and h2
are equal, 0 if they are not.
int XXH128_cmp  (  XXH_NOESCAPE const void *  h128_1, 
XXH_NOESCAPE const void *  h128_2  
) 
Compares two XXH128_hash_t This comparator is compatible with stdlib's qsort()
/bsearch()
.
void XXH128_canonicalFromHash  (  XXH_NOESCAPE XXH128_canonical_t *  dst, 
XXH128_hash_t  hash  
) 
Converts an XXH128_hash_t to a big endian XXH128_canonical_t.
dst  The XXH128_canonical_t pointer to be stored to. 
hash  The XXH128_hash_t to be converted. 
dst
must not be NULL
. XXH128_hash_t XXH128_hashFromCanonical  (  XXH_NOESCAPE const XXH128_canonical_t *  src  ) 
Converts an XXH128_canonical_t to a native XXH128_hash_t.
src  The XXH128_canonical_t to convert. 
src
must not be NULL
.XXH_errorcode XXH3_64bits_reset_withSecretandSeed  (  XXH_NOESCAPE XXH3_state_t *  statePtr, 
XXH_NOESCAPE const void *  secret,  
size_t  secretSize,  
XXH64_hash_t  seed64  
) 
These variants generate hash values using either seed
for "short" keys (< XXH3_MIDSIZE_MAX = 240 bytes) or secret
for "large" keys (>= XXH3_MIDSIZE_MAX).
This generally benefits speed, compared to _withSeed()
or _withSecret()
. _withSeed()
has to generate the secret on the fly for "large" keys. It's fast, but can be perceptible for "not so large" keys (< 1 KB). _withSecret()
has to generate the masks on the fly for "small" keys, which requires more instructions than _withSeed() variants. Therefore, _withSecretandSeed variant combines the best of both worlds.
When secret
has been generated by XXH3_generateSecret_fromSeed(), this variant produces exactly the same results as _withSeed()
variant, hence offering only a pure speed benefit on "large" input, by skipping the need to regenerate the secret for every large input.
Another usage scenario is to hash the secret to a 64bit hash value, for example with XXH3_64bits(), which then becomes the seed, and then employ both the seed and the secret in _withSecretandSeed(). On top of speed, an added benefit is that each bit in the secret has a 50% chance to swap each bit in the output, via its impact to the seed.
This is not guaranteed when using the secret directly in "small data" scenarios, because only portions of the secret are employed for small data.
XXH128_hash_t XXH3_128bits_withSecretandSeed  (  XXH_NOESCAPE const void *  input, 
size_t  length,  
XXH_NOESCAPE const void *  secret,  
size_t  secretSize,  
XXH64_hash_t  seed64  
) 
These variants generate hash values using either seed
for "short" keys (< XXH3_MIDSIZE_MAX = 240 bytes) or secret
for "large" keys (>= XXH3_MIDSIZE_MAX).
This generally benefits speed, compared to _withSeed()
or _withSecret()
. _withSeed()
has to generate the secret on the fly for "large" keys. It's fast, but can be perceptible for "not so large" keys (< 1 KB). _withSecret()
has to generate the masks on the fly for "small" keys, which requires more instructions than _withSeed() variants. Therefore, _withSecretandSeed variant combines the best of both worlds.
When secret
has been generated by XXH3_generateSecret_fromSeed(), this variant produces exactly the same results as _withSeed()
variant, hence offering only a pure speed benefit on "large" input, by skipping the need to regenerate the secret for every large input.
Another usage scenario is to hash the secret to a 64bit hash value, for example with XXH3_64bits(), which then becomes the seed, and then employ both the seed and the secret in _withSecretandSeed(). On top of speed, an added benefit is that each bit in the secret has a 50% chance to swap each bit in the output, via its impact to the seed.
This is not guaranteed when using the secret directly in "small data" scenarios, because only portions of the secret are employed for small data.
XXH128_hash_t XXH128  (  XXH_NOESCAPE const void *  data, 
size_t  len,  
XXH64_hash_t  seed  
) 
simple alias to preselected XXH3_128bits variant
XXH_errorcode XXH3_128bits_reset_withSecretandSeed  (  XXH_NOESCAPE XXH3_state_t *  statePtr, 
XXH_NOESCAPE const void *  secret,  
size_t  secretSize,  
XXH64_hash_t  seed64  
) 
These variants generate hash values using either seed
for "short" keys (< XXH3_MIDSIZE_MAX = 240 bytes) or secret
for "large" keys (>= XXH3_MIDSIZE_MAX).
This generally benefits speed, compared to _withSeed()
or _withSecret()
. _withSeed()
has to generate the secret on the fly for "large" keys. It's fast, but can be perceptible for "not so large" keys (< 1 KB). _withSecret()
has to generate the masks on the fly for "small" keys, which requires more instructions than _withSeed() variants. Therefore, _withSecretandSeed variant combines the best of both worlds.
When secret
has been generated by XXH3_generateSecret_fromSeed(), this variant produces exactly the same results as _withSeed()
variant, hence offering only a pure speed benefit on "large" input, by skipping the need to regenerate the secret for every large input.
Another usage scenario is to hash the secret to a 64bit hash value, for example with XXH3_64bits(), which then becomes the seed, and then employ both the seed and the secret in _withSecretandSeed(). On top of speed, an added benefit is that each bit in the secret has a 50% chance to swap each bit in the output, via its impact to the seed.
This is not guaranteed when using the secret directly in "small data" scenarios, because only portions of the secret are employed for small data.
XXH_errorcode XXH3_generateSecret  (  XXH_NOESCAPE void *  secretBuffer, 
size_t  secretSize,  
XXH_NOESCAPE const void *  customSeed,  
size_t  customSeedSize  
) 
Derive a highentropy secret from any userdefined content, named customSeed. The generated secret can be used in combination with *_withSecret()
functions. The _withSecret()
variants are useful to provide a higher level of protection than 64bit seed, as it becomes much more difficult for an external actor to guess how to impact the calculation logic.
The function accepts as input a custom seed of any length and any content, and derives from it a highentropy secret of length secretSize
into an already allocated buffer secretBuffer
.
The generated secret can then be used with any *_withSecret()
variant. The functions XXH3_128bits_withSecret(), XXH3_64bits_withSecret(), XXH3_128bits_reset_withSecret() and XXH3_64bits_reset_withSecret() are part of this list. They all accept a secret
parameter which must be large enough for implementation reasons (>= XXH3_SECRET_SIZE_MIN) and feature very high entropy (consist of randomlooking bytes). These conditions can be a high bar to meet, so XXH3_generateSecret() can be employed to ensure proper quality.
customSeed
can be anything. It can have any size, even small ones, and its content can be anything, even "poor entropy" sources such as a bunch of zeroes. The resulting secret
will nonetheless provide all required qualities.
secretSize
must be >= XXH3_SECRET_SIZE_MINcustomSeedSize
> 0, supplying NULL as customSeed is undefined behavior.Example code:
void XXH3_generateSecret_fromSeed  (  XXH_NOESCAPE void *  secretBuffer, 
XXH64_hash_t  seed  
) 
Generate the same secret as the _withSeed() variants.
The generated secret can be used in combination with *_withSecret()
and _withSecretandSeed()
variants.
Example C++ std::string
hash class:
secretBuffer  A writable buffer of XXH3_SECRET_SIZE_MIN bytes 
seed  The seed to seed the state. 