xxHash is an extremely fast non-cryptographic hash algorithm, working at RAM speed limits.
It is proposed in four flavors, in three families:
- XXH32 family
- Classic 32-bit hash function. Simple, compact, and runs on almost all 32-bit and 64-bit systems.
- XXH64 family
- Classic 64-bit adaptation of XXH32. Just as simple, and runs well on most 64-bit systems (but not 32-bit systems).
- XXH3 family
- Modern 64-bit and 128-bit hash function family which features improved strength and performance across the board, especially on smaller data. It benefits greatly from SIMD and 64-bit without requiring it.
Benchmarks
The reference system uses an Intel i7-9700K CPU, and runs Ubuntu x64 20.04. The open source benchmark program is compiled with clang v10.0 using -O3 flag.
Hash Name | ISA ext | Width | Large Data Speed | Small Data Velocity |
XXH3_64bits() | AVX2 | 64 | 59.4 GB/s | 133.1 |
MeowHash | AES-NI | 128 | 58.2 GB/s | 52.5 |
XXH3_128bits() | AVX2 | 128 | 57.9 GB/s | 118.1 |
CLHash | PCLMUL | 64 | 37.1 GB/s | 58.1 |
XXH3_64bits() | SSE2 | 64 | 31.5 GB/s | 133.1 |
XXH3_128bits() | SSE2 | 128 | 29.6 GB/s | 118.1 |
RAM sequential read | | N/A | 28.0 GB/s | N/A |
ahash | AES-NI | 64 | 22.5 GB/s | 107.2 |
City64 | | 64 | 22.0 GB/s | 76.6 |
T1ha2 | | 64 | 22.0 GB/s | 99.0 |
City128 | | 128 | 21.7 GB/s | 57.7 |
FarmHash | AES-NI | 64 | 21.3 GB/s | 71.9 |
XXH64() | | 64 | 19.4 GB/s | 71.0 |
SpookyHash | | 64 | 19.3 GB/s | 53.2 |
Mum | | 64 | 18.0 GB/s | 67.0 |
CRC32C | SSE4.2 | 32 | 13.0 GB/s | 57.9 |
XXH32() | | 32 | 9.7 GB/s | 71.9 |
City32 | | 32 | 9.1 GB/s | 66.0 |
Blake3* | AVX2 | 256 | 4.4 GB/s | 8.1 |
Murmur3 | | 32 | 3.9 GB/s | 56.1 |
SipHash* | | 64 | 3.0 GB/s | 43.2 |
Blake3* | SSE2 | 256 | 2.4 GB/s | 8.1 |
HighwayHash | | 64 | 1.4 GB/s | 6.0 |
FNV64 | | 64 | 1.2 GB/s | 62.7 |
Blake2* | | 256 | 1.1 GB/s | 5.1 |
SHA1* | | 160 | 0.8 GB/s | 5.6 |
MD5* | | 128 | 0.6 GB/s | 7.8 |
- Note
- Hashes which require a specific ISA extension are noted. SSE2 is also noted, even though it is mandatory on x64.
- Hashes with an asterisk are cryptographic. Note that MD5 is non-cryptographic by modern standards.
- Small data velocity is a rough average of algorithm's efficiency for small data. For more accurate information, see the wiki.
- More benchmarks and strength tests are found on the wiki: https://github.com/Cyan4973/xxHash/wiki
Usage
All xxHash variants use a similar API. Changing the algorithm is a trivial substitution.
- Precondition
- For functions which take an input and length parameter, the following requirements are assumed:
- The range from [
input
, input + length
) is valid, readable memory.
- The only exception is if the
length
is 0
, input
may be NULL
.
- For C++, the objects must have the TriviallyCopyable property, as the functions access bytes directly as if it was an array of
unsigned char
.
Single Shot
These functions are stateless functions which hash a contiguous block of memory, immediately returning the result. They are the easiest and usually the fastest option.
XXH32(), XXH64(), XXH3_64bits(), XXH3_128bits()
#include <string.h>
{
size_t length = (string == NULL) ? 0 : strlen(string);
return XXH32(
string, length, seed);
}
XXH32_hash_t XXH32(const void *input, size_t length, XXH32_hash_t seed)
Calculates the 32-bit hash of input using xxHash32.
Definition xxhash.h:2792
uint32_t XXH32_hash_t
An unsigned 32-bit integer.
Definition xxhash.h:515
Streaming
These groups of functions allow incremental hashing of unknown size, even more than what would fit in a size_t.
XXH32_reset(), XXH64_reset(), XXH3_64bits_reset(), XXH3_128bits_reset()
#include <stdio.h>
#include <assert.h>
{
assert(state != NULL && "Out of memory!");
char buffer[4096];
size_t count;
while ((count = fread(buffer, 1, sizeof(buffer), f)) != 0) {
}
return result;
}
XXH64_hash_t XXH3_64bits_digest(XXH_NOESCAPE const XXH3_state_t *statePtr)
Returns the calculated XXH3 64-bit hash value from an XXH3_state_t.
Definition xxhash.h:6108
XXH3_state_t * XXH3_createState(void)
Allocate an XXH3_state_t.
Definition xxhash.h:5821
XXH_errorcode XXH3_64bits_reset(XXH_NOESCAPE XXH3_state_t *statePtr)
Resets an XXH3_state_t to begin a new hash.
Definition xxhash.h:5879
XXH_errorcode XXH3_64bits_update(XXH_NOESCAPE XXH3_state_t *statePtr, XXH_NOESCAPE const void *input, size_t length)
Consumes a block of input to an XXH3_state_t.
Definition xxhash.h:6063
XXH_errorcode XXH3_freeState(XXH3_state_t *statePtr)
Frees an XXH3_state_t.
Definition xxhash.h:5837
uint64_t XXH64_hash_t
An unsigned 64-bit integer.
Definition xxhash.h:820