Tuesday, July 22, 2014

Python hash calculation algorithms

I've decided to understand how Python calculates hash values for built-in objects. The blessing that Python sources are opened. So I've analyzed C code of hash functions for built-in data types and rewrote equivalent functions in Python.
Here is my code of hash functions for built-in data types (str, int, float, complex, bool, object, tuple and frozenset). Tested on both 32-bit and 64-bit versions of Python 2.7 and also on Python 3.2, 3.5 (CPython implementation of Python).
As you can see, from source code, Python 3 uses updated versions of int and float hash calculation algorithms. Moreover starting from version 3.4, SipHash algorithm replaced the Fowler-Noll-Vo (FNV) algorithm as default string and bytes hash algorithm.
N.B.: None is an object, so it is necessary to use "object_hash" function to calculate its hash.
P.S.: If you find some test data where built-in hash functions' results will differ from equivalent rewritten functions' results, please let me know.

No comments:

Post a Comment