Python Dictionary String Key Performance

This Python article times the dictionary get method with string keys of varying lengths. A short key is faster.
Dictionary, string keys. Often we think lookup time is constant, but many factors influence a dictionary's speed. For strings, each character must be hashed. So short keys are faster than long ones.
Example. Performance tests help us learn how collections really work. In this example, I create a dictionary with two keys. One is short, with just three letters (cat). And the second is longer with 22 characters in it.

Version 1: In this loop we do a get() for the first key. The get method internally computes a hash, but only needs to scan a short string.

Version 2: Here we call get() on the longer string. In each iteration, a hash code must be computed for the entire string.

Python program that times short, long string key lookups import time lookup = {"cat": 1, "anextremelylongstringkey": 2} print(time.time()) # Version 1: short string key. for i in range(0, 100000000): v = lookup.get("cat") print(time.time()) # Version 2: long string key. for i in range(0, 100000000): v = lookup.get("anextremelylongstringkey") print(time.time()) Output: PyPy 100 million iterations 1412466713.551 1412466715.161 Get short key = 1.61 s 1412466716.911 Get long key = 1.75 s Output: Python3 10 million iterations 1412467011.994969 1412467014.838859 Short = 2.84 s 1412467017.745185 Long = 2.91 s
Results. The short key is faster to look up in the dictionary. The hashing method in Python 3 is not free: it too requires some time. In large collections, or dictionaries with collisions, this time is less significant than in this test.

Thus: Using the shortest unique keys possible in a dictionary will improve performance.

Analysis: This is not a primary consideration, but performance tips like this one can help us develop better programs over time.

Summary. Dictionaries are one of the most important types in computer languages. They are helpful. Much of our information technology in the world uses hash codes and dictionaries. And with minimal string keys, performance is likely to improve.
© 2007-2020 Sam Allen. Every person is special and unique. Send bug reports to
Dot Net Perls