HomeSearch

Python Dictionary Order Benchmark

This Python example page tests the performance of dictionaries. It determines that keys added first are faster to look up.

Dictionary order.

We add many keys to a dictionary. Some are accessed all the time in our Python program. But some of them are rarely accessed.

With an optimization,

we can place keys that are used most often first. We add them first. In this benchmark, we determine the performance results of this optimization.

Example programs.

Here are the example programs. We benchmark each approach individually. Each program adds 150001 string keys to a dictionary.Dictionary

Program 1: This version adds the key we access in the loop ("cat") last, after all the other keys were added.

In

Program 2: This adds the key we access repeatedly ("cat") first, before any other keys are added.

Result: The programs are executed many times and the timings are displayed in the "results" section.

Python program that benchmarks dictionary order, last key import time data = {} # Add 150000 keys to the dictionary. for i in range(50000): data["cat" + str(i)] = i data["ca" + str(i)] = i data["c" + str(i)] = i # Add tested key last. data["cat"] = 100 print(time.time()) # Loop up the key that was added last to the dictionary. for i in range(10000000): if "cat" not in data: break print(time.time()) Python program that benchmarks dictionary order, first key import time data = {} # Add tested key first. data["cat"] = 100 # Add 150000 keys to the dictionary. for i in range(50000): data["cat" + str(i)] = i data["ca" + str(i)] = i data["c" + str(i)] = i print(time.time()) # Loop up the key that was added first to the dictionary. for i in range(10000000): if "cat" not in data: break print(time.time()) Output ::: LAST ::: 1478551029.5272546 1478551030.2166858 0.69 s 1478551038.066783 1478551038.7393672 0.67 s 1478551239.0341876 1478551239.7217193 0.69 s [Average = 0.68 s] ::: FIRST ::: 1478551063.879777 1478551064.5186193 0.64 s 1478551074.2162373 1478551074.82664 0.61 s 1478551220.501552 1478551221.1421218 0.64 s [Average = 0.63 s]

Benchmark results.

Let us consider the results of this benchmark. When we access the "cat" string that was added first, we finish the program in 0.63 seconds.

But: When we access the "cat" string added last, we consistently require more time, about 0.68 seconds.

Benchmark analysis.

Why does dictionary add order matter? In dictionary implementations, two strings with the same hash code are stored together.

Then: A linear search through entries that have the same hash code must be done. The position of the match here makes a difference.

Analysis, continued.

When we add a key first, it is faster to locate it when hashes collide. To get optimal performance for a certain key, we can add it first (or earlier).

A summary.

This optimization will not make a big difference for many programs. But it is useful to learn and understand all aspects of dictionary performance.
Home
Dot Net Perls
© 2007-2019 Sam Allen. All rights reserved. Written by Sam Allen, info@dotnetperls.com.