Everything you have to know when using Dictionaries in Python
Practical and Advanced Tips on Python's __dict__
Dictionaries are considered a keystone of Python and every programming language. Python’s Standard Library is especially helpful, providing ready-to-use mapping that can be very effective when used the correct way. So in this article, we will discuss what makes Python's Dictionaries special.
Dict Comprehension
The syntax of list comprehension has been adapted to dictionaries, a dictComp builds a dict instance by taking the pairs {key: value} from any iterable.
dial_codes = [
(55, 'Brazil'),
(86, 'China'),
(91, 'India'),
(62, 'Indonesia'),
(81, 'Japan'),
(7, 'Russia'),
(1, 'United States'),
]
country_dial = {country: code for code, country in dial_codes}
print(country_dial)
# {'Brazil': 55, 'China': 86, 'India': 91, 'Indonesia': 62, 'Japan': 81, 'Russia': 7, 'United States': 1}
code_less_than_70 = {code: country.upper() for country, code in sorted(country_dial.items()) if code < 70}
print(code_less_than_70)
# {55: 'BRAZIL', 62: 'INDONESIA', 7: 'RUSSIA', 1: 'UNITED STATES'}
Unpacking Mappings
Unpacking dictionaries is similar to unpacking a list only you will be unpacking a pair of {key: value} and it’s done with ** rather than *
There are a few key points to retain:
We can apply the Operator ** to more than one argument in a function call**.**
Duplicate keyword arguments are forbidden in function calls, so make sure your unpacked dictionary keys are unique.
The Operator ** can be used inside a dict literal also multiple times.
def dump(**kwargs):
return kwargs
print(dump(**{'x': 1}, y=2, **{'z': 3}))
# {'x': 1, 'y': 2, 'z': 3}
print(dump(**{'x': 1}, y=2, **{'z': 3, 'x': 6}))
# TypeError: dump() got multiple values for keyword argument 'x'
print({'a': 0, **{'x': 1}, 'y': 2, **{'z': 3, 'x': 4}}
# Duplicate keys are allowed here and they get overwritten by the last occurence
#{'a': 0, 'x': 4, 'y': 2, 'z': 3}
Merging dictionaries
Dictionaries support the Set Union Operators
| Operator creates a new mapping when merging**.**
|= Operator updates the first operand when merging.
d1 = {'a': 1, 'b': 3}
d2 = {'a': 2, 'b': 4, 'c': 6}
print(d1 | d2)
# overwriting the last occurence {'a': 2, 'b': 4, 'c': 6}
d1 |= d2
print(d1)
# {'a': 2, 'b': 4, 'c': 6}
Automatic Handling of Missing Keys
Imagine a scenario where a program tries to access a value in a dict by its key, only to not find it. There are two ways to handle this.
Use a defaultdict
A defaultdict instance creates items with a default value on demand whenever a missing key is searched using the d[k] syntax.
from collections import defaultdict
# Function to return a default values for keys that is not present
def def_value():
return "This guest did not check in yet"
# Defining the dict
d = defaultdict(def_value)
d["guest_1"] = "Adam"
d["guest_2"] = "Eve"
print(d["guest_1"]) # Adam
print(d["guest_2"]) # Eve
print(d["guest_3"]) # This guest did not check in yet
Implement the __missing__ method
We can also implement the **missing** method in dict or any other mapping type.
class GuestDict(dict):
def __missing__(self, key):
return 'This guest did not check in yet'
x = {"guest_1": "Adam", "guest_2": "Eve"}
guest_dict = GuestDict(x)
# Try accessing missing key:
print(guest_dict['guest_3'])
# This guest did not check in yet
Dictionary Views
The dict instance methods .keys(), .values(), and .items() return instances of classes called dict_keys, dict_values, and dict_items that are read-only but are dynamic proxies of the real dict*.*
If the source dict is updated, you can immediately see the changes through an existing view.
d = {'a': 5, 'b': 2, 'c': 0, 'd': 7}
v = d.values()
print('values before before updating : ', v)
# values before before updating : dict_values([5, 2, 0, 7])
d['e'] = 5
print('Dictionary after adding new value', d)
# Dictionary after adding new value {'a': 5, 'b': 2, 'c': 0, 'd': 7, 'e': 5}
# dynamic view object is updated automatically
print('value of the updated dictionary: ', v)
# value of the updated dictionary: dict_values([5, 2, 0, 7, 5])
Different flavors of dict
There are different variations of dict that are ready to use and we can leverage them to gain time.
collections.OrderedDict
These are regular dictionaries but have some extra capabilities relating to ordering operations, you can find more about them here.
collections.ChainMap
A ChainMap instance holds a list of mappings that can be searched as one.
The lookup is performed on each input mapping in the order it appears in the constructor call, and succeeds as soon as the key is found in one of those mappings.
baseline = {'music': 'bach', 'art': 'rembrandt'}
adjustments = {'art': 'van gogh', 'opera': 'carmen'}
print(list(ChainMap(adjustments, baseline)))
# ['music', 'art', 'opera']
collections.Counter
A mapping that holds an integer count for each key. Updating an existing key adds to its count.
ct = collections.Counter('abracadabra')
print(ct)
# Counter({'a': 5, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
ct.update('aaaaazzz')
print(ct)
# Counter({'a': 10, 'z': 3, 'b': 2, 'r': 2, 'c': 1, 'd': 1})
print(ct.most_common(3))
# [('a', 10), ('z', 3), ('b', 2)]
shelve.Shelf
The shelve module(Naming is inspired by the fact that Pickle jars are stored on shelves) in the standard library provides persistent storage for a mapping of string keys to Python objects serialized in the Pickle binary format.
UserDict
The UserDict has some methods that are ready to use rather than having to implement or reinvent the wheel by extending the dict class, find more about it here.
MappingProxyType
this is a wrapper class that given a mapping type returns a mappingproxy instance that is read-only but mirrors the changes from the original mapping.
from types import MappingProxyType
d = {1: 'A'}
d_proxy = MappingProxyType(d)
print(d_proxy)
# mappingproxy({1: 'A'})
print(d_proxy[1])
# "A"
d[2] = 'B'
print(d_proxy)
# mappingproxy({1: 'A', 2: 'B'})
print(d_proxy[2])
# "B"
The mechanics of a Python Dictionary
Python’s dict implementation is very efficient as it’s based on a hash table but there are a few things to keep in mind.
Keys must be hashable objects. They must implement proper **hash** and eq* * methods.
Item access by key is very fast. A dict may have millions of keys, but Python can locate a key directly by computing the hash code of the key and deriving an index offset into the hash table.
It's better to avoid creating dict instance attributes outside of the __init__ method since it saves memory.
Dicts inevitably have a significant memory overhead.
Further Reading
The full article is on my Blog.
The Python Standard Library Documentation, “collections — Container datatypes”, includes examples and practical recipes with several mapping types.
Python's Standard Library Documentation on Shelves.