Wednesday, July 20, 2022

How to use Python dictionaries - InfoWorld - Dictionary

Programming languages all come with a variety of data structures, each suited to specific kinds of jobs. Among the data structures built into Python, the dictionary, or Python dict, stands out. A Python dictionary is a fast, versatile way to store and retrieve data by way of a name or even a more complex object type, rather than just an index number.

Python dictionaries consists of one or more keys—an object like a string or an integer. Each key is associated with a value, which can be any Python object. You use a key to obtain its related values, and the lookup time for each key/value pair is highly constant. In other languages, this type of data structure is sometimes called a hash map or associative array.

In this article, we'll walk through the basics of Python dictionaries, including how to use them, the scenarios where they make sense, and some common issues and pitfalls to be aware of.

Working with Python dictionaries

Let's begin with a simple example of a Python dictionary:

movie_years = {
    "2001: a space odyssey": 1968,
    "Blade Runner": 1982
}

In this dictionary, the movie names are the keys, and the release years are the values. The structure {key: value, key: value ... } can be repeated indefinitely.

The example we see here is called a dictionary literal—a dictionary structure that is hard-coded into the program's source. It's also possible to create or modify dictionaries programmatically, as you'll see later on.

Keys in dictionaries

A Python dictionary key can be nearly any Python object. I say "nearly" because the object in question must be hashable, meaning that it must have a hash value (the output of its __hash__() method) that does not change over its lifetime, and which can be compared to other objects.

Any mutable Python object doesn't have a consistent hash value over its lifetime, and so can't be used as a key. For instance, a list can't be a key, because elements can be added to or removed from a list. Likewise, a dictionary itself can't be a key for the same reason. But a tuple can be a key, because a tuple is immutable, and so has a consistent hash across its lifetime.

Strings, numbers (integers and floats alike), tuples, and built-in singleton objects (True, False, and None) are all common types to use as keys.

A given key is unique to a given dictionary. Multiples of the same key aren't possible. If you want to have a key that points to multiple values, you'd use a structure like a list, a tuple, or even another dictionary as the value. (More about this shortly.)

Values in dictionaries

Values in dictionaries can be any Python object at all. Here are some examples of values:

example_values = {
    "integer": 32,
    "float": 5.5,
    "string": "hello world",
    "variable": some_var,
    "object": some_obj,
    "function_output": some_func(),
    "some_list": [1,2,3],
    "another_dict": {
        "Blade Runner": 1982
    }
}

Again, to store multiple values in a key, simply use a container type—a list, dictionary, or tuple—as the value. In the above example, the keys "some_list" and "another_dict" hold lists and dictionaries, respectively. This way, you can create nested structures of any depth needed.

Creating new dictionaries

You can create a new, empty dictionary by simply declaring:

new_dict = {}

You can also use the dict() built-in to create a new dictionary from a sequence of pairs:


new_dict = dict(
    (
        ("integer", 32), ("float", 5.5),
    )
)

Another way to build a dictionary is with a dictionary comprehension, where you specify keys and values from a sequence:


new_dict = {x:x+1 for x in range(3)}
# {0: 1, 1: 2, 2: 3}

Getting and setting dictionary keys and values

To retrieve a value from a dictionary, you use Python's indexing syntax:


example_values["integer"] # yields 32

# Get the year Blade Runner was released
blade_runner_year = movie_years["Blade Runner"]

If you have a container as a value, and you want to retrieve a nested value—that is, something from within the container—you can either access it directly with indexing (if supported), or by using an interstitial assignment:


example_values["another_dict"]["Blade Runner"] # yields 1982
# or ...
another_dict = example_values["another_dict"]
another_dict["Blade Runner"]

# to access a property of an object in a dictionary:
another_dict["some_obj"].property

Setting a value in a dictionary is simple enough:


# Set a new movie and year
movie_years["Blade Runner 2049"] = 2017

Using .get() to safely retrieve dictionary values

If you try to retrieve a value using a key that doesn't exist in a given dictionary, you'll raise a KeyError exception. A common way to handle this sort of retrieval is to use a try/except block. A more elegant way to look for a key that might not be there is the .get() method.

The .get() method on a dictionary attempts to find a value associated with a given key. If no such value exists, it returns None or a default that you specify. In some situations you'll want to explicitly raise an error, but much of the time you'll just want to supply a sane default.


my_dict = {"a":1}

my_dict["b"] # raises a KeyError exception
my_dict.get("a") # returns 1
my_dict.get("b") # returns None
my_dict.get("b", 0) # returns 0, the supplied default

When to use a Python dictionary

Using Python dictionaries makes the most sense under the following conditions:

  • You want to store objects and data using names, not just positions or index numbers. If you want to store elements so that you can retrieve them by their index number, use a list. Note that you can use integers as index keys, but this isn't quite the same as storing data in a list structure, which is optimized for actions like adding to the end of the list. (Dictionaries, as you'll see, have no "end" or "beginning" element as such.)
  • You need to find data and objects quickly by name. Dictionaries are optimized so that lookups for keys are almost always in constant time, regardless of the dictionary size. You can find an element in a list by its position in constant time, too, but you can't hunt for a specific element quickly—you have to iterate through a list to find a specific thing if you don't know its position.
  • The order of elements isn't as important as their presence. Again, if the ordering of the elements matters more than whether or not a given element exists in the collection, use a list. Also, as you'll note below, while dictionaries do preserve the order in which these elements are inserted, that's not the same as being able to seek() to the nth element quickly.

Gotchas for values in dictionaries

There are a few idiosyncrasies worth noting about how values work in dictionaries.

First, if you use a variable name as a value, what's stored under that key is the value contained in the variable at the time the dictionary value was defined. Here's an example:


some_var = 128
example_values = {
    "variable": some_var,
    "function_output": some_func()
}

In this case, we set some_var to the integer 128 before defining the dictionary. The key "variable" would contain the value 128. But if we changed some_var after the dictionary was defined, the contents of the "variable" key would not change. (This rule also applies to Python lists and other container types in Python.)

A similar rule applies to how function outputs work as dictionary values. For the key "function_output", we have some_func(). This means when the dictionary is defined, some_func() is executed, and the returned value is used as the value for "function_output". But some_func() is not re-executed each time we access the key "function_output". That value will remain what it was when it was first created.

If we want to re-run some_func() every time we access that key, we need to take a different approach—one that also has other uses.

Calling function objects in dictionaries

Function objects can be stored in a dictionary as values. This lets us use dictionaries to execute one of a choice of functions based on some key—a common way to emulate the switch/case functionality found in other languages.

First, we store the function object in the dictionary, then we retrieve and execute it:


def run_func(a1, a2):
    ...
def reset_func(a1, a2):
    ...

my_dict = {
    "run": run_func,
    "reset": reset_func
}

command = "run"
# execute run_func
my_dict[command](x, y)
# or ...
cmd = my_dict[command]
cmd(x, y)

Note that we need to define the functions first, then list them in the dictionary.

Also, Python as of version 3.10 has a feature called structural pattern matching that resembles conventional switch/case statements. But in Python, it's meant to be used for matching against structures or combinations of types, not just single values. If you want to use a value to execute an action or just return another value, use a dictionary.

Iterating through dictionaries

If you need to iterate through a dictionary to inspect all of its keys or values, there are a few different ways to do it. The most common is to use a for loop on the dictionary—e.g., for item in the_dict. This yields up the keys in the dictionary, which can then be used to retrieve values if needed:


movie_years = {
    "2001: a space odyssey": 1968,
    "Blade Runner": 1982
}
for movie in movie_years:
    print (movie)

This call would yield "2001: a space odyssey", then "Blade Runner".

If we instead used the following:


for movie in movie_years:
    print (movie_years[movie])

we'd get 1968 and 1982. In this case, we're using the keys to obtain the values.

If we just want the values, we can iterate with the .values() method available on dictionaries:


for value in movie_years.values():

Finally, we can obtain both keys and values together by way of the .items() method:


for key, value in movie_years.items():

Ordering in Python dictionaries

Something you might notice when iterating through dictionaries is that the keys are generally returned in the order in which they are inserted.

This wasn't always the case. Before Python 3.6, items in a dictionary wouldn't be returned in any particular order if you iterated through them. Version 3.6 introduced a new and more efficient dictionary algorithm, which retained insertion order for keys as a convenient side effect.

Previously, Python offered the type collections.OrderedDict as a way to construct dictionaries that preserved insertion order. collections.OrderedDict is still available in the standard library, mainly because a lot of existing software uses it, and also because it supports methods that are still not available with regular dicts. For instance, it offers reversed() to return dictionary keys in reverse order of insertion, which regular dictionaries don't do.

Removing items from dictionaries

Sometimes you need to remove a key/value pair completely from a dictionary. For this, use the del built-in:


del movie_titles["Blade Runner"]

This removes the key/value pair {"Blade Runner": 1982} from our example at the beginning of the article.

Note that setting a key or a value to None is not the same as removing those elements from the dictionary. For instance, the command movie_titles["Blade Runner"] = None would just set the value of that key to None; it wouldn't remove the key altogether.

Finding keys by way of values

A common question with dictionaries is whether it's possible to find a key by looking up a value. The short answer is no—at least, not without iterating through the key/value pairs to find the right value (and thus the right key to go with it).

If you find yourself in a situation where you need to find keys by way of their values, as well as values by way of their keys, consider keeping two dictionaries, where one of them has the keys and values inverted. However, you can't do this if the values you're storing aren't hashable. In a case like that, you'll have to resort to iterating through the dictionary—or, better yet, finding a more graceful solution to the problem you're actually trying to solve.

Dictionaries vs. sets

Finally, Python has another data structure, the set, which superficially resembles a dictionary. Think of it as a dictionary with only keys, but no values. Its syntax is also similar to a dictionary:


movie_titles = {
    "2001: a space odyssey",
    "Blade Runner",
    "Blade Runner 2049"
}

However, sets are not for storing information associated with a given key. They're used mainly for storing hashable values in a way that can be quickly tested for their presence or absence. Also, sets don't preserve insertion order, as the code they use isn't the same as the code used to create dictionaries.

Adblock test (Why?)

No comments:

Post a Comment