Pandas is a popular Python data library that provides a powerful API that lets developers analyze and manipulate data.
One of the most common tasks when working with Python and Pandas is converting a dictionary into a DataFrame. This can be extremely useful when you’d like to perform a quick analysis or data visualization that is currently stored in a dictionary data structure.
3 Ways to Convert a Dictionary to DataFrame
pandas.DataFrame.from_dict
Method: This allows you to create a DataFrame from dict of array-like or dicts objects when the dictionary keys correspond to the DataFrame columns.orient='index'
Option: When callingfrom_dict
, this option ensures the key-value is parsed as a DataFrame row.orient='tight'
Option: This is most useful for creating MultiIndex DataFrames. It assumes that the input dictionary has the following keys:'index'
,'columns'
,'data'
,'index_names'
and'column_names'
.
In this article, we’ll explore different ways to convert a Python dictionary into a Pandas DataFrame based on how the data is structured and stored originally in a dict
.
Convert a Dictionary Into a DataFrame
In order to convert a Python dictionary to a Pandas DataFrame, we can use the pandas.DataFrame.from_dict
method to construct DataFrames from dict
of array-like or dicts
objects.
Let’s create an example Python dictionary with some dummy values that we’ll continue using in the next few sections. This will help us demonstrate some interesting ways for converting it into a Pandas DataFrame.
users = {
'fist_name': ['John', 'Andrew', 'Maria', 'Helen'],
'last_name': ['Brown', 'Purple', 'White', 'Blue'],
'is_enabled': [True, False, False, True],
'age': [25, 48, 76, 19]
}
In this example dictionary, the keys correspond to DataFrame columns, while every element in the list corresponds to the row-value for that particular column. Therefore, we can (optionally) specify the orient to be equal to 'columns'
.
import pandas as pd
users = {
'fist_name': ['John', 'Andrew', 'Maria', 'Helen'],
'last_name': ['Brown', 'Purple', 'White', 'Blue'],
'is_enabled': [True, False, False, True],
'age': [25, 48, 76, 19]
}
df = pd.DataFrame.from_dict(users)
We’ve just created a Pandas DataFrame using a Python dictionary.
print(df)
fist_name last_name is_enabled age
0 John Brown True 25
1 Andrew Purple False 48
2 Maria White False 76
3 Helen Blue True 19
This approach only applies whenever your data in the dictionary is structured in such a way that every key corresponds to the DataFrame columns. But what happens if we have a different structure?
Converting Dictionary to DataFrame With Orient=‘Index’
Now, let’s assume that we have a dictionary whose keys correspond to the rows of the DataFrame we’d like to create.
users = {
'row_1': ['John', 'Brown', True, 25],
'row_2': ['Andrew', 'Purple', False, 48],
'row_3': ['Maria', 'White', False, 76],
'row_4': ['Helen', 'Blue', True, 19],
}
We’ll have to use the orient='index'
option such that every key-value pair in our dictionary is parsed as a DataFrame row. When using orient='index'
, we must explicitly specify the column names when calling from_dict()
method:
import pandas as pd
users = {
'row_1': ['John', 'Brown', True, 25],
'row_2': ['Andrew', 'Purple', False, 48],
'row_3': ['Maria', 'White', False, 76],
'row_4': ['Helen', 'Blue', True, 19],
}
cols = ['first_name', 'last_name', 'is_enabled', 'age']
df = pd.DataFrame.from_dict(users, orient='index', columns=cols)
And once again, we managed to construct a Pandas DataFrame out of a Python dictionary, this time by parsing every key-value pair as a DataFrame row:
print(df)
first_name last_name is_enabled age
row_1 John Brown True 25
row_2 Andrew Purple False 48
row_3 Maria White False 76
row_4 Helen Blue True 19
As you may have noticed, every key also became an index to the newly populated DataFrame. If you wish to get rid of it, you can do so by running the following commands:
df.reset_index(drop=True, inplace=True)
And the index should now be reset:
print(df)
first_name last_name is_enabled age
0 John Brown True 25
1 Andrew Purple False 48
2 Maria White False 76
3 Helen Blue True 19
Convert Dictionary to DataFrame Using Orient=‘Tight’
As of Pandas v1.4.0, you can also use the orient='tight'
option to construct a Pandas DataFrames from Python dictionaries. This option assumes that the input dictionary has the following keys: 'index'
, 'columns'
, 'data'
, 'index_names'
and 'column_names'
.
For example, the following dictionary matches this requirement:
data = {
'index': [('a', 'b'), ('a', 'c')],
'columns': [('x', 1), ('y', 2)],
'data': [[1, 3], [2, 4]],
'index_names': ['n1', 'n2'],
'column_names': ['z1', 'z2']
}
df = pd.DataFrame.from_dict(data, orient='tight')
print(df)
z1 x y
z2 1 2
n1 n2
a b 1 3
c 2 4
This final approach is typically useful for constructing MultiIndex DataFrames.
Common Methods to Convert Dictionary to DataFrame
Converting a Python dictionary into a Pandas DataFrame is a simple and straightforward process. By using the pd.DataFrame.from_dict
method along with the correct orient option according to the way your original dictionary is structured, you can easily transform your data into a DataFrame to perform data analysis or transformation using the Pandas API.
No comments:
Post a Comment