Monday, July 17, 2023

How to Convert a Dictionary Into a Pandas DataFrame - Built In - Dictionary

Pandas is a popular Python data library that provides a powerful API that lets developers analyze and manipulate data

One of the most common tasks when working with Python and Pandas is converting a dictionary into a DataFrame. This can be extremely useful when you’d like to perform a quick analysis or data visualization that is currently stored in a dictionary data structure.

3 Ways to Convert a Dictionary to DataFrame

  1. pandas.DataFrame.from_dict Method: This allows you to create a DataFrame from dict of array-like or dicts objects when the dictionary keys correspond to the DataFrame columns. 
  2. orient='index' Option: When calling from_dict, this option ensures the key-value is parsed as a DataFrame row.
  3. orient='tight' Option: This is most useful for creating MultiIndex DataFrames. It assumes that the input dictionary has the following keys: 'index', 'columns', 'data', 'index_names' and 'column_names'.  

In this article, we’ll explore different ways to convert a Python dictionary into a Pandas DataFrame based on how the data is structured and stored originally in a dict.

Convert a Dictionary Into a DataFrame

In order to convert a Python dictionary to a Pandas DataFrame, we can use the pandas.DataFrame.from_dict method to construct DataFrames from dict of array-like or dicts objects.

Let’s create an example Python dictionary with some dummy values that we’ll continue using in the next few sections. This will help us demonstrate some interesting ways for converting it into a Pandas DataFrame.

users = {
  'fist_name': ['John', 'Andrew', 'Maria', 'Helen'],
  'last_name': ['Brown', 'Purple', 'White', 'Blue'],
  'is_enabled': [True, False, False, True],
  'age': [25, 48, 76, 19]
}

In this example dictionary, the keys correspond to DataFrame columns, while every element in the list corresponds to the row-value for that particular column. Therefore, we can (optionally) specify the orient to be equal to 'columns'.

import pandas as pd 


users = {
  'fist_name': ['John', 'Andrew', 'Maria', 'Helen'],
  'last_name': ['Brown', 'Purple', 'White', 'Blue'],
  'is_enabled': [True, False, False, True],
  'age': [25, 48, 76, 19]
}


df = pd.DataFrame.from_dict(users)

We’ve just created a Pandas DataFrame using a Python dictionary.

print(df)

  fist_name last_name  is_enabled  age
0      John     Brown        True   25
1    Andrew    Purple       False   48
2     Maria     White       False   76
3     Helen      Blue        True   19

This approach only applies whenever your data in the dictionary is structured in such a way that every key corresponds to the DataFrame columns. But what happens if we have a different structure?

More on Python10 Ways to Convert Lists in Python Dictionaries

Converting Dictionary to DataFrame With Orient=‘Index’

Now, let’s assume that we have a dictionary whose keys correspond to the rows of the DataFrame we’d like to create.

users = {
  'row_1': ['John', 'Brown', True, 25],
  'row_2': ['Andrew', 'Purple', False, 48],
  'row_3': ['Maria', 'White', False, 76],
  'row_4': ['Helen', 'Blue', True, 19],
}

We’ll have to use the orient='index' option such that every key-value pair in our dictionary is parsed as a DataFrame row. When using orient='index', we must explicitly specify the column names when calling from_dict() method:

import pandas as pd

users = {
  'row_1': ['John', 'Brown', True, 25],
  'row_2': ['Andrew', 'Purple', False, 48],
  'row_3': ['Maria', 'White', False, 76],
  'row_4': ['Helen', 'Blue', True, 19],
}

cols = ['first_name', 'last_name', 'is_enabled', 'age']
df = pd.DataFrame.from_dict(users, orient='index', columns=cols)

And once again, we managed to construct a Pandas DataFrame out of a Python dictionary, this time by parsing every key-value pair as a DataFrame row:

print(df)

      first_name last_name  is_enabled  age
row_1       John     Brown        True   25
row_2     Andrew    Purple       False   48
row_3      Maria     White       False   76
row_4      Helen      Blue        True   19

As you may have noticed, every key also became an index to the newly populated DataFrame. If you wish to get rid of it, you can do so by running the following commands:

df.reset_index(drop=True, inplace=True)

And the index should now be reset:

print(df)


  first_name last_name  is_enabled  age
0       John     Brown        True   25
1     Andrew    Purple       False   48
2      Maria     White       False   76
3      Helen      Blue        True   19

Convert Dictionary to DataFrame Using Orient=‘Tight’ 

As of Pandas v1.4.0, you can also use the orient='tight' option to construct a Pandas DataFrames from Python dictionaries. This option assumes that the input dictionary has the following keys: 'index', 'columns', 'data', 'index_names' and 'column_names'.

For example, the following dictionary matches this requirement:

data = {
  'index': [('a', 'b'), ('a', 'c')],
  'columns': [('x', 1), ('y', 2)],
  'data': [[1, 3], [2, 4]],
  'index_names': ['n1', 'n2'],
  'column_names': ['z1', 'z2']
}

df = pd.DataFrame.from_dict(data, orient='tight')


print(df)
z1     x  y
z2     1  2
n1 n2      
a  b   1  3
   c   2  4

This final approach is typically useful for constructing MultiIndex DataFrames.

A tutorial on how to convert dictionary to DataFrame. | Video: Erik Marsja

More on Pandas8 Ways to Filter Pandas DataFrame

Common Methods to Convert Dictionary to DataFrame 

Converting a Python dictionary into a Pandas DataFrame is a simple and straightforward process. By using the pd.DataFrame.from_dict method along with the correct orient option according to the way your original dictionary is structured, you can easily transform your data into a DataFrame to perform data analysis or transformation using the Pandas API.

Adblock test (Why?)

No comments:

Post a Comment