Skip to content

Sort Pandas DataFrame: Examples and Tips

Pandas DataFrame is a powerful tool for data analysis in Python. It allows you to store and manipulate large datasets with ease. Sorting data is a common operation that is useful for exploring and visualizing data. In this tutorial, we will cover how to sort data in a Pandas DataFrame, including sorting by column, multiple columns, index, and more.

Want to quickly create Data Visualizations in Python?

PyGWalker is an Open Source Python Project that can help speed up the data analysis and visualization workflow directly within a Jupyter Notebook-based environments.

PyGWalker (opens in a new tab) turns your Pandas Dataframe (or Polars Dataframe) into a visual UI where you can drag and drop variables to create graphs with ease. Simply use the following code:

pip install pygwalker
import pygwalker as pyg
gwalker = pyg.walk(df)

You can run PyGWalker right now with these online notebooks:

And, don't forget to give us a ⭐️ on GitHub!

Run PyGWalker in Kaggle Notebook (opens in a new tab)Run PyGWalker in Google Colab (opens in a new tab)Give PyGWalker a ⭐️ on GitHub (opens in a new tab)
Run PyGWalker in Kaggle Notebook (opens in a new tab)Run PyGWalker in Google Colab (opens in a new tab)Run PyGWalker in Google Colab (opens in a new tab)

What is a Pandas DataFrame?

A Pandas DataFrame is a two-dimensional table-like data structure that contains rows and columns. It can hold a variety of data types such as numbers, strings, and dates. You can think of it as a spreadsheet or a SQL table. It is a convenient way to store and manipulate data with Python.

How to Install Pandas in Python?

Before we dive into sorting a Pandas DataFrame, you need to make sure that you have Pandas installed on your system. You can do this by running the following command in your terminal or command prompt:

pip install pandas

This will install the latest version of Pandas on your system.

How to Create a Pandas DataFrame?

There are many ways to create a Pandas DataFrame. One of the most common ways is to create it from a dictionary of lists. Here's an example:

import pandas as pd
 
data = {'Name': ['John', 'Jane', 'Bob', 'Lisa'],
        'Age': [25, 30, 45, 23],
        'Salary': [50000, 60000, 80000, 40000]}
 
df = pd.DataFrame(data)
 
print(df)

Output:

   Name  Age  Salary
0  John   25   50000
1  Jane   30   60000
2   Bob   45   80000
3  Lisa   23   40000

In this example, we created a dictionary of three lists, where each list represents a column in the DataFrame. We then used the pd.DataFrame() function to create a DataFrame from the dictionary.

What is the Difference Between Sorting in Ascending and Descending Order?

Before we start sorting a Pandas DataFrame, it's important to understand the difference between sorting in ascending and descending order. Sorting in ascending order means that the values will be sorted from lowest to highest. Sorting in descending order means that the values will be sorted from highest to lowest.

How to Sort a Pandas DataFrame by Column?

Sorting a Pandas DataFrame by column is a common operation. You can use the sort_values() method to sort a DataFrame by a single column. Here's an example:

import pandas as pd
 
data = {'Name': ['John', 'Jane', 'Bob', 'Lisa'],
        'Age': [25, 30, 45, 23],
        'Salary': [50000, 60000, 80000, 40000]}
 
df = pd.DataFrame(data)
 
# sort by Age column in ascending order
df.sort_values('Age', ascending=True, inplace=True)
 
print(df)

Output:

   Name  Age  Salary
3  Lisa   23   40000
0  John   25   50000
1  Jane   30   60000
2   Bob   45   80000

In this example, we sorted the DataFrame by the "Age" column in ascending order using the sort_values() method. We set the ascending parameter to True to sort in ascending order. The inplace parameter is set to True to modify the original DataFrame.

Can I Sort a Pandas DataFrame by Multiple Columns?

Yes, you can sort a Pandas DataFrame by multiple columns. You need to pass a list of column names to the sort_values() method. Here's an example:

import pandas as pd
 
data = {'Name': ['John', 'Jane', 'Bob', 'Lisa'],
        'Age': [25, 30, 45, 23],
        'Salary': [50000, 60000, 80000, 40000]}
 
df = pd.DataFrame(data)
 
# sort by Age column in ascending order, then by Salary column in descending order
df.sort_values(['Age', 'Salary'], ascending=[True, False], inplace=True)
 
print(df)

Output:

   Name  Age  Salary
3  Lisa   23   40000
0  John   25   50000
1  Jane   30   60000
2   Bob   45   80000

In this example, we sorted the DataFrame by the "Age" column in ascending order, then by the "Salary" column in descending order. We passed a list of column names to the sort_values() method and a list of boolean values to the ascending parameter to specify the sorting direction for each column.

How to Sort a Pandas DataFrame by Index?

You can also sort a Pandas DataFrame by its index using the sort_index() method. Here's an example:

import pandas as pd
 
data = {'Name': ['John', 'Jane', 'Bob', 'Lisa'],
        'Age': [25, 30, 45, 23],
        'Salary': [50000, 60000, 80000, 40000]}
 
df = pd.DataFrame(data)
 
# sort by index in descending order
df.sort_index(ascending=False, inplace=True)
 
print(df)

Output:

   Name  Age  Salary
3  Lisa   23   40000
2   Bob   45   80000
1  Jane   30   60000
0  John   25   50000

In this example, we sorted the DataFrame by its index in descending order using the sort_index() method. The ascending parameter is set to False to sort in descending order.

How to Sort a Pandas DataFrame by Date?

Sorting a Pandas DataFrame by date is a common operation in time series analysis. You can use the sort_values() method with the datetime data type. Here's an example:

import pandas as pd
 
data = {'Date': ['2022-01-01', '2022-01-02', '2022-01-03', '2022-01-04'],
        'Sales': [100, 200, 150, 300]}
 
df = pd.DataFrame(data)
 
# convert Date column to datetime data type
df['Date'] = pd.to_datetime(df['Date'])
 
# sort by Date column in ascending order
df.sort_values('Date', ascending=True, inplace=True)
 
print(df)

Output:

        Date  Sales
0 2022-01-01    100
1 2022-01-02    200
2 2022-01-03    150
3 2022-01-04    300

In this example, we created a DataFrame with a "Date" column and a "Sales" column. We used the to_datetime() method to convert the "Date" column to the datetime data type. We then used the sort_values() method to sort the DataFrame by the "Date" column in ascending order.

Pandas DataFrame Sort Values

The sort_values() method is the primary method for sorting a Pandas DataFrame. It can sort a DataFrame by a single column or multiple columns. It also supports sorting by index and by date.

Conclusion

Sorting data in a Pandas DataFrame is an essential operation for data analysis and visualization. In this tutorial, we covered how to sort a Pandas DataFrame by column, multiple columns, index, and date. We also discussed the difference between sorting in ascending and descending order. By mastering these techniques, you will be able to manipulate data like a pro.