Skip to content

Need help? Join our Discord Community!

Tutorials
Python
The Comprehensive Guide to Using Pandas DataFrame Append in Python

The Comprehensive Guide to Using Pandas DataFrame Append in Python

Pandas DataFrame is an essential tool for data manipulation and analysis in Python. In many cases, you might need to append new data to an existing DataFrame. The Pandas DataFrame append() method provides an efficient way to accomplish this task. In this tutorial, we will cover everything you need to know about appending data to a Pandas DataFrame, including importing data, appending values, optimizing your workflow, and improving your data operations.

Want to quickly create Data Visualisation from Python Pandas Dataframe with No code?

PyGWalker is a Python library for Exploratory Data Analysis with Visualization. PyGWalker (opens in a new tab) can simplify your Jupyter Notebook data analysis and data visualization workflow, by turning your pandas dataframe (and polars dataframe) into a Tableau-style User Interface for visual exploration.

PyGWalker for Data visualization (opens in a new tab)

Importing Data to Pandas DataFrame

Before we start working on appending data to a Pandas DataFrame, we need to import some data first. There are several ways to import data into a Pandas DataFrame. You can import data from CSV files, Excel files, SQL databases, and more. Here, we will demonstrate how to import a dictionary to a Pandas DataFrame using the pd.DataFrame() method.

import pandas as pd
 
# create a dictionary of data
data_dict = {"name": ["Alice", "Bob", "Charlie"], "age": [25, 30, 35], "sex": ["female", "male", "male"]}
 
# create a pandas DataFrame
df = pd.DataFrame(data_dict)
 
# print the DataFrame
print(df)

Output:

       name  age     sex
0     Alice   25  female
1       Bob   30    male
2  Charlie   35    male

Appending Values to Pandas DataFrame

The Pandas DataFrame append() method provides a way to add new rows to an existing DataFrame. You can append a single row or multiple rows to a DataFrame. Here is an example of appending a single row to the DataFrame we created earlier.

# create a new row
new_row = {"name": "Dave", "age": 40, "sex": "male"}
 
# append the new row to the DataFrame
df = df.append(new_row, ignore_index=True)
 
# print the updated DataFrame
print(df)

Output:

       name  age     sex
0     Alice   25  female
1       Bob   30    male
2  Charlie   35    male
3      Dave   40    male

In the above example, we create a new row as a dictionary and append it to the existing DataFrame using the append() method. The ignore_index parameter is set to True to generate a new index for the new row.

Appending multiple rows to a DataFrame can be done using the same method. Let's consider the following example.

# create a list of new rows
new_rows = [
    {"name": "Eve", "age": 45, "sex": "female"},
    {"name": "Frank", "age": 50, "sex": "male"}
]
 
# append the list of new rows to the DataFrame
df = df.append(new_rows, ignore_index=True)
 
# print the updated DataFrame
print(df)

Output:

       name  age     sex
0     Alice   25  female
1       Bob   30    male
2  Charlie   35    male
3      Dave   40    male
4       Eve   45  female
5     Frank   50    male

Using DataFrame Append in a Loop

In some cases, appending data to a Pandas DataFrame in one go may not be feasible. The data may be too big to fit in memory, or you may want to append data in chunks to improve performance. In such cases, you can use the append() method in a loop.

Let's consider an example where you want to append data from multiple CSV files to a single Pandas DataFrame.

# import required libraries
import pandas as pd
import glob
 
# create an empty DataFrame to hold the data from CSV files
df_combined = pd.DataFrame()
 
# loop through all the CSV files in the directory
for file in glob.glob("*.csv"):
    # read the CSV file
    df_temp = pd.read_csv(file)
    # append the data to the combined DataFrame
    df_combined = df_combined.append(df_temp)
 
# print the combined DataFrame
print(df_combined)

In the above example, we create an empty DataFrame df_combined to hold the data from multiple CSV files. We then use a for loop to loop through all the CSV files in the directory. Inside the loop, we read each CSV file using the pd.read_csv() method and append the data to the df_combined DataFrame using the append() method.

Optimizing Pandas DataFrame Append() Method

Appending data to a Pandas DataFrame using the append() method can sometimes be slow, especially when working with large datasets. Here are some tips to optimize the append() method and improve performance.

Use pd.concat() instead of DataFrame.append()

The Pandas pd.concat() function provides a faster alternative to the DataFrame.append() method. The pd.concat() function can concatenate multiple DataFrames along a specified axis with ease.

# import libraries
import pandas as pd
 
# create a dataframe
df1 = pd.DataFrame({'a': [1, 2], 'b': [3, 4]})
df2 = pd.DataFrame({'a': [3, 4], 'b': [6, 8]})
 
# concatenate the dataframes using pd.concat
df = pd.concat([df1, df2], axis=0, ignore_index=True)
 
# print the concatenated dataframe
print(df)

In the above example, we concatenate two DataFrames using the pd.concat() method. We specify the axis as 0 to concatenate vertically and use ignore_index=True to reset the index.

Use lists of dictionaries instead of append()

Appending data to a list of dictionaries can be faster than using DataFrame.append(). You can append dictionaries to a list and convert the list to a DataFrame using a constructor. Here is an example.

# create an empty list
data_list = []
 
# append data to the list
data_list.append({"name": "Alice", "age": 25, "sex": "female"})
data_list.append({"name": "Bob", "age": 30, "sex": "male"})
data_list.append({"name": "Charlie", "age": 35, "sex": "male"})
 
# create a dataframe from the list
df = pd.DataFrame(data_list)
 
# print the dataframe
print(df)

Use the right data structure for appending data

Using the correct data structure when appending data can improve performance. For example, appending data to a list and then converting it to a Pandas DataFrame is faster than appending data to the DataFrame using the append() method.

Conclusion

In this tutorial, we have covered everything you need to know about appending data to a Pandas DataFrame using the append() method. We have demonstrated how to import data, append a single row or multiple rows, and optimize the append() method for better performance. By following these techniques, you can efficiently append data to a Pandas DataFrame and improve your data operations. We hope you find this tutorial helpful in your data manipulation and analysis endeavors.