Streamlit DataFrame: Displaying, Styling, and Optimizing Pandas DataFrames (Updated for 2025)
In the realm of data science, the ability to visualize and interact with your data is paramount. Streamlit, a Python library, has revolutionized the way we create data-rich web applications with only a few lines of code. One of Streamlit's most powerful features is its seamless integration with Pandas DataFrames. In this article, we'll delve into the world of Streamlit DataFrames, exploring how to display, style, and optimize your Pandas DataFrames in a modern Streamlit app. We'll maintain an easy-to-follow tutorial style — from basic usage to advanced tips — updated to reflect Streamlit's latest capabilities as of 2025.
What is a Streamlit DataFrame?
A Streamlit DataFrame refers to the display of a Pandas DataFrame (or similar tabular data) within a Streamlit app. It's like taking a static table from a Jupyter notebook and bringing it to life as an interactive element in a web app. Under the hood, Streamlit leverages Pandas (and other data structures like PyArrow or Polars) to manage data, but it wraps these in a web-friendly interface. Pandas DataFrames are two-dimensional, labeled data structures that are ubiquitous in data science. Streamlit builds on Pandas by providing a platform where DataFrames can be interactively displayed and manipulated by users in real time. Instead of just viewing raw tables, you can sort columns, filter data, highlight important values, and even allow users to edit data — all through Streamlit's intuitive components.
Streamlit DataFrame Tutorial
Have you heard of this awesome Data Analysis & Data Visualisation tool, that can easily turn your Streamlit App into Tableau?
PyGWalker (opens in a new tab) is a Python Library that helps you easily embed a tableau-alternative UI into your own Streamlit app effortlessly. Check out this amazing video how to explore data with pygwalker (opens in a new tab) demonstrating the detailed steps for empowering your Streamlit app with this powerful Data Visualization Python Library!
Get Started with Streamlit DataFrames
First, make sure Streamlit is installed. You can install it via pip:
pip install streamlitImport Streamlit in your Python script, along with Pandas for data handling:
import streamlit as st
import pandas as pdNext, create a simple DataFrame to display. For example, let's use a small dataset of fruits:
data = {
"Fruit": ["Apple", "Banana", "Cherry", "Date", "Elderberry"],
"Quantity": [10, 15, 20, 25, 30],
"Price": [0.5, 0.25, 0.75, 1.0, 2.0]
}
df = pd.DataFrame(data)Now, to display this DataFrame in a Streamlit app, use the st.dataframe() function:
st.dataframe(df)When you run your Streamlit app (streamlit run your_script.py), you'll see your DataFrame rendered as an interactive table. You can sort the table by clicking on column headers, and you can resize the table by dragging its bottom-right corner. In modern Streamlit versions, the table comes with a toolbar that allows searching within the data, copying data to clipboard, and even downloading the data as a CSV. By default, st.dataframe will adjust its height to show up to 10 rows and will let you scroll within the table if there are more. This basic example shows how easy it is to get started. Next, we'll explore how you can customize the display and handle larger datasets.
Displaying DataFrames in Streamlit
How to Display a DataFrame as an Interactive Table
As shown above, displaying a DataFrame in Streamlit is as simple as calling st.dataframe(df). However, there's more to it than just showing the table — Streamlit lets you customize the appearance and interactivity of the DataFrame. Customizing size: You can set the size of the DataFrame component to suit your app's layout. For example, to limit the height (in pixels) of the table display:
st.dataframe(df, height=300)In the above code, the DataFrame will occupy a vertical space of 300 pixels (roughly showing ~10 rows at a time). If the DataFrame has more rows, a scrollbar appears within the table so the user can scroll through the data. Similarly, you can control the width. In recent versions, st.dataframe accepts a width parameter (or you can use width="stretch" to make it expand to the container's full width). For instance:
st.dataframe(df, width="stretch", height=300)This will stretch the table to the full width of the app column/container while fixing the height. (Note: The older use_container_width=True parameter is now deprecated in favor of using width="stretch" in the latest Streamlit API.) Interactive features: The DataFrame display in Streamlit is not static; it's powered by an interactive data grid. Users can perform the following actions directly on the table UI:
- Sort columns: Click on a column header to sort ascending/descending.
- Resize and reorder columns: Drag column borders to resize, or drag headers to reorder or pin columns.
- Hide columns: Use the column menu (usually an overflow "⋮" menu on the header) to hide/show specific columns.
- Search: Use the search box in the table's toolbar (or press Ctrl+F/Cmd+F) to find entries across the entire DataFrame.
- Copy & Download: Select cells and press Ctrl+C/Cmd+C to copy, or use the download button in the toolbar to download the data as a CSV file.
All these features are available by default when using st.dataframe, making it a powerful tool for exploring data.
Highlighting Data and Conditional Formatting
Often you may want to draw attention to certain values in the DataFrame. A convenient way to do this is using Pandas Styler to apply conditional formatting before displaying the DataFrame. Streamlit supports rendering Pandas Styler objects, meaning you can use methods like highlight_max, highlight_min, background_gradient, etc., and then pass the styled DataFrame to Streamlit. For example, to highlight the maximum value in each column:
st.dataframe(df.style.highlight_max(axis=0))In this example, the largest value in each column will be highlighted (with a default highlight style). You can customize the styling further or use different Styler methods. Another example: to apply a color gradient based on the values in each column:
st.dataframe(df.style.background_gradient(cmap="Blues"))This will color the background of the cells from light to dark blue depending on their magnitude, which can help visualize distribution at a glance. Streamlit will display these styled DataFrames, although note that some advanced styling features from Pandas (like bar charts or tooltips in cells) may not be fully supported in the Streamlit table. The most common styling for colors, text, and basic formatting, however, will work.
Handling Large DataFrames in Streamlit
Working with large DataFrames (thousands or even millions of rows) can be challenging for any web app. Streamlit's table component is designed for performance and can handle very large datasets by using efficient virtualization (drawing only what's visible) and an HTML canvas under the hood. However, there are still practical considerations and limits when dealing with huge data:
- Browser and network limits: The entire data typically needs to be sent from the Streamlit server to the browser. Extremely large datasets may hit WebSocket message size limits or exhaust browser memory. For instance, if you try to send a million-row DataFrame, the app might handle it, but it could be slow to transmit and render on the client side.
- Automatic optimizations: Streamlit will automatically disable certain features for large tables to keep things responsive. For example, if your dataset has more than ~150,000 rows, Streamlit turns off column sorting to speed up rendering. Very large tables may not support all interactive features to avoid performance issues.
- Best practices for large data:
- Show subsets of data: Rather than dumping an entire huge DataFrame into
st.dataframeat once, consider showing a filtered or sampled subset. For example, you might let the user choose a subset of columns or a date range to view, or simply display the first N rows with an option to page through the data. - Implement simple pagination: You can manually create a pagination mechanism. One approach is to use a slider or number input for page index, and slice the DataFrame accordingly:
- Show subsets of data: Rather than dumping an entire huge DataFrame into
page_size = 100 # rows per page
total_rows = len(df)
total_pages = (total_rows - 1) // page_size + 1
page = st.number_input("Page", min_value=1, max_value=total_pages, value=1)
start_idx = (page - 1) * page_size
end_idx = start_idx + page_size
st.write(f"Showing rows {start_idx} to {min(end_idx, total_rows)}")
st.dataframe(df.iloc[start_idx:end_idx])In this example, the user can select a page number to view a chunk of 100 rows at a time. This prevents the app from trying to render the entire DataFrame at once, improving responsiveness.
- Leverage lazy dataframes: Streamlit can display data from sources like PySpark or Snowflake (Snowpark) DataFrames. These data structures only pull data as needed, which means if you apply filters or limits, the processing can happen on the backend (e.g., in a database or Spark engine) and only the limited results are sent to Streamlit. If your dataset is extremely large and resides in a database or big data platform, consider querying it in chunks or using such lazy evaluation methods rather than loading everything into a Pandas DataFrame in memory.
- Use caching for data loading: (We will cover caching in detail in a later section, but in short, cache the data retrieval step so that you're not repeatedly reading a large dataset on each app rerun.)
By being mindful of these strategies, you can handle large datasets more smoothly in Streamlit. The key is to avoid overwhelming the frontend with too much data at once and to take advantage of Streamlit's performance features.
Styling DataFrames in Streamlit
Displaying data is one thing — making it look good and readable is another. Streamlit provides multiple ways to style your DataFrame, from built-in formatting options to custom CSS (with some caveats). Let's explore how to enhance the appearance of your tables.
Can I Style a DataFrame Using CSS in Streamlit?
You might wonder if it's possible to apply custom CSS styles (like you would in a web page) to the DataFrame in Streamlit. The short answer is yes, but with caution. Streamlit allows you to inject HTML/CSS into your app using st.markdown with the unsafe_allow_html=True flag. This means you can attempt to target the elements of the table with CSS. For example, to change the background color of the table:
st.markdown(
"""
<style>
table { background-color: #f0f0f0; }
</style>
""",
unsafe_allow_html=True
)
st.dataframe(df)In this snippet, we insert a <style> block that sets all HTML <table> backgrounds to a light gray before calling st.dataframe(df). This might affect the styling of the DataFrame if the underlying rendering uses standard HTML table elements. However, keep in mind:
- This approach is not officially supported and can break if Streamlit's internal implementation changes. In fact, modern Streamlit
st.dataframeis built on an HTML canvas and doesn't use a simple HTML table for the data cells, so some CSS selectors might not apply as expected. - Using
unsafe_allow_html=Trueis generally discouraged except for quick hacks, because it can potentially introduce security or stability issues (if you accidentally style something globally, for instance).
In summary, while you can use CSS for minor tweaks (like setting a background color or font style), it's better to use Streamlit's built-in styling features when possible.
Streamlit DataFrame Styling with Pandas and Column Config
A more robust way to style DataFrames in Streamlit is to use Pandas Styler (as shown in the previous section with highlight_max and background_gradient) or Streamlit's column configuration options for formatting. Pandas Styler: You can apply many styling functions provided by Pandas. For example:
df.style.format(format_dict)– to format numbers or dates in each column (e.g., display a float as a percentage or currency).df.style.applymap(func)– to apply a styling function elementwise (e.g., color negative numbers red).df.style.set_properties(**props)– to set CSS properties on certain cells (though not all will carry over to Streamlit).df.style.hide(axis="index")– to hide the index if it's not meaningful for your display.
After styling the DataFrame with Pandas, you pass the Styler object to st.dataframe() just like we did with highlight_max. Column configuration: Streamlit (as of v1.22+) introduced a column_config parameter for st.dataframe and st.data_editor that lets you customize how columns are displayed. This is a Pythonic way to specify things like:
- Column labels (renaming columns for display without changing the DataFrame itself).
- Hiding specific columns.
- Setting data type display (e.g., marking a column as an Image column, Link column, Checkbox, Datetime, etc., which can change how values are rendered).
- Formatting numbers or dates (similar to Styler but done via Streamlit API).
For example, suppose your DataFrame has a column of prices and you want to show them as US currency and maybe rename the column header:
import streamlit as st
import pandas as pd
df = pd.DataFrame({
"item": ["A", "B", "C"],
"price": [1.5, 2.0, 3.25]
})
st.dataframe(
df,
column_config={
"price": st.column_config.NumberColumn(
"Price (USD)",
format="$%.2f"
)
}
)In this snippet:
- We renamed the
pricecolumn toPrice (USD)for display. - We formatted the numbers in that column to have a dollar sign and two decimal places.
This approach yields a nicely formatted table without needing custom CSS, and it works with Streamlit's interactive grid. We could also hide columns by setting a column's config to None, or use other config types for different data (like images or booleans). In summary, use Pandas Styler or Streamlit's column configurations for styling when possible, as they are more stable and expressive for common tasks than injecting raw CSS.
Optimizing DataFrames in Streamlit
As you build more complex apps or work with bigger datasets, performance becomes important. This section covers how to optimize the use of DataFrames in Streamlit for speed and efficiency, focusing on caching and other best practices.
How to Optimize a Pandas DataFrame in Streamlit
Optimization in this context means both optimizing the performance of your app (load times, responsiveness) and optimizing resource usage (like memory). Here are some key strategies:
Use Streamlit's caching for data loading and computation: One of the simplest ways to speed up your app is to avoid repeating expensive operations. If you have a large dataset stored in a CSV or a database, loading it fresh on every app run can be slow. Streamlit provides a caching mechanism to help with this. In older versions, you might have seen @st.cache used for this purpose. In current Streamlit, you should use @st.cache_data to cache functions that return data (like DataFrames). For example:
import pandas as pd
import streamlit as st
@st.cache_data
def load_data():
# Imagine this is an expensive operation, e.g., reading a large CSV
df = pd.read_csv("large_dataset.csv")
# (You could do additional processing here if needed)
return df
# Use the cached function to load data
df_large = load_data()
st.dataframe(df_large.head(100)) # Display just first 100 rows as an exampleBy using @st.cache_data, the first run will load the CSV and store the resulting DataFrame in cache. On subsequent runs (or when users re-run the app), as long as the function's inputs haven't changed, Streamlit will skip running load_data() again and retrieve the DataFrame from cache. This can significantly speed up apps that repeatedly need the same data.
Optimize DataFrame size and types: Large DataFrames can sometimes be optimized by using appropriate data types. For instance, if you have categorical text data, converting it to a Pandas Categorical type can save memory. If you have columns that only need 0/1 or True/False, use boolean type instead of integers. Floating-point numbers that don't need high precision can be downcast to float32. These pandas optimizations reduce memory usage, which can indirectly improve performance in Streamlit (especially important if deploying on limited-resource servers).
Use efficient data formats: If you control the source of the data, using binary formats like Parquet or Arrow can make loading data faster than CSV. Streamlit can directly read Arrow tables and will handle them efficiently. This also ties into caching — for example, you might cache the result of reading a Parquet file which is faster to load in the first place.
Now let's look more closely at caching and performance tips, since caching is such an important part of optimization.
Streamlit DataFrame Caching and Performance Tips
Caching is a powerful tool in Streamlit, but it's important to use it correctly. Here are some tips and best practices for caching and performance:
Choose the Right Cache Decorator: Use @st.cache_data for caching data computations or queries (functions that return data like DataFrames, lists, dictionaries, etc.). Use @st.cache_resource for caching singleton resources (like a database connection, ML model, or other object that should be initialized once and reused). Replacing @st.cache with the appropriate new decorator will avoid deprecation warnings and give better performance tailored to the use-case.
Function Input Matters: Cached functions are invalidated based on their input arguments. Any time you call a cached function with a new argument value, it will run again and cache the new result. This can be useful for data updates. For example:
@st.cache_data
def load_data(filename):
return pd.read_csv(filename)
file_choice = st.selectbox("Choose a data file", ["data1.csv", "data2.csv"])
df = load_data(file_choice)
st.dataframe(df.head())In this scenario, if the user switches from "data1.csv" to "data2.csv", load_data will execute again for the new filename and cache that result separately. Switching back to "data1.csv" will retrieve from cache. This behavior ensures your app can handle multiple datasets efficiently without recomputing unnecessarily.
Avoid Mutating Cached Data: One common pitfall is altering a cached object. For instance, if you cache a DataFrame and then modify it in place, those changes will persist in the cached object across runs, which can lead to unexpected results. With st.cache_data, Streamlit helps avoid this by returning a fresh copy of the data each time you call the function (from cache) to prevent mutation issues. You generally won't need to use the old allow_output_mutation=True (which was an option in st.cache) because the new system handles it differently. If you do have a use-case where you need to cache an object that must be mutated, consider using st.cache_resource for that, but do so cautiously and document that behavior.
Clear cache when needed: If your data updates externally and you need to refresh the cached data, you can add a button for users to manually clear the cache (st.cache_data.clear() for example) or simply incorporate a cache ttl (time-to-live) or hash of an external data version. For instance, if you know data updates daily, you might include the current date as part of the cache key or set @st.cache_data(ttl=86400) to expire after a day. This ensures users aren't stuck with stale data.
Limit DataFrame Rendering Size: Even with caching, rendering a huge DataFrame can be slow in the browser. It's often wise to limit how much of a DataFrame is displayed at once. We discussed using the height parameter or manual pagination above. Another simple tactic is to only display summary information or sample of a large dataset, and provide download links or optional full view on demand. The Streamlit app should focus on what's relevant to the user's analysis at a given time, rather than overwhelming them (and the browser) with a massive table. If you have to show a lot of data, the users can always use the search and scroll features, but make sure the app remains responsive.
By following these caching and performance tips, you'll ensure that your Streamlit app remains snappy and efficient, even as your data grows.
Streamlit DataFrame: Advanced Use Cases
In this section, let's explore a couple of advanced (yet common) scenarios where Streamlit DataFrames play a crucial role: interactive filtering of data and integrating DataFrame usage into a machine learning workflow.
Streamlit DataFrame Filtering
Filtering data is a core part of data exploration. Streamlit's widgets make it easy to add interactive filters to your DataFrame display. Instead of predefining static subsets, you can let the user choose how to filter the DataFrame. For instance, suppose you want to allow the user to filter the df DataFrame by selecting a range of values in one of the numeric columns. You can use a slider for the range and a selectbox to pick the column to filter:
# Assume df is already loaded
column = st.selectbox("Select column to filter", df.columns)
if pd.api.types.is_numeric_dtype(df[column]):
min_val, max_val = int(df[column].min()), int(df[column].max())
# Slider to pick a range within [min, max]
range_values = st.slider(f"Filter {column} between:", min_val, max_val, (min_val, max_val))
# Filter the dataframe based on slider
filtered_df = df[(df[column] >= range_values[0]) & (df[column] <= range_values[1])]
else:
# If non-numeric, maybe use multiselect for categories
options = st.multiselect(f"Filter values for {column}:", df[column].unique(), default=list(df[column].unique()))
filtered_df = df[df[column].isin(options)]
st.dataframe(filtered_df)In this example:
- We first let the user choose which column to filter on.
- If the chosen column is numeric, we display a range slider from the column's min to max, and the user's selection gives us a
(min_val, max_val)tuple. We then filterdfto that range. - If the column is non-numeric (say strings/categories), we use a multiselect widget to let the user pick which values of that column to include, and then filter accordingly.
- Finally, we display the
filtered_df.
This pattern can be adapted to many scenarios: you could have multiple filters simultaneously (just add more widgets and conditions), or different types of widgets for different columns (date pickers for date columns, text inputs for string contains filters, etc.). The result is an app where users can slice and dice the DataFrame on the fly, and immediately see the table update to show only the data that meets their criteria.
Streamlit DataFrames in Machine Learning Apps
Streamlit isn't just for static data display — it's great for building interactive machine learning demos and dashboards. DataFrames often appear in ML apps: for example, to show a preview of the training data, or to show feature importance scores, or to let the user upload new data for predictions. Let's consider a simple example: you have a dataset and you want to allow the user to train a model (say a classifier) on it with the click of a button, and then display the results. You can use a DataFrame to show the data and Streamlit widgets to manage the interaction:
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
# Display the DataFrame (e.g., training dataset preview)
st.dataframe(df)
# Let user trigger model training
if st.button("Train Model"):
# Assume 'target' is the label column in df
if 'target' not in df.columns:
st.error("No target column found in data!")
else:
# Split the data
X = df.drop('target', axis=1)
y = df['target']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a simple RandomForest model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)
# Evaluate the model
preds = model.predict(X_test)
accuracy = accuracy_score(y_test, preds)
# Display accuracy result
st.write(f"**Model Accuracy:** {accuracy*100:.2f}%")Here's what's happening in this ML app snippet:
- We display the DataFrame
dfso the user can inspect the raw data. - We include a button labeled "Train Model". When the user clicks it, the code under the
if st.buttonblock executes. - We do a quick sanity check to ensure there's a
targetcolumn to predict (this is our label). - We split the DataFrame into features (X) and target (y), then into training and test sets.
- We initialize a
RandomForestClassifier(from scikit-learn) and train it on the training data. - We make predictions on the test set and compute the accuracy.
- Finally, we use
st.writeto display the accuracy in the app.
This simple example illustrates how you can integrate DataFrame display with interactive controls to create a mini machine learning pipeline in a Streamlit app. The user could be allowed to tweak hyperparameters (e.g., number of trees in the forest, test split ratio via sliders), or choose different models, and see the outcomes quickly. The DataFrame is central here as a way to present the data being used for training or the results (you could also show a DataFrame of predictions or misclassified examples, etc.). With Streamlit's interactive DataFrame and widgets, you turn a script into a real app where users (or yourself) can experiment in real time.
Conclusion
Streamlit has made it easier than ever to build interactive data apps, and its integration with Pandas DataFrames brings a lot of power to your fingertips. In this article, we covered how to display DataFrames in Streamlit, from basic usage to advanced customization. We saw how to style data for better readability, how to handle large datasets efficiently, and even how DataFrames fit into interactive filtering and machine learning use-cases. With the latest updates in Streamlit (as of 2025), st.dataframe is more capable and performant than before — offering built-in sorting, search, download, and a snappy grid rendering that can handle substantial data volumes. We also introduced st.data_editor for those scenarios where you need users to edit or contribute data via the app. Whether you're a seasoned data scientist or a beginner, Streamlit provides a friendly and powerful platform to share data insights. A DataFrame that would otherwise be static in a notebook can become an interactive exploration tool in a Streamlit app. As you continue your journey, remember to leverage caching for performance, keep the user experience in mind (show the most relevant slice of data, not everything at once), and use the rich styling options to make your data tell a clear story. So go ahead — try these techniques in your own Streamlit app. Turn your DataFrames into interactive tables, build a dashboard around them, or create the next awesome data science web app. The possibilities are endless when you combine Pandas and Streamlit!
Have you heard of this awesome Data Analysis & Data Visualisation tool, that turns your Streamlit App into Tableau?
PyGWalker (opens in a new tab) is a Python Library that helps you easily embed a tableau-alternative UI into your own Streamlit app effortlessly.
Frequently Asked Questions
How can I style a DataFrame in Streamlit? – You can style a DataFrame using both Pandas styling and Streamlit's display options. For example, use Pandas Styler methods like highlight_max or background_gradient to add color highlights. You can also apply custom CSS via st.markdown (with unsafe_allow_html=True) to tweak simple styles (though this is advanced and limited). Additionally, take advantage of column_config in st.dataframe to format columns (e.g., number formatting, hiding index) in a straightforward way.
How can I filter a DataFrame in Streamlit? – Streamlit provides interactive widgets that make filtering easy. You can use dropdowns (st.selectbox or st.multiselect) for categorical filters, sliders (st.slider) for numeric ranges, text inputs for text search, etc. In your code, use these widget values to subset your DataFrame (for instance, df[df[column] == value] or using pandas boolean indexing for ranges). The app will update in real time when the user adjusts the widgets, showing the filtered data.
Can I display images inside a DataFrame in Streamlit? – Streamlit can display images, but not directly inside st.dataframe cells as of now. A common approach is to have a column in your DataFrame with image URLs or file paths. Instead of trying to put the images in the table, you would loop through the DataFrame and use st.image() for each image (or use st.columns to lay them in a grid). Alternatively, if you have small thumbnail URLs, you could use the column configuration to mark that column as an Image column (when this feature is enabled) which might display images directly. But generally, you'll handle images by separate Streamlit calls rather than embedding them in a st.dataframe. Remember that any image data would need to be accessible (either as a URL or uploaded file) for st.image to display it.
