Skip to content
ChatGPT Prompts for Pandas Data Visualization

ChatGPT Prompts for Data Visualization: Turn Pandas Dataframes into Beautiful Charts

Data visualization is a vital aspect of data analysis. It helps in understanding data, identifying patterns and trends, and communicating insights effectively. Python is one of the most popular languages for data analysis, and the Pandas library is a popular tool for manipulating and analyzing data in Python. In this article, we will explore how ChatGPT prompts can be used to turn Pandas dataframes into beautiful charts. We'll also explore using an Open Source Python library named PyGWalker (opens in a new tab) for Data Visualization.

📚

Introduction to ChatGPT prompts

ChatGPT is a large language model trained by OpenAI that can generate human-like text. It can be used to generate prompts for data analysis, such as generating code to plot charts. The prompts are designed to be easy to use and can save time and effort for data scientists.

Using ChatGPT prompts to plot charts

ChatGPT Prompt to Visualize Data with Matplotlib

We can use ChatGPT prompts to plot charts using the Matplotlib library. Matplotlib is a popular visualization library in Python that provides a wide range of chart types and customization options. Here's an example of using ChatGPT prompts to plot a line chart:

Suppose we have a dataset called "sales_data.csv" with columns "date" and "sales". We want to plot a line chart that shows the trend of sales over time. We can use the following ChatGPT prompt:

Prompt: I want you to act as a coder in python. I have a dataset ["sales_data.csv"] with columns ["date" and "sales"]. [Plot a line chart that shows the trend of sales over time]

Where you can change the words within "[]" for your specific scenario. ChatGPT generates the following code:

# Import necessary libraries
import pandas as pd
import matplotlib.pyplot as plt
 
# Read the data
df = pd.read_csv('sales_data.csv')
 
# Plot a line chart
plt.plot(df['date'], df['sales'])
plt.title('Sales Trend')
plt.xlabel('Date')
plt.ylabel('Sales')
plt.show()

This code generates a line chart that shows the trend of sales over time. We can use ChatGPT prompts to generate code for other chart types as well, such as bar charts, scatter plots, and histograms.

ChatGPT Prompt to Visualize Data with Matplotlib

In addition to plotting charts from dataframes, we can also use Matplotlib to display images. Suppose we have a folder of images named "image_folder". Each image is named with its corresponding label, such as "cat_01.jpg", "dog_02.jpg", etc. We want to display a grid of images with their labels. Use the following sample prompt:

Prompt: I want you to act as a coder. I have a folder of images. [Describe how files are organized in the directory] [Describe how you want images to be printed]

ChatGPT generates the following Python code:

# Import necessary libraries
import matplotlib.pyplot as plt
import os
from PIL import Image
 
# Set the directory path
directory = "image_folder/"
 
# Create a list of image filenames and labels
filenames = os.listdir(directory)
labels = [filename.split("_")[0] for filename in filenames]
 
# Create a grid of images
fig, axs = plt.subplots(nrows=3, ncols=3, figsize=(10,10))
 
for i, ax in enumerate(axs.flatten()):
    # Load the image and label
    img = Image.open(directory + filenames[i])
    label = labels[i]
    
    # Display the image and label
    ax.imshow(img)
    ax.set_title(label)
    ax.axis('off')
 
plt.show()

This code generates a grid of images with their corresponding labels. We can use ChatGPT prompts to generate code for other image display styles, such as individual images with captions or side-by-side image comparisons.

ChatGPT Prompt to Explain Model with Lime

Sometimes we want to explain the output of a machine-learning model. LIME is a popular library for explaining machine learning models. Suppose we have trained a model using the scikit-learn library. We want to explain the model's output using LIME. Use the following prompt:

Prompt: I want you to act as a data scientist and explain the model's results. I have trained a [library name] model and I would like to explain the output using LIME. Please write the code.

import lime
import lime.lime_tabular
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
 
# Load the iris dataset
iris = load_iris()
 
# Train a random forest classifier
clf = RandomForestClassifier(n_estimators=500)
clf.fit(iris.data, iris.target)
 
# Define the explainer
explainer = lime.lime_tabular.LimeTabularExplainer(
    iris.data, 
    feature_names=iris.feature_names, 
    class_names=iris.target_names,
    discretize_continuous=True
)
 
# Explain the first data point
exp = explainer.explain_instance(iris.data[0], clf.predict_proba, num_features=4)
exp.show_in_notebook(show_table=True)

This code will generate an explanation for the first data point in the iris dataset, which has four features: sepal length, sepal width, petal length, and petal width. The explanation will show which features are most important for the classifier's prediction.

ChatGPT Prompt to Explain Model with Shap

SHAP (SHapley Additive exPlanations) is another popular library for explaining machine learning models. It provides global interpretability by showing the contribution of each feature to the model's output across the entire dataset.

Use the following prompt for ChatGPT:

Prompt: I want you to act as a data scientist and explain the model's results. I have trained a scikit-learn XGBoost model and I would like to explain the output using a series of plots with Shap. Please write the code.

And this is the output from ChatGPT:

import shap
from sklearn.datasets import load_iris
from xgboost import XGBClassifier
 
# Load the iris dataset
iris = load_iris()
 
# Train an XGBoost classifier
clf = XGBClassifier(n_estimators=500)
clf.fit(iris.data, iris.target)
 
# Define the explainer
explainer = shap.Explainer(clf)
 
# Explain the entire dataset
shap_values = explainer(iris.data)
 
# Generate a summary plot
shap.summary_plot(shap_values, iris.data)
 
# Generate a force plot for the first data point
shap.initjs()
shap.force_plot(explainer.expected_value, shap_values[0], iris.data[0])

This code will generate a summary plot that shows the most important features of the model's output across the entire dataset. It will also generate a forced plot for the first data point in the dataset, which shows how each feature contributes to the model's prediction for that data point.

📚