Import pygwalker and pandas to your Jupyter Notebook to get started.
import pandas as pd import pygwalker as pyg
Load your data as a dataframe, then pass it to pygwalker.
df = pd.read_csv('./<your_csv_file_path>.csv') walker = pyg.walk(df)
pygwalker accept not just pandas dataframe, but also modin dataframe and even a data connection, like snowflake.
Sometimes your dataframe can be pretty large, and causes slow performance of pygwalker. Now we provide you with a simple way to boosting its performance with one extra parameter
By set use_kernel_calc=True will enable the new computation engine in pygwalker powered by DuckDB.
Sometimes your data can be extremely large, and you don't want to load it into your local memory. PyGWalker allows to push all its computations into a remote OLAP services, like Snowflake.
pip install --upgrade --pre pygwalker pip install --upgrade --pre "pygwalker[snowflake]"
Here is a code example of using pygwalker with Snowflake.
import pygwalker as pyg from pygwalker.data_parsers.database_parser import Connector conn = Connector( "snowflake://user_name:password@account_identifier/database/schema", """ SELECT * FROM SNOWFLAKE_SAMPLE_DATA.TPCH_SF1.ORDERS """ ) walker = pyg.walk(conn)
PyGWalker is powerful for data exploration locally, and it can be great if it can be run in a web app. Basically, there are many ways to implment this:
- Use PyGWalker with Djgango or Flask (opens in a new tab).
- Use the core SDK of PyGWalker: [Graphic Walker] to integrate in any web app.
- Use the Streamlit (opens in a new tab) to build a web app.
Streamlit is a great tool to build data apps with Python, especially for data scientists who are not familiar with web development. Here is a quick example of using PyGWalker with Streamlit.
import pandas as pd import streamlit.components.v1 as components import streamlit as st from pygwalker.api.streamlit import init_streamlit_comm, get_streamlit_html st.set_page_config( page_title="Use Pygwalker with Streamlit", layout="wide" ) st.title("PyGWalker with Streamlit") # Initialize pygwalker communication init_streamlit_comm() # When using `use_kernel_calc=True`, you should cache your pygwalker html, if you don't want your memory to explode @st.cache_resource def get_pyg_html(df: pd.DataFrame) -> str: # When you need to publish your application, you need set `debug=False`,prevent other users to write your config file. # If you want to use feature of saving chart config, set `debug=True` html = get_streamlit_html(df, spec="./gw0.json", use_kernel_calc=True, debug=False) return html @st.cache_data def get_df() -> pd.DataFrame: return pd.read_csv("/bike_sharing_dc.csv") df = get_df() components.html(get_pyg_html(df), width=1300, height=1000, scrolling=True)
Check this article from community to learn more about how to use PyGWalker with Streamlit: An Introduction to PyGWalker: Supercharge Your Streamlit Visualizations (opens in a new tab)