Pandas Rolling Window: Rolling, Expanding, and EWM
Updated on
Moving averages and smoothed signals are core to time-series analysis, yet many teams struggle with window alignment, missing values, and slow Python loops.
- Problem: Calculating rolling stats by hand is error-prone and often misaligned with timestamps.
- Agitate: Loops and ad hoc shifts produce off-by-one errors, gaps for early periods, and sluggish notebooks.
- Solution: Use
rolling,expanding, andewmwith the right window definitions (min_periods, time-based windows,center,adjust) to get correct, fast, vectorized results.
Want an AI agent that understands your pandas notebooks and rolling-window features?
RunCell is a JupyterLab AI agent that can read your code, analyze DataFrames, understand notebook context, debug errors, and even generate & execute code for you. It works directly inside JupyterLab—no switching windows or copy-pasting.
👉 Try RunCell: runcell.dev (opens in a new tab)
Quick Reference
| Window type | Best for | Key parameters |
|---|---|---|
rolling | Moving averages, volatility, custom window functions | window=3 (or "7D"), min_periods, center, win_type, on |
expanding | Cumulative stats from the start | min_periods |
ewm | Exponential decay smoothing or weighted metrics | span, alpha, halflife, adjust, times |
Sample Data
import pandas as pd
dates = pd.date_range("2024-01-01", periods=8, freq="D")
sales = pd.DataFrame({"date": dates, "revenue": [10, 12, 9, 14, 15, 13, 11, 16]})
sales = sales.set_index("date")Rolling Windows (Fixed and Time-Based)
Fixed-size windows
sales["rev_ma3"] = (
sales["revenue"]
.rolling(window=3, min_periods=2)
.mean()
)min_periodscontrols when results start; early rows stayNaNuntil the minimum count is met.center=Truealigns the statistic to the middle of the window (handy for plots).
Time-based windows on datetime index or on= column
sales_reset = sales.reset_index()
sales_reset["rev_7d_mean"] = (
sales_reset.rolling("7D", on="date")["revenue"].mean()
)- Use duration strings (
"7D","48H") for irregular sampling; pandas chooses rows within the lookback horizon rather than a fixed count. - For slicing before/after edges, adjust
closed="left"or"right"as needed.
Custom window functions
sales["rev_range"] = (
sales["revenue"].rolling(4).apply(lambda x: x.max() - x.min(), raw=True)
)Set raw=True to work with NumPy arrays inside apply for speed.
Expanding Windows (Cumulative)
sales["rev_cum_mean"] = sales["revenue"].expanding(min_periods=2).mean()- Use expanding when every observation should see everything up to that point (running average, cumulative ratios).
- Combine with
shift()to compare the newest value against the historical average.
Exponential Weighted Windows
sales["rev_ewm_span4"] = sales["revenue"].ewm(span=4, adjust=False).mean()adjust=Falseuses a recursive formula that mirrors typical smoothing in analytics dashboards.halflifeoffers intuitive decay:ewm(halflife=3)halves the weight every 3 periods.- For irregular timestamps, pass
times="date"(or set the index) to weight by actual time deltas instead of row counts.
Choosing the Right Window (Cheat Sheet)
| Goal | Recommended method | Notes |
|---|---|---|
| Smooth short-term noise | rolling with small window and center=True | Works on numeric columns; keep min_periods ≥ 1 for early visibility |
| Running totals or averages from start | expanding | No fixed window; great for KPIs that accumulate |
| Decay older observations | ewm(span=...) | Better than large rolling windows for momentum-like signals |
| Irregular timestamps | Time-based rolling("7D", on="date") or ewm(..., times="date") | Avoids bias from denser sampling days |
| Feature generation | rolling().agg(["mean","std","min","max"]) | Multi-aggregation builds tidy feature sets quickly |
Performance and Correctness Tips
- Keep datetime columns as
datetime64[ns]and set an index when working heavily with time-based windows. - Prefer built-in aggregations (
mean,std,sum,count) over Pythonapplyfor speed. - Avoid forward-looking bias:
shift()targets before rolling if you are building supervised learning features. - Combine with
resampleto normalize frequency before rolling if the source data is irregular.
Rolling, expanding, and exponential windows cover most smoothing and feature-engineering needs without loops. Pair them with pandas-to-datetime and pandas-resample for clean time axes, and you will get fast, reliable metrics ready for charts or models.