Skip to content

Pandas Rolling Window: Rolling, Expanding, and EWM

Updated on

Moving averages and smoothed signals are core to time-series analysis, yet many teams struggle with window alignment, missing values, and slow Python loops.

  • Problem: Calculating rolling stats by hand is error-prone and often misaligned with timestamps.
  • Agitate: Loops and ad hoc shifts produce off-by-one errors, gaps for early periods, and sluggish notebooks.
  • Solution: Use rolling, expanding, and ewm with the right window definitions (min_periods, time-based windows, center, adjust) to get correct, fast, vectorized results.

Want an AI agent that understands your pandas notebooks and rolling-window features?

RunCell is a JupyterLab AI agent that can read your code, analyze DataFrames, understand notebook context, debug errors, and even generate & execute code for you. It works directly inside JupyterLab—no switching windows or copy-pasting.

👉 Try RunCell: runcell.dev (opens in a new tab)


Quick Reference

Window typeBest forKey parameters
rollingMoving averages, volatility, custom window functionswindow=3 (or "7D"), min_periods, center, win_type, on
expandingCumulative stats from the startmin_periods
ewmExponential decay smoothing or weighted metricsspan, alpha, halflife, adjust, times

Sample Data

import pandas as pd
 
dates = pd.date_range("2024-01-01", periods=8, freq="D")
sales = pd.DataFrame({"date": dates, "revenue": [10, 12, 9, 14, 15, 13, 11, 16]})
sales = sales.set_index("date")

Rolling Windows (Fixed and Time-Based)

Fixed-size windows

sales["rev_ma3"] = (
    sales["revenue"]
    .rolling(window=3, min_periods=2)
    .mean()
)
  • min_periods controls when results start; early rows stay NaN until the minimum count is met.
  • center=True aligns the statistic to the middle of the window (handy for plots).

Time-based windows on datetime index or on= column

sales_reset = sales.reset_index()
sales_reset["rev_7d_mean"] = (
    sales_reset.rolling("7D", on="date")["revenue"].mean()
)
  • Use duration strings ("7D", "48H") for irregular sampling; pandas chooses rows within the lookback horizon rather than a fixed count.
  • For slicing before/after edges, adjust closed="left" or "right" as needed.

Custom window functions

sales["rev_range"] = (
    sales["revenue"].rolling(4).apply(lambda x: x.max() - x.min(), raw=True)
)

Set raw=True to work with NumPy arrays inside apply for speed.


Expanding Windows (Cumulative)

sales["rev_cum_mean"] = sales["revenue"].expanding(min_periods=2).mean()
  • Use expanding when every observation should see everything up to that point (running average, cumulative ratios).
  • Combine with shift() to compare the newest value against the historical average.

Exponential Weighted Windows

sales["rev_ewm_span4"] = sales["revenue"].ewm(span=4, adjust=False).mean()
  • adjust=False uses a recursive formula that mirrors typical smoothing in analytics dashboards.
  • halflife offers intuitive decay: ewm(halflife=3) halves the weight every 3 periods.
  • For irregular timestamps, pass times="date" (or set the index) to weight by actual time deltas instead of row counts.

Choosing the Right Window (Cheat Sheet)

GoalRecommended methodNotes
Smooth short-term noiserolling with small window and center=TrueWorks on numeric columns; keep min_periods ≥ 1 for early visibility
Running totals or averages from startexpandingNo fixed window; great for KPIs that accumulate
Decay older observationsewm(span=...)Better than large rolling windows for momentum-like signals
Irregular timestampsTime-based rolling("7D", on="date") or ewm(..., times="date")Avoids bias from denser sampling days
Feature generationrolling().agg(["mean","std","min","max"])Multi-aggregation builds tidy feature sets quickly

Performance and Correctness Tips

  • Keep datetime columns as datetime64[ns] and set an index when working heavily with time-based windows.
  • Prefer built-in aggregations (mean, std, sum, count) over Python apply for speed.
  • Avoid forward-looking bias: shift() targets before rolling if you are building supervised learning features.
  • Combine with resample to normalize frequency before rolling if the source data is irregular.

Rolling, expanding, and exponential windows cover most smoothing and feature-engineering needs without loops. Pair them with pandas-to-datetime and pandas-resample for clean time axes, and you will get fast, reliable metrics ready for charts or models.