Model performance

We publish our forecast errors.

Walk-forward backtest, refreshed monthly. For every model version we hold out the trailing twelve months of observable data, refit on the prior history, predict each holdout period, and compare to two naive baselines (persistence and trailing-mean). Both expanding and rolling-window variants are run so you can see the stricter stability test alongside the headline number.

Warming up

First backtest run pending.

The monthly backtest cron runs on day 28 of each month. Real numbers land here within 30 days of the first production deployment.

What we're testing

Each row on this page is the average MAPE across every fittable (stage, metro) pair for the named model over the holdout window. Baselines are computed on the identical periods so the comparison is apples-to-apples.

Why both windows

Expanding-window uses all prior history at each holdout step. Rolling uses the last 36 months only. A model that wins on expanding but loses on rolling is overfitting to regimes that aren't coming back — we'd rather show you that up front.

Raw data

Everything above is derived from /api/v1/performance. Pull it, verify it, graph it yourself. No auth required.