Transparency as method — 2026

Most prediction services publish their best days. Not their worst.

We publish both. As of today, every bet we have placed since April 21 on Polymarket's daily maximum-temperature markets is public — wins and losses combined. Fifty cities, fourteen weather models, three types of bets, every daily result verified against the actual observation.

Why? Because a probabilistic forecast without a verifiable track record isn't science. It's communication.

Why we publish everything

The first two posts in this series defended a simple idea: weather is probabilistic. At a 24-hour horizon, the natural uncertainty of a forecast is 1 to 2°C — often wider than a Polymarket bucket. At 3 days, you're closer to 2 to 3°C. That's not a technical defect: it's the atmosphere itself.

Direct consequence: a probabilistic system must be wrong regularly. Mathematically. If we say "85% chance this NO holds," then about one time in seven, on average, it doesn't. That's exactly what those 85% mean.

Hiding the days when it doesn't is lying about the very nature of the tool. A track record page that only shows green bars is a marketing argument — not a proof.

What you actually see

Here's what you'll find on demfi.io/track-record:

Three families of bets, each tracked separately. Long Shots (YES positions on extreme buckets, taken only when the market pays under $0.05). Edge Bets (the zone near the mode, where forecasting requires 1 °C precision — still in an experimental phase). Safe Bets (NO positions on buckets the model rules out with high confidence).
Four Markov confidence tiers per city: HIGH, MEDIUM, LOW, NONE. They reflect the maturity of our calibration on that specific city — not an opinion score.
Every line traceable: by day, by city, by bucket. You can click any point on the chart and see the underlying bets, their entry price, their resolution.

One point worth clarifying, because many people will read it wrong: NONE does not mean "low confidence." It means the city hasn't accumulated enough resolved days yet for us to assign a real score. New cities sit in NONE until they have enough history, then migrate to a real tier. Confusing the two leads to wrong conclusions.

What works, what works less

The numbers speak for themselves — go look. But here's the honest summary as we see it.

Safe Bets at HIGH confidence are our most mature signal. When our model says "the daily high will land in this bucket with ≥ 97% probability," and the market still pays 86 to 94 cents for the NO, the edge is real and reproducible.

Edge Bets are the most technically difficult. You need to forecast within a degree, both the winning bucket and the two or three losing buckets right next to it. That's why they remain experimental: we keep refining them.

Long Shots live off a few big hits. They pay rarely. When they do pay, returns can be twenty to a hundred times the stake — but Polymarket slippage and the limited liquidity on those buckets force us into stakes of $1, occasionally a few dollars. No more.

Nobody wins every day. Not us, not anyone. The track record says so plainly.

How to read this correctly

The trap, looking at these numbers, is to hunt for the right combination of the day. There is no such thing.

Winning on Polymarket isn't picking the winning bucket. It's holding, every day and for every city, a coherent combination of YES and NO positions across the distribution. The diversity — across dozens of independent markets — produces the result; not the lone bet that stands out. That's what we tried to materialize in the performance matrix: to understand what works, look at the (type × confidence) blocks — not at individual rows.

Constant or proportional sizing. Never all-in on a single call. It really is that simple.

What we do not (yet) account for

Three caveats, because transparency runs both ways:

The figures shown do not account for slippage or limited liquidity on certain Polymarket buckets — particularly true for Long Shots. A theoretical edge that doesn't materialize at execution is what we call a dead edge.
Past results do not guarantee future performance. Three months of track record guarantee nothing. Six months either.
Nothing you read here is financial advice. These are decision-support tools for people who take their own risks.

The track record lives at demfi.io/track-record. Connect any wallet — a 7-day Premium trial is automatically granted on first connection.

Good analysis,

— JP