Methodology

What this is

PokeFuture uses machine learning to estimate the price of a sealed Pokémon product 3 years from now. Forecasts are educational, not financial advice. Treat the numbers as ranges, not targets.

The model

We use XGBoost, an open-source machine-learning library that trains gradient-boosted decision trees. Our model takes a sealed product's current price, its set's history, the strength of singles inside the set, and live marketplace signals, and outputs a projected price 3 years from now.

Training data: 892 historical price observations across 25 sealed products. We've validated 1-year and 3-year horizons against real outcomes (916 captured 1-year outcomes and 378 captured 3-year outcomes). The 3-year model is the one published on every product page. Longer horizons aren't published because we don't yet have enough multi-year outcomes in the training panel to score them honestly.

A real limitation: the universe of English sealed Pokémon products is small, roughly 300 actively tracked. Many products in our training data are also products we forecast. That overlap creates a risk of overfitting. The model may look more accurate on backtests than it really is. We're honest about this and keep collecting fresh outcomes each month to widen the training panel.

The signals

The model uses 11 signals from three sources: eBay (live listings, prices, supply), PriceCharting (sealed price history and set metadata), and community engagement (Reddit activity, Google Trends).

On every product page these signals are summarized as five dimensions, the dots you see in the "Why this rating" card:

  • Outlook. Long-term price direction (history, momentum, drawdown).
  • Cards inside.Singles backing, how much of the box's value sits in the cards.
  • Community. Collector engagement (Reddit, Google Trends).
  • Resale ease. Liquidity (active eBay listings, sales velocity).
  • Price reliability. Source agreement (PriceCharting vs eBay).

Each dimension also feeds a confidence indicator that tells you how much to trust the headline forecast. When data is thin or sources disagree, confidence drops and Buy or Sell signals are automatically demoted to Hold.

The process

  1. 1. Collect data

    Daily syncs from eBay and PriceCharting. Monthly refresh of community signals.

  2. 2. Normalize

    Standardize across sources, fill missing values, compute derived features like momentum, volatility, and singles ratios.

  3. 3. Train and validate

    Retrain XGBoost monthly. Check accuracy against held-out historical outcomes. Only deploy approved models.

  4. 4. Forecast and guard

    Run the trained model per product, apply scenario adjustments and guardrails (ROI cap, reprint risk, weak-evidence demotion), publish.

FAQ

How accurate is the forecast?

Across 916 past 1-year predictions, our model was off by 27.4% on average. The 3-year model has similar accuracy on its 378 validated outcomes. Treat each projection as a range, not a target. A "$500 in 3 years" prediction would typically land between $365 and $635.

Why is my product rated Hold when I expected Buy?

A Buy requires both a projected return above the S&P 500 baseline (10.5% per year, compounded over 3 years) AND confidence above a threshold. If the data is sparse or sources disagree, confidence drops and the rating is automatically demoted to Hold.

Why a 3-year projection and not longer?

Our 3-year XGBoost model is trained directly on 378 real 3-year price outcomes and validated with 5-fold time-series cross-validation. Longer horizons need longer-tenured feature snapshots than our 4-year panel contains, and farther-out projections compound error. 3 years is the sweet spot: long enough to be useful for collectors, short enough to be backed by a directly-measured model.

Why does the eBay median differ from the Current price?

Current price comes from PriceCharting, a reference price source. The eBay median is what people are asking right now. They typically differ by 10 to 30% depending on market activity. When the gap exceeds 15% we surface it on the detail page.

How fresh is the data?

Detail pages refresh once a day. Live eBay listings refresh once a day. The monthly buying list rebuilds on the 1st of each month, around 2 AM UTC.

What products won't have a forecast?

Brand-new sets (less than 1 year old, no history to learn from), products with too many estimated features, and products outside our core type list (Booster Boxes, Elite Trainer Boxes, Booster Bundles, Collection Boxes, Booster Packs).

Contribute

This is a side project, not a hedge fund. If you spot a wrong price, a missing product, a forecast that doesn't match your read of the market, or a methodology gap, we want to hear about it. Send us a note through the contact form. Every bug report makes the next forecast better.