Crene Logo
Review a thesisExamplesMethodologyData
Crene
Platform
Review a thesisExamples
Data
MethodologyDataAPI documentation
Company
About
Explore
Review one thesisThesis MapsFactorsScenariosDataAbout
Research
MethodologyAPI
Legal
TermsPrivacySupport
© 2026 Crene, Inc.
Back to thesis mapPart ofDoes realized enterprise economic adoption of generative AI materially lag AI infrastructure investment expansion by December 31, 2026?
Event · UNIT ECONOMICS & PRICING

Will model context window pricing (cost per million tokens for 1M+ context) decline more than 60% YoY in 2026?

Resolves Dec 31, 2026
Probability
61%

4-model average

Confidence
LOW

building category history

Stability
—

loading

Models
Diverging

45pt spread

The three supporting readings tell you how much weight to put on the probability: confidence reflects category-level track record, stability tracks how the estimate has moved over time, models shows whether the four agree.

Belief trajectory

Loading belief trajectory...

Per-model probabilities

GPT-4o
30%-31 vs avg

Current trends in AI model development indicate a strong emphasis on efficiency and cost reduction, yet historical price declines in technology suggest that large drops in pricing are uncommon. For instance, while the price for cloud computing services has generally decreased, the overall cost per million tokens has fallen more slowly, with estimates indicating a cumulative decline of around 40% over the last five years. The combination of rising demand for AI services and potential increases in infrastructure costs may counteract steep reductions.

Key uncertaintyThe impact of breakthrough technologies or regulatory changes on the cost structures of AI model training and deployment could significantly alter pricing dynamics.
Gemini
75%+14 vs avg

Given the rapid advancements and increasing competition in large language model development, alongside the drive for greater efficiency in training and inference, a significant YoY decline in context window pricing is highly probable. Companies like Google and OpenAI have consistently pushed for longer context windows at lower costs, and this trend is expected to accelerate with further technological breakthroughs and economies of scale, making a >60% drop by 2026 a strong possibility.

Key uncertaintyThe rate of hardware innovation (e.g., specialized AI chips, memory bandwidth) will be a critical factor in determining the feasibility and cost-effectiveness of processing increasingly larger context windows.
Claude
72%+11 vs avg

Large context window pricing has declined approximately 80-90% YoY from 2023-2025 (Claude 3.5 Sonnet: $3/$15 per MTok in 2024 → $0.80/$2.40 in 2025; GPT-4 Turbo context pricing dropped similarly). A 60% YoY decline in 2026 represents a deceleration from recent trends but remains plausible given: (1) continued architectural improvements and training efficiency gains, (2) increased competition from Claude, GPT, Gemini, and open-source models like Llama driving commoditization, and (3) the structural economics of serving longer contexts becoming routine. However, the 60% threshold is notably lower than 2024-2025 declines, suggesting we're on the flatter part of a logarithmic cost curve, making the prediction moderately likely but not highly probable.

Key uncertaintyWhether a major capability breakthrough (e.g., new architectures dramatically reducing inference costs) occurs in late 2025/early 2026, or conversely, whether pricing stabilizes around gross margin targets (40-50%) that providers defend, either scenario significantly altering YoY decline rates.
GrokGrok
68%+7 vs avg

Context pricing for 1M+ windows fell ~75% YoY in 2024-2025 (GPT-4o 128k at $2.50/$10 per M tokens to Gemini 1.5 Pro 1M at $0.35/$1.05), driven by 3-4x efficiency gains in attention mechanisms and 2x growth in HBM supply from Samsung and SK Hynix. NVIDIA B200 and AMD MI350 ramps in 2025-26 add 60-70% more effective FLOPs per dollar, while Anthropic, OpenAI, and Google have all signaled 2M-10M context products on 2026 roadmaps. Structural precedent from 2023-24 shows each new hardware generation produced 55-80% price drops within 12 months of volume availability.

Key uncertaintyWhether memory bandwidth constraints from HBM4 delays limit the effective cost reduction below 60% even if FLOPs-per-dollar improves.
Key disagreementGemini (75%) vs GPT-4o (30%): Different weighting of factors

Resolution criteria

SourceFrontier provider pricing pages 2026
CRENE-AIER-C087-20261231Generated Jun 28, 2026