4-model average
building category history
loading
45pt spread
The three supporting readings tell you how much weight to put on the probability: confidence reflects category-level track record, stability tracks how the estimate has moved over time, models shows whether the four agree.
Loading belief trajectory...
Current trends in AI model development indicate a strong emphasis on efficiency and cost reduction, yet historical price declines in technology suggest that large drops in pricing are uncommon. For instance, while the price for cloud computing services has generally decreased, the overall cost per million tokens has fallen more slowly, with estimates indicating a cumulative decline of around 40% over the last five years. The combination of rising demand for AI services and potential increases in infrastructure costs may counteract steep reductions.
Given the rapid advancements and increasing competition in large language model development, alongside the drive for greater efficiency in training and inference, a significant YoY decline in context window pricing is highly probable. Companies like Google and OpenAI have consistently pushed for longer context windows at lower costs, and this trend is expected to accelerate with further technological breakthroughs and economies of scale, making a >60% drop by 2026 a strong possibility.
Large context window pricing has declined approximately 80-90% YoY from 2023-2025 (Claude 3.5 Sonnet: $3/$15 per MTok in 2024 → $0.80/$2.40 in 2025; GPT-4 Turbo context pricing dropped similarly). A 60% YoY decline in 2026 represents a deceleration from recent trends but remains plausible given: (1) continued architectural improvements and training efficiency gains, (2) increased competition from Claude, GPT, Gemini, and open-source models like Llama driving commoditization, and (3) the structural economics of serving longer contexts becoming routine. However, the 60% threshold is notably lower than 2024-2025 declines, suggesting we're on the flatter part of a logarithmic cost curve, making the prediction moderately likely but not highly probable.
Context pricing for 1M+ windows fell ~75% YoY in 2024-2025 (GPT-4o 128k at $2.50/$10 per M tokens to Gemini 1.5 Pro 1M at $0.35/$1.05), driven by 3-4x efficiency gains in attention mechanisms and 2x growth in HBM supply from Samsung and SK Hynix. NVIDIA B200 and AMD MI350 ramps in 2025-26 add 60-70% more effective FLOPs per dollar, while Anthropic, OpenAI, and Google have all signaled 2M-10M context products on 2026 roadmaps. Structural precedent from 2023-24 shows each new hardware generation produced 55-80% price drops within 12 months of volume availability.