LinkedIn Post
Don't buy a 10 ton truck to deliver 4 tons of bricks
by Chris Hornby
14 January 2026
AI cost reduction is a sign of a maturing ecosystem. Like most serious AI users, we benchmark our repeatable work so we can assess the latest LLM releases. Recently, this is becoming as much about cost reduction as about unlocking new use cases.
During 2025 we ran benchmarks usually to see if the models were now capable of doing work that was previously too difficult. As we move into 2026, we have started asking "can a cheaper model do this work at the same quality?"
Mini case study: Venato.ai processes thousands of regulatory documents a day. Our latest benchmarks show that Gemini 3 Flash produces summaries at equivalent or higher quality to Gemini 2.5 Pro. This means Venato can meet the quality bar at less than half the cost, allowing us to broaden the jurisdiction coverage for the same budget.
Simplistic analogy: As long as your quality bar is met, "better" has no real meaning. For example, if you need a truck to deliver 4 tons of bricks at a time, then a 3 ton truck won't cut it. But there is no added value in buying a 5 or 10 ton truck. Your customers get no extra value. In fact, delivery can be slower and more expensive.
Key takeaway: this is a real sign of the maturing state of LLM capability: when we start accepting that LLMs can do real work and start asking if we can get it done cheaper.
There are a bunch of benefits to this aside from cost, not least environmental: as we drive capability up, so processing and compute times drop, meaning less energy and water usage.