The title might sound dramatic - and yes, I didn’t storm into a boardroom shouting about millions saved.
But what started as a casual read on a clever efficiency hack turned into a personal experiment in one of my projects — and the results were eye-opening. A simple adjustment in how data is formatted led to significant reductions in token usage, and at scale, that translates to real cost savings.
Here’s what I discovered and how you can apply it too.
Why Tokens Add Up
If you’ve used Gen AI APIs like OpenAI or Gemini, you know that every token counts — input and output alike.
For a handful of calls, the difference may seem trivial. But when you scale to thousands or millions of requests, even a small inefficiency quietly stacks into noticeable costs. This experiment began with a simple question: Can how data is formatted actually impact token usage?
Turns out, yes — and it’s more impactful than you’d think.
Experimenting With Data Formats
The theory suggested that using YAML instead of JSON could reduce token usage. I wanted to see it in practice.
JSON (Standard)
{
"invoice_id": "INV-342",
"vendor": "CodeTerra",
"amount": 4500,
"currency": "USD"
}YAML (Alternative)
YAML
invoice_id: INV-342
vendor: CodeTerra
amount: 4500
currency: USDBoth contain identical information. But the alternative format eliminates extra punctuation — braces, commas, quotes — which reduces the number of tokens when processed by the AI model.
Results From My Project
In my specific test case, I saw:
-
20% drop in output tokens.
-
30% drop in character count.
It’s the kind of optimization that, at scale, feels like “millions saved” — even if it’s metaphorical for your project.
NOTE
Results can vary based on the data structure; some datasets might see even better results, but it’s worth trying for any high-volume application.
Implementing It Without Breaking Your Workflow
Most systems still expect JSON, so I set up a simple conversion pipeline to translate the AI’s YAML output back into JSON for my backend:
JavaScript
const yaml = require("js-yaml");
// Example AI output in YAML
const yamlData = `
invoice_id: INV-342
vendor: CodeTerra
amount: 4500
currency: USD
`;
// Convert YAML → JSON
const json = yaml.load(yamlData);
console.log(json);Fast, lightweight, and it preserves efficiency while keeping backend systems happy.
Why This Actually Works
-
Fewer symbols = fewer tokens.
-
Compact structure = higher semantic density.
-
Cleaner data = easier for the model to process.
Key Takeaways
IMPORTANT
Test ideas yourself: Reading about a hack is one thing; validating it in your workflow is another.
Serialization matters: It affects both readability and cost.
Small tweaks multiply: Token savings compound quickly at scale.
Practical conversion is easy: Integrate efficiency without changing your existing backend.
Final Thoughts
This experiment showed me that optimization doesn’t always require rewriting models or infrastructure. Sometimes, it’s about noticing tiny inefficiencies that quietly eat resources.
The “millions saved” might be metaphorical here, but at scale, the idea is real: small changes in AI workflows can have outsized impacts on costs and performance.
