Most developers have a short mental list of data formats: JSON for APIs, CSV for spreadsheets, YAML for config files, maybe XML if you're working with legacy systems. But there's a newer format quietly making its way into the AI tooling space that's worth knowing about — TOON, or Token-Oriented Object Notation. It was designed from the ground up for a specific problem: passing structured data to large language models without burning through your token budget.
What Is TOON?
TOON stands for Token-Oriented Object Notation. It's a compact data serialization format purpose-built to minimise the number of tokens consumed when structured data is embedded in LLM prompts or responses. Think of it as JSON with all the verbosity stripped out — no repeated key names in arrays, no redundant quotes, no trailing whitespace — just the signal with as little syntactic noise as possible.
The npm package is @toon-format/toon,
and it gives you a straightforward encode / decode API that works in any
Node.js project or modern bundler.
TOON files use the .toon extension.
Why Not Just Use JSON?
JSON is excellent for machine-to-machine communication where bandwidth is cheap and parsing is handled by the runtime. But when you're sending data as part of a prompt to the OpenAI API or the Anthropic API, every character counts — literally. Both APIs charge by the token, and tokens map roughly to 4 characters of English text.
Consider a table of 100 user records. In JSON, you'd repeat the keys — "id",
"name", "role", "email" — once per record. That's 100 copies of the
same structural information. TOON's tabular syntax defines those keys once and then lists values row by row,
the same way a CSV does, but with the object structure preserved. The token savings on real-world datasets
can be 40–70% compared to compact JSON.
TOON Syntax at a Glance
TOON supports four core data shapes: scalars (strings, numbers, booleans), arrays, objects, and tabular data. The scalar and collection syntax will look familiar if you've worked with JSON — the tabular format is where TOON really differentiates itself.
A simple object drops the outer braces' whitespace, omits unnecessary quotes, and separates key–value pairs with commas:
{name:Alice,age:30,role:admin}A simple array of scalars looks exactly like you'd expect:
[1,2,3]Now here's where it gets interesting — tabular data. This is the syntax that makes TOON compelling for LLM use cases. Instead of repeating keys for every object in an array, you declare the schema once in the header and list values row by row:
users[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,editorThat block represents an array of 3 user objects — equivalent to the JSON below — but in a fraction
of the tokens. The header users[3]{id,name,role} tells a parser: "this is a variable named
users, it has 3 rows, and each row maps to the fields id, name, and role". The rows are pure values,
no key repetition.
[
{ "id": 1, "name": "Alice", "role": "admin" },
{ "id": 2, "name": "Bob", "role": "user" },
{ "id": 3, "name": "Charlie", "role": "editor" }
]Installing and Using the @toon-format/toon Package
The official npm package handles both encoding JavaScript values to TOON strings and decoding TOON strings back into JavaScript objects and arrays. Install it from the npm registry:
npm install @toon-format/toonThe package exports two functions — encode and decode:
import { encode, decode } from '@toon-format/toon';
// Decode a TOON string → JavaScript value
const toonString = `users[3]{id,name,role}:
1,Alice,admin
2,Bob,user
3,Charlie,editor`;
const users = decode(toonString);
console.log(users[0].name); // "Alice"
console.log(users[1].role); // "user"
// Encode a JavaScript value → TOON string
const config = {
model: 'gpt-4o',
temperature: 0.7,
maxTokens: 1024,
stream: true
};
const toon = encode(config, { indent: 2 });
console.log(toon);
// {model:gpt-4o,temperature:0.7,maxTokens:1024,stream:true}The indent option in encode() controls whether the output is pretty-printed.
For LLM prompts you'll usually want compact output (no indent) to save tokens. For debugging or human-readable
.toon files, { indent: 2 } gives you nicely formatted output.
A Real-World Use Case: Sending Records to an LLM
Imagine you're building a product analytics feature that needs to summarise user activity. You fetch recent session records from your database and want to send them to an LLM for natural-language summarisation. Here's what that looks like with TOON:
import { encode } from '@toon-format/toon';
const sessions = [
{ userId: 101, action: 'login', duration: 0, page: '/dashboard' },
{ userId: 101, action: 'view_report', duration: 142, page: '/reports/q1' },
{ userId: 101, action: 'export_csv', duration: 8, page: '/reports/q1' },
{ userId: 102, action: 'login', duration: 0, page: '/dashboard' },
{ userId: 102, action: 'edit_profile', duration: 37, page: '/settings' }
];
const toonPayload = encode(sessions);
// sessions[5]{userId,action,duration,page}:
// 101,login,0,/dashboard
// 101,view_report,142,/reports/q1
// 101,export_csv,8,/reports/q1
// 102,login,0,/dashboard
// 102,edit_profile,37,/settings
const prompt = `Summarise the following user activity. Data is in TOON format.\n\n${toonPayload}`;The TOON representation of those 5 session records is significantly shorter than the equivalent JSON,
which would repeat "userId", "action", "duration", and "page"
five times each. Over hundreds of records the savings are substantial, and they translate directly into lower
API costs and faster response times (fewer tokens to process in the attention window).
Key Data Types in TOON
TOON maps cleanly onto the primitives you already use in JavaScript and most other languages:
- Strings — Unquoted when they contain no special characters. Quoted with double quotes when they include commas, colons, or whitespace.
- Numbers — Integers and floats written as-is:
42,3.14,-7. - Booleans —
trueandfalse, same as JSON. - Null — Written as
nullfor absent or undefined values. - Arrays — Inline bracket syntax
[val1,val2,val3]for short lists; tabular header syntax for arrays of objects. - Objects — Brace syntax
{key:value,key2:value2}for single objects. - Tabular data — The star of the show:
name[n]{col1,col2,...}:header followed by comma-separated rows — ideal for any collection of same-shaped records.
When to Reach for TOON vs Other Formats
TOON is not trying to replace JSON as a general-purpose interchange format. It fills a specific niche. Here's a quick decision guide:
- Use JSON when you're building REST APIs, storing documents in a database, or passing data between services. JSON is universally supported, well-tooled, and human-readable.
- Use TOON when structured data is part of a prompt or LLM context window and token count matters. The tabular format shines whenever you're working with rows of records — user lists, log entries, product catalogues, analytics events.
- Use CSV when you only need flat tabular data and the consumer expects CSV (spreadsheets, BI tools). CSV has no object nesting, so TOON is more expressive.
- Use YAML for human-edited config files where readability and comments matter more than compactness.
- Use TOON for LLM tool outputs too: if an LLM is calling your tool and returning structured results, encoding those results in TOON saves tokens on both the input and output sides of the API call.
Wrapping Up
TOON — Token-Oriented Object Notation — is a compact serialization format built for the LLM era. It keeps the familiar structure of JSON (objects, arrays, scalars) while introducing a tabular syntax that eliminates redundant key repetition across records. The result is a format that can be 40–70% smaller than equivalent JSON, which translates directly into lower token costs when working with the OpenAI API, Anthropic API, or any other token-billed LLM service.
If you want to explore TOON hands-on, we have a full suite of tools right here:
use the TOON Formatter to pretty-print and inspect TOON documents,
the TOON Validator to catch syntax errors,
the JSON to TOON converter to migrate existing payloads,
and the TOON to JSON converter if you need to go the other way.
The @toon-format/toon
npm package gives you encode and decode to integrate TOON directly into your
Node.js or browser-side code in minutes.