TOON Syntax Guide — From Scalars to Tabular Data

Most data formats were designed for one audience: humans or machines. JSON leans human. Binary formats like MessagePack lean machine. TOON is explicitly designed for a third audience that didn't exist when those formats were invented: large language models. Every syntax decision in TOON prioritises token efficiency — fitting maximum structured data into minimum tokens so you can pass more context to an AI without blowing your budget. This guide covers every TOON syntax feature, from simple scalars to the tabular notation that makes it genuinely different.

What Makes TOON Different

TOON stands for Token-Optimised Object Notation. The core idea is simple: data serialisation formats like JSON were designed long before per-token API pricing was a thing. JSON is fine for machine-to-machine communication, but when your payload is a 200-row dataset headed into a gpt-4o prompt, you're paying for every quote mark, every curly brace, and every repeated key name. TOON eliminates that waste. It installs as a single npm package — @toon-format/toon — and files use the .toon extension.

Scalar Values — Strings, Numbers, Booleans

TOON scalars look very similar to their JSON counterparts, with one important difference: string values do not require quotes unless they contain special characters like commas, colons, or brackets. This alone cuts a meaningful chunk of tokens out of text-heavy datasets.

text

// Numbers — exactly like JSON
42
3.14
-7

// Booleans — lowercase, same as JSON
true
false

// Strings — no quotes needed for simple values
hello
Alice
Widget Pro

When a string contains a comma, colon, or bracket, wrap it in double quotes just like JSON. So "New York, NY" needs quotes, but London does not. Simple rule, big savings.

Objects — Key:Value Pairs Without the Noise

TOON objects use curly braces with a key:value syntax. Keys are never quoted — that alone saves two characters per key compared to JSON. Pairs are comma-separated. No trailing commas, no colons after the final pair.

text

{name:Alice,age:31,city:London,active:true}

Compare that to the equivalent JSON:

json

{"name": "Alice", "age": 31, "city": "London", "active": true}

Same data, fewer characters, and the token savings compound dramatically when you're working with arrays of objects. Speaking of which...

Arrays — Ordered Lists of Values

Arrays in TOON use square brackets with comma-separated values — exactly like JSON, just without quotes on bare string values.

text

// Array of strings
[Alice,Bob,Carol]

// Array of numbers
[10,20,30,40,50]

// Mixed types (same rules as JSON)
[Widget Pro,29.99,true,101]

// Array of objects
[{id:1,name:Alice},{id:2,name:Bob}]

Arrays of objects work, but this is where TOON's killer feature comes in. If you have more than two or three rows of structured data, the tabular notation is dramatically more efficient.

Tabular Notation — The Feature That Changes Everything

Tabular notation is TOON's headline feature. It's designed for the scenario you hit constantly in real work: a list of similar objects — products, users, transactions, log entries — where repeating the keys on every row is pure waste. The syntax is:

text

name[count]{col1,col2,col3,...}:
  row1val1,row1val2,row1val3
  row2val1,row2val2,row2val3

Breaking it down: name is the dataset's label, [count] is the number of data rows (required — it tells the parser exactly how many lines to read), {col1,col2,...} is the header row, the colon : ends the header, and each subsequent indented line is one row of values. Here's a real products example:

text

products[5]{id,name,price,inStock,category}:
  101,Widget Pro,29.99,true,Tools
  102,Gadget Plus,49.99,true,Electronics
  103,Thing Basic,9.99,false,Misc
  104,Super Doohickey,74.99,true,Electronics
  105,Budget Widget,14.99,true,Tools

Now picture that same data as an array of JSON objects. You'd write "id", "name", "price", "inStock", and "category" five times each — plus all the surrounding braces, brackets, and quotes. The tabular TOON representation is roughly 60% smaller in token count for this shape of data.

A user dataset follows the same pattern:

text

users[4]{id,username,email,role,active}:
  1,alice_dev,[email protected],admin,true
  2,bob_writer,[email protected],editor,true
  3,carol_ops,[email protected],viewer,false
  4,dan_qa,[email protected],editor,true

The count is mandatory. Unlike column-header CSV where you just count newlines, TOON's [count] is part of the syntax and must match the actual number of data rows. The parser uses it to know when the table ends — especially useful when TOON is embedded inside a larger structure. Use the TOON Validator to catch count mismatches instantly.

Nesting — Objects, Arrays, and Tabular Together

TOON supports nesting in the places you'd expect. An object can contain an array value. A tabular row can contain an object. This lets you represent real-world data that doesn't fit perfectly into flat rows.

Object containing an array:

text

{name:Alice,roles:[admin,editor],active:true}

Tabular data with an embedded object in one column (useful for address data, metadata, etc.):

text

orders[3]{orderId,customer,total,address}:
  1001,alice_dev,89.97,{city:London,country:UK}
  1002,bob_writer,49.99,{city:Berlin,country:DE}
  1003,carol_ops,124.50,{city:Paris,country:FR}

Keep nesting shallow when possible. Two or three levels deep is where TOON still wins on token count. Deeply recursive structures are better served by JSON, which has richer tooling for schema validation — see the MDN JSON reference if you need that.

Working with the npm Package

Install with npm or any compatible package manager. TOON works in Node.js and modern browsers.

bash

npm install @toon-format/toon

The package exports two functions: encode and decode. That's the entire public API — intentionally minimal.

import { encode, decode } from '@toon-format/toon';

// decode: TOON string → JS value
const toonString = `products[3]{id,name,price,inStock}:
  101,Widget Pro,29.99,true
  102,Gadget Plus,49.99,true
  103,Thing Basic,9.99,false`;

const data = decode(toonString);
console.log(data);
// [
//   { id: 101, name: "Widget Pro", price: 29.99, inStock: true },
//   { id: 102, name: "Gadget Plus", price: 49.99, inStock: true },
//   { id: 103, name: "Thing Basic", price: 9.99, inStock: false }
// ]

// encode: JS value → TOON string
const users = [
  { id: 1, username: 'alice_dev', active: true },
  { id: 2, username: 'bob_writer', active: false }
];

const toon = encode(users, { indent: 2 });
console.log(toon);
// users[2]{id,username,active}:
//   1,alice_dev,true
//   2,bob_writer,false

The { indent: 2 } option controls indentation of row values. You can also use encode on plain objects and primitive values, not just arrays — it picks the most compact TOON representation automatically. Paste the output into the TOON Formatter to inspect and pretty-print it.

Common Mistakes

These are the errors that bite developers new to TOON, usually within the first hour:

Wrong row count in tabular notation. Writing products[3] but then including 4 data rows will cause a parse error or silently drop the last row depending on the parser version. Count your rows and keep the number up to date. The TOON Validator catches this immediately.
Quoting object keys. {"name":Alice} is invalid TOON — keys are never quoted. Drop the quotes: {name:Alice}.
Forgetting the colon after the header. products[2]{id,name} followed by a newline will fail. You need the trailing colon: products[2]{id,name}:.
Using spaces around the colon in key:value pairs. {name : Alice} is not valid. No spaces: {name:Alice}.
Embedding commas in unquoted string values. If a product name is "Bolts, Nuts & More", you must quote it in tabular rows: 101,"Bolts, Nuts & More",4.99,true.
Expecting JSON output from decode on scalars. decode("42") returns the JavaScript number 42, not a JSON object. TOON decodes to native JS types.

Converting Between TOON and JSON

TOON and JSON are fully interconvertible — TOON is a superset of the same logical data model that JSON serialises. Use JSON to TOON to shrink an existing dataset before sending it to an LLM, and TOON to JSON to convert back when you need to feed the result into a JSON-only downstream system. The encode and decode functions handle this programmatically if you're automating the workflow in Node.js. The underlying data structure spec is closely related to concepts from the JSON global in JavaScript.

Wrapping Up

TOON syntax is intentionally compact. No quoted keys, no repeated key names in tables, no obligatory quotes on simple string values. Once you've written a tabular block by hand you'll immediately see why — a dataset that takes 40 lines of JSON fits in 8 lines of TOON. The npm package at npmjs.com/@toon-format/toon keeps the API surface tiny: encode and decode, nothing else to learn. Use the TOON Formatter to pretty-print, TOON Validator to catch syntax errors, JSON to TOON to convert your existing data, and TOON to JSON when you need the result back in standard form.

← All TOON articles Browse all categories →