• Skip to primary navigation
  • Skip to main content
  • Skip to footer

Codemotion Magazine

We code the future. Together

  • Discover
    • Events
    • Community
    • Partners
    • Become a partner
    • Hackathons
  • Magazine
    • Backend
    • Frontend
    • AI/ML
    • DevOps
    • Dev Life
    • Soft Skills
    • Infographics
  • Talent
    • Discover Talent
    • Jobs
    • Manifesto
  • Companies
  • For Business
    • EN
    • IT
    • ES
  • Sign in

Orli DunDecember 17, 2025 4 min read

The Hidden Inefficiency of JSON vs the TOON Revolution

Machine Learning
facebooktwitterlinkedinreddit

In traditional software development, JSON (JavaScript Object Notation) is the undisputed king. It’s readable, structured, and universal. But here’s the uncomfortable truth nobody told you when you started building with AI: LLMs don’t care about the “purity” of your data structure; they care about tokens.

If you’re still sending and receiving large JSON blocks to models like GPT-4, Claude 3.5, or Llama 3, you’re wasting semantic bandwidth, increasing latency, and literally burning your budget.

Recommended article
smart vehicles connectivity, neuroscience, safety
October 21, 2025

Building Safer Roads with V2X: How AI and Neuroscience enable the Future of Mobility

Codemotion

Codemotion

Machine Learning

Recently, TOON (Token-Oriented Object Notation) was launched as a compact, human-readable encoding that minimizes tokens while making structure easy for models to follow. The industry hasn’t stopped talking about it. TOON replaces plain JSON in prompts and payloads for LLMs, reducing tokens, accelerating inference, and cutting costs. The specification and SDK are publicly available and show practical improvements for production.

TOON vs JSON in LLMs: Why Does It Matter?

TOON is a compact, schema-aware notation designed specifically to feed language models with structured data more efficiently than traditional JSON. It’s not just a syntax shortcut: it’s a representation designed to minimize redundant tokens while maintaining human readability and compatibility with schemas and validations.

TOON has arrived to eliminate the fat and leave only the muscle. Designed specifically for LLM inference, TOON prioritizes semantic density over syntactic rigidity.

The Problem: JSON’s “Syntax Tax”

To understand why JSON is inefficient for AI, we must first understand how models “read”: Tokenization.

LLMs use tokenizers (like BPE — Byte Pair Encoding). Every character counts, but “common” characters are grouped together. However, JSON is full of characters that are “poison” for token efficiency:

  • Double quotes (“”): repeated thousands of times for keys and values
  • Braces and brackets ({}, []): rigid structure
  • Whitespace and line breaks: visual formatting that consumes context

In an average RAG (Retrieval-Augmented Generation) prompt, JSON syntax can occupy between 15% and 25% of your total tokens. That’s 25% of your context window you’re not using for reasoning or data.

What Makes TOON Different?

TOON combines YAML’s indentation-based structure for nested objects with a CSV-style tabular layout for uniform arrays. Here’s what sets it apart:

  • Redundancy elimination: TOON discards quotes in keys and uses delimiters that are usually single tokens in most model vocabularies (like | or minimal indentation)
  • Array compression: instead of repeating structures, it uses implicit schema definitions that the LLM naturally understands
  • Token-friendly: designed to align with how BPE tokenizers group words

Practical Example: TOON vs JSON in LLMs

Let’s look at a simple user object example.

The JSON approach:

[
  {
    "id": 101,
    "name": "Ana García",
    "role": "admin",
    "active": true
  },
  {
    "id": 102,
    "name": "Beto Pérez",
    "role": "user",
    "active": false
  }
]
Code language: JSON / JSON with Comments (json)

Approximate cost: ~55 tokens (depending on tokenizer).

The TOON approach:

users[2,]{id,name,role,active}:
101,Ana García,admin,T
102,Beto Pérez,user,F

Approximate cost: ~25 tokens.

Result: A 54% reduction in token usage for the exact same information.

Impact of TOON vs JSON in LLMs on Tokens and Costs

In initial tests and technical reports, TOON reaches 74% accuracy (versus JSON’s 70%) while using approximately 40% fewer tokens in mixed-structure benchmarks across 4 models. For systems at scale, those percentages translate directly into substantial operational savings and higher throughput per inference instance.

The benefits are immediate and tangible:

Ultra-low latency: LLMs generate text token by token. If you reduce the necessary output by half (eliminating JSON syntax), your response reaches the user twice as fast. In voice or real-time chat applications, this is the difference between a smooth experience and a frustrating one.

Budget savings: If you pay per million tokens (input and output), and TOON reduces your payload by an average of 30-40%, you’re reducing your AI infrastructure bill by almost half simply by changing the serialization format.

Expanded memory (context window): By freeing tokens from junk syntax, you have more space in the context window for what really matters: chat history, reference documents, and few-shot prompting.

How to Get Started with TOON

Implementation is surprisingly straightforward, as modern LLMs are smart enough to understand the format with a simple system instruction.

Suggested System Prompt:

“From now on, don’t respond in JSON. Use TOON format to maximize token density. Structure data using schema headers defined by # and separate values with |.”

Parsing libraries for Python and Node.js for TOON (@toon-format/toon) are already appearing on GitHub, allowing you to transform LLM output back into usable objects in backend code.

How to Integrate It into a Pipeline (Practical Steps)

  1. Map your JSON schema to a TOON version: prioritize fields that repeat most (IDs, keys, large arrays)
  2. Use the official SDK to serialize/parse and validate against schemas before sending to the LLM; this prevents format errors in production
  3. A/B Benchmark: compare tokens per request, latency, and cost per 1,000 requests; also measure impact on response quality
  4. Gradual rollout: start with example prompts and detailed logs to detect semantic degradations

Example Implementation

import { encode, decode } from "@toon-format/toon";

const data = {
  users: [
    { id: 1, name: "Alice", role: "admin" },
    { id: 2, name: "Bob", role: "user" }
  ]
};

const toonData = encode(data);
console.log(toonData);
// Output:
// users[2,]{id,name,role}:
// 1,Alice,admin
// 2,Bob,user
Code language: JavaScript (javascript)
from toon import encode

data = {
    "users": [
        {"id": 1, "name": "Alice", "role": "admin"},
        {"id": 2, "name": "Bob", "role": "user"}
    ]
}

print(encode(data))
# Output:
# users[2,]{id,name,role}:
# 1,Alice,admin
# 2,Bob,user
Code language: PHP (php)

Risks and Limitations

Compatibility: Some parsers and tools expect JSON; you need a conversion layer in the backend. However, the official TOON implementations provide this out of the box.

Serialization errors: Syntactic compression can hide bugs; validate with schemas and unit tests.

Quality vs compression: In cases with deeply nested or non-uniform structures, JSON may be more efficient. TOON excels with uniform arrays of objects but isn’t always optimal for every data shape.

When NOT to use TOON:

  • Deeply nested or irregular data structures
  • Non-AI use cases where JSON tooling is essential
  • Semi-uniform arrays (40-60% tabular eligibility) where token savings diminish

The Verdict

Continuing to use JSON for LLMs in 2025 is like trying to send a fax using an iPhone. It works, but you’re wasting all the potential of the technology.

TOON isn’t just a format; it’s a statement of principles about efficiency in AI. TOON’s sweet spot is uniform arrays of objects (multiple fields per row, same structure across items), achieving CSV-like compactness while adding explicit structure that helps LLMs parse and validate data reliably.

If you’re serious about your application’s speed and cost optimization, it’s time to let go of the curly braces {} and embrace density.

Are you ready to reclaim your tokens?


Resources

  • Official TOON Specification
  • TypeScript/JavaScript SDK
  • Python Implementation
  • TOON Format Website
  • Benchmarks & Conformance Tests

Related Posts

When Size Doesn’t Matter: All About the HRM Model

Dario Ferrero
August 26, 2025

From Junior to Senior Developer with ChatGPT

Matteo Baccan
November 19, 2024
machine learning vs traditional programming, which is best for your career?

ML vs Traditional Programming: Which is Better for Your Career?

Pohan Lin
February 27, 2024
AI books you have to read. Machine Learning

Your Machine Learning Toolkit: Top OSS Libraries and Frameworks for ML

Codemotion
February 14, 2024
Share on:facebooktwitterlinkedinreddit

Tagged as:AI

Orli Dun
From finance to the digital revolution! Software Developer - Cloud & AI - OCI Certified - Tech Content Creator #foramillionfriends
What if we were all becoming Borg? When AI becomes assimilation in software development
Previous Post

Footer

Discover

  • Events
  • Community
  • Partners
  • Become a partner
  • Hackathons

Magazine

  • Tech articles

Talent

  • Discover talent
  • Jobs

Companies

  • Discover companies

For Business

  • Codemotion for companies

About

  • About us
  • Become a contributor
  • Work with us
  • Contact us

Follow Us

© Copyright Codemotion srl Via Marsala, 29/H, 00185 Roma P.IVA 12392791005 | Privacy policy | Terms and conditions