JSON Schema Explained
JSON Schema is a vocabulary for annotating and validating JSON documents. It describes the expected structure — required fields, allowed types, value constraints, and nested object shapes — in a machine-readable format. A single schema validates thousands of API payloads, pipeline records, or config files consistently, replacing ad-hoc type checks scattered across application code.
For a broader overview of JSON workflows including validation, flattening, and conversion, see the JSON Data Processing Workflows guide.
TL;DR
- →JSON Schema is written in JSON itself — a schema is just a JSON object with keywords like
type,properties, andrequired - →The current stable draft is Draft 2020-12; most validators support Draft 7 or Draft 2019-09
- →Validate at system boundaries — API entry points, pipeline ingestion, config load — not deep inside business logic
- →Popular validators: Ajv (Node.js, fastest), jsonschema (Python), json-schema (Go)
- →JSON Schema validates structure, not syntax — use the JSON Formatter to fix syntax errors before schema validation
1. What JSON Schema Is
JSON Schema is a specification (maintained by the JSON Schema Organisation) that defines a vocabulary for describing the structure and constraints of JSON documents. A schema is itself a JSON document — it describes what valid JSON should look like, using keywords like type, properties, required, and enum.
Unlike JSON itself (which only specifies the grammar of the syntax), JSON Schema operates one level above: it describes the semantics — what fields exist, what types they should hold, which are mandatory, and what values are permissible. A validator takes a JSON document and a JSON Schema, then reports whether the document conforms.
JSON Schema is not:
- • A JSON syntax checker — use a JSON formatter/parser for that
- • A data transformation tool — it validates, not transforms
- • A query language — see JSONPath or jq for querying
- • Tied to any programming language — validators exist for every major language
2. Why JSON Schema Matters
Without schema validation, data quality errors surface late — after writes to a database, after a batch job runs, or after a bug report from production. JSON Schema moves validation to the earliest possible point: the moment data enters your system.
- •Validate request bodies at the API gateway or server layer before business logic runs
- •Return structured validation errors (which fields failed, why) rather than generic 400 responses
- •Document the API contract in a machine-readable format — tools like OpenAPI embed JSON Schema
- •Prevent type coercion bugs (a string "123" accepted where a number 123 was expected)
- •Validate records at ingestion before they reach a database or warehouse
- •Catch schema drift early — when an upstream API adds, removes, or renames fields
- •Gate pipeline stages: reject malformed records, route to a dead-letter queue, alert on anomalies
- •Enforce consistent null handling, date formats, and enum values across data sources
- •Validate application config at startup — fail fast before serving traffic on misconfiguration
- •Provide editor autocomplete and inline validation in VS Code with a $schema reference
- •Catch typos in config keys (a schema with additionalProperties: false rejects unknown keys)
- •Document what each config field does using the description keyword
3. Core JSON Schema Concepts
typeRestricts the JSON value to one or more primitive types.
Valid types are: "string", "number", "integer", "boolean", "null", "object", "array". You can allow multiple types with an array: ["string", "null"] means the value is a string or null.
{ "type": "string" }propertiesDefines the expected fields of a JSON object and their schemas.
Each key in "properties" maps to a schema that applies to that field. Having a field in "properties" does not make it required — use the "required" keyword for that. Unknown fields are allowed by default unless "additionalProperties" is set to false.
{ "type": "object", "properties": { "name": { "type": "string" }, "age": { "type": "integer" } } }requiredLists which properties must be present in the object.
"required" takes an array of property names. If any listed field is absent from the object, validation fails. Note: "required" only checks presence — not the value. A required field can still be null or empty string unless further constrained.
{ "type": "object", "required": ["id", "email"] }enumRestricts the value to a fixed set of allowed values.
"enum" accepts an array of any JSON values. The instance must exactly equal one of the listed values. Common use: status fields, event types, unit codes. Enum values can be of any type — mixing strings and numbers in one enum is valid.
{ "type": "string", "enum": ["active", "inactive", "pending"] }Arrays"items" defines the schema that each element of an array must satisfy.
Use "type": "array" with "items" to describe typed arrays. "minItems" and "maxItems" constrain array length. "uniqueItems": true rejects arrays with duplicates. For tuples (fixed-length arrays with different types per position), use "prefixItems" in Draft 2020-12.
{ "type": "array", "items": { "type": "string" }, "minItems": 1 }Nested structuresSchemas nest recursively — a property can itself be an object with its own properties.
There is no depth limit. An "address" property can have its own "type": "object" with "properties" for "street", "city", "country". JSON Schema handles arbitrary nesting the same way JSON does. For recursive structures (tree nodes, linked lists), use "$ref" to reference the schema from within itself.
{ "type": "object", "properties": { "address": { "type": "object", "properties": { "city": { "type": "string" } } } } }Other useful keywords
minimum / maximumNumeric range constraintsminLength / maxLengthString length constraintspatternRegular expression constraint on string valuesformatSemantic format hints: "email", "date", "uri", "uuid"additionalPropertiesfalse = reject unknown keys; schema = apply schema to extra keys$refReference another schema by URI — enables reuse and recursionallOf / anyOf / oneOfCombine multiple schemas with AND / OR / XOR logicif / then / elseConditional validation based on a sub-schema match4. A Simple Schema Example
Consider a user registration payload that an API endpoint receives. The payload must have a name (non-empty string), an email, an optional age (positive integer), and a role drawn from a fixed set.
JSON Schema
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"type": "object",
"required": ["name", "email", "role"],
"additionalProperties": false,
"properties": {
"name": {
"type": "string",
"minLength": 1,
"maxLength": 100
},
"email": {
"type": "string",
"format": "email"
},
"age": {
"type": "integer",
"minimum": 0,
"maximum": 150
},
"role": {
"type": "string",
"enum": ["admin", "editor", "viewer"]
}
}
}✓ Valid payload
{
"name": "Alice",
"email": "alice@example.com",
"age": 28,
"role": "editor"
}✗ Invalid — three violations
{
"name": "", ← minLength violation
"email": "not-email", ← format violation
"role": "superuser", ← enum violation
"extra": "field" ← additionalProperties
}The schema above captures several important aspects:
- required — name, email, and role must be present; age is optional
- additionalProperties: false — the "extra" field is rejected; useful for config files to catch typos
- format: "email" — hints that a validator should check email format; not all validators enforce format by default
- enum — role is locked to three allowed values; a validator rejects anything else
- integer with bounds — age must be a whole number between 0 and 150
Note on format validation
The format keyword is technically advisory in the JSON Schema spec — validators are not required to validate it. Ajv (the most popular Node.js validator) requires enabling format checks with an option. Always verify your validator's behaviour for format keywords.
5. Typical JSON Validation Workflow
JSON Schema fits naturally into a four-step data processing workflow:
Ingest JSON
Receive data from an API call, webhook, file upload, or pipeline source. At this stage the data is untrusted.
Validate
Run the JSON document through a JSON Schema validator. Reject or quarantine invalid documents immediately — before any processing.
Transform
Apply transformations to the validated, trusted data — flatten nested structure, convert types, enrich with lookups. No defensive null-checks needed.
Export / Load
Write to a database, send to a downstream API, export as CSV, or persist to a file. The downstream system receives clean, validated data.
In the validation step, the JSON Formatter is useful for fixing syntax errors before schema validation — a validator will reject syntactically invalid JSON with a parse error, not a schema error. Use the JSON Formatter to confirm the document is valid JSON, then run schema validation to check structure. After transformation, use the JSON Diff tool to verify that a transformation produced the expected changes without unintended side effects.
Handling validation failures
APIs — return HTTP 422 (Unprocessable Entity) with a structured error body listing each failed validation and the JSON pointer to the offending field
Pipelines — route invalid records to a dead-letter queue or error table; log the schema violations for monitoring; never silently drop records
Config files — fail immediately at startup with a human-readable error message; do not attempt to run with an invalid config
6. Common Mistakes
✗ Overly strict schemas
Setting additionalProperties: false on every schema breaks backwards compatibility when upstream services add new optional fields. Reserve this for config files where unknown keys are likely typos — not for API responses from external services you do not control.
✗ Not marking optional fields explicitly
"required" lists only mandatory fields. Developers often assume that any field listed in "properties" is required — it is not. Readers of your schema will not know which fields are optional unless you document it clearly with "description" or by explicitly listing required fields.
✗ Inconsistent nesting depth
Schemas that represent the same concept with different nesting in different parts of the system are hard to maintain. Define shared sub-schemas (address, money, timestamp) using "$defs" (or "definitions" in Draft 7) and reference them with "$ref" rather than repeating the structure.
✗ Confusing type coercion with validation
Some validators (particularly in loosely typed environments) attempt to coerce values before validating — accepting the string "123" where a number is required. Ajv has a coerceTypes option that does this. Coercion can mask data quality problems. Validate strictly; fix the data source instead.
✗ Validating only the happy path
Schemas often describe only what valid data looks like, without testing that the schema actually rejects known-invalid inputs. Write negative test cases: a schema that accepts everything is not useful. Use tools like Ajv's CLI or pytest-jsonschema to test both valid and invalid examples.
✗ Using JSON Schema for business logic
JSON Schema can check that a field is a positive integer, but it cannot check that an order total matches the sum of line items — that is business logic. Keep schemas focused on structural and type constraints; put business rules in application code where they can be unit-tested with context.
7. When JSON Schema Is Useful vs Unnecessary
- ✓Validating API request/response bodies at system boundaries
- ✓Ingesting data from external sources with unpredictable quality
- ✓Documenting a data contract (OpenAPI uses JSON Schema internally)
- ✓Validating application config files at startup
- ✓Ensuring consistency across multiple consumers of the same data feed
- ✓Testing that a transformation produces the expected output shape
- —Your language already provides static types (TypeScript, Go, Rust) — use those instead of duplicating with a runtime schema
- —The data is small, internal, and only ever written by code you control
- —You need business-rule validation (cross-field dependencies, database lookups)
- —The JSON structure is genuinely free-form and cannot be described by a schema
- —A simpler inline check (typeof, Array.isArray) is sufficient for a one-off case
JSON Schema and TypeScript
TypeScript types are erased at runtime. JSON Schema validates at runtime against untrusted external data that TypeScript types cannot protect you from. The two are complementary: TypeScript for internal code safety, JSON Schema for runtime boundary validation. Tools like typebox and zod generate both a TypeScript type and a compatible validator from a single source of truth, eliminating the duplication.
Try the Tools
Validate, inspect, and transform JSON data entirely in your browser. No data is uploaded. Browse all data tools →
Frequently Asked Questions
Related Reading
JSON Data Processing Workflows: Parsing, Transforming, and Loading JSON
Practical guide to validating, flattening, diffing, and converting JSON data in real workflows.
Data Formatting & Processing Basics: CSV, JSON, XML, Excel Explained
Full guide covering all four formats, typical workflows, and when to use each.
JSON vs XML: What's the Difference and Which Should You Use?
Side-by-side comparison of JSON and XML, including schema and validation approaches.
Data Engineering & Processing Tools
Browse all browser-based CSV, JSON, XML, Excel, and SQL tools.