JSON Schema vs. OpenAPI
JSON Schema and OpenAPI can seem similar but have different use-cases.
To begin, how JSON Scheme and OPenAPI differ? Well, in contrast to JSON Schema, an OpenAPI document is a definition for an entire API, not just data models. One might compare JSON Schema with the OpenAPI data model.
Why the need to validate JSON?
There are a plethora of use-cases, but let me explain why I use it:
Enter the world of Kubernetes and you’ll find yourself surrounded by object manifests which are either defined as YAML or JSON. But having to maintain thousands of such manifests can be a nightmare if your code is repeated. Languages likejsonnet
, a lazy data templating language by Google, let you DRYup (Don’t Repeat Yourself) the configuration code. The jsonnet
spits JSON, offers template reuse, processes a code only if it is required. So not only does it give you a performance boost, it also takes away a big percentage of maintenance.
Good, now we have a lot of code written in jsonnet
which generates JSON based manifests. Going forward, as custom JSON objects grow, we need some way to validate inputs and auto-document. This is when you would use OpenAPI.
JSON Schema
JSON schema, as defined at json-schema.org, is a powerful tool for validating the structure of JSON data.
At its heart, JSON is built on the following data structures:
object
: for example{ "key1": "value1", "key2": "value2" }
array
: for example[ "first", "second", "third" ]
numbers
: for example42
,3.1415926
string
: for example"This is a string"
boolean
: for exampletrue
andfalse
null
: for examplenull
These types have analogs in most programming languages, though they may go by different names.
In JSON Schema, an empty object, {}
, is a completely valid schema that will accept any valid JSON (any object, number, string, etc). You can also use true
in place of empty object to represent a schema that matches anything, or false
for a schema that matches nothing.
The most common thing to do in a JSON schema is to restrict to a specific type. The type
keyword is used for that. For example,
{ "type": "string" }
The type
keyword may either be a string or an array (in which case the JSON snippet is valid if it matches any of the given types).
{ "type": ["number", "string"] }
Since JSON Schema is itself JSON, it’s not always easy to tell when something is JSON Schema or just an arbitrary chunk of JSON. The $schema
keyword is used to declare that something is a JSON Schema. It’s generally a good practice to include it, though it is not required.
{ "$schema": "http://json-schema.org/schema#" }
It is also best practice to include an $id
property as a unique identifier for each schema. You can just set it to a URL at a domain you control, for example,
{ "$id": "http://yourdomain.com/schemas/myschema.json" }
Keywords specific to object
data type:
- The properties (key-value pairs) on an object are defined using the
properties
keyword. The value ofproperties
is an object, where each key is the name of a property and each value is a JSON schema used to validate that property. - The
additionalProperties
keyword is used to control the handling of extra stuff, that is, properties whose names are not listed in theproperties
keyword. By default, any additional properties are allowed. TheadditionalProperties
keyword may be either a boolean or an object. IfadditionalProperties
is a boolean and set tofalse
, no additional properties will be allowed. IfadditionalProperties
is an object, that object is a schema that will be used to validate any additional properties not listed inproperties
. - By default, the properties defined by the
properties
keyword are not required. However, one can provide a list of required properties using therequired
keyword. Therequired
keyword takes an array of zero or more strings. Each of these strings must be unique. - The names of properties can be validated against a schema using
propertyNames
, irrespective of their values. This can be useful if you don’t want to enforce specific properties, but you want to make sure that the names of those properties follow a specific convention. - The number of properties on an object can be restricted using the
minProperties
andmaxProperties
keywords. Each of these must be a non-negative integer. - The
dependencies
keyword allows the schema of the object to change based on the presence of certain special properties. There are two forms of dependencies in JSON Schema: - Property dependencies declare that certain other properties must be present if a given property is present. The value of
dependencies
keyword is an object. Each entry in the object maps from the name of a property,p
, to an array of strings listing properties that are required wheneverp
is present. - Schema dependencies declare that the schema changes when a given property is present. Schema dependencies work like property dependencies, but instead of just specifying other required properties, they can extend the schema to have other constraints.
- As we saw before,
additionalProperties
can restrict the object so that it either has additional properties that weren’t explicitly listed, or it can specify a schema for any additional properties on the object. Sometimes this isn’t enough, and you may want to restrict the names of extra properties, or you may want to say that, given a particular kind of name, the value should match a particular schema. That’s wherepatternProperties
comes in: it maps from regular expressions to schemas. If an additional property matches a given regular expression, it must also validate against the corresponding schema. - The
patternProperties
can be used in conjunction withadditionalProperties
. In that case,additionalProperties
will refer to any properties that are not explicitly listed inproperties
and don’t match any of thepatternProperties
.
Keywords specific to string
, number
, boolean
and null
data types:
- The length of a string can be constrained using the
minLength
andmaxLength
. - The
pattern
keyword is used to restrict a string to a particular regular expression. - The
format
keyword allows for basic semantic validation on certain kinds of string values that are commonly used. Check Built-in-formats. - Range of numbers are specified using a combination of
minimum
andmaximum
keywords (orexclusiveMinimum
andexclusiveMaximum
for expressing exclusive range). - The
boolean
keyword matches only two special values:true
andfalse
. Note that the values that evaluate totrue
orfalse
, such as1
and0
, are not accepted by the schema. - The
null
type is generally used to represent a missing value. When a schema specifies atype
ofnull
, it has only one acceptable value:null
.
Keywords specific to array
data type:
The items
keyword:
- Set
items
keyword to a single schema that will be used to validate all of the items in the array. - Set
items
keyword to an array, where each item is a schema that corresponds to each index of the document’s array. That is, an array where the first element validates the first element of the input array, the second element validates the second element of the input array, etc. - While
items
schema must be valid for every item in the array, thecontains
schema only needs to validate against one or more items in the array. - The
additionalItems
keyword controls whether it’s valid to have additional items in the array beyond what is defined initems
. Setting it tofalse
has the effect of disallowing extra items in the array. It can also be schema to validate against every additional item in the array. - The length of the array can be specified using the
minItems
andmaxItems
keywords. The value of each keyword must be a non-negative number. - A schema can ensure that each of the items in an array is unique. Simply set the
uniqueItems
keyword totrue
.
Generic Keywords:
- The
title
anddescription
keywords must be strings. Atitle
will preferably be short, whereas adescription
will provide a more lengthy explanation about the purpose of the data described by the schema. - The
default
keyword specifies a default value for an item. JSON processing tools may use this information to provide a default value for a missing key/value pair, though many JSON schema validators simply ignore the default keyword. It should validate against the schema in which it resides, but that isn’t required. - The
enum
keyword is used to restrict a value to a fixed set of values. It must be an array with at least one element, where each element is unique. You can useenum
even without a type, to accept values of different types. - The
const
keyword is used to restrict a value to a single value.
Reusing Schemas using $ref
:
- We can refer to a schema snippet from elsewhere using the
$ref
keyword. The easiest way to describe$ref
is that it gets logically replaced with the things that it points to. - You will always use
$ref
as the only key in an object: any other keys you put will be ignored by the validator. - The value of
$ref
is a URI-reference, and the part after#
sign (the “fragment” or “named anchor”) is in a format called Json Pointer. - If you’re using a definition from the same document, the
$ref
value begins with the pound symbol,#
. Following that, the slash-separated items traverse the keys in the objects in the document. - The
$ref
elements may be used to create recursive schemas that refer to themselves. - The
$id
property is a URI-reference that serves two purposes: - It declares a unique identifier for the schema.
- It declares a base URI against which
$ref
URI-reference are resolved. - It is best practice that every top-level schema should set
$id
to an absolute-URI (not a relative reference), with a domain that you control.
Combining Schemas:
- To validate against
allOf
, the given data must be valid against all of the given sub-schemas (provided as elements of an array). - To validate against
anyOf
, the given data must be valid against any (one or more) of the given sub-schemas. - To validate against
oneOf
, the given data must be valid against exactly one of the given subschemas. - The
not
keyword declares that an instance validates if it doesn’t validate against the given sub-subschema.
Applying sub-schemas conditionally:
- The
if
,then
, andelse
keywords allow the application of a sub-schema based on the outcome of another schema. Ifif
is valid,then
must also be valid (andelse
is ignored.) Ifif
is invalid,else
must also be valid (andthen
is ignored).
OpenAPI
OpenAPI Specification (formerly Swagger Specification) is an API description for REST APIs. An OpenAPI file allows you to describe your entire API, including:
- Available endpoints (
/users
) and operations on each endpoint (GET /users
,POST /users
) - Operation parameters Input and Output for each operation
- Authentication methods
- Contact information, license, terms of use and other information
The complete OpenAPI Specification can be found on Github OpenAPI-Specification
.
Conclusion
This article has explored which tool to use and when. If there are only just data models whose schema you need to define, JSON Schema is a good option. But, if you want to describe your entire API, it’s better to go with OpenAPI. I hope you have found this article helpful, thank you for reading!