Miles to go ...

Sort order for JSON

Sort order for JSON data types.

This post discuss in detail the sort order for JSON data types. Specifically,

A reference implementation of this proposal is provided by the Rust library jsondata.

Sort order for types

There are 4 primitive types and 2 composite types defined by JSON.

Among Boolean type, false value sort before true value.

Sort order for numerical values

JSON specification do not define a lower or upper limit for numbers. Although it encourages programmers to stick within [-(2^53)+1, (2^53)-1] range for the sake of inter-operability.

For the purpose of numerical-sorting, integers and floating point numbers can be treated uniformly such that when comparing an integer value and floating point value, later value is converted to integer and compared. This means for 64-bit floating point,

Also there is the total ordering issue for floating point numbers, and JSON specification do not define -Infinity, +Infinity and NaN values that are part of the floating point specification. These tokens are defined in the JSON5 specification. And its sort order shall be defined, for the sake of total ordering, as

Sort order for string value

Sorting string value is complex because of Unicode and UTF8-encoding. In some languages the order of character glyph don’t have a similar ordering when converted to their Unicode code points. Also, UTF8-encoding introduces other complexities. For simplicity sake we can do binary comparison of string values, after the escape-encoding is removed. There are some notable collation standards - Unicode collation and ICU collation, that are already available for string collation.

Sort order for array values

When sorting array values, array items shall be compared recursively, applying the comparison logic on each array item and its counterpart. For example:

When comparison run out of items, array with lesser number of items shall sort before the array with more number of items.

Sort order for object values

Range operation

The purpose of maintaining sorted set of JSON value is, for users to do range operation on the value set. And when such operations need to be performed over the wire, especially when using HTTP/JSON, we may have to deal with unbounded values,

Using HTTP/JSON means we may have to serialize these values over the wire and once again JSON specification do not define anything related to this. Hence as a proposal to extend JSON, we can include 2 reserved tokens -

This can be added in future specification.

Note that, when implementing Range API as part of native libraries we don’t have to make unbounded values as part of JSON. Minbound and Maxbound is needed for serializing such queries over-the-wire using JSON format.