|
5 years ago | |
---|---|---|
c_src | 5 years ago | |
plugins | 11 years ago | |
src | 5 years ago | |
test | 5 years ago | |
.gitignore | 8 years ago | |
.travis.yml | 5 years ago | |
LICENSE | 9 years ago | |
Makefile | 8 years ago | |
README.md | 5 years ago | |
enc | 7 years ago | |
rebar | 12 years ago | |
rebar.config | 5 years ago | |
rebar.config.script | 5 years ago |
A JSON parser as a NIF. This is a complete rewrite of the work I did in EEP0018 that was based on Yajl. This new version is a hand crafted state machine that does its best to be as quick and efficient as possible while not placing any constraints on the parsed JSON.
Jiffy is a simple API. The only thing that might catch you off guard
is that the return type of jiffy:encode/1
is an iolist even though
it returns a binary most of the time.
A quick note on unicode. Jiffy only understands UTF-8 in binaries. End of story.
Errors are raised as error exceptions.
Eshell V5.8.2 (abort with ^G)
1> jiffy:decode(<<"{\"foo\": \"bar\"}">>).
{[{<<"foo">>,<<"bar">>}]}
2> Doc = {[{foo, [<<"bing">>, 2.3, true]}]}.
{[{foo,[<<"bing">>,2.3,true]}]}
3> jiffy:encode(Doc).
<<"{\"foo\":[\"bing\",2.3,true]}">>
jiffy:decode/1,2
jiffy:decode(IoData)
jiffy:decode(IoData, Options)
The options for decode are:
return_maps
- Tell Jiffy to return objects using the maps data type
on VMs that support it. This raises an error on VMs that don't support
maps.{null_term, Term}
- Returns the specified Term
instead of null
when decoding JSON. This is for people that wish to use undefined
instead of null
.use_nil
- Returns the atom nil
instead of null
when decoding
JSON. This is a short hand for {null_term, nil}
.return_trailer
- If any non-whitespace is found after the first
JSON term is decoded the return value of decode/2 becomes
{has_trailer, FirstTerm, RestData::iodata()}
. This is useful to
decode multiple terms in a single binary.dedupe_keys
- If a key is repeated in a JSON object this flag
will ensure that the parsed object only contains a single entry
containing the last value seen. This mirrors the parsing beahvior
of virtually every other JSON parser.copy_strings
- Normally, when strings are decoded, they are
created as sub-binaries of the input data. With some workloads, this
leads to an undesirable bloating of memory: Strings in the decode
result keep a reference to the full JSON document alive. Setting
this option will instead allocate new binaries for each string, so
the original JSON document can be garbage collected even though
the decode result is still in use.{max_levels, N}
where N >= 0 - This controls when to stop decoding
by depth, after N levels are decoded, the rest is returned as a
Resource::reference()
. Resources have some limitations, check partial jsons
section.{bytes_per_red, N}
where N >= 0 - This controls the number of
bytes that Jiffy will process as an equivalent to a reduction. Each
20 reductions we consume 1% of our allocated time slice for the current
process. When the Erlang VM indicates we need to return from the NIF.{bytes_per_iter, N}
where N >= 0 - Backwards compatible option
that is converted into the bytes_per_red
value.jiffy:encode/1,2
jiffy:encode(EJSON)
jiffy:encode(EJSON, Options)
where EJSON is a valid representation of JSON in Erlang according to the table below.
The options for encode are:
uescape
- Escapes UTF-8 sequences to produce a 7-bit clean outputpretty
- Produce JSON using two-space indentationforce_utf8
- Force strings to encode as UTF-8 by fixing broken
surrogate pairs and/or using the replacement character to remove
broken UTF-8 sequences in data.use_nil
- Encode's the atom nil
as null
.escape_forward_slashes
- Escapes the /
character which can be
useful when encoding URLs in some cases.partial
- Instead of returning an iodata()
, returns a
Resource::reference()
which holds the verified raw json. This resource can be used
as a block to build more complex jsons, without the need to encode these
blocks again. Resources have some limitations, check partial jsons
section.{bytes_per_red, N}
- Refer to the decode options{bytes_per_iter, N}
- Refer to the decode optionsjiffy:validate/1,2
jiffy:validate(IoData)
jiffy:validate(IoData, Options)
Performs a fast decode to validate the correct IoData, uses the same Options as
jiffy:decode/2
(although some may make no sense).
Returns a boolean instead of an EJSON.
Erlang JSON Erlang
==========================================================================
null -> null -> null
true -> true -> true
false -> false -> false
"hi" -> [104, 105] -> [104, 105]
<<"hi">> -> "hi" -> <<"hi">>
hi -> "hi" -> <<"hi">>
1 -> 1 -> 1
1.25 -> 1.25 -> 1.25
[] -> [] -> []
[true, 1.0] -> [true, 1.0] -> [true, 1.0]
{[]} -> {} -> {[]}
{[{foo, bar}]} -> {"foo": "bar"} -> {[{<<"foo">>, <<"bar">>}]}
{[{<<"foo">>, <<"bar">>}]} -> {"foo": "bar"} -> {[{<<"foo">>, <<"bar">>}]}
#{<<"foo">> => <<"bar">>} -> {"foo": "bar"} -> #{<<"foo">> => <<"bar">>}
N.B. The last entry in this table is only valid for VM's that support
the maps
data type (i.e., 17.0 and newer) and client code must pass
the return_maps
option to jiffy:decode/2
.
Jiffy should be in all ways an improvement over EEP0018. It no longer imposes limits on the nesting depth. It is capable of encoding and decoding large numbers and it does quite a bit more validation of UTF-8 in strings.
jiffy:encode/2
with option partial
returns a Resource::reference()
.
jiffy:decode/2
with option max_levels
may place a Resource::reference()
instead of some json_value()
.
These resources hold a binary()
with the verified JSON data and can be used
directly, or as a part of a larger EJSON in jiffy:encode/1,2
. These binaries
won't be reencoded, instead, they will be placed directly in the result.
However, using resources has some limitations: The resource is only valid in the node where it was created. If a resource is serialized and deserialized, or if it changes nodes back and forth, it will only be still valid if the resource was not GC'd.