You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

158 regels
6.6 KiB

14 jaren geleden
14 jaren geleden
14 jaren geleden
14 jaren geleden
14 jaren geleden
  1. Jiffy - JSON NIFs for Erlang
  2. ============================
  3. A JSON parser as a NIF. This is a complete rewrite of the work I did
  4. in EEP0018 that was based on Yajl. This new version is a hand crafted
  5. state machine that does its best to be as quick and efficient as
  6. possible while not placing any constraints on the parsed JSON.
  7. [![Build Status](https://travis-ci.org/davisp/jiffy.svg?branch=master)](https://travis-ci.org/davisp/jiffy)
  8. Usage
  9. -----
  10. Jiffy is a simple API. The only thing that might catch you off guard
  11. is that the return type of `jiffy:encode/1` is an iolist even though
  12. it returns a binary most of the time.
  13. A quick note on unicode. Jiffy only understands UTF-8 in binaries. End
  14. of story.
  15. Errors are raised as error exceptions.
  16. Eshell V5.8.2 (abort with ^G)
  17. 1> jiffy:decode(<<"{\"foo\": \"bar\"}">>).
  18. {[{<<"foo">>,<<"bar">>}]}
  19. 2> Doc = {[{foo, [<<"bing">>, 2.3, true]}]}.
  20. {[{foo,[<<"bing">>,2.3,true]}]}
  21. 3> jiffy:encode(Doc).
  22. <<"{\"foo\":[\"bing\",2.3,true]}">>
  23. `jiffy:decode/1,2`
  24. ------------------
  25. * `jiffy:decode(IoData)`
  26. * `jiffy:decode(IoData, Options)`
  27. The options for decode are:
  28. * `return_maps` - Tell Jiffy to return objects using the maps data type
  29. on VMs that support it. This raises an error on VMs that don't support
  30. maps.
  31. * `{null_term, Term}` - Returns the specified `Term` instead of `null`
  32. when decoding JSON. This is for people that wish to use `undefined`
  33. instead of `null`.
  34. * `use_nil` - Returns the atom `nil` instead of `null` when decoding
  35. JSON. This is a short hand for `{null_term, nil}`.
  36. * `return_trailer` - If any non-whitespace is found after the first
  37. JSON term is decoded the return value of decode/2 becomes
  38. `{has_trailer, FirstTerm, RestData::iodata()}`. This is useful to
  39. decode multiple terms in a single binary.
  40. * `dedupe_keys` - If a key is repeated in a JSON object this flag
  41. will ensure that the parsed object only contains a single entry
  42. containing the last value seen. This mirrors the parsing beahvior
  43. of virtually every other JSON parser.
  44. * `copy_strings` - Normally, when strings are decoded, they are
  45. created as sub-binaries of the input data. With some workloads, this
  46. leads to an undesirable bloating of memory: Strings in the decode
  47. result keep a reference to the full JSON document alive. Setting
  48. this option will instead allocate new binaries for each string, so
  49. the original JSON document can be garbage collected even though
  50. the decode result is still in use.
  51. * `{max_levels, N}` where N &gt;= 0 - This controls when to stop decoding
  52. by depth, after N levels are decoded, the rest is returned as a
  53. `Resource::reference()`. Resources have some limitations, check [partial jsons
  54. section](#partial-jsons).
  55. * `{bytes_per_red, N}` where N &gt;= 0 - This controls the number of
  56. bytes that Jiffy will process as an equivalent to a reduction. Each
  57. 20 reductions we consume 1% of our allocated time slice for the current
  58. process. When the Erlang VM indicates we need to return from the NIF.
  59. * `{bytes_per_iter, N}` where N &gt;= 0 - Backwards compatible option
  60. that is converted into the `bytes_per_red` value.
  61. `jiffy:encode/1,2`
  62. ------------------
  63. * `jiffy:encode(EJSON)`
  64. * `jiffy:encode(EJSON, Options)`
  65. where EJSON is a valid representation of JSON in Erlang according to
  66. the table below.
  67. The options for encode are:
  68. * `uescape` - Escapes UTF-8 sequences to produce a 7-bit clean output
  69. * `pretty` - Produce JSON using two-space indentation
  70. * `force_utf8` - Force strings to encode as UTF-8 by fixing broken
  71. surrogate pairs and/or using the replacement character to remove
  72. broken UTF-8 sequences in data.
  73. * `use_nil` - Encode's the atom `nil` as `null`.
  74. * `escape_forward_slashes` - Escapes the `/` character which can be
  75. useful when encoding URLs in some cases.
  76. * `partial` - Instead of returning an `iodata()`, returns a
  77. `Resource::reference()` which holds the verified raw json. This resource can be used
  78. as a block to build more complex jsons, without the need to encode these
  79. blocks again. Resources have some limitations, check [partial jsons
  80. section](#partial-jsons).
  81. * `{bytes_per_red, N}` - Refer to the decode options
  82. * `{bytes_per_iter, N}` - Refer to the decode options
  83. `jiffy:validate/1,2`
  84. ------------------
  85. * `jiffy:validate(IoData)`
  86. * `jiffy:validate(IoData, Options)`
  87. Performs a fast decode to validate the correct IoData, uses the same Options as
  88. `jiffy:decode/2` (although some may make no sense).
  89. Returns a boolean instead of an EJSON.
  90. Data Format
  91. -----------
  92. Erlang JSON Erlang
  93. ==========================================================================
  94. null -> null -> null
  95. true -> true -> true
  96. false -> false -> false
  97. "hi" -> [104, 105] -> [104, 105]
  98. <<"hi">> -> "hi" -> <<"hi">>
  99. hi -> "hi" -> <<"hi">>
  100. 1 -> 1 -> 1
  101. 1.25 -> 1.25 -> 1.25
  102. [] -> [] -> []
  103. [true, 1.0] -> [true, 1.0] -> [true, 1.0]
  104. {[]} -> {} -> {[]}
  105. {[{foo, bar}]} -> {"foo": "bar"} -> {[{<<"foo">>, <<"bar">>}]}
  106. {[{<<"foo">>, <<"bar">>}]} -> {"foo": "bar"} -> {[{<<"foo">>, <<"bar">>}]}
  107. #{<<"foo">> => <<"bar">>} -> {"foo": "bar"} -> #{<<"foo">> => <<"bar">>}
  108. N.B. The last entry in this table is only valid for VM's that support
  109. the `maps` data type (i.e., 17.0 and newer) and client code must pass
  110. the `return_maps` option to `jiffy:decode/2`.
  111. Improvements over EEP0018
  112. -------------------------
  113. Jiffy should be in all ways an improvement over EEP0018. It no longer
  114. imposes limits on the nesting depth. It is capable of encoding and
  115. decoding large numbers and it does quite a bit more validation of UTF-8 in strings.
  116. Partial JSONs
  117. -------------------------
  118. `jiffy:encode/2` with option `partial` returns a `Resource::reference()`.
  119. `jiffy:decode/2` with option `max_levels` may place a `Resource::reference()`
  120. instead of some `json_value()`.
  121. These resources hold a `binary()` with the verified JSON data and can be used
  122. directly, or as a part of a larger EJSON in `jiffy:encode/1,2`. These binaries
  123. won't be reencoded, instead, they will be placed directly in the result.
  124. However, using resources has some limitations: The resource is only valid in
  125. the node where it was created. If a resource is serialized and deserialized, or
  126. if it changes nodes back and forth, it will only be still valid if the resource
  127. was not GC'd.