erlAarango 二进制序列化库
Nie możesz wybrać więcej, niż 25 tematów Tematy muszą się zaczynać od litery lub cyfry, mogą zawierać myślniki ('-') i mogą mieć do 35 znaków.

545 wiersze
27 KiB

  1. VelocyPack (VPack)
  2. ==================
  3. Version 1
  4. VelocyPack (VPack) is a fast and compact serialization format
  5. ## Generalities
  6. VPack is (unsigned) byte oriented, so VPack values are simply sequences
  7. of bytes and are platform independent. Values are not necessarily
  8. aligned, so all access to larger subvalues must be properly organised to
  9. avoid alignment assumptions of the CPU.
  10. ## Value types
  11. We describe a single VPack value, which is recursive in nature, but
  12. resides (with two exceptions, see below) in a single contiguous block of
  13. memory. Assume that the value starts at address A, the first byte V
  14. indicates the type (and often the length) of the VPack value at hand:
  15. We first give an overview with a brief but accurate description for
  16. reference, for arrays and objects see below for details:
  17. - `0x00` : none - this indicates absence of any type and value, this is not allowed in VPack values
  18. - `0x01` : empty array
  19. - `0x02` : array without index table (all subitems have the same byte length), 1-byte byte length
  20. - `0x03` : array without index table (all subitems have the same byte length), 2-byte byte length
  21. - `0x04` : array without index table (all subitems have the same byte length), 4-byte byte length
  22. - `0x05` : array without index table (all subitems have the same byte length), 8-byte byte length
  23. - `0x06` : array with 1-byte index table offsets, bytelen and # subvals
  24. - `0x07` : array with 2-byte index table offsets, bytelen and # subvals
  25. - `0x08` : array with 4-byte index table offsets, bytelen and # subvals
  26. - `0x09` : array with 8-byte index table offsets, bytelen and # subvals
  27. - `0x0a` : empty object
  28. - `0x0b` : object with 1-byte index table offsets, sorted by attribute name, 1-byte bytelen and # subvals
  29. - `0x0c` : object with 2-byte index table offsets, sorted by attribute name, 2-byte bytelen and # subvals
  30. - `0x0d` : object with 4-byte index table offsets, sorted by attribute name, 4-byte bytelen and # subvals
  31. - `0x0e` : object with 8-byte index table offsets, sorted by attribute name, 8-byte bytelen and # subvals
  32. - `0x0f` : object with 1-byte index table offsets, not sorted by attribute name, 1-byte bytelen and # subvals
  33. - `0x10` : object with 2-byte index table offsets, not sorted by attribute name, 2-byte bytelen and # subvals
  34. - `0x11` : object with 4-byte index table offsets, not sorted by attribute name, 4-byte bytelen and # subvals
  35. - `0x12` : object with 8-byte index table offsets, not sorted by attribute name, 8-byte bytelen and # subvals
  36. - `0x13` : compact array, no index table
  37. - `0x14` : compact object, no index table
  38. - `0x15`-`0x16` : reserved
  39. - `0x17` : illegal - this type can be used to indicate a value that is illegal in the embedding application
  40. - `0x18` : null
  41. - `0x19` : false
  42. - `0x1a` : true
  43. - `0x1b` : double IEEE-754, 8 bytes follow, stored as little endian uint64 equivalent
  44. - `0x1c` : UTC-date in milliseconds since the epoch, stored as 8 byte signed int, little endian, two's complement
  45. - `0x1d` : external (only in memory): a char* pointing to the actual place in memory, where another VPack item
  46. resides, not allowed in VPack values on disk or on the network
  47. - `0x1e` : minKey, nonsensical value that compares < than all other values
  48. - `0x1f` : maxKey, nonsensical value that compares > than all other values
  49. - `0x20`-`0x27` : signed int, little endian, 1 to 8 bytes, number is V - `0x1f`, two's complement
  50. - `0x28`-`0x2f` : uint, little endian, 1 to 8 bytes, number is V - `0x27`
  51. - `0x30`-`0x39` : small integers 0, 1, ... 9
  52. - `0x3a`-`0x3f` : small negative integers -6, -5, ..., -1
  53. - `0x40`-`0xbe` : UTF-8-string, using V - `0x40` bytes (not Unicode characters!), length 0 is possible, so `0x40` is the
  54. empty string, maximal length is 126, note that strings here are not zero-terminated and may contain NUL bytes
  55. - `0xbf` : long UTF-8-string, next 8 bytes are length of string in bytes (not Unicode characters) as little
  56. endian unsigned integer, note that long strings are not zero-terminated and may contain NUL bytes
  57. - `0xc0`-`0xc7` : binary blob, next V - `0xbf` bytes are the length of blob in bytes, note that binary blobs are not
  58. zero-terminated
  59. - `0xc8`-`0xcf` : positive long packed BCD-encoded float, V - `0xc7` bytes follow that encode in a little endian way the
  60. length of the mantissa in bytes. Directly after that follow 4 bytes encoding the (power of 10) exponent, by which the
  61. mantissa is to be multiplied, stored as little endian two's complement signed 32-bit integer. After that, as many
  62. bytes follow as the length information at the beginning has specified, each byte encodes two digits in big-endian
  63. packed BCD. Example: 12345 decimal can be encoded as
  64. `c8 03 00 00 00 00 01 23 45` or
  65. `c8 03 ff ff ff ff 12 34 50`
  66. - `0xd0`-`0xd7` : negative long packed BCD-encoded float, V - `0xcf` bytes follow that encode in a little endian way the
  67. length of the mantissa in bytes. After that, same as positive long packed BCD-encoded float above.
  68. - `0xd8`-`0xed` : reserved
  69. - `0xee`-`0xef` : value tagging for logical types
  70. - `0xf0`-`0xff` : custom types
  71. ## Arrays
  72. Empty arrays are simply a single byte `0x01`.
  73. We next describe the type cases `0x02` to `0x09`, see below for the special compact type `0x13`.
  74. Non-empty arrays look like one of the following:
  75. one of 0x02 to 0x05
  76. BYTELENGTH
  77. OPTIONAL UNUSED: padding
  78. sub VPack values
  79. or
  80. 0x06
  81. BYTELENGTH in 1 byte
  82. NRITEMS in 1 byte
  83. OPTIONAL UNUSED: 6 bytes of padding
  84. sub VPack values
  85. INDEXTABLE with 1 byte per entry
  86. or
  87. 0x07
  88. BYTELENGTH in 2 bytes
  89. NRITEMS in 2 bytes
  90. OPTIONAL UNUSED: 4 bytes of padding
  91. sub VPack values
  92. INDEXTABLE with 4 byte per entry
  93. or
  94. 0x08
  95. BYTELENGTH in 4 bytes
  96. NRITEMS in 4 bytes
  97. sub VPack values
  98. INDEXTABLE with 4 byte per entry
  99. or
  100. 0x09
  101. BYTELENGTH in 8 bytes
  102. sub VPack values
  103. INDEXTABLE with 8 byte per entry
  104. NRITEMS in 8 bytes
  105. If any optional padding is allowed for a type, the padding must consist of exactly that many bytes that the length of
  106. the padding, the length of BYTELENGTH and the length of NRITEMS (if present) sums up to 8. If the length of BYTELENGTH
  107. is already 8, there is no padding allowed. The entire padding must consist of zero bytes (ASCII NUL).
  108. Numbers (for byte length, number of subvalues and offsets in the INDEXTABLE) are little endian unsigned integers, using
  109. 1 byte for types `0x02` and `0x06`, 2 bytes for types `0x03` and `0x07`, 4 bytes for types
  110. `0x04` and `0x08`, and 8 bytes for types `0x05` and `0x09`.
  111. NRITEMS is a single number as described above.
  112. The INDEXTABLE consists of:
  113. - for types `0x06`-`0x09` an array of offsets (unaligned, in the number format described above) earlier offsets reside
  114. at lower addresses. Offsets are measured from the start of the VPack value.
  115. Non-empty arrays of types `0x06` to `0x09` have a small header including their byte length, the number of subvalues,
  116. then all the subvalues and finally an index table containing offsets to the subvalues. To find the index table, find the
  117. number of subvalues, then the end, and from that the base of the index table, considering how wide its entries are.
  118. For types `0x02` to `0x05` there is no offset table and no number of items. The first item begins at address A+2, A+3,
  119. A+5 or respectively A+9, depending on the type and thus the width of the byte length field. Note the following special
  120. rule: The actual position of the first subvalue is allowed to be further back, after some run of padding zero bytes.
  121. For example, if 2 bytes are used for both the byte length (BYTELENGTH), then an optional padding of 4 zero bytes is then
  122. allowed to follow, and the actual VPack subvalues can start at A+9. This is to give a program that builds a VPack value
  123. the opportunity to reserve 8 bytes in the beginning and only later find out that fewer bytes suffice to write the byte
  124. length. One can determine the number of subvalues by finding the first subvalue, its byte length, and dividing the
  125. amount of available space by it.
  126. For types `0x06` to `0x09` the offset table describes where the subvalues reside. It is not necessary for the subvalues
  127. to start immediately after the number of subvalues field.
  128. As above, it is allowed to include optional padding. Again here, any padding must consist of a run of consecutive zero
  129. bytes (ASCII NUL) and must be as long that it fills up the length of BYTELENGTH and the length of NRITEMS to 8.
  130. For example, if both BYTELENGTH and NRITEMS can be expressed using 2 bytes each, the sum of their lengths is 4. It is
  131. therefore allowed to add 4 bytes of padding here, so that the first subvalue could be at address A+9.
  132. There is one exception for the 8-byte numbers case (type `0x05`):
  133. In this case the number of elements is moved behind the index table. This is to get away without moving memory when one
  134. has reserved 8 bytes in the beginning and later noticed that all 8 bytes are needed for the byte length. For this case
  135. it is not allowed to include any padding.
  136. All offsets are measured from base A.
  137. *Example*:
  138. `[1,2,3]` has the hex dump
  139. 02 05 31 32 33
  140. in the most compact representation, but the following are equally
  141. possible, though not necessarily advised to use:
  142. *Examples*:
  143. 03 06 00 31 32 33
  144. 04 08 00 00 00 31 32 33
  145. 05 0c 00 00 00 00 00 00 00 31 32 33
  146. 06 09 03 31 32 33 03 04 05
  147. 07 0e 00 03 00 31 32 33 05 00 06 00 07 00
  148. 08 18 00 00 00 03 00 00 00 31 32 33 09 00 00 00 0a 00 00 00 0b 00 00 00
  149. 09
  150. 2c 00 00 00 00 00 00 00
  151. 31 32 33
  152. 09 00 00 00 00 00 00 00
  153. 0a 00 00 00 00 00 00 00
  154. 0b 00 00 00 00 00 00 00
  155. 03 00 00 00 00 00 00 00
  156. Note that it is not recommended to encode short arrays in too long a format.
  157. We now describe the special type `0x13`, which is useful for a particularly compact array representation. Note that to
  158. some extent this goes against the principles of the VelocyPack format, since quick access to subvalues is no longer
  159. possible, all items in the array must be scanned to find a particular one. However, there are certain use cases for
  160. VelocyPack which only require sequential access (for example JSON dumping) and have a particular need for compactness.
  161. The overall format of this array type is
  162. 0x13 as type byte
  163. BYTELENGTH
  164. sub VPack values
  165. NRITEMS
  166. There is no index table at all, although the sub VelocyPack values can
  167. have different byte sizes. The BYTELENGTH and NRITEMS are encoded in a
  168. special format, which we describe now.
  169. The BYTELENGTH consists of 1 to 8 bytes, of which all but the last one
  170. have their high bit set. Thus, the high bits determine, how many bytes
  171. are actually used. The lower 7 bits of all these bits together comprise
  172. the actual byte length in a little endian fashion. That is, the byte at
  173. address A+1 contains the least significant 7 bits (0 to 6) of the byte length,
  174. the following byte at address A+2 contains the bits 7 to 13, and so on.
  175. Since the total number of bytes is limited to 8, this encodes unsigned
  176. integers of up to 56 bits, which is the overall limit for the size of
  177. such a compact array representation.
  178. The NRITEMS entry is encoded essentially the same, except that it is
  179. laid out in reverse order in memory. That is, one has to use the
  180. BYTELENGTH to find the end of the array value and go back bytes until
  181. one finds a byte with high bit reset. The last byte (at the highest
  182. memory address) contains the least significant 7 bits of the NRITEMS
  183. value, the second one bits 7 to 13 and so on.
  184. Here is an example, the array [1, 16] can be encoded as follows:
  185. 13 06
  186. 31 28 10
  187. 02
  188. ## Objects
  189. Empty objects are simply a single byte `0x0a`.
  190. We next describe the type cases `0x0b` to `0x12`, see below for the special compact type `0x14`.
  191. Non-empty objects look like this:
  192. one of 0x0b - 0x12
  193. BYTELENGTH
  194. optional NRITEMS
  195. sub VPack values as pairs of attribute and value
  196. optional INDEXTABLE
  197. NRITEMS for the 8-byte case
  198. Numbers (for byte length, number of subvalues and offsets in the INDEXTABLE) are little endian unsigned integers, using
  199. 1 byte for types `0x0b` and `0x0f`, 2 bytes for types `0x0c` and `0x10`, 4 bytes for types
  200. `0x0d` and `0x11`, and 8 bytes for types `0x0e` and `0x12`.
  201. NRITEMS is a single number as described above.
  202. The INDEXTABLE consists of:
  203. - an array of offsets (unaligned, in the number format described
  204. above) earlier offsets reside at lower addresses.
  205. Offsets are measured from the beginning of the VPack value.
  206. Non-empty objects have a small header including their byte length, the number of subvalues, then all the subvalues and
  207. finally an index table containing offsets to the subvalues. To find the index table, find number of subvalues, then the
  208. end, and from that the base of the index table, considering how wide its entries are.
  209. For all types the offset table describes where the subvalues reside. It is not necessary for the subvalues to start
  210. immediately after the number of subvalues field. For performance reasons when building the value, it could be desirable
  211. to reserve 8 bytes for the byte length and the number of subvalues and not fill the gap, even though it turns out later
  212. that offsets and thus the byte length only uses 2 bytes, say.
  213. There is one special case: the empty object is simply stored as the single byte `0x0a`.
  214. There is another exception: For 8-byte numbers (`0x12`) the number of subvalues is stored behind the INDEXTABLE. This is
  215. to get away without moving memory when one has reserved 8 bytes in the beginning and later noticed that all 8 bytes are
  216. needed for the byte length.
  217. All offsets are measured from base A.
  218. Each entry consists of two parts, the key and the value, they are encoded as normal VPack values as above, the first is
  219. always a short or long UTF-8 string starting with a byte `0x40`-`0xbf` as described below. The second is any other VPack
  220. value.
  221. There is one extension: For the key it is possible to use the positive small integer values `0x30`-`0x39` or an unsigned
  222. integer starting with a type byte of `0x28`-`0x2f`. Any such integer value is an index into an outside-given table of
  223. attribute names. These are convenient when only very few attribute names occur or some are repeated very often. The
  224. standard way to encode such an attribute name table is as a VPack array of strings as specified here.
  225. Objects are always stored with sorted key/value pairs, sorted by bytewise comparisons of the keys on each nesting level.
  226. Sorting has some overhead but will allow looking up keys in logarithmic time later. Note that only the index table needs
  227. to be sorted, it is not required that the offsets in these tables are increasing. Since the index table resides after
  228. the actual subvalues, one can build up a complex VPack value by writing linearly.
  229. Example: the object `{"a": 12, "b": true, "c": "xyz"}` can have the hexdump:
  230. 0b
  231. 13 03
  232. 41 62 1a
  233. 41 61 28 0c
  234. 41 63 43 78 79 7a
  235. 06 03 0a
  236. The same object could have been done with an index table with longer
  237. entries, as in this example:
  238. 0d
  239. 22 00 00 00
  240. 03 00 00 00
  241. 41 62 1a
  242. 41 61 28 0c
  243. 41 63 43 78 79 7a
  244. 0c 00 00 00 09 00 00 00 10 00 00 00
  245. Similarly with type `0x0c` and 2-byte offsets, byte length and number of subvalues, or with type `0x0e` and 8-byte
  246. numbers.
  247. Note that it is not recommended to encode short objects with too long index tables.
  248. ### Special compact objects
  249. We now describe the special type `0x14`, which is useful for a particularly compact object representation. Note that to
  250. some extent this goes against the principles of the VelocyPack format, since quick access to subvalues is no longer
  251. possible, all key/value pairs in the object must be scanned to find a particular one. However, there are certain use
  252. cases for VelocyPack which only require sequential access
  253. (for example JSON dumping) and have a particular need for compactness.
  254. The overall format of this object type is
  255. 0x14 as type byte
  256. BYTELENGTH
  257. sub VPack key/value pairs
  258. NRPAIRS
  259. There is no index table at all, although the sub VelocyPack values can have different byte sizes. The BYTELENGTH and
  260. NRPAIRS are encoded in a special format, which we describe now. It is the same as for the special compact array
  261. type `0x13`, which we repeat here for the sake of completeness.
  262. The BYTELENGTH consists of 1 to 8 bytes, of which all but the last one have their high bit set. Thus, the high bits
  263. determine, how many bytes are actually used. The lower 7 bits of all these bits together comprise the actual byte length
  264. in a little endian fashion. That is, the byte at address A+1 contains the least significant 7 bits (0 to 6) of the byte
  265. length, the following byte at address A+2 contains the bits 7 to 13, and so on. Since the total number of bytes is
  266. limited to 8, this encodes unsigned integers of up to 56 bits, which is the overall limit for the size of such a compact
  267. array representation.
  268. The NRPAIRS entry is encoded essentially the same, except that it is laid out in reverse order in memory. That is, one
  269. has to use the BYTELENGTH to find the end of the array value and go back bytes until one finds a byte with high bit
  270. reset. The last byte (at the highest memory address) contains the least significant 7 bits of the NRPAIRS value, the
  271. second one bits 7 to 13 and so on.
  272. Here is an example, the object `{"a":1, "b":16}` can be encoded as follows:
  273. 14 0a
  274. 41 61 31 42 62 28 10
  275. 02
  276. ## Doubles
  277. Type `0x1b` indicates a double IEEE-754 value using the 8 bytes following the type byte. To guarantee
  278. platform-independentness the details of the byte order are as follows. Encoding is done by using memcpy to copy the
  279. internal double value to an uint64\_t. This 64-bit unsigned integer is then stored as little endian 8 byte integer in
  280. the VPack value. Decoding works in the opposite direction. This should sort out the undetermined byte order in IEEE-754
  281. in practice.
  282. ## Dates
  283. Type `0x1c` indicates a signed 64-int integer stored in 8 bytes little endian two's complement notation directly after
  284. the type. The value means a universal UTC-time measured in milliseconds since the epoch, which is 00:00 on 1 January
  285. 1970 UTC.
  286. ## External VPack values
  287. This type is only for use within memory, not for data exchange over disk
  288. or network. Therefore, we only need to specify that the following k
  289. bytes are the memcpy of a char* on the current architecture. That char*
  290. points to the actual VPack value elsewhere in memory.
  291. ## Artificial minimal and maximal keys
  292. These values of types `0x1e` and `0x1f` have no meaning other than comparing smaller or greater respectively than any
  293. other VPack value. The idea is that these can be used in systems that define a total order on all VPack values to
  294. specify left or right ends of infinite intervals.
  295. ## Integer types
  296. There are different ways to specify integers. For small values -6 to 9 inclusively there are specific type bytes in the
  297. range `0x30` to `0x3f` to allow for storage in a single byte. After that there are signed and unsigned integer types
  298. that can code in the type byte the number of bytes used (ranges `0x20`-`0x27` for signed and `0x28`-`0x2f` for unsigned)
  299. .
  300. ## Null and boolean values
  301. These three values use a single byte to store the corresponding JSON values.
  302. ## Strings
  303. Strings are stored as UTF-8 encoded byte sequences. There are two variants, a short one and a long one. In the short
  304. one, the byte length
  305. (not the number of UTF-8 characters) is directly encoded in the type, and this works up to and including byte length
  306. 126. Types `0x40` to `0xbe`
  307. are used for this and the byte length is V - `0x3f`, if V is the type byte. For strings longer than 126 bytes, the
  308. type byte is `0xbf` and the byte length of the string is stored in the first 8 bytes after the type byte, using a
  309. little endian unsigned integer representation. The actual string follows after these 8 bytes. There is no
  310. terminating zero byte in either case and the string may contain zero bytes.
  311. ## Binary data
  312. The type bytes `0xc0` to `0xc7` allow to store arbitrary binary byte sequences as a VPack value. The format is as
  313. follows: If V is the type byte, then V - `0xbf` bytes follow it to make a little endian unsigned integer representing
  314. the length of the binary data, which directly follows these length bytes. No alignment is guaranteed. The content is
  315. entirely up to the user.
  316. ## Packed BCD long floats
  317. These types are used to represent arbitrary precision decimal numbers.
  318. There are different types for positive and negative numbers. The overall
  319. format of these values is:
  320. one of 0xc8 - 0xcf (positive) or of 0xd0 - 0xd7 (negative)
  321. LENGTH OF MANTISSA in bytes
  322. EXPONENT (as 4-byte little endian signed two's complement integer)
  323. MANTISSA (as packed BCD-encoded integer, big-endian)
  324. The type byte describes the sign of the number as well as the number of bytes used to specify the byte length of the
  325. mantissa. As usual, if V is the type byte, then V - `0xc7` (in the positive case) or V - `0xcf` (in the negative case)
  326. bytes are used for the length of the mantissa, stored as little endian unsigned integer directly after the byte length.
  327. After this follow exactly 4 bytes (little endian signed two's complement integer) to specify the exponent. After the
  328. exponent, the actual mantissa bytes follow.
  329. Packed BCD is used, so that each byte stores exactly 2 decimal digits as in `0x34` for the decimal digits 34. Therefore,
  330. the mantissa always has an even number of decimal digits. Note that the mantissa is stored in big endian form, to make
  331. parsing and dumping efficient. This leads to the
  332. "unholy nibble problem": When a JSON parser sees the beginning of a longish number, it does not know whether an even or
  333. odd number of digits follow. However, for efficiency reasons it wants to start writing bytes to the output as it reads
  334. the input. This is, where the exponent comes to the rescue, which is illustrated by the following example. 12345 decimal
  335. can be encoded as:
  336. c8 03 00 00 00 00 01 23 45
  337. c8 03 ff ff ff ff 12 34 50
  338. The former encoding puts a leading 0 in the first byte and uses exponent 0, the latter encoding directly starts putting
  339. two decimal digits in one byte and then in the end has to "erase" the trailing 0 by using exponent -1, encoded by the 4
  340. byte sequence `ff ff ff ff`.
  341. Therefore, the unholy nibble problem is solved and parsing (and indeed dumping) can be efficient.
  342. ## Tagging
  343. Types `0xee`-`0xef` are used for tagging of values to implement logical types.
  344. For example, if type `0x1c` did not exist, the database driver could serialize a timestamp object (Date in JavaScript,
  345. Instant in Java, etc)
  346. into a Unix timestamp, a 64-bit integer. Assuming the lack of schema, upon deserialization it would not be possible to
  347. tell an integer from a timestamp and deserialize the value accordingly.
  348. Type tagging resolves this by attaching an integer tag to values that can then be read when deserializing the value,
  349. e.g. that tag=1 is a timestamp and the relevant timestamp class should be used.
  350. The tag values are specified separately and applications can also specify their own to have the database driver
  351. deserialize their specific data types into the appropriate classes (including models).
  352. Essentially this is object-relational mapping for parts of documents.
  353. The format of the type is:
  354. 0xee
  355. TAG number in 1 byte
  356. sub VPack value
  357. or
  358. 0xef
  359. TAG number in 8 bytes, little-endian encoding
  360. sub VPack value
  361. ## Custom types
  362. Note that custom types should usually not be used for data exchange but
  363. only internally in systems. Nevertheless, the design of this part of
  364. the specification is made such that it is possible by generic methods
  365. to derive the byte length of each custom data type.
  366. The following user-defined types exist:
  367. - `0xf0` : 1 byte payload, directly following the type byte
  368. - `0xf1` : 2 bytes payload, directly following the type byte
  369. - `0xf2` : 4 bytes payload, directly following the type byte
  370. - `0xf3` : 8 bytes payload, directly following the type byte
  371. - `0xf4`-`0xf6` : length of the payload is described by a single further unsigned byte directly following the type byte,
  372. the payload of that many bytes follows
  373. - `0xf7`-`0xf9` : length of the payload is described by two bytes (little endian unsigned integer) directly following
  374. the type byte, the payload of that many bytes follows
  375. - `0xfa`-`0xfc` : length of the payload is described by four bytes (little endian unsigned integer) directly following
  376. the type byte, the payload of that many bytes follows
  377. - `0xfd`-`0xff` : length of the payload is described by eight bytes (little endian unsigned integer) directly following
  378. the type byte, the payload of that many bytes follows
  379. Note: In types `0xf4` to `0xff` the "payload" refers to the actual data not including the length specification.
  380. ## Portability
  381. Serialized booleans, integers, strings, arrays, objects etc. all have a defined endianess and length, which is
  382. platform-independent. These types are fully portable in serialized VelocyPack.
  383. There are still a few caveats when it comes to portability:
  384. It is possible to build up very large values on a 64 bit system, but it may not be possible to read them back on a 32
  385. bit system. This is because the maximum memory allocation size on a 32 bit system may be severely limited compared to a
  386. 64 bit system, i.e. a 32 bit OS may simply not allow to allocate buffers larger than 4 GB. This is not a limitation of
  387. VelocyPack, but a limitation of 32 bit architectures. If all VelocyPack values are kept small enough so that they are
  388. well below the 32 bit length boundaries, this should not matter though.
  389. The VelocyPack type *External* contains just a raw pointer to memory, which should only be used during the buildup of
  390. VelocyPack values in memory. The *External* type is not supposed to be used in VelocyPack values that are serialized and
  391. stored persistently, and then later read back from persistence. Doing it anyway is not portable and will also pose a
  392. security risk. Not using the *External* type for any data that is serialized will avoid this problem entirely.
  393. The VelocyPack type *Custom* is completely user-defined, and there is no default implementation for them. So it is up to
  394. the embedder to make these custom type bindings portable if portability of them is a concern.
  395. VelocyPack *Double* values are serialized as integer equivalents in a specific way, and unserialized back into integers
  396. that overlay a IEEE-754 double-precision floating point value in memory. We found this to be sufficiently portable for
  397. our needs, although at least in theory there may be portability issues with some systems.
  398. The [following](https://en.wikipedia.org/wiki/Endianness#Floating_point) was used as a backing for our "reasonably
  399. portable in the real world" assumptions:
  400. > It may therefore appear strange that the widespread IEEE 754 floating-point standard does not specify endianness.[17] Theoretically, this means that even standard IEEE floating-point data written by one machine might not be readable by another. However, on modern standard computers (i.e., implementing IEEE 754), one may in practice safely assume that the endianness is the same for floating-point numbers as for integers, making the conversion straightforward regardless of data type.