diff options
author | Luke Shumaker <lukeshu@lukeshu.com> | 2023-01-25 21:05:17 -0700 |
---|---|---|
committer | Luke Shumaker <lukeshu@lukeshu.com> | 2023-01-26 00:45:27 -0700 |
commit | ffee5c8516f3f55f82ed5bb8f0a4f340d485fa92 (patch) | |
tree | 0c10526b1ea57b043230402e9378b341c6966965 /README.md | |
parent | 4148776399cb7ea5e10c74dc465e4e1e682cb399 (diff) |
Write documentationv0.2.0
Diffstat (limited to 'README.md')
-rw-r--r-- | README.md | 170 |
1 files changed, 170 insertions, 0 deletions
diff --git a/README.md b/README.md new file mode 100644 index 0000000..c8e05ab --- /dev/null +++ b/README.md @@ -0,0 +1,170 @@ +<!-- +Copyright (C) 2023 Luke Shumaker <lukeshu@lukeshu.com> + +SPDX-License-Identifier: GPL-2.0-or-later +--> + +# lowmemjson + +`lowmemjson` is a mostly-compatible alternative to the standard +library's [`encoding/json`][] that has dramatically lower memory +requirements for large data structures. + +`lowmemjson` is not targeting extremely resource-constrained +environments, but rather targets being able to efficiently stream +gigabytes of JSON without requiring gigabytes of memory overhead. + +## Compatibility + +`encoding/json`'s APIs are designed around the idea that it can buffer +the entire JSON document as a `[]byte`, and as intermediate steps it +may have a fragment buffered multiple times while encoding; encoding a +gigabyte of data may consume several gigabytes of memory. In +contrast, `lowmemjson`'s APIs are designed around streaming +(`io.Writer` and `io.RuneScanner`), trying to have the memory overhead +of encode and decode operations be as close to O(1) as possible. + +`lowmemjson` offers a high level of compatibility with the +`encoding/json` APIs, but for best memory usage (avoiding storing +large byte arrays inherent in `encoding/json`'s API), it is +recommended to migrate to `lowmemjson`'s own APIs. + +### Callee API (objects to be encoded-to/decoded-from JSON) + +`lowmemjson` supports `encoding/json`'s `json:` struct field tags, as +well as the `encoding/json.Marshaler` and `encoding/json.Unmarshaler` +interfaces; you do not need to adjust your types to successfully +migrate from `encoding/json` to `lowmemjson`. + +That is: Given types that decode as desired with `encoding/json`, +those types should decode identically with `lowmemjson`. Given types +that encode as desired with `encoding/json`, those types should encode +identically with `lowmemjson` (assuming an appropriately configured +`ReEncoder` to match the whitespace-handling and special-character +escaping; a `ReEncoder` with `Compact=true` and all other settings +left as zero will match the behavior of `json.Marshal`). + +For better memory usage: + - Instead of implementing [`json.Marshaler`][], consider implementing + [`lowmemjson.Encodable`][] (or implementing both). + - Instead of implementing [`json.Unmarshaler`][], consider + implementing [`lowmemjson.Decodable`][] (or implementing both). + +### Caller API + +`lowmemjson` offers a [`lowmemjson/compat/json`][] package that is a +(mostly) drop-in replacement for `encoding/json` (see the package's +documentation for the small incompatibilities). + +For better memory usage, avoid using `lowmemjson/compat/json` and +instead use `lowmemjson` directly: + - Instead of using <code>[json.Marshal][`json.Marshal`](val)</code>, + consider using + <code>[lowmemjson.NewEncoder][`lowmemjson.NewEncoder`](w).[Encode][`lowmemjson.Encoder.Encode`](val)</code>. + - Instead of using + <code>[json.Unmarshal][`json.Unmarshal`](dat, &val)</code>, consider + using + <code>[lowmemjson.NewDecoder][`lowmemjson.NewDecoder`](r).[DecodeThenEOF][`lowmemjson.Decoder.DecodeThenEOF`](&val)</code>. + - Instead of using [`json.Compact`][], [`json.HTMLEscape`][], or + [`json.Indent`][]; consider using a [`lowmemjson.ReEncoder`][]. + - Instead of using [`json.Valid`][], consider using a + [`lowmemjson.ReEncoder`][] with `io.Discard` as the output. + +The error types returned from `lowmemjson` are different from the +error types returned by `encoding/json`, but `lowmemjson/compat/json` +translates them back to the types returned by `encoding/json`. + +## Overview + +### Caller API + +There are 3 main types that make up the caller API for producing and +handling streams of JSON, and each of those types has some associated +types that go with it: + + 1. `type Decoder` + + `type DecodeArgumentError` + + `type DecodeError` + * `type DecodeReadError` + * `type DecodeSyntaxError` + * `type DecodeTypeError` + + 2. `type Encoder` + + `type EncodeTypeError` + + `type EncodeValueError` + + `type EncodeMethodError` + + 3. `type ReEncoder` + + `type ReEncodeSyntaxError` + + `type BackslashEscaper` + * `type BackslashEscapeMode` + +A `*Decoder` handles decoding a JSON stream into Go values; the most +common use of it will be +`lowmemjson.NewDecoder(r).DecodeThenEOF(&val)` or +`lowmemjson.NewDecoder(bufio.NewReader(r)).DecodeThenEOF(&val)`. + +A `*ReEncoder` handles transforming a JSON stream; this is useful for +prettifying, minifying, sanitizing, and/or validating JSON. A +`*ReEncoder` wraps an `io.Writer`, itself implementing `io.Writer`. +The most common use of it will be something along the lines of + +```go +out = &ReEncoder{ + Out: out, + // settings here +} +``` + +An `*Encoder` handles encoding Go values into a JSON stream. +`*Encoder` doesn't take much care in to making its output nice; so it +is usually desirable to have the output stream of an `*Encoder` be a `*ReEncoder`; the most +common use of it will be + +```go +lowmemjson.NewEncoder(&lowmemjson.ReEncoder{ + Out: out, + // settings here +}).Encode(val) +``` + +### Callee API + +For defining Go types with custom JSON representations, `lowmemjson` +respects all of the `json:` struct field tags of `encoding/json`, as +well as respecting the same "marshaler" and "unmarshaler" interfaces +as `encoding/json`. In addition to those interfaces, `lowmemjson` +adds two of its own interfaces, and some helper functions to help with +implementing those interfaces: + + 1. `type Decodable` + + `func DecodeArray` + + `func DecodeObject` + 2. `type Encodable` + +These are streaming variants of the standard `json.Unmarshaler` and +`json.Marshaler` interfaces. + +<!-- packages --> +[`lowmemjson`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson +[`lowmemjson/compat/json`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson/compat/json +[`encoding/json`]: https://pkg.go.dev/encoding/json@go1.18 + +<!-- encoding/json symbols --> +[`json.Marshaler`]: https://pkg.go.dev/encoding/json@go1.18#Marshaler +[`json.Unmarshaler`]: https://pkg.go.dev/encoding/json@go1.18#Unmarshaler +[`json.Marshal`]: https://pkg.go.dev/encoding/json@go1.18#Marshal +[`json.Unmarshal`]: https://pkg.go.dev/encoding/json@go1.18#Unmarshal +[`json.Compact`]: https://pkg.go.dev/encoding/json@go1.18#Compact +[`json.HTMLEscape`]: https://pkg.go.dev/encoding/json@go1.18#HTMLEscape +[`json.Indent`]: https://pkg.go.dev/encoding/json@go1.18#Indent +[`json.Valid`]: https://pkg.go.dev/encoding/json@go1.18#Valid + +<!-- lowmemjson symbols --> +[`lowmemjson.Encodable`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Encodable +[`lowmemjson.Decodable`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Decodable +[`lowmemjson.NewEncoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#NewEncoder +[`lowmemjson.Encoder.Encode`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Encoder.Encode +[`lowmemjson.NewDecoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#NewDecoder +[`lowmemjson.Decoder.DecodeThenEOF`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Decoder.DecodeThenEOF +[`lowmemjson.ReEncoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#ReEncoder |