summaryrefslogtreecommitdiff
path: root/README.md
diff options
context:
space:
mode:
authorLuke Shumaker <lukeshu@lukeshu.com>2023-01-25 21:05:17 -0700
committerLuke Shumaker <lukeshu@lukeshu.com>2023-01-26 00:45:27 -0700
commitffee5c8516f3f55f82ed5bb8f0a4f340d485fa92 (patch)
tree0c10526b1ea57b043230402e9378b341c6966965 /README.md
parent4148776399cb7ea5e10c74dc465e4e1e682cb399 (diff)
Write documentationv0.2.0
Diffstat (limited to 'README.md')
-rw-r--r--README.md170
1 files changed, 170 insertions, 0 deletions
diff --git a/README.md b/README.md
new file mode 100644
index 0000000..c8e05ab
--- /dev/null
+++ b/README.md
@@ -0,0 +1,170 @@
+<!--
+Copyright (C) 2023 Luke Shumaker <lukeshu@lukeshu.com>
+
+SPDX-License-Identifier: GPL-2.0-or-later
+-->
+
+# lowmemjson
+
+`lowmemjson` is a mostly-compatible alternative to the standard
+library's [`encoding/json`][] that has dramatically lower memory
+requirements for large data structures.
+
+`lowmemjson` is not targeting extremely resource-constrained
+environments, but rather targets being able to efficiently stream
+gigabytes of JSON without requiring gigabytes of memory overhead.
+
+## Compatibility
+
+`encoding/json`'s APIs are designed around the idea that it can buffer
+the entire JSON document as a `[]byte`, and as intermediate steps it
+may have a fragment buffered multiple times while encoding; encoding a
+gigabyte of data may consume several gigabytes of memory. In
+contrast, `lowmemjson`'s APIs are designed around streaming
+(`io.Writer` and `io.RuneScanner`), trying to have the memory overhead
+of encode and decode operations be as close to O(1) as possible.
+
+`lowmemjson` offers a high level of compatibility with the
+`encoding/json` APIs, but for best memory usage (avoiding storing
+large byte arrays inherent in `encoding/json`'s API), it is
+recommended to migrate to `lowmemjson`'s own APIs.
+
+### Callee API (objects to be encoded-to/decoded-from JSON)
+
+`lowmemjson` supports `encoding/json`'s `json:` struct field tags, as
+well as the `encoding/json.Marshaler` and `encoding/json.Unmarshaler`
+interfaces; you do not need to adjust your types to successfully
+migrate from `encoding/json` to `lowmemjson`.
+
+That is: Given types that decode as desired with `encoding/json`,
+those types should decode identically with `lowmemjson`. Given types
+that encode as desired with `encoding/json`, those types should encode
+identically with `lowmemjson` (assuming an appropriately configured
+`ReEncoder` to match the whitespace-handling and special-character
+escaping; a `ReEncoder` with `Compact=true` and all other settings
+left as zero will match the behavior of `json.Marshal`).
+
+For better memory usage:
+ - Instead of implementing [`json.Marshaler`][], consider implementing
+ [`lowmemjson.Encodable`][] (or implementing both).
+ - Instead of implementing [`json.Unmarshaler`][], consider
+ implementing [`lowmemjson.Decodable`][] (or implementing both).
+
+### Caller API
+
+`lowmemjson` offers a [`lowmemjson/compat/json`][] package that is a
+(mostly) drop-in replacement for `encoding/json` (see the package's
+documentation for the small incompatibilities).
+
+For better memory usage, avoid using `lowmemjson/compat/json` and
+instead use `lowmemjson` directly:
+ - Instead of using <code>[json.Marshal][`json.Marshal`](val)</code>,
+ consider using
+ <code>[lowmemjson.NewEncoder][`lowmemjson.NewEncoder`](w).[Encode][`lowmemjson.Encoder.Encode`](val)</code>.
+ - Instead of using
+ <code>[json.Unmarshal][`json.Unmarshal`](dat, &val)</code>, consider
+ using
+ <code>[lowmemjson.NewDecoder][`lowmemjson.NewDecoder`](r).[DecodeThenEOF][`lowmemjson.Decoder.DecodeThenEOF`](&val)</code>.
+ - Instead of using [`json.Compact`][], [`json.HTMLEscape`][], or
+ [`json.Indent`][]; consider using a [`lowmemjson.ReEncoder`][].
+ - Instead of using [`json.Valid`][], consider using a
+ [`lowmemjson.ReEncoder`][] with `io.Discard` as the output.
+
+The error types returned from `lowmemjson` are different from the
+error types returned by `encoding/json`, but `lowmemjson/compat/json`
+translates them back to the types returned by `encoding/json`.
+
+## Overview
+
+### Caller API
+
+There are 3 main types that make up the caller API for producing and
+handling streams of JSON, and each of those types has some associated
+types that go with it:
+
+ 1. `type Decoder`
+ + `type DecodeArgumentError`
+ + `type DecodeError`
+ * `type DecodeReadError`
+ * `type DecodeSyntaxError`
+ * `type DecodeTypeError`
+
+ 2. `type Encoder`
+ + `type EncodeTypeError`
+ + `type EncodeValueError`
+ + `type EncodeMethodError`
+
+ 3. `type ReEncoder`
+ + `type ReEncodeSyntaxError`
+ + `type BackslashEscaper`
+ * `type BackslashEscapeMode`
+
+A `*Decoder` handles decoding a JSON stream into Go values; the most
+common use of it will be
+`lowmemjson.NewDecoder(r).DecodeThenEOF(&val)` or
+`lowmemjson.NewDecoder(bufio.NewReader(r)).DecodeThenEOF(&val)`.
+
+A `*ReEncoder` handles transforming a JSON stream; this is useful for
+prettifying, minifying, sanitizing, and/or validating JSON. A
+`*ReEncoder` wraps an `io.Writer`, itself implementing `io.Writer`.
+The most common use of it will be something along the lines of
+
+```go
+out = &ReEncoder{
+ Out: out,
+ // settings here
+}
+```
+
+An `*Encoder` handles encoding Go values into a JSON stream.
+`*Encoder` doesn't take much care in to making its output nice; so it
+is usually desirable to have the output stream of an `*Encoder` be a `*ReEncoder`; the most
+common use of it will be
+
+```go
+lowmemjson.NewEncoder(&lowmemjson.ReEncoder{
+ Out: out,
+ // settings here
+}).Encode(val)
+```
+
+### Callee API
+
+For defining Go types with custom JSON representations, `lowmemjson`
+respects all of the `json:` struct field tags of `encoding/json`, as
+well as respecting the same "marshaler" and "unmarshaler" interfaces
+as `encoding/json`. In addition to those interfaces, `lowmemjson`
+adds two of its own interfaces, and some helper functions to help with
+implementing those interfaces:
+
+ 1. `type Decodable`
+ + `func DecodeArray`
+ + `func DecodeObject`
+ 2. `type Encodable`
+
+These are streaming variants of the standard `json.Unmarshaler` and
+`json.Marshaler` interfaces.
+
+<!-- packages -->
+[`lowmemjson`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson
+[`lowmemjson/compat/json`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson/compat/json
+[`encoding/json`]: https://pkg.go.dev/encoding/json@go1.18
+
+<!-- encoding/json symbols -->
+[`json.Marshaler`]: https://pkg.go.dev/encoding/json@go1.18#Marshaler
+[`json.Unmarshaler`]: https://pkg.go.dev/encoding/json@go1.18#Unmarshaler
+[`json.Marshal`]: https://pkg.go.dev/encoding/json@go1.18#Marshal
+[`json.Unmarshal`]: https://pkg.go.dev/encoding/json@go1.18#Unmarshal
+[`json.Compact`]: https://pkg.go.dev/encoding/json@go1.18#Compact
+[`json.HTMLEscape`]: https://pkg.go.dev/encoding/json@go1.18#HTMLEscape
+[`json.Indent`]: https://pkg.go.dev/encoding/json@go1.18#Indent
+[`json.Valid`]: https://pkg.go.dev/encoding/json@go1.18#Valid
+
+<!-- lowmemjson symbols -->
+[`lowmemjson.Encodable`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Encodable
+[`lowmemjson.Decodable`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Decodable
+[`lowmemjson.NewEncoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#NewEncoder
+[`lowmemjson.Encoder.Encode`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Encoder.Encode
+[`lowmemjson.NewDecoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#NewDecoder
+[`lowmemjson.Decoder.DecodeThenEOF`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#Decoder.DecodeThenEOF
+[`lowmemjson.ReEncoder`]: https://pkg.go.dev/git.lukeshu.com/go/lowmemjson#ReEncoder