# Sonic A blazingly fast JSON serializing & deserializing library, accelerated by JIT (just-in-time compiling) and SIMD (single-instruction-multiple-data). ## Requirement - Go 1.15/1.16/1.17 - Linux/darwin OS - Amd64 CPU with AVX instruction set ## Features - Runtime object binding without code generation - Complete APIs for JSON value manipulation - Fast, fast, fast! ## Benchmarks For **all sizes** of json and **all scenarios** of usage, **Sonic performs best**. - [Medium](https://github.com/bytedance/sonic/blob/main/decoder/testdata_test.go#L19) (13KB, 300+ key, 6 layers) ```powershell goversion: 1.17.1 goos: darwin goarch: amd64 cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz BenchmarkEncoder_Generic_Sonic 25181 ns/op 517.65 MB/s 13035 B/op 4 allocs/op BenchmarkEncoder_Generic_JsonIter 43765 ns/op 297.84 MB/s 13433 B/op 77 allocs/op BenchmarkEncoder_Generic_StdLib 108776 ns/op 119.83 MB/s 49137 B/op 827 allocs/op BenchmarkEncoder_Binding_Sonic 6282 ns/op 2075.01 MB/s 13765 B/op 4 allocs/op BenchmarkEncoder_Binding_JsonIter 20740 ns/op 628.51 MB/s 9487 B/op 2 allocs/op BenchmarkEncoder_Binding_StdLib 16661 ns/op 782.34 MB/s 9479 B/op 1 allocs/op BenchmarkEncoder_Parallel_Generic_Sonic-16 4072 ns/op 3200.89 MB/s 11052 B/op 4 allocs/op BenchmarkEncoder_Parallel_Generic_JsonIter-16 11379 ns/op 1145.52 MB/s 13458 B/op 77 allocs/op BenchmarkEncoder_Parallel_Generic_StdLib-16 50635 ns/op 257.43 MB/s 49183 B/op 827 allocs/op BenchmarkEncoder_Parallel_Binding_Sonic-16 1304 ns/op 9994.64 MB/s 10925 B/op 4 allocs/op BenchmarkEncoder_Parallel_Binding_JsonIter-16 6072 ns/op 2146.76 MB/s 9505 B/op 2 allocs/op BenchmarkEncoder_Parallel_Binding_StdLib-16 3510 ns/op 3713.89 MB/s 9481 B/op 1 allocs/op BenchmarkDecoder_Generic_Sonic 53843 ns/op 242.09 MB/s 49779 B/op 313 allocs/op BenchmarkDecoder_Generic_StdLib 130402 ns/op 99.96 MB/s 50868 B/op 772 allocs/op BenchmarkDecoder_Generic_JsonIter 92810 ns/op 140.45 MB/s 55788 B/op 1068 allocs/op BenchmarkDecoder_Binding_Sonic 29793 ns/op 437.52 MB/s 24778 B/op 34 allocs/op BenchmarkDecoder_Binding_StdLib 121206 ns/op 107.54 MB/s 10576 B/op 208 allocs/op BenchmarkDecoder_Binding_JsonIter 36099 ns/op 361.09 MB/s 14674 B/op 385 allocs/op BenchmarkDecoder_Parallel_Generic_Sonic-16 10319 ns/op 1263.21 MB/s 49423 B/op 313 allocs/op BenchmarkDecoder_Parallel_Generic_StdLib-16 58526 ns/op 222.72 MB/s 50875 B/op 772 allocs/op BenchmarkDecoder_Parallel_Generic_JsonIter-16 60156 ns/op 216.69 MB/s 55812 B/op 1068 allocs/op BenchmarkDecoder_Parallel_Binding_Sonic-16 7265 ns/op 1794.18 MB/s 24952 B/op 34 allocs/op BenchmarkDecoder_Parallel_Binding_StdLib-16 44000 ns/op 296.25 MB/s 10575 B/op 208 allocs/op BenchmarkDecoder_Parallel_Binding_JsonIter-16 21029 ns/op 619.86 MB/s 14678 B/op 385 allocs/op BenchmarkGetOne_Sonic 17070 ns/op 762.94 MB/s 29 B/op 1 allocs/op BenchmarkGetOne_Gjson 19714 ns/op 660.59 MB/s 0 B/op 0 allocs/op BenchmarkGetOne_Jsoniter 99281 ns/op 131.17 MB/s 27936 B/op 647 allocs/op BenchmarkSetOne_Sonic 23730 ns/op 548.80 MB/s 1883 B/op 17 allocs/op BenchmarkSetOne_Sjson 57680 ns/op 225.78 MB/s 52180 B/op 9 allocs/op BenchmarkSetOne_Jsoniter 104018 ns/op 125.20 MB/s 45859 B/op 964 allocs/op BenchmarkGetOne_Parallel_Sonic-16 2010 ns/op 6479.41 MB/s 114 B/op 1 allocs/op BenchmarkGetOne_Parallel_Gjson-16 1815 ns/op 7176.39 MB/s 0 B/op 0 allocs/op BenchmarkGetOne_Parallel_Jsoniter-16 23261 ns/op 559.86 MB/s 27942 B/op 647 allocs/op BenchmarkSetOne_Parallel_Sonic-16 2007 ns/op 6487.78 MB/s 2202 B/op 17 allocs/op BenchmarkSetOne_Parallel_Sjson-16 12422 ns/op 1048.40 MB/s 52180 B/op 9 allocs/op BenchmarkSetOne_Parallel_Jsoniter-16 39204 ns/op 332.18 MB/s 45889 B/op 964 allocs/op ``` - [Small](https://github.com/bytedance/sonic/blob/main/testdata/small.go) (400B, 11 keys, 3 layers) ![small benchmarks](bench-small.jpg) - [Large](https://github.com/bytedance/sonic/blob/main/testdata/twitter.json) (635KB, 10000+ key, 6 layers) ![large benchmarks](bench-large.jpg) See [bench.sh](https://github.com/bytedance/sonic/blob/main/bench.sh) for benchmark codes. ## How it works See [INTRODUCTION.md](INTRODUCTION.md). ## Usage ### Marshal/Unmarshal Default behaviors are mostly consistent with `encoding/json`, except HTML escaping form (see [Escape HTML](https://github.com/bytedance/sonic/blob/main/README.md#escape-html)) and `SortKeys` feature (optional support see [Sort Keys](https://github.com/bytedance/sonic/blob/main/README.md#sort-keys)) that is **NOT** in conformity to [RFC8259](https://datatracker.ietf.org/doc/html/rfc8259). ```go import "github.com/bytedance/sonic" var data YourSchema // Marshal output, err := sonic.Marshal(&data) // Unmarshal err := sonic.Unmarshal(output, &data) ``` ### Use Number/Use Int64 ```go import "github.com/bytedance/sonic/decoder" var input = `1` var data interface{} // default float64 dc := decoder.NewDecoder(input) dc.Decode(&data) // data == float64(1) // use json.Number dc = decoder.NewDecoder(input) dc.UseNumber() dc.Decode(&data) // data == json.Number("1") // use int64 dc = decoder.NewDecoder(input) dc.UseInt64() dc.Decode(&data) // data == int64(1) root, err := sonic.GetFromString(input) // Get json.Number jn := root.Number() jm := root.InterfaceUseNumber().(json.Number) // jn == jm // Get float64 fn := root.Float64() fm := root.Interface().(float64) // jn == jm ``` ### Sort Keys On account of the performance loss from sorting (roughly 10%), sonic doesn't enable this feature by default. If your component depends on it to work (like [zstd](https://github.com/facebook/zstd)), Use it like this: ```go import "github.com/bytedance/sonic" import "github.com/bytedance/sonic/encoder" // Binding map only m := map[string]interface{}{} v, err := encoder.Encode(m, encoder.SortMapKeys) // Or ast.Node.SortKeys() before marshal var root := sonic.Get(JSON) err := root.SortKeys() ``` ### Escape HTML On account of the performance loss (roughly 15%), sonic doesn't enable this feature by default. You can use `encoder.EscapeHTML` option to open this feature (align with `encoding/json.HTMLEscape`). ```go import "github.com/bytedance/sonic" v := map[string]string{"&&":{"<>"}} ret, err := Encode(v, EscapeHTML) // ret == `{"\u0026\u0026":{"X":"\u003c\u003e"}}` ``` ### Print Syntax Error ```go import "github.com/bytedance/sonic" import "github.com/bytedance/sonic/decoder" var data interface{} err := sonic.Unmarshal("[[[}]]", &data) if err != nil { /*one line by default*/ println(e.Error())) // "Syntax error at index 3: invalid char\n\n\t[[[}]]\n\t...^..\n" /*pretty print*/ if e, ok := err.(decoder.SyntaxError); ok { /*Syntax error at index 3: invalid char [[[}]] ...^.. */ print(e.Description()) } } ``` ### Ast.Node Sonic/ast.Node is a completely self-contained AST for JSON. It implements serialization and deserialization both, and provides robust APIs for obtaining and modification of generic data. #### Get/Index Search partial JSON by given paths, which must be non-negative integer or string or nil ```go import "github.com/bytedance/sonic" input := []byte(`{"key1":[{},{"key2":{"key3":[1,2,3]}}]}`) // no path, returns entire json root, err := sonic.Get(input) raw := root.Raw() // == string(input) // multiple pathes root, err := sonic.Get(input, "key1", 1, "key2") sub := root.Get("key3").Index(2).Int64() // == 3 ``` **Tip**: since `Index()` uses offset to locate data, which is faster much than scanning like `Get()`, we suggest you use it as much as possible. And sonic also provides another API `IndexOrGet()` to underlying use offset as well as ensuring the key is matched. #### Set/Unset Modify the json content by Set()/Unset() ```go import "github.com/bytedance/sonic" // Set exist, err := root.Set("key4", NewBool(true)) // exist == false alias1 := root.Get("key4") println(alias1.Valid()) // true alias2 := root.Index(1) println(alias1 == alias2) // true // Unset exist, err := root.UnsetByIndex(1) // exist == true println(root.Get("key4").Check()) // "value not exist" ``` #### Serialize To encode `ast.Node` as json, use `MarshalJson()` or `json.Marshal()` (MUST pass the node's pointer) ```go import ( "encoding/json" "github.com/bytedance/sonic" ) buf, err := root.MarshalJson() println(string(buf)) // {"key1":[{},{"key2":{"key3":[1,2,3]}}]} exp, err := json.Marshal(&root) // WARN: use pointer println(string(buf) == string(exp)) // true ``` #### APIs - validation: `Check()`, `Error()`, `Valid()`, `Exist()` - searching: `Index()`, `Get()`, `IndexPair()`, `IndexOrGet()`, `GetByPath()` - go-type casting: `Int64()`, `Float64()`, `String()`, `Number()`, `Bool()`, `Map[UseNumber|UseNode]()`, `Array[UseNumber|UseNode]()`, `Interface[UseNumber|UseNode]()` - go-type packing: `NewRaw()`, `NewNumber()`, `NewNull()`, `NewBool()`, `NewString()`, `NewObject()`, `NewArray()` - iteration: `Values()`, `Properties()`, `ForEach()`, `SortKeys()` - modification: `Set()`, `SetByIndex()`, `Add()` ## Tips ### Pretouch Since Sonic uses [golang-asm](https://github.com/twitchyliquid64/golang-asm) as a JIT assembler, which is NOT very suitable for runtime compiling, first-hit running of a huge schema may cause request-timeout or even process-OOM. For better stability, we advise to **use `Pretouch()` for huge-schema or compact-memory application** before `Marshal()/Unmarshal()`. ```go import ( "reflect" "github.com/bytedance/sonic" "github.com/bytedance/sonic/option" ) func init() { var v HugeStruct // For most large types (nesting depth <= 5) err := sonic.Pretouch(reflect.TypeOf(v)) // If the type is too deep nesting (nesting depth > 5), // you can set compile recursive depth in Pretouch for better stability in JIT. err := sonic.Pretouch(reflect.TypeOf(v), option.WithCompileRecursiveDepth(depth)) ``` ### Accelerate `json.RawMessage\json.Marshaler\encoding.TextMarshaler` To ensure data security, sonic.Encoder validates and escapes JSON values from these interfaces by default, which may degrades performance much if the most of your data is in form of them. We provide two options `encoder.NoCompactMarshaler` (for `json.RawMessage\json.Marshaler`) and `encoder.NoQuoteTextMarshaler` (for `encoding.TextMarshaler`) to avoid validating and escaping operations, which means you **MUST** ensure the validity of JSON values from these interfaces by your own. ### Pass string or []byte? For alignment to `encoding/json`, we provide API to pass `[]byte` as an argument, but the string-to-bytes copy is conducted at the same time considering safety, which may lose performance when origin JSON is huge. Therefore, you can use `UnmarshalString` and `GetFromString` to pass a string, as long as your origin data is a string or **nocopy-cast** is safe for your []byte. ### Better performance for generic data In **fully-parsed** scenario, `Unmarshal()` performs better than `Get()`+`Node.Interface()`. But if you only have a part of schema for specific json, you can combine `Get()` and `Unmarshal()` together: ```go import "github.com/bytedance/sonic" node, err := sonic.GetFromString(_TwitterJson, "statuses", 3, "user") var user User // your partial schema... err = sonic.UnmarshalString(node.Raw(), &user) ``` Even if you don't have any schema, use `ast.Node` as the container of generic values instead of `map` or `interface`: ```go import "github.com/bytedance/sonic" root, err := sonic.GetFromString(_TwitterJson) user := root.GetByPath("statuses", 3, "user") // === root.Get("status").Index(3).Get("user") err = user.Check() // err = user.LoadAll() // only call this when you want to use 'user' concurrently... go someFunc(user) ``` Why? Because `ast.Node` stores its children using `array`: - `Array`'s performance is **much better** than `Map` when Inserting (Deserialize) and Scanning (Serialize) data; - **Hashing** (`map[x]`) is not as efficient as **Indexing** (`array[x]`), which `ast.Node` can conduct on **both array and object**. - Using `Interface()`/`Map()` means Sonic must parse all the underlying values, while `ast.Node` can parse them **on demand**. **CAUTION:** `ast.Node` **DOESN'T** ensure concurrent security directly, due to its **lazy-load** design. However, your can call `Node.Load()`/`Node.LoadAll()` to achieve that, which may bring performance reduction while it still works faster than converting to `map` or `interface{}`