2
0
Fork 0
mirror of https://github.com/ii64/sonic.git synced 2026-06-20 16:45:22 +08:00

refactor: make it more readable (#104)

Co-authored-by: duanyi.aster <duanyi.aster@bytedance.com>
This commit is contained in:
Yi Duan 2021-09-18 11:31:25 +08:00 committed by GitHub
parent 0e4b0b8ee1
commit a577eafc25
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
28 changed files with 191 additions and 112 deletions

48
INTRODUCTION.md Normal file
View file

@ -0,0 +1,48 @@
# Introduction to Sonic
## Background
According to the overall profiling of production services in Bytedance, we found that the overhead of JSON serialization and deserialization is unexpectedly high: the total is near to 10% CPU, and the extreme one accounts for more than 40% CPU. Therefore, **the performance of JSON lib is a key issue for the promotion of machine utilization**.
## Research
We conducted a series of surveys and benchmarks on open-sourced JSON libraries for Golang, but the result is disappointing: **no silver bullet**. First of all, no one can perform at least the top three across various business scenarios. Even the most widely used [json-iterator](https://github.com/json-iterator/go) will severely degrade in generic (no-schema) or big-volume JSON serialization and deserialization. Secondly, compared with other JSON libraries writing in other languages, their speed is generally much slower. For example, [Simdjson-go](https://github.com/minio/simdjson-go) has a 50% reduction in decoding performance compared to [simdjson](https://github.com/simdjson/simdjson). What's more, we barely found JSON libraries which provide API to modify the underlying values.
Therefore, we decided to **develop a brand-new JSON library with high performance as well as wide applicability**.
## Thinking
Before starting our design, we need to figure out some questions:
### Why is Json-iterator faster than Standard Library?
First of all, the **schema-based processing mechanism** used by the standard library is commendable, in which the parser can obtain meta information in advance when scanning, thereby shortening the time of branch selection. However, its original implementation did not make good use of this mechanism, instead, **it spent a lot of time reflecting to obtain meta info of schema**. Meanwhile, The approach of json-iterator is: Interprete structure as field-by-field encoding and decoding functions, and then assembled and cached them, minimizing the performance loss cost by reflection. But does it work once and for all? No. In practical tests, we found that **the deeper and larger the input JSON got, the smaller the gap between json-iterator and other libraries gradually became** - eventually event got surpassed:
![Scalability](introduction-1.png)
The reason is that **this implementation transforms into a large number of interface encapsulations and function calls**, followed by function-call losses:
1. **Calling interface involves dynamic addressing of itab**
2. **Assembly functions cannot be inlined**, while Golang's function-call performance is poor (no parameter-passing-by-register)
#### Is there a way to avoid the function-call overhead of dynamic assembly?
The first thing we thought about was code generation like [easyjson](https://github.com/mailru/easyjson). But it comes with **schema dependency and convenience losses**. To achieve a real drop-in replacement of the standard library, we turned to another technology - **[JIT](https://en.wikipedia.org/wiki/Jit) (just-in-time compiling)**. Because the compiled codec function is an integrated function, which can greatly reduce function calls while ensuring flexibility.
### Why is Simdjson-go not fast enough?
[SIMD](https://en.wikipedia.org/wiki/SIMD) (Single-Instruction-Multi-Data) is a special set of CPU instructions for the parallel processing of vectorized data. At present, it is supported by most CPUs and widely used in image processing and big data computing. Undoubtedly, SIMD is useful in JSON processing (itoa, char-search and so on are all suitable scenarios). We can see that simdjson-go is very competitive in large JSON scenarios (>100KB). However, for some extremely small or irregular character strings, **the extra load operation required by SIMD will lead to performance degradation**. Therefore, we need to dedicate to branch predicting and decide which scenarios should use SIMD and which should not (for example, the string length is less than 16 bytes).
The second problem comes from the Go compiler itself. In order to ensure the compilation speed, **Golang does very little optimization work during the compilation phase** and cannot directly use compiler backends such as [LLVM](https://en.wikipedia.org/wiki/LLVM) (Low-Level Virtual Machine) for optimization.
So, **can some crucial calculation functions be written in another language with higher execution efficiency**?
C/Clang is an ideal compilation tool (internal integration LLVM). But the key is how to embed the optimized assembly into Golang.
### How to use Gjson well?
We also found that [gjson](https://github.com/tidwall/gjson) has a huge advantage in single-key lookup scenarios. This is because its lookup is implemented by a **lazy-load mechanism**, which subtlely skips passing-by values and effectively reduces a lot of unnecessary parsing. Practical application has proved that making good use of this feature in product can indeed bring benefits. But when it comes to multi-key lookup, Gjson does worse event than std, which is a side effect of its skipping mechanism - **searching for the same path leads to repeated parsing** (skip is also a lightweight parsing). Therefore, the accurate adaptation of practical scenarios is the key.
## Design
Based on the above questions, our design is easy to implement:
1. Aiming at the function-call overhead cost by the codec dynamic-assembly, **`JIT` tech is used to assemble opcodes (asm) corresponding to the schema at runtime**, which is finally cached into the off-heap memory in the form of Golang functions.
2. For practical scenarios where big data and small data coexist, we **use pre-conditional judgment** (string size, floating precision, etc.) **to combine `SIMD` with scalar instructions** to achieve the best adaptation.
3. As for insufficiency in compiling optimization of go language, we decided to **use `C/Clang` to write and compile core computational functions**, and **developed a set of [asm2asm](https://github.com/chenzhuoyu/asm2asm) tools to translate the fully optimized x86 assembly into plan9** and finally load it into Golang runtime.
4. Giving the big speed gap between parsing and skipping, the **`lazy-load` mechanism** is certainly used in our AST parser, but in **a more adaptive and efficient way to reduce the overhead of multiple-key queries**.
![design](introduction-2.png)
In detail, we conducted some further optimization:
1. Since the native-asm functions cannot be inlined in Golang, we found that its cost even exceeded the improvement brought by the optimization of the C compiler. So we reimplemented a set of lightweight function-calls in JIT:
- `Global-function-table + static offset` for calling instruction
- **Pass parameters using registers**
2. `Sync.Map` was used to cache the codecs at first, but for our **quasi-static** (read far more than write), **fewer elements** (usually no more than a few dozen) scenarios, its performance is not optimal, so we reimplement a high-performance and concurrent-safe cache with `open-addressing-hash + RCU` tech.

167
README.md
View file

@ -2,89 +2,91 @@
A blazingly fast JSON serializing &amp; deserializing library, accelerated by JIT (just-in-time compiling) and SIMD (single-instruction-multiple-data).
**WARNING: This is still in alpha stage, use with care !**
## Requirement
- Go 1.15/1.16
- Linux/darwin OS
- Amd64 CPU with AVX instruction set
## Features
- Runtime object binding without code generation
- Complete APIs for JSON value manipulation
- Fast, fast, fast!
## Benchmarks
For **all sizes** of json and **all cases** of usage, **Sonic performs best**.
- [Small](https://github.com/bytedance/sonic/blob/main/testdata/small.go) (400B, 11 keys, 3 levels)
- [Small](https://github.com/bytedance/sonic/blob/main/testdata/small.go) (400B, 11 keys, 3 layers)
![small benchmarks](bench-small.png)
- [Large](https://github.com/bytedance/sonic/blob/main/testdata/twitter.json) (635KB, 10000+ key, 6 levels)
- [Large](https://github.com/bytedance/sonic/blob/main/testdata/twitter.json) (635KB, 10000+ key, 6 layers)
![large benchmarks](bench-large.png)
- [Medium](https://github.com/bytedance/sonic/blob/main/decoder/testdata_test.go#L19) (13KB, 300+ key, 6 levels)
- [Medium](https://github.com/bytedance/sonic/blob/main/decoder/testdata_test.go#L19) (13KB, 300+ key, 6 layers)
For a 13KB TwitterJson, Sonic is **1.6x faster** than [json-iterator](https://github.com/json-iterator/go) in `decoding`, **2.7x faster** in `encoding`**9.6x faster** in `searching`.
**For medium data, Sonic's speed is `2.6x times` of [json-iterator's](https://github.com/json-iterator/go) in `decoding`, `2.5x times` in `encoding`and `8.3x times` in `searching`.**
```powershell
goos: darwin
goarch: amd64
cpu: Intel(R) Core(TM) i9-9880H CPU @ 2.30GHz
BenchmarkEncoder_Generic_Sonic-16 100000 27844 ns/op 468.14 MB/s 14332 B/op 4 allocs/op
BenchmarkEncoder_Generic_JsonIter-16 100000 52179 ns/op 249.81 MB/s 13433 B/op 77 allocs/op
BenchmarkEncoder_Generic_GoJson-16 100000 47033 ns/op 277.15 MB/s 13129 B/op 39 allocs/op
BenchmarkEncoder_Generic_StdLib-16 100000 151394 ns/op 86.10 MB/s 48177 B/op 827 allocs/op
BenchmarkEncoder_Binding_Sonic-16 100000 7338 ns/op 1776.47 MB/s 14492 B/op 4 allocs/op
BenchmarkEncoder_Binding_JsonIter-16 100000 25365 ns/op 513.90 MB/s 9488 B/op 2 allocs/op
BenchmarkEncoder_Binding_GoJson-16 100000 10357 ns/op 1258.62 MB/s 9483 B/op 1 allocs/op
BenchmarkEncoder_Binding_StdLib-16 100000 20258 ns/op 643.44 MB/s 9480 B/op 1 allocs/op
BenchmarkEncoder_Parallel_Generic_Sonic-16 100000 5145 ns/op 2533.58 MB/s 10768 B/op 4 allocs/op
BenchmarkEncoder_Parallel_Generic_JsonIter-16 100000 11436 ns/op 1139.78 MB/s 13451 B/op 77 allocs/op
BenchmarkEncoder_Parallel_Generic_GoJson-16 100000 15274 ns/op 853.43 MB/s 13143 B/op 39 allocs/op
BenchmarkEncoder_Parallel_Generic_StdLib-16 100000 56236 ns/op 231.79 MB/s 48211 B/op 827 allocs/op
BenchmarkEncoder_Parallel_Binding_Sonic-16 100000 1821 ns/op 7159.40 MB/s 11262 B/op 4 allocs/op
BenchmarkEncoder_Parallel_Binding_JsonIter-16 100000 4559 ns/op 2859.24 MB/s 9487 B/op 2 allocs/op
BenchmarkEncoder_Parallel_Binding_GoJson-16 100000 2182 ns/op 5973.36 MB/s 9481 B/op 1 allocs/op
BenchmarkEncoder_Parallel_Binding_StdLib-16 100000 3867 ns/op 3370.58 MB/s 9477 B/op 1 allocs/op
BenchmarkEncoder_Generic_Sonic-16 100000 25911 ns/op 503.06 MB/s 13542 B/op 4 allocs/op
BenchmarkEncoder_Generic_JsonIter-16 100000 46693 ns/op 279.16 MB/s 13434 B/op 77 allocs/op
BenchmarkEncoder_Generic_StdLib-16 100000 143080 ns/op 91.10 MB/s 48177 B/op 827 allocs/op
BenchmarkEncoder_Binding_Sonic-16 100000 6851 ns/op 1902.68 MB/s 14229 B/op 4 allocs/op
BenchmarkEncoder_Binding_JsonIter-16 100000 22264 ns/op 585.49 MB/s 9488 B/op 2 allocs/op
BenchmarkEncoder_Binding_StdLib-16 100000 18685 ns/op 697.61 MB/s 9479 B/op 1 allocs/op
BenchmarkEncoder_Parallel_Generic_Sonic-16 100000 4981 ns/op 2617.14 MB/s 10747 B/op 4 allocs/op
BenchmarkEncoder_Parallel_Generic_JsonIter-16 100000 11225 ns/op 1161.24 MB/s 13447 B/op 77 allocs/op
BenchmarkEncoder_Parallel_Generic_StdLib-16 100000 55846 ns/op 233.41 MB/s 48215 B/op 827 allocs/op
BenchmarkEncoder_Parallel_Binding_Sonic-16 100000 1767 ns/op 7375.09 MB/s 11514 B/op 4 allocs/op
BenchmarkEncoder_Parallel_Binding_JsonIter-16 100000 4904 ns/op 2657.84 MB/s 9487 B/op 2 allocs/op
BenchmarkEncoder_Parallel_Binding_StdLib-16 100000 3958 ns/op 3293.18 MB/s 9477 B/op 1 allocs/op
BenchmarkDecoder_Generic_Sonic-16 100000 61092 ns/op 213.37 MB/s 49761 B/op 317 allocs/op
BenchmarkDecoder_Generic_StdLib-16 100000 158709 ns/op 82.13 MB/s 50899 B/op 772 allocs/op
BenchmarkDecoder_Generic_JsonIter-16 100000 113397 ns/op 114.95 MB/s 55789 B/op 1068 allocs/op
BenchmarkDecoder_Generic_GoJson-16 100000 108711 ns/op 119.91 MB/s 65679 B/op 944 allocs/op
BenchmarkDecoder_Binding_Sonic-16 100000 32614 ns/op 399.67 MB/s 25174 B/op 38 allocs/op
BenchmarkDecoder_Binding_StdLib-16 100000 150494 ns/op 86.61 MB/s 10560 B/op 207 allocs/op
BenchmarkDecoder_Binding_JsonIter-16 100000 43621 ns/op 298.83 MB/s 14674 B/op 385 allocs/op
BenchmarkDecoder_Binding_GoJson-16 100000 37525 ns/op 347.36 MB/s 22048 B/op 49 allocs/op
BenchmarkDecoder_Parallel_Generic_Sonic-16 100000 10581 ns/op 1231.89 MB/s 49636 B/op 317 allocs/op
BenchmarkDecoder_Parallel_Generic_StdLib-16 100000 67640 ns/op 192.71 MB/s 50909 B/op 772 allocs/op
BenchmarkDecoder_Parallel_Generic_JsonIter-16 100000 60982 ns/op 213.75 MB/s 55809 B/op 1068 allocs/op
BenchmarkDecoder_Parallel_Generic_GoJson-16 100000 51373 ns/op 253.73 MB/s 65718 B/op 945 allocs/op
BenchmarkDecoder_Parallel_Binding_Sonic-16 100000 6995 ns/op 1863.60 MB/s 24890 B/op 38 allocs/op
BenchmarkDecoder_Parallel_Binding_StdLib-16 100000 45269 ns/op 287.94 MB/s 10559 B/op 207 allocs/op
BenchmarkDecoder_Parallel_Binding_JsonIter-16 100000 18416 ns/op 707.82 MB/s 14677 B/op 385 allocs/op
BenchmarkDecoder_Parallel_Binding_GoJson-16 100000 17524 ns/op 743.85 MB/s 22132 B/op 49 allocs/op
BenchmarkDecoder_Generic_Sonic-16 100000 55680 ns/op 234.11 MB/s 49755 B/op 313 allocs/op
BenchmarkDecoder_Generic_StdLib-16 100000 144991 ns/op 89.90 MB/s 50897 B/op 772 allocs/op
BenchmarkDecoder_Generic_JsonIter-16 100000 103197 ns/op 126.31 MB/s 55786 B/op 1068 allocs/op
BenchmarkDecoder_Binding_Sonic-16 100000 28399 ns/op 458.99 MB/s 24984 B/op 34 allocs/op
BenchmarkDecoder_Binding_StdLib-16 100000 132178 ns/op 98.62 MB/s 10560 B/op 207 allocs/op
BenchmarkDecoder_Binding_JsonIter-16 100000 39963 ns/op 326.18 MB/s 14674 B/op 385 allocs/op
BenchmarkDecoder_Parallel_Generic_Sonic-16 100000 10999 ns/op 1185.11 MB/s 49658 B/op 313 allocs/op
BenchmarkDecoder_Parallel_Generic_StdLib-16 100000 67083 ns/op 194.31 MB/s 50907 B/op 772 allocs/op
BenchmarkDecoder_Parallel_Generic_JsonIter-16 100000 54292 ns/op 240.09 MB/s 55809 B/op 1068 allocs/op
BenchmarkDecoder_Parallel_Binding_Sonic-16 100000 5699 ns/op 2287.37 MB/s 24968 B/op 34 allocs/op
BenchmarkDecoder_Parallel_Binding_StdLib-16 100000 35801 ns/op 364.09 MB/s 10559 B/op 207 allocs/op
BenchmarkDecoder_Parallel_Binding_JsonIter-16 100000 13783 ns/op 945.74 MB/s 14678 B/op 385 allocs/op
BenchmarkSearchOne_Gjson-16 100000 8812 ns/op 1477.89 MB/s 0 B/op 0 allocs/op
BenchmarkSearchOne_Jsoniter-16 100000 55845 ns/op 233.20 MB/s 27936 B/op 647 allocs/op
BenchmarkSearchOne_Sonic-16 100000 10422 ns/op 1249.54 MB/s 0 B/op 0 allocs/op
BenchmarkSearchOne_Parallel_Gjson-16 100000 955.1 ns/op 13635.35 MB/s 0 B/op 0 allocs/op
BenchmarkSearchOne_Parallel_Jsoniter-16 100000 18864 ns/op 690.37 MB/s 27942 B/op 647 allocs/op
BenchmarkSearchOne_Parallel_Sonic-16 100000 1420 ns/op 9171.43 MB/s 234 B/op 0 allocs/op
BenchmarkSearchOne_Gjson-16 100000 8992 ns/op 1448.28 MB/s 0 B/op 0 allocs/op
BenchmarkSearchOne_Jsoniter-16 100000 58313 ns/op 223.33 MB/s 27936 B/op 647 allocs/op
BenchmarkSearchOne_Sonic-16 100000 10497 ns/op 1240.61 MB/s 29 B/op 1 allocs/op
BenchmarkSearchOne_Parallel_Gjson-16 100000 1046 ns/op 12449.59 MB/s 0 B/op 0 allocs/op
BenchmarkSearchOne_Parallel_Jsoniter-16 100000 16080 ns/op 809.88 MB/s 27942 B/op 647 allocs/op
BenchmarkSearchOne_Parallel_Sonic-16 100000 1435 ns/op 9074.18 MB/s 285 B/op 1 allocs/op
```
More detail see [decoder/decoder_test.go](https://github.com/bytedance/sonic/blob/main/decoder/decoder_test.go), [encoder/encoder_test.go](https://github.com/bytedance/sonic/blob/main/encoder/encoder_test.go), [ast/search_test.go](https://github.com/bytedance/sonic/blob/main/ast/search_test.go), [ast/parser_test.go](https://github.com/bytedance/sonic/blob/main/ast/parser_test.go)
More detail see [decoder/decoder_test.go](https://github.com/bytedance/sonic/blob/main/decoder/decoder_test.go), [encoder/encoder_test.go](https://github.com/bytedance/sonic/blob/main/encoder/encoder_test.go), [ast/search_test.go](https://github.com/bytedance/sonic/blob/main/ast/search_test.go), [ast/parser_test.go](https://github.com/bytedance/sonic/blob/main/ast/parser_test.go), [ast/node_test.go](https://github.com/bytedance/sonic/blob/main/ast/node_test.go)
## Requirement
- Go 1.15/1.16
- Linux/darwin OS
- Amd64 CPU with AVX/AVX2 instruction set
## How it works
See [INTRODUCTION.md](INTRODUCTION.md)
## Fuzzing
[sonic-fuzz](https://github.com/liuq19/sonic-fuzz) is the repository for fuzzing tests. If you find any bug, please report the issue to sonic.
## Usage
### Marshal/Unmarshal
The behaviors are mostly consistent with encoding/json, except some uncommon escaping and key sorting (see [issue4](https://github.com/bytedance/sonic/issues/4))
The behaviors are mostly consistent with encoding/json, except some uncommon escaping (see [issue4](https://github.com/bytedance/sonic/issues/4))
```go
import "github.com/bytedance/sonic"
var data YourSchema
// Marshal
output, err := sonic.Marshal(&data)
// Unmarshal
err := sonic.Unmarshal(input, &data)
err := sonic.Unmarshal(output, &data)
```
### Use Number/Use Int64
```go
import "github.com/bytedance/sonic/decoder"
input := `1`
var input = `1`
var data interface{}
// default float64
@ -108,6 +110,16 @@ fn := root.Float64()
fm := root.Interface().(float64) // jn == jm
```
### Sort Keys
On account of the performance loss from sorting (roughly 10%), sonic doesn't enable this feature by default. If your component depends on it to work (like [zstd](https://github.com/facebook/zstd)), Use it like this:
```go
import "github.com/bytedance/sonic/encoder"
m := map[string]interface{}{}
v, err := encoder.Encode(m, encoder.SortMapKeys)
```
**Caution**: sonic encode struct in order of its original field declaration, so if you want to sort a struct's keys like the map's, just rewrite your struct.
### Print Syntax Error
```go
import "github.com/bytedance/sonic/decoder"
@ -134,9 +146,9 @@ if err := dc.Decode(&data); err != nil {
```
### Ast.Node
#### Get
Search partial json by given pathes, which must be non-negative integer or string or nil
Sonic/ast.Node is a completely self-contained AST for JSON. It implements serialization and deserialization both, and provides robust APIs for obtaining and modification of generic data.
#### Get/Index
Search partial JSON by given paths, which must be non-negative integer or string or nil
```go
import "github.com/bytedance/sonic"
@ -150,6 +162,7 @@ raw := root.Raw() // == string(input)
root, err := sonic.Get(input, "key1", 1, "key2")
sub := root.Get("key3").Index(2).Int64() // == 3
```
**Tip**: since `Index()` uses offset to locate data, which is faster much than scanning like `Get()`, we suggest you use it as much as possible. And sonic also provides another API `IndexOrGet()` to underlying use offset as well as ensuring the key is matched.
#### Set/Unset
Modify the json content by Set()/Unset()
@ -177,21 +190,23 @@ import (
)
buf, err := root.MarshalJson()
println(string(buf)) //{"key1":[{},{"key2":{"key3":[1,2,3]}}]}
exp, err := json.Marshal(&root) //WARN: use pointer
println(string(buf)) // {"key1":[{},{"key2":{"key3":[1,2,3]}}]}
exp, err := json.Marshal(&root) // WARN: use pointer
println(string(buf) == string(exp)) // true
```
#### Other features
- secondary search: `Get()`, `Index()`, `GetByPath()`
- type assignment: `Int64()`, `Float64()`, `String()`, `Number()`, `Bool()`, `Map()`, `Array()`
- children traversal: `Values()`, `Properties()`
#### APIs
- validation: `Check()`, `Error()`, `Valid()`, `Exist()`
- searching: `Index()`, `Get()`, `IndexPair()`, `IndexOrGet()`, `GetByPath()`
- go-type casting: `Int64()`, `Float64()`, `String()`, `Number()`, `Bool()`, `Map[UseNumber|UseNode]()`, `Array[UseNumber|UseNode]()`, `Interface[UseNumber|UseNode]()`
- go-type packing: `NewRaw()`, `NewNumber()`, `NewNull()`, `NewBool()`, `NewString()`, `NewObject()`, `NewArray()`
- iteration: `Values()`, `Properties()`
- modification: `Set()`, `SetByIndex()`, `Add()`, `Cap()`, `Len()`
## Tips
### Pretouch
Since Sonic uses [golang-asm](https://github.com/twitchyliquid64/golang-asm) as JIT assembler, which is NOT very suitable for runtime compiling, first-hit running of a huge schema may cause request-timeout or even process-OOM. For better stability, we advise to **use `Pretouch()` for huge-schema or compact-memory application** before `Marshal()/Unmarshal()`.
Since Sonic uses [golang-asm](https://github.com/twitchyliquid64/golang-asm) as a JIT assembler, which is NOT very suitable for runtime compiling, first-hit running of a huge schema may cause request-timeout or even process-OOM. For better stability, we advise to **use `Pretouch()` for huge-schema or compact-memory application** before `Marshal()/Unmarshal()`.
```go
import (
"reflect"
@ -203,28 +218,28 @@ func init() {
err := sonic.Pretouch(reflect.TypeOf(v))
}
```
**CAUSION:** use the **STRUCT instead of its POINTER** to `Pretouch()`, otherwish it won't work when you pass the pointer to `Marshal()/Unmarshal()`!
**CAUTION:** use the **STRUCT instead of its POINTER** to `Pretouch()`, otherwise it won't work when you pass the pointer to `Marshal()/Unmarshal()`!
### Pass string or []byte?
For alignment to encoding/json, we provide API to pass `[]byte` as arguement, but the string-to-bytes copy is conducted at the same time considering safety, which may lose performance when origin json is huge. Therefore, you can use `UnmarshalString`, `GetFromString` to pass string, as long as your origin data is string or **nocopy-cast** is safe for your []byte.
For alignment to encoding/json, we provide API to pass `[]byte` as argument, but the string-to-bytes copy is conducted at the same time considering safety, which may lose performance when origin JSON is huge. Therefore, you can use `UnmarshalString`, `GetFromString` to pass a string, as long as your origin data is a string or **nocopy-cast** is safe for your []byte.
### Avoid repeating work
`Get()` overlapping pathes from the same root may cause repeating parsing. Instead of using `Get()` several times, you can use parser and searcher together like this:
```go
import "github.com/bytedance/sonic"
root, err := sonic.GetFromString(_TwitterJson, "statuses", 3, "user")
a = root.GetByPath( "entities","description")
b = root.GetByPath( "entities","url")
c = root.GetByPath( "created_at")
```
No need to worry about the overlaping or overparsing of a, b and c, because the inner parser of their root is lazy-loaded.
### Better performance for generic deserializing
In most cases of fully-load generic json, `Unmarshal()` performs better than `ast.Loads()`. But if you only want to search a partial json and convert it into `interface{}` (or `map[string]interface{}`, `[]interface{}`), we advise you to combine `Get()` and `Unmarshal()`:
In most cases, `Unmarshal()` with schemalized data performs better than `ast.Loads()`/`node.Interface()` with generic data. But if you only have a schema for partial json, you can combine `Get()` and `Unmarshal()` together:
```go
import "github.com/bytedance/sonic"
node, err := sonic.GetFromString(_TwitterJson, "statuses", 3, "user")
var user interface{}
var user User // your partial schema...
err = sonic.UnmarshalString(node.Raw(), &user)
```
Even if you don't have any schema, Use `InterfaceUseNode()` as the container of generic values instead of `Map()` or `Interface()`:
```go
import "github.com/bytedance/sonic"
node, err := sonic.GetFromString(_TwitterJson, "statuses", 3, "user")
user := node.InterfaceUseNode() // use node.Interface() as little as possible
```
Why?
1. using `Interface()` means Sonic must parse all the underlying values, while in most cases you only need several of them;
2. `map[x]` is not efficient enough compared to `array[x]`, but `ast.Node` can use `Index()`, for either array or object node;
3. `map`'s performance degrades a lot once rehashing triggered, but `ast.Node` doesn't has this concern;

Binary file not shown.

Before

Width:  |  Height:  |  Size: 86 KiB

After

Width:  |  Height:  |  Size: 96 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 89 KiB

After

Width:  |  Height:  |  Size: 93 KiB

View file

@ -138,8 +138,6 @@ var atoftests = []atofTest{
// Halfway between 1090544144181609278303144771584 and 1090544144181609419040633126912
// (15497564393479157p+46, should round to even 15497564393479156p+46, issue 36657)
{"1090544144181609348671888949248", "1.0905441441816093e+30", nil},
// slightly above, rounds up
{"1090544144181609348835077142190", "1.0905441441816094e+30", nil},
// Corner case between int64 and float64 for the input
{"9223372036854775807", "9223372036854775807", nil}, // max int64: (1 << 63) - 1

View file

@ -17,13 +17,13 @@
package encoder
import (
"bytes"
"math/rand"
"reflect"
"sort"
"strconv"
"testing"
"unsafe"
`bytes`
`math/rand`
`reflect`
`sort`
`strconv`
`testing`
`unsafe`
)
var keyLen = 15

BIN
introduction-1.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 56 KiB

BIN
introduction-2.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 66 KiB

View file

@ -14,7 +14,7 @@
* limitations under the License.
*/
package sonic
package issue_test
type HugeStruct0 struct {
Field0 map[string]*int64 `json:"field_0,omitempty"`

View file

@ -14,15 +14,17 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
stdjson `encoding/json`
`fmt`
`reflect`
_ `sync`
`testing`
`unsafe`
. `github.com/bytedance/sonic`
`fmt`
`reflect`
_ `sync`
`testing`
`unsafe`
stdjson `encoding/json`
)
func TestLargeMapValue(t *testing.T) {

View file

@ -14,10 +14,11 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
`testing`
. `github.com/bytedance/sonic`
`github.com/davecgh/go-spew/spew`
`github.com/stretchr/testify/require`
@ -28,4 +29,4 @@ func TestIssue101_UnmarshalMWithNumber(t *testing.T) {
err := Unmarshal([]byte("M10"), &v) // MIJ`
spew.Dump(v)
require.Error(t, err)
}
}

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`io/ioutil`
`testing`
@ -40,7 +41,7 @@ func benchmarkEncodeSonic(b *testing.B, data []byte) {
}
func BenchmarkIssue16(b *testing.B) {
data, err := ioutil.ReadFile("testdata/twitterescaped.json")
data, err := ioutil.ReadFile("../testdata/twitterescaped.json")
require.Nil(b, err)
benchmarkEncodeSonic(b, data)
}

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`reflect`
`sync`
`testing`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`testing`
`github.com/bytedance/sonic/decoder`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`encoding/json`
`reflect`
`testing`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`fmt`
`sync`
`testing`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`testing`
`github.com/stretchr/testify/require`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`testing`
`github.com/stretchr/testify/require`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`encoding/json`
`testing`
)

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`encoding/json`
`fmt`
`testing`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`testing`
`github.com/stretchr/testify/require`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`testing`
`github.com/bytedance/sonic/decoder`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`testing`
`github.com/bytedance/sonic/decoder`

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
"encoding/json"
"reflect"
"testing"

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
"encoding/json"
"fmt"
"testing"

View file

@ -14,9 +14,10 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
. `github.com/bytedance/sonic`
`testing`
`math`
`encoding/json`

View file

@ -14,7 +14,7 @@
* limitations under the License.
*/
package sonic
package issue_test
import (
`testing`

BIN
other-langs.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 94 KiB