Serialization is at the heart of every networked Go application. Whether you’re building gRPC microservices, persisting application state, or passing messages between goroutines, your choice of encoding format directly impacts throughput, latency, and schema evolution. Go’s standard library provides encoding/json, encoding/gob, and encoding/xml, but a rich ecosystem of third-party libraries offers dramatically better performance and richer type support.
This guide compares six serialization approaches for Go: encoding/gob (Go-native binary), msgp (MessagePack code generator), protobuf-go (Protocol Buffers), goavro/hamba/avro (Apache Avro), segmentio/encoding (high-performance JSON), and the standard encoding/json. We focus on throughput, schema evolution, and cross-language interoperability.
Performance Comparison
| Library | Format | Schema Required | Cross-Language | Speed (relative) | Alloc/MB |
|---|---|---|---|---|---|
| encoding/gob | Binary | No | ❌ Go-only | 1.0x (baseline) | High |
| msgp | Binary (MessagePack) | Code-gen | ✅ Yes | 3-8x faster | Low |
| protobuf-go | Binary (protobuf) | .proto file | ✅ Yes | 2-6x faster | Low |
| goavro | Binary (Avro) | Schema file | ✅ Yes | 1.5-4x faster | Medium |
| segmentio/encoding | Text (JSON) | No | ✅ Yes | 2-4x faster | Low |
| encoding/json | Text (JSON) | No | ✅ Yes | 1.0x (baseline) | High |
encoding/gob: Go-Native Binary Serialization
encoding/gob is Go’s built-in binary encoding format. It’s self-describing (no separate schema file needed) and works with any Go type that satisfies the encoding/gob.GobEncoder interface. The key advantage: zero configuration.
| |
Gob’s main limitation is that it’s Go-only. You cannot decode gob-encoded data from Python or JavaScript. It’s ideal for Go-to-Go communication (RPC between services, local caching, session persistence) but unsuitable for APIs consumed by external clients.
msgp: MessagePack Code Generation
msgp uses code generation to produce highly optimized MessagePack serializers for your Go types. Unlike reflection-based approaches, msgp generates explicit read/write methods that avoid allocation overhead.
| |
After running msgp -file=types.go, you get:
| |
MessagePack is a binary format with implementations in 50+ languages, making msgp-generated Go code interoperable with Python, JavaScript, Rust, and more. The code generation approach delivers the best throughput among all options compared here.
protobuf-go: Protocol Buffers for gRPC and Beyond
Protocol Buffers is Google’s language-neutral serialization format. The Go implementation (google.golang.org/protobuf) is the canonical choice for gRPC services, but it works equally well for standalone serialization.
| |
| |
Protobuf’s strongest feature is schema evolution — you can add fields, deprecate old ones, and change types in controlled ways without breaking existing consumers. The .proto file serves as living documentation and source of truth for your data contracts.
Avro for Go: Schema-Based Evolution
Apache Avro provides rich schema evolution with two Go libraries: linkedin/goavro (1,064 stars, battle-tested at LinkedIn scale) and hamba/avro (510 stars, more idiomatic Go API). Both encode data alongside a schema to enable reader-schema/writer-schema resolution.
| |
Avro is particularly popular in the Hadoop/Kafka ecosystem, where Confluent Schema Registry manages schemas and ensures compatibility across producers and consumers. If your Go services consume from or produce to Kafka topics with Avro encoding, hamba/avro or goavro are essential.
segmentio/encoding: High-Performance JSON
segmentio/encoding replaces the reflection-heavy standard encoding/json with generated code that delivers 2-4x throughput improvement while maintaining full JSON compatibility.
| |
The API is a drop-in replacement for encoding/json — change one import and your code compiles. Under the hood, it uses unsafe pointer arithmetic and custom assembly to reduce allocations and CPU cycles. It’s ideal for REST APIs serving JSON at high throughput.
Decision Guide
Use encoding/gob for Go-to-Go RPC, caching, and internal state persistence where cross-language compatibility isn’t needed. Zero configuration, zero code generation.
Use msgp when you need maximum throughput with cross-language support. The code generation approach produces the fastest serializers, and MessagePack has broad ecosystem support.
Use protobuf-go when building gRPC services or when you need formal schema evolution with automated compatibility checks. The .proto files serve as the single source of truth for your team’s data contracts.
Use Avro (hamba/avro or goavro) when your data flows through Kafka with Confluent Schema Registry, or when you need reader/writer schema resolution for backward compatibility.
Use segmentio/encoding as a drop-in replacement for encoding/json when JSON is required (REST APIs, webhooks) but you need better performance.
For a broader comparison of serialization frameworks across languages, see our schema serialization frameworks guide. For C++-specific serialization, our C++ serialization comparison covers Cereal, Boost.Serialization, and Bitsery.
Real-World Performance Benchmarks
While theoretical benchmarks vary by payload size and structure, here are representative throughput numbers from community benchmarks on Go 1.22 with 1KB message payloads:
| Library | Encode (MB/s) | Decode (MB/s) | Allocations per op |
|---|---|---|---|
| msgp (code-gen) | 850 | 720 | 4 |
| protobuf-go | 620 | 580 | 12 |
| segmentio/encoding/json | 480 | 440 | 8 |
| encoding/json (stdlib) | 180 | 160 | 42 |
| encoding/gob | 140 | 130 | 55 |
| goavro | 380 | 350 | 18 |
| hamba/avro | 410 | 380 | 14 |
These numbers illustrate why code generation (msgp, protobuf-go) dominates raw encoding speed — they skip reflection entirely. The expensive part of encoding/json and encoding/gob is the reflection-based type inspection, which generates substantial GC pressure from allocations.
For microservices processing millions of messages per second, the difference between 180 MB/s (stdlib JSON) and 850 MB/s (msgp) translates to roughly 5x fewer CPU cores for the same workload. In cloud environments where compute cost scales linearly with core count, this is a meaningful operational expense reduction.
When Schema-Free Matters
Not every project can afford the operational overhead of maintaining .proto files, running protoc in CI, and coordinating schema changes across teams. For smaller teams or rapid prototyping, segmentio/encoding offers an excellent middle ground: 2-3x faster than stdlib JSON with zero code generation, zero schema files, and a drop-in API. You get the performance benefits without the workflow friction.
FAQ
When should I use gob instead of protobuf for internal services?
Use gob when both producer and consumer are written in Go and you value simplicity over cross-language compatibility. Gob requires no .proto files, no code generation step, and no dependency on protoc. It’s ideal for Go microservices communicating over gRPC alternatives like net/rpc or NATS.
Is msgp faster than protobuf-go in Go?
Yes, typically 1.5-2x faster for encode/decode operations because msgp generates Go-specific code with zero allocations, while protobuf-go uses a more general-purpose runtime. However, protobuf offers superior schema evolution and broader ecosystem tooling.
Can I use Avro without Kafka?
Absolutely. Avro is a standalone serialization format — you can use it for any data storage or message passing. The hamba/avro library works independently of Kafka, though Avro’s strongest ecosystem integrations are in the Kafka/Hadoop world.
Does segmentio/encoding support all encoding/json features?
It supports the vast majority of encoding/json features, including custom marshalers (json.Marshaler), json.RawMessage, omitempty, and struct tags. However, edge cases like json.Number and certain nested interface patterns may behave slightly differently. Test thoroughly if you have complex JSON structures.
What’s the serialized size difference between these formats?
Protobuf and msgp produce the most compact binary output (often 3-5x smaller than JSON). Avro is comparable to protobuf when schemas are shared out-of-band but slightly larger with embedded schemas. Gob includes type metadata making it the least compact for small messages. JSON is the largest but most human-readable.
💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到技术监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测技术相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com