encoding block defines how Tangent serializes and compresses records before writing them to a sink (e.g., S3, File).It controls file format, schema handling (for Avro/Parquet), and data compression.
Schema
Output format for serialized records.
Defaults to
Defaults to
ndjson.Options- ndjson— newline-delimited JSON; one record per line
- json— standard JSON array of records
- avro— Apache Avro binary encoding (requires schema)
- parquet— Apache Parquet columnar format (requires schema)
Path to a schema file required by 
Can point to a local
avro and parquet formats.Can point to a local
.avsc (Avro) or .json (Arrow/Parquet) schema.Configures compression for the encoded data.
Defaults toNotes
Defaults to
zstd with level 3.Options- none— no compression
- gzip— standard Gzip (- .gzextension, level 6 default)
- zstd— Zstandard (- .zstextension, level 3 default)
- snappy— Snappy block compression (Avro-only)
- deflate— Deflate stream compression (Avro-only)
- For Avro and Parquet, compression applies to data blocks inside the file, not the file as a whole.
- Tangent automatically appends compression extensions when applicable (e.g. .gz,.zst).
Examples
NDJSON (default)
tangent.yaml
Parquet with Arrow schema
tangent.yaml
Avro with explicit schema
tangent.yaml
Defaults Summary
| Setting | Default | Description | 
|---|---|---|
| encoding.type | ndjson | Newline-delimited JSON | 
| encoding.schema | — | Required for Avro/Parquet | 
| compression.type | zstd | Data compression method | 
| compression.level | 3(zstd) /6(gzip) | Compression strength |