Why Use Command-Line Data Format Processors?

Modern DevOps workflows involve juggling YAML, JSON, TOML, CSV, and XML configuration files. Manually editing these formats is error-prone, and writing custom scripts for each transformation is inefficient. Tools like yq and dasel provide jq-like query and transformation capabilities across multiple data formats, making them indispensable for Infrastructure as Code, CI/CD pipelines, and configuration management.

These processors let you extract values, modify configurations, and convert between formats without writing Python or JavaScript scripts. They’re particularly valuable in Kubernetes environments where YAML manipulation is a daily task.

For developers who work with structured data, see our CSV processing guide for tabular data workflows, and our shell script linting tools for ensuring configuration script quality.

Feature Comparison

Featureyq (mikefarah/yq)dasel
LanguageGoGo
Stars13,600+7,200+
YAML support✅ Native✅ Native
JSON support
TOML support
CSV support✅ (read/write)
XML support
jq syntax compatibilityPartialOwn DSL
In-place editing✅ (-i flag)✅ (-w flag)
Multiple documents
Streaming YAML
Binary size~10MB~7MB
Expression languagejq-likedasel selector

yq: The Swiss Army Knife for YAML

yq by Mike Farah is the most popular YAML processor, offering a jq-inspired syntax for querying and transforming YAML, JSON, XML, and TOML files. Its expression language will feel familiar to anyone who’s used jq.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
# Install yq
wget https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 -O /usr/local/bin/yq
chmod +x /usr/local/bin/yq
# or: brew install yq / snap install yq

# Read a value from YAML
yq '.metadata.name' deployment.yaml

# Update a YAML value in-place
yq -i '.spec.replicas = 3' deployment.yaml

# Convert between formats
yq -p json -o yaml config.json > config.yaml

# Query nested structures
yq '.spec.containers[].image' pod.yaml

# Merge two YAML files
yq eval-all 'select(fileIndex == 0) * select(fileIndex == 1)' base.yaml overlay.yaml > merged.yaml

For Kubernetes workflows, yq is particularly powerful:

1
2
3
4
5
6
7
8
# Extract all container images from a deployment
yq '.spec.template.spec.containers[].image' deploy.yaml

# Set resource limits on all containers
yq -i '.spec.template.spec.containers[].resources.limits.memory = "256Mi"' deploy.yaml

# Convert Helm values to JSON for API consumption
yq -o json values.yaml > values.json

dasel: The Multi-Format Selector

dasel (Data Selector) takes a different approach with its own intuitive selector syntax that works identically across YAML, JSON, TOML, CSV, and XML. This consistency makes it easier to learn and reduces context-switching between formats.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
# Install dasel
wget -O /usr/local/bin/dasel https://github.com/TomWright/dasel/releases/latest/download/dasel_linux_amd64
chmod +x /usr/local/bin/dasel
# or: brew install dasel / go install github.com/tomwright/dasel/v2/cmd/dasel@latest

# Read a value (same syntax for all formats)
dasel -f config.yaml '.metadata.name'
dasel -f config.json '.metadata.name'
dasel -f config.toml '.metadata.name'

# Update a value
dasel put string -f config.yaml '.spec.replicas' '5'

# Delete a key
dasel delete -f config.yaml '.metadata.annotations.deprecated'

# Convert between formats
dasel -f input.json -p json -o yaml > output.yaml

# Select with conditions
dasel -f data.yaml 'all().containers.filter(equal(name,nginx)).image'

The consistent selector syntax is dasel’s strongest advantage:

1
2
3
4
# These work identically regardless of input format
dasel -f data.yaml '.servers.[0].host'
dasel -f data.json '.servers.[0].host'
dasel -f data.toml '.servers.[0].host'

Docker Deployment

Both tools are single static binaries, ideal for containerized environments:

1
2
3
4
5
FROM alpine:latest
RUN wget -O /usr/local/bin/yq https://github.com/mikefarah/yq/releases/latest/download/yq_linux_amd64 && \
    chmod +x /usr/local/bin/yq && \
    wget -O /usr/local/bin/dasel https://github.com/TomWright/dasel/releases/latest/download/dasel_linux_amd64 && \
    chmod +x /usr/local/bin/dasel

CI/CD Integration

Both tools shine in CI/CD pipelines for dynamic configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
# GitHub Actions: Update deployment config
- name: Bump replicas for production
  run: |
    yq -i '.spec.replicas = 5' k8s/deployment.yaml
    kubectl apply -f k8s/deployment.yaml

# GitLab CI: Extract version from package.json
- script: |
    VERSION=$(dasel -f package.json '.version' -p json)
    echo "Building version $VERSION"

Choosing the Right Tool

  • yq is ideal for Kubernetes-heavy teams who already use jq and appreciate the familiar expression language. Its streaming YAML support handles large multi-document files efficiently.
  • dasel excels in multi-format environments where you need a single tool for YAML, JSON, TOML, CSV, and XML. Its consistent selector syntax reduces the mental overhead of format-specific query languages.

Historical Context and Evolution

The need for command-line YAML processing grew alongside the adoption of Kubernetes and Docker Compose. yq was created by Mike Farah in 2017 to fill the gap left by jq’s JSON-only limitation. It borrowed jq’s expression syntax while adding native YAML awareness — a design choice that made it immediately accessible to millions of developers already familiar with jq.

dasel emerged in 2020 with a different philosophy: instead of adapting jq’s complex expression language to multiple formats, it designed a simple selector syntax from the ground up that works identically across YAML, JSON, TOML, CSV, and XML. This reduced the learning curve dramatically — where yq requires learning jq expressions, dasel’s dot-notation selectors are intuitive from the first use.

The tools are complementary rather than competitive. Teams that already use jq heavily tend to prefer yq for YAML work, while teams new to data format processing often find dasel’s consistent multi-format approach more approachable. Both are now essential parts of the modern DevOps toolkit.

Real-World Kubernetes Workflows

In production Kubernetes environments, these tools handle critical configuration tasks that would otherwise require error-prone manual YAML editing:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
# Extract all image tags for security scanning
yq '.spec.template.spec.containers[].image' deploy.yaml | sort -u > images.txt

# Batch update resource limits across multiple deployments
for f in k8s/deployments/*.yaml; do
  yq -i '.spec.template.spec.containers[].resources.limits.cpu = "500m"' "$f"
done

# Validate Helm values against schema
dasel -f values.yaml '.replicaCount' && echo "Valid config"

# Generate environment-specific overlays
yq eval-all 'select(fileIndex==0) * select(fileIndex==1)' base.yaml prod.yaml > final.yaml

Error Handling and Validation

Both tools provide meaningful error messages for malformed input, but they differ in validation depth. yq validates YAML syntax strictly and provides detailed line numbers for parse errors. dasel focuses on selector correctness — it will tell you if your selector path doesn’t match the document structure, which is invaluable for catching typos in CI/CD pipeline scripts.

For related configuration management tools, see our shell script linting guide and our CSV processing tools comparison.

Security Considerations for Configuration Processing

When processing configuration files that may contain secrets (API keys, database passwords, service tokens), consider these security practices:

  1. Never commit processed output containing secrets — both yq and dasel can inadvertently expose secrets in their output. Always use .gitignore patterns for processed files and verify output doesn’t contain sensitive data before sharing.

  2. Use environment variable substitution — rather than hardcoding secrets in YAML files, both tools can work with templated configurations:

1
2
3
4
5
# Replace placeholder with environment variable
yq -i '.database.password = strenv(DB_PASS)' config.yaml

# dasel approach using env substitution
DB_PASS=$DB_PASSWORD dasel put string -f config.yaml '.database.password' "$DB_PASS"
  1. Validate before applying — always run a dry-run validation before modifying production configurations. Both tools support output-only mode (without -i/-w) for previewing changes.

  2. Use checksums for integrity — after processing configuration files, generate and store checksums to detect unintended modifications in CI/CD pipelines.

These practices are especially important in GitOps workflows where configuration changes are automatically applied to production environments. For more on secure configuration management, see our shell script linting guide which covers validation techniques for configuration-related scripts.

FAQ

How do yq and dasel compare to jq?

jq is JSON-only but has the most powerful expression language. yq extends jq-like syntax to YAML/XML/TOML. dasel uses its own simpler selector DSL that works identically across all formats. For JSON-only work, jq is still the best choice. For multi-format pipelines, yq or dasel are better.

Can I use these for Kubernetes manifest management?

Absolutely. yq is particularly popular in the Kubernetes ecosystem for patching manifests, extracting values, and converting between formats. Both tools support in-place editing which is essential for CI/CD pipeline automation.

Which has better performance on large files?

yq handles large YAML files more efficiently, especially with streaming YAML support. dasel loads the entire document into memory. For files under 10MB, the difference is negligible.

Do they support anchors and aliases in YAML?

yq has robust support for YAML anchors, aliases, and merge keys. dasel’s YAML support covers these features but with some limitations on complex anchor graphs. Test with your specific YAML structure before committing to a tool.

Can I use these as Go libraries in my own tools?

Yes — both are written in Go and expose their core functionality as importable packages. yq’s pkg/yqlib and dasel’s core module can be embedded in custom Go applications for programmatic data transformation.


💰 想测试你的市场判断力?我用 Polymarket 做预测市场交易——这是全球最大的预测市场平台,从大选结果到技术监管时间线,什么都可以押注。和赌博不同,这是真正的信息市场:你懂的信息越多,胜率越高。我靠预测技术相关事件的走向已经赚了不少。用我的邀请链接注册:Polymarket.com