JSON Trouble on the Default-lines
Note: I leaned heavily on Google Gemini for this investigation, blog post, and the necessary blog post illustration.
We've been refining our configuration system using JSON schemas as the source of truth. The goal: let users provide a sparse config.yaml and have our application fill in all the defaults automatically. What we discovered about how different languages handle JSON Schema defaults was illuminating.
The Expectation
The main library we're using is the json_schemer gem with its insert_property_defaults: true option. We considered the more popular json-schema gem (500+ million downloads vs json_schemer's 80 million), but it's stuck on JSON Schema draft-05 and no longer actively maintained. Multiple GitHub issues confirm draft-06+ support remains incomplete despite years of requests.
We expected json_schemer would take our JSON schema (complete with default keywords) and minimal user config, then produce a fully populated configuration hash. e.g. if our schema defined a site section with host defaulting to "localhost:3000", even an empty user config should result in site: { host: "localhost:3000", ... }.
The Problem
Simple top-level defaults worked. Defaults within existing, valid config sections worked. But the deeper "scaffolding" – creating missing nested objects and filling their defaults – didn't happen comprehensively.
The issue: interaction between default: {} (indicating an object should be created if missing) and the required keyword. If an object was created (e.g., site: {}) but immediately failed validation due to missing required properties, json_schemer halted default-filling for properties within that invalid object.
The Experiment
To determine if this was json_schemer-specific, we tested across three environments:
- Ruby with
json_schemer - Node.js with
ajv - Python with
jsonschema
Test schema structure: nested objects with default: {} and required fields at multiple levels.
{
"type": "object",
"properties": {
"config_section": {
"type": "object",
"default": {},
"properties": {
"setting1_with_default": { "type": "string", "default": "default_for_setting1" },
"setting2_required_no_default": { "type": "boolean" },
"nested_object": {
"type": "object",
"default": {},
"properties": {
"deep_setting_with_default": { "type": "integer", "default": 42 },
"deep_setting_required_no_default": { "type": "string" }
},
"required": ["deep_setting_required_no_default"]
}
},
"required": ["setting2_required_no_default", "nested_object"]
},
"top_level_prop_with_default": {
"type": "string",
"default": "default_for_top_level"
}
}
}
Ruby Results
With empty input:
Data AFTER validation:
{"config_section" => {}, "top_level_prop_with_default" => "default_for_top_level"}
Validation FAILED. Errors:
1. Path: /config_section, Error: required, Details: {"missing_keys" => ["setting2_required_no_default", "nested_object"]}
config_section was created as {}, but setting1_with_default and deep_setting_with_default were not applied. The required failure stopped the cascade.
Node.js Results
ajv with useDefaults: true behaved differently:
Data AFTER validation:
{
config_section: {
setting1_with_default: 'default_for_setting1', // Applied!
nested_object: { deep_setting_with_default: 42 } // Applied!
},
top_level_prop_with_default: 'default_for_top_level'
}
Validation FAILED. Errors: [required field errors]
ajv scaffolded the nested structure with defaults first, then reported validation errors.
Python Results
With custom validator extension, Python mirrored ajv's behavior: defaults applied before required errors were flagged.
The Difference
json_schemer: Validates required constraints before filling defaults within invalid objects.
ajv / Python: Apply defaults globally first, then validate.
Both approaches are valid per the JSON Schema specification, but json_schemer's behavior meant our strategy wouldn't work.
Our Solution
We're adopting a "Deep Merge" strategy:
- Generate Full Defaults: Traverse our schema to build a complete Ruby hash with all defaults applied
- Load User Config: Read the sparse
config.yaml - Deep Merge: User settings override defaults, but all defaults are present
- Validate: Pass the complete hash to
json_schemerfor type/format/constraint validation
This ensures predictable, fully defaulted configuration regardless of how sparse the user's input is.
Appendix: Test Scripts
For those interested in replicating these tests, here are the test scripts.
Ruby (json_schemer_test.rb):
Node.js (ajv_test.js):
Python (python_jsonschema_test.py):
