JSON Trouble on the Default-lines
Note: I leaned heavily on Google Gemini for this investigation, blog post, and the necessary blog post illustration.
We've been refining our configuration system using JSON schemas as the source of truth. The goal: let users provide a sparse config.yaml
and have our application fill in all the defaults automatically. What we discovered about how different languages handle JSON Schema defaults was illuminating.
The Expectation
The main library we're using is the json_schemer
gem with its insert_property_defaults: true
option. We considered the more popular json-schema
gem (500+ million downloads vs json_schemer
's 80 million), but it's stuck on JSON Schema draft-05 and no longer actively maintained. Multiple GitHub issues confirm draft-06+ support remains incomplete despite years of requests.
We expected json_schemer
would take our JSON schema (complete with default
keywords) and minimal user config, then produce a fully populated configuration hash. e.g. if our schema defined a site
section with host
defaulting to "localhost:3000"
, even an empty user config should result in site: { host: "localhost:3000", ... }
.
The Problem
Simple top-level defaults worked. Defaults within existing, valid config sections worked. But the deeper "scaffolding" – creating missing nested objects and filling their defaults – didn't happen comprehensively.
The issue: interaction between default: {}
(indicating an object should be created if missing) and the required
keyword. If an object was created (e.g., site: {}
) but immediately failed validation due to missing required properties, json_schemer
halted default-filling for properties within that invalid object.
The Experiment
To determine if this was json_schemer
-specific, we tested across three environments:
- Ruby with
json_schemer
- Node.js with
ajv
- Python with
jsonschema
Test schema structure: nested objects with default: {}
and required
fields at multiple levels.
{
"type": "object",
"properties": {
"config_section": {
"type": "object",
"default": {},
"properties": {
"setting1_with_default": { "type": "string", "default": "default_for_setting1" },
"setting2_required_no_default": { "type": "boolean" },
"nested_object": {
"type": "object",
"default": {},
"properties": {
"deep_setting_with_default": { "type": "integer", "default": 42 },
"deep_setting_required_no_default": { "type": "string" }
},
"required": ["deep_setting_required_no_default"]
}
},
"required": ["setting2_required_no_default", "nested_object"]
},
"top_level_prop_with_default": {
"type": "string",
"default": "default_for_top_level"
}
}
}
Ruby Results
With empty input:
Data AFTER validation:
{"config_section" => {}, "top_level_prop_with_default" => "default_for_top_level"}
Validation FAILED. Errors:
1. Path: /config_section, Error: required, Details: {"missing_keys" => ["setting2_required_no_default", "nested_object"]}
config_section
was created as {}
, but setting1_with_default
and deep_setting_with_default
were not applied. The required
failure stopped the cascade.
Node.js Results
ajv
with useDefaults: true
behaved differently:
Data AFTER validation:
{
config_section: {
setting1_with_default: 'default_for_setting1', // Applied!
nested_object: { deep_setting_with_default: 42 } // Applied!
},
top_level_prop_with_default: 'default_for_top_level'
}
Validation FAILED. Errors: [required field errors]
ajv
scaffolded the nested structure with defaults first, then reported validation errors.
Python Results
With custom validator extension, Python mirrored ajv
's behavior: defaults applied before required
errors were flagged.
The Difference
json_schemer
: Validates required
constraints before filling defaults within invalid objects.
ajv
/ Python: Apply defaults globally first, then validate.
Both approaches are valid per the JSON Schema specification, but json_schemer
's behavior meant our strategy wouldn't work.
Our Solution
We're adopting a "Deep Merge" strategy:
- Generate Full Defaults: Traverse our schema to build a complete Ruby hash with all defaults applied
- Load User Config: Read the sparse
config.yaml
- Deep Merge: User settings override defaults, but all defaults are present
- Validate: Pass the complete hash to
json_schemer
for type/format/constraint validation
This ensures predictable, fully defaulted configuration regardless of how sparse the user's input is.
Appendix: Test Scripts
For those interested in replicating these tests, here are the test scripts.
Ruby (json_schemer_test.rb
):
Node.js (ajv_test.js
):
Python (python_jsonschema_test.py
):