16
The Perfect Configuration Format? Try Typescript
A lot of ink has been spilled about configuration file formats. Popular formats like JSON, TOML, YAML, and XML each have their advantages and drawbacks. There's an alternative that should be receiving more attention: Typescript!
Before getting into why Typescript is a compelling choice for config files, let's review the common file formats used in a typical web application today:
- XML: Back in the late 90s to mid 2000s this was king both as a messaging protocol and flat file format. It gets a bad rap, but there is a lot to like about XML as a config file format, including great tooling support and a robust schema definition language in XSD.
- JSON: Extremely common as a data interchange format in REST APIs, and also very popular as a config file format. For a lot of developers this strikes the right balance of feature-set and simplicity.
- YAML: This format is packed full of features that I've never needed to use. Due to its focus on terseness, I find its features to be less discoverable versus something more verbose like XML. For me, it's the Perl of configuration formats.
- TOML: Short for Tom's Obvious, Minimal Language - this format doesn't go overboard on features while still being more terse than JSON or XML. However, the hitchdev blog makes a compelling argument about difficulties with using TOML for large config files.
Here's where I think Typescript shines:
My biggest criticism of JSON is that it doesn't support comments.
In 2012, Douglas Crockford, the creator of JSON, had an interesting reason for not supporting comments in his original spec:
I removed comments from JSON because I saw people were using them to hold parsing directives, a practice which would have destroyed interoperability. I know that the lack of comments makes some people sad, but it shouldn't.
Suppose you are using JSON to keep configuration files, which you would like to annotate. Go ahead and insert all the comments you like. Then pipe it through JSMin before handing it to your JSON parser.
It's ten years later, and unsurprisingly, generating JSON config files from some intermediate “annotated” file never caught on. What is surprising is that support for comments never made it into subsequent revisions of the JSON spec.
Since the release of Crockford's original spec, JSON was codified in RFC-4627 and revised in
RFC-7159 and RFC-8259. The latest RFC was published in 2017 and the only change there was defining that JSON must be encoded in UTF-8, so it seems unlikely this will ever change.
Keeping the JSON spec simple has a lot of benefits,
including backwards compatibility and helping avoid the various security issues that come along with a more powerful,
but complex, spec. However, think of the number of applications that would have benefited from the ability to add comments.
Wouldn't the benefits in readability outweigh any concerns around overloading comments for unintended use-cases?
I love strongly-typed languages because it helps me avoid mistakes. This is also why I like using configuration files that have a pre-defined schema. Instead of waiting until runtime to know that my config file is invalid, config files that ship with a schema allow me to see my mistakes inline within my code editor.
JSON supports schemas in the form of JSON Schema. JSON Schema is a bit of a misnomer, since it supports defining schemas for multiple file formats including TOML and YAML. While JSON Schema doesn't feel nearly as bloated as XML's schema equivalents of XML Entities and XSD, it's not something that I find easy to add to a config file in my own applications.
Here's an example JSON Schema for a config file with four properties:
{
"$schema": "https://json-schema.org/draft/2019-09/schema",
"title": "MyAppConfig",
"type": "object",
"properties": {
"locale": {
"type": "string",
},
"timezone": {
"type": "string",
},
"logLevel": {
"type": "string",
"enum": ["trace", "debug", "info", "warn", "error", "fatal"]
},
"environment": {
"type": "string",
"enum": ["local", "dev", "staging", "production"]
}
},
"required": ["locale", "timezone", "logLevel", "environment"]
}
Typescript gives you this same ability to write a schema for your configuration file in a language that's more concise,
easier to read, and that a lot of developers are already familiar with.
Here's that same “schema” in Typescript:
interface MyAppConfig {
locale: string,
timezone: string,
logLevel: 'trace' | 'debug' | 'info' | 'warn' | 'error' | 'fatal',
environment: 'local' | 'dev' | 'staging' | 'production',
}
And here's a valid configuration using this "schema":
export const config: MyAppConfig = {
locale: 'en-US',
timezone: 'America/New_York',
logLevel: 'info',
environment: 'local',
}
It's difficult for a configuration file format to include features that keep the config DRY without also making the config spec complex. Typescript has great facilities for splitting configs into reusable segments.
Consider the common scenario of having separate config files for different environments in your application (e.g. a separate configuration file for your local environment vs staging vs production). Each configuration file will likely have a set of values that are common across all environments,
as well as values that change with each environment.
Using Typescript's "Omit" Utility Type, we can define a single base config that's shared across all environments:
export const baseConfig: Omit<MyAppConfig, 'environment' | 'logLevel'>: {
'locale': 'en-US',
'timezone': 'America/New_York',
}
...and define per-environment values separately:
const localConfig = {
'logLevel': 'debug',
'environment': 'local',
}
export const config: MyAppConfig = { baseConfig, ...localConfig }
Eventually the list of omitted properties would become unwieldy, so for larger config files, we can define an interface explicitly for the base configuration and use Intersection Types to construct the final config schema:
interface BaseConfig {
locale: string,
timezone: string
}
interface EnvironmentSpecificConfig {
logLevel: 'trace' | 'debug' | 'info' | 'warn' | 'error' | 'fatal',
environment: 'local' | 'dev' | 'staging' | 'production',
}
type MyAppConfig = BaseConfig & EnvironmentSpecificConfig
One clear benefit of JSON is the simplicity of its spec.
We know from XML that benign behavior defined in the spec can be a landmine for parser implementers. The Billion Laughs attack exploited XML's support for nested XML entities (basically schema definitions) in an XML document. In a 20 line XML file, a vulnerable XML parser would end up loading billions of entity references and run out of memory.
We're talking about config files here and not file interchange formats where you need to defend against a malicious sender. But nevertheless, preventing footguns is important.
In Typescript, we can ensure the config file is immutable by using the 'as const' construct. Appending this to our config file definition marks all of the properties within the config as 'readonly', and prevents configuration values from changing during the lifetime of the application:
export const config: MyAppConfig = {
'locale': 'en-US',
'timezone': 'America/New_York',
'logLevel': 'info',
'environment': 'local',
} as const
// Syntax error: Cannot assign to environment because it is a read-only property.
config.environment = 'production'
To use this approach today requires that your project also be written in Typescript. It also means you'll need to forgo a separate flat-file for your configuration. Given that Typescript configs can be made immutable to avoid footguns, and given the other benefits I outlined here, I think the trade-off vs standalone flat-files is worth it.
In the future, I think a successor to JSON which uses a subset of Typescript for schema definitions would be pretty amazing. This could potentially stand alone as its own format that's parsed directly, or the Typescript compiler could potentially be parse Typescript-based configs and emit valid JSON.
The latter approach would end up looking a lot like Douglas Crockford's proposal back in 2012, but instead of using JSMin we'd be using tsc
. Maybe he was right all along.
16