Something I miss after emigrating from the UK to the USA is the using the name of a yeast extract based spread as an adjective. To describe something as “marmite” is to indicate that people either “love it” or “hate it”, and if you ask anyone from the UK, Australia or New Zealand how they feel about Marmite, Vegemite or its cousins, you’ll see visceral reactions on both sides of the marmite spectrum.
YAML is the “marmite” of infrastructure as code. If you ask a software engineer or DevOps practitioner what they think of YAML, they may tell you they write their entire production infrastructure in thousands of lines of YAML, or they could claw our their eyes and run screaming from the room. There doesn’t seem to be much of a middle ground. If you’ve read this blog before and even as recently as this week you’ll be aware I fall firmly on the “hate it” side of what I’m going to call “the marmite spectrum”.
The biggest problem I have with YAML is not the language itself, but the way it’s shoehorned into situations it has no reasonable right to be involved in. One of those situations is complex infrastructure as code definitions.
There are a multitude of infrastructure as code tools out there that will allow you to use YAML or other configuration formats to describe infrastructure as code, so Pulumi adding support for YAML as a language came as a surprise to many:
When our CTO, Luke Hoban told us all we were adding YAML to the roadmap, I had my own doubts:
So why am I now writing a blog post talking about me learning to love YAML? Let’s talk about it.
Let’s talk about the YAML
Pulumi has long been the refuge of people not wanting to use YAML in their infrastructure definitions. Our marketing content was focused entirely on the idea you could use “familiar” or general purpose, expressive languages to define your infrastructure. I’ve talked with hundreds of users who repeatedly told me that not having YAML support was enlightening.
To understand why YAML is now a supported language, we first need to look at the problem we’re trying to solve, and those problems invariably come from our users or potential users.
The two main talking points we’re faced with during the Pulumi adoption or sales cycle and in the infrastructure as code community are related to the use of general purpose languages. The first, is general purpose languages aren’t right for infrastructrue, the second is that general purpose languages are too complex for the problem at hand.
The abstraction argument
The abstraction argument goes a little bit like this:
Software developers know nothing about infrastructure, and when they write infrastructure as code in the same language they’re writing their applications in they make it really complex. I then have to fix it, and thats really really hard.
Lets put aside for this post my intense frustration with the ivory tower, “I’m better than you because I understand the magical incantation of IAM roles” bullshit this is and focus more on the argument itself.
This line of thinking often continues with the idea that configuration languages are the perfect antidote to this monstrous complexity. I agree that configuration languages provide guard rails to complexity, but this entire world view ignores one truth.
Whether we like it or not, infrastructure is complex.
We all know one of those people who’ll tell you that the answer to all infrastructure problems is a bunch of EC2 instances, a golden AMI and a load balancer. Those people might be right, but if take a look at your infrastructure right now, could you solve the problems in your organization by going back to building AMIs and sticking them behind a load balancer? Even if you can, do you really want to? No, I thought not.
If you’re using a configuration language to define your infrastructure, you’ve no doubt already run into this. We can see this fait accompli by watching the evolution of HCL as a language. HCL started as a simple mechanism to express JSON files, and now you can define abstractions (modules), use conditionals (sort of, although if you want optional parts of your infrastructure, you’ll need to abuse the count
option) and leverage loops. Its further apparent in Helm which uses Go templates to allow you to express the complexity that inherently exists in Kubernetes deployments.
Writing Terraform or Helm charts can leave you in a weird twilight zone where you feel like you’re writing software but you’re just not quite there. Don’t believe me? Take a look at the AWS Transit Gateway Module for Terraform. It has code like this in there:
locals {
# List of maps with key and route values
vpc_attachments_with_routes = chunklist(flatten([
for k, v in var.vpc_attachments : setproduct([{ key = k }], v.tgw_routes) if var.create_tgw && can(v.tgw_routes)
]), 2)
tgw_default_route_table_tags_merged = merge(
var.tags,
{ Name = var.name },
var.tgw_default_route_table_tags,
)
vpc_route_table_destination_cidr = flatten([
for k, v in var.vpc_attachments : [
for rtb_id in try(v.vpc_route_table_ids, []) : {
rtb_id = rtb_id
cidr = v.tgw_destination_cidr
}
]
])
}
Or perhaps this Helm chart defining a Prometheus Node Exporter for Kubernetes is more your style:
containers:
- name: {{ template "prometheus.name" . }}-{{ .Values.nodeExporter.name }}
image: "{{ .Values.nodeExporter.image.repository }}:{{ .Values.nodeExporter.image.tag }}"
imagePullPolicy: "{{ .Values.nodeExporter.image.pullPolicy }}"
args:
- --path.procfs=/host/proc
- --path.sysfs=/host/sys
{{- if .Values.nodeExporter.hostRootfs }}
- --path.rootfs=/host/root
{{- end }}
{{- if .Values.nodeExporter.hostNetwork }}
- --web.listen-address=:{{ .Values.nodeExporter.service.hostPort }}
{{- end }}
{{- range $key, $value := .Values.nodeExporter.extraArgs }}
{{- if $value }}
- --{{ $key }}={{ $value }}
{{- else }}
- --{{ $key }}
{{- end }}
{{- end }}
ports:
- name: metrics
{{- if .Values.nodeExporter.hostNetwork }}
containerPort: {{ .Values.nodeExporter.service.hostPort }}
{{- else }}
containerPort: 9100
{{- end }}
hostPort: {{ .Values.nodeExporter.service.hostPort }}
You can argue (and many do) that these mechanisms are the perfect balance of flexibility and control. It is my opinion that they are just bolt ons to configuration to make them closer to programming languages to try and meet users where their needs are.
The complexity argument
I’ve been very open about the fact that I don’t consider myself a talented software engineer. I’ve said before that I didn’t truly understand programming constructs like Object orientation until I joined Pulumi. What I mean to say here is that I get this argument.
What I don’t understand about this argument is that people seem unwilling to admit that the world is fundamentally changing. It’s my opinion that the people who truly loathe Pulumi don’t want to admit they don’t understand the languages it supports very well. They’re worried that adopting Pulumi is going to put them out of a job. I could never prove this of course, but I believe it because I also believed it.
Here’s the truths nobody wants to admit.
There are more “software engineers” than there are “infrastructure engineers” (or DevOps engineers, S