Most formatting languages display data in a non-human readable format. Even JSON, the most popular data format in use, has poor code readability.
YAML is an alternative to JSON that formats data in a natural, easy-to-read, and concise manner.
This article will introduce you to the YAML markup language. We cover the basic concepts behind this markup language, explain its key features, and show what YAML offers to DevOps teams.
What is YAML?
YAML is a data serialization language. Back when it came out in 2001, YAML stood for “Yet Another Markup Language.” The acronym was later changed to “YAML Ain’t Markup Language” to emphasize that the language is intended for data and not documents.
It is not a programming language in the true sense of the word. YAML files store information, so they do not include actions and decisions.
Unlike XML or JSON, YAML presents data in a way that makes it easy for a human to read. The simple syntax does not impact the language’s capabilities. Any data or structure added to an XML or JSON file can also be stored in YAML.
Besides human-readable code, YAML also features:
- Cross-language data portability
- A consistent data model
- One-pass processing
- Ease of implementation and use
Users can write code for reading and generating YAML in any programming language. The extensions in YAML are .yaml and .yml. Both extensions stand for the same file type.
Yaml Features
YAML has several features that make it an excellent option for data formatting.
Multi-Document Support
Users can add multiple documents to a single YAML file. Separate different documents with three dashes (---
), like this:
--- time: 19:04:12 player: playerOne action: strike (miss) --- time: 20:03:47 player: playerTwo action: strike (hit) ...
Three dots (“…“) mark the end of a document without starting a new one.
Built-In Comments
YAML allows users to add comments to their code. YAML comments start with the #
symbol and do not have to be on a separate line:
key: #This is a single line comment - value line 10 #This is a #multi-line comment - value line 20
Clean Syntax
Like Python, YAML relies on indentations to show the levels and structure in the data. There are no usual format symbols, such as braces, square brackets, closing tags, or quote marks. The syntax is clean and easy to scan through.
The clean syntax is why several popular tools rely on YAML, such as Ansible, Kubernetes, and OpenStack.
No Tabs
YAML does not allow tabs. Spaces are the only way to achieve indentation.
It is good practice to display whitespace characters in your text editor to prevent accidental uses of tabs.
Precise Feedback
YAML feedback refers to specific lines in the file. You can quickly find and fix errors when you know where to look.
Support for Complex Structures
YAML provides the ability to reference other data objects. With referencing, you can write recursive data in the YAML file and build advanced data structures.
Explicit Data Types with Tags
YAML auto-detects the type of data, but users are free to specify the type they need. To specify the type of data, you include a “!!” symbol:
# The value should be an int: is-an-int: !!int 5.6 # Turn any value to a string: is-a-str: !!str 90.88 # The next value should be a boolean: is-a-bool: !!bool yes
No Executable Commands
YAML is a data-representation format. There are no executable commands, which makes the language highly secure when exchanging files with third parties.
If a user wishes to add an executable command, YAML must be integrated with other languages. Add Perl parsers, for example, to enable Perl code execution.
How YAML Works
YAML matches native data structures of agile methodology and its languages, such as Perl, Python, PHP, Ruby, and JavaScript. It also derives features from other languages:
- Scalars, lists, and arrays come from Perl.
- The three-dash separator comes from MIME.
- Whitespace wrapping comes from HTML.
- Escape sequences come from C.
YAML supports all essential data types, including nulls, numbers, strings, arrays, and maps. It recognizes some language-specific data types, such as dates, timestamps, and special numerical values.
Colon and a single space define a scalar (or a variable):
string: "17" integer: 17 float: 17.0 boolean: No
A |
character denotes a string that preserves newlines and a >
character denotes a string that folds newlines:
data: | Every one Of these Newlines Will be Broken up. data: > This text is wrapped and will be formed into a single paragraph.
Basics aside, there are two vital types of structures you need to know about in YAML:
- YAML lists
- YAML maps
Use these two structures for formatting in YAML.
YAML Maps (With Examples)
Maps associate name-value pairs, a vital aspect of setting up data. A YAML configuration file can start like this:
--- apiVersion: v3 kind: Pod
Here is the JSON equivalent of the same file opening:
{ "apiVersion": "v3", "kind": "Pod" }
Both codes have two values, v3
and Pod
, mapped to two keys, apiVersion
and kind
. In YAML, the quotation marks are optional, and there are no brackets.
This markup language allows you to specify more complex structures by creating a key that maps to another map rather than a string. See the YAML example below:
--- apiVersion: v3 kind: Pod metadata: name: rss-site labels: app: web
We have a key (metadata
) with two other keys as its value name
and labels
. The labels
key has another map as its value. YAML allows you to nest maps as far as you need to.
The number of spaces does not matter, but it must be consistent throughout the file. In our example, we used two spaces for readability. Name
and labels
have the same indentation level, so the processor knows both are part of the same map.
The same mapping would look like this in JSON:
{ "apiVersion": "v3", "kind": "Pod", "metadata": { "name": "rss-site", "labels": { "app": "web" } } }
YAML Lists (With Examples)
A YAML list is a sequence of items. For example:
args: - shutdown - "1000" - msg - "Restart the system"
A list may contain any number of items. An item starts with a dash, while indentation separates it from the parent. You can also store maps within a list:
--- apiVersion: v3 kind: Pod metadata: name: rss-site labels: app: web spec: containers: - name: front-end image: nginx ports: - containerPort: 80 - name: rss-reader image: nickchase/rss-php-nginx:v1 ports: - containerPort: 88
We have a list of containers (objects). Each consists of a name, an image, and a list of ports. Each item under ports is a map that lists the containerPort
and its value.
Our example would look like this in JSON:
{ “apiVersion”: “v3”, “kind”: “Pod”, “metadata”: { “name”: “rss-site”, “labels”: { “app”: “web” } }, “spec”: { “containers”: [{ “name”: “front-end”, “image”: “nginx”, “ports”: [{ “containerPort”: “80” }] }, { “name”: “rss-reader”, “image”: “nickchase/rss-php-nginx:v1”, “ports”: [{ “containerPort”: “88” }] }] } }
What is the difference between YAML and JSON?
JSON and YAML are used interchangeably, and they serve the same purpose. However, there are significant differences between the two:
YAML | JSON |
---|---|
Easy for a human to read | Hard for a human to read |
Allows comments | No comments |
Space characters determine hierarchy | Brackets and braces denote arrays and objects |
String quotes support single and double quotes | Strings must be in double quotes |
The root node can be any of the valid data types | The root node is either an object or an array |
The main difference between YAML and JSON is code readability. The best example of this is the official YAML homepage. That website is itself valid YAML, yet it is easy for a human to read.
YAML is a superset of JSON. If you paste JSON directly into a YAML file, it resolves the same through YAML parsers. Users can also convert most documents between the two formats. It is possible to convert JSON files into YAML either online or use a tool like Syck or XS.
YAML in IaC
YAML is a common option when writing configuration files for Infrastructure as Code. These files store parameters and settings for the desired cloud environment.
Red Hat’s Ansible, one of the most popular IaC tools, uses YAML for file management. Ansible users create so-called playbooks written in YAML code that automate manual tasks of provisioning and deploying a cloud environment.
In the example below, we define an Ansible playbook verify-apache.yml:
--- - hosts: webservers vars: http_port: 90 max_clients: 250 remote_user: root tasks: - name: ensure apache is at the latest version yum: name: httpd state: latest - name: write the apache config file template: src: /srv/httpd.j2 dest: /etc/httpd.conf notify: - restart apache - name: ensure apache is running service: name: httpd state: started handlers: - name: restart apache service: name: httpd state: restarted
There are three tasks in this YAML playbook:
- We update Apache to the latest version using the
yum
command. - We use a template to copy the Apache configuration file. The playbook then restarts the Apache service.
- We start the Apache service.
Once set, a playbook is run from the command line. While the path varies based on the setup, the following command runs the playbook:
ansible-playbook -i hosts/groups verify_apache.yml
The Use of YAML in DevOps
Many DevOps teams define their development pipelines using YAML. YAML allows users to approach pipeline features like a markup file and manage them as any source file. Pipelines are versioned with the code, so teams can identify issues and roll back changes quickly.
To add a YAML build definition, a developer adds a source file to the root of the repository.
Thanks to YAML, DevOps separate logic from the configuration. That way, the configuration code follows best practices, such as:
- No hard-coded strings.
- Methods that perform one function and one function only.
- Testable code.
Several tools with a prominent role in DevOps rely on YAML:
- Azure DevOps provides a YAML designer to simplify defining build and release tasks.
- Kubernetes uses YAML to create storage and lightweight Linux virtual machines.
- Docker features YAML files called Dockerfiles. Dockerfiles are blueprints for everything you need to run software, including codes, runtime, tools, settings, and libraries.
An Efficient, User-Friendly Data Formatting Language
YAML offers levels of code readability other data-formatting languages cannot deliver. YAML also allows users to perform more operations with less code, making it an ideal option for DevOps teams that wish to speed up their delivery cycles.