What is YAML?

What is YAML?

TABLE OF CONTENTS

  • Why use YAML with Python?

  • Installing and importing PyYAML

  • Reading and parsing a YAML file with Python

  • Parsing YAML strings with Python

  • Parsing files with multiple YAML documents

  • Writing (or dumping) YAML to a file

YAML is a human-friendly data serialization language for all programming languages. YAML is most often used for configuration files, but it’s also used for data exchange.

YAML is easy to write and read for humans, even for non-programmers. At the same time, it’s also easy to parse YAML, especially with Python and the PyYAML library! Its human-friendliness and readability is YAML’s biggest advantage over other formats, like JSON and XML.

These are the most prominent features of YAML:

  • You can use comments in YAML files

  • You can store multiple documents in one YAML file with the --- separator. A feature often used in Kubernetes definitions.

  • It’s easy to read for humans

  • It’s easy to parse for computers

  • Why use YAML with Python?

    If you ask me, YAML is perfect for configuration files. That’s exactly how I, and many other developers, use it the most. Others seem to agree, as many large projects, like Docker and Kubernetes, use YAML to define deployments. It has a richer syntax than the often used alternative, .ini files, but is still nice on the eyes and simple to write and parse.

    There are some downsides to using YAML with Python, too:

    • YAML is not part of the standard Python library, while XML and JSON are

    • Its dependence on indentation is frustrating sometimes (however, Python developers are used to that, right?)

    • It’s perhaps too versatile for simple use cases, like data exchange of simple objects.

If you’re looking for a suitable data format for data exchange and storage, I recommend JSON, XML, or other more efficient formats like protocol buffers and Avro.

Installing and importing PyYAML

Multiple Python packages can parse YAML data. However, PyYAML is the most prevalent and complete implementation for parsing YAML. PyYAML is not part of the standard Python library, meaning you must install it with Pip. Use the following command to install PyYAML, preferably in a virtual environment:

    pip install pyyamlCode language: PowerShell (powershell)

On some systems, you need to use pip3:

    pip3 install pyyamlCode language: Bash (bash)

To use PyYAML in your scripts, import the module as follows. Note that you don’t import pyyaml, but simply yaml:

    import yamlCode language: Python (python)

Reading and parsing a YAML file with Python

Once we have the YAML parser imported, we can load a YAML file and parse it. YAML files usually carry the extension .yaml or .yml. Let’s work with the following example YAML file called config.yaml:

    rest:
      url: "https://example.org/primenumbers/v1"
      port: 8443

    prime_numbers: [2, 3, 5, 7, 11, 13, 17, 19]Code language: YAML (yaml)

Loading, parsing, and using this configuration file is similar to loading JSON with the Python JSON library. First, we open the file. Next, we parse it with the yaml.safe_load() function. Please note that I changed the output a little to make it more readable for you:

    >>> import yaml
    >>> with open('config.yml', 'r') as file
    ...    prime_service = yaml.safe_load(file)

    >>> prime_service
    {'rest': 
      { 'url': 'https://example.org/primenumbers/v1',
        'port': 8443
      },
      'prime_numbers': [2, 3, 5, 7, 11, 13, 17, 19]}

    >>> prime_service['rest']['url']
    https://example.org/primenumbers/v1Code language: Python (python)

The YAML parser returns a regular Python object that best fits the data. In this case, it’s a Python dictionary. This means all the regular dictionary features can be used, like using get() with a default value.

Parsing YAML strings with Python

You can use yaml.safe_load() to parse all kinds of valid YAML strings. Here’s an example that parses a simple list of items into a Python list:

  • Parsing files with multiple YAML documents

    YAML allows you to define multiple documents in one file, separating them with a triple dash (---). PyYAML will happily parse such files and return a list of documents. You can do so by using the yaml.safe_load_all() function. This function returns a generator that in turn will return all documents one by one.

    Note that the file needs to be opened as long as you’re reading documents from the YAML, so you must process the file within the with clause. Here’s an interactive example that demonstrates this function:

Writing (or dumping) YAML to a file

Although most will only read YAML as a configuration file, it can also be very handy to write YAML as well. Example use cases could be:

  • Create an initial configuration file with current settings for your user

  • To save state of your program in an easy to read file (instead of using something like Pickle)

In the following example, we’ll:

  • Create a list with names as we did before

  • Save the names to a YAML formatted file with yaml.dump

  • Read and print the file, as proof that everything worked as expected

Here you go: