Import configuration from Hiera or a Git repository with YAML files into Terraform

April 5, 2022

De-duplication of configuration information is key when managing large environments which use different types of automation (Terraform, Jenkins, Ansible, scripts executed as Systemd timers, Puppet…). Although many different configuration management tools exist (RDBMS, Consul, …), one of the easiest to use is Hiera or just a plain normal Git repository with YAML files, in some hierarchical way (which Hiera in theory is).

The YAML configuration hierarchy could be defined as the following file structure:

common.yaml: Default settings no matter which role or host. These can be overridden in all the below.
my_environment01.yaml: Environment specific configuration (example: development, staging, production, amsterdam, az01, az04, …). These can be overridden in all the below.
common/roles/some_server_role.yaml: A server role, or type definition, which contains role specific configuration parameters. The roles could implement an extra hierarchy as for instance:
- debian::databases::postgres
- debian::databases::postgres::timescale
- debian::databases::postgres::timescale::prometheus
- debian::loadbalancer::internal
- debian::application::request_processor
  
  The hierarchy steps are divided by :: in the above example, and need to be inherited accordingly, each with their own YAML file.
  These can be overridden in all the below.
my_environment01/roles/some_server_role.yaml: override role configuration parameters per environment.
These can even be overridden on host level below.
my_environment01/hosts/my_hostname01.yaml: set host specific configuration parameters. This file is actually always required and should contain at least the IP address of the node and the server role string.

Let’s take the following example: The host vmazdbprm01 has the role debian::databases::postgres::timescale::prometheus and is deployed in the environment in my_cool_location03. The configuration management should search for parameters in the following file locations (and first verify if the file path exists):

common.yaml
my_cool_location01.yaml
common/roles/debian.yaml
common/roles/debian::databases.yaml
common/roles/debian::databases::postgres.yaml
common/roles/debian::databases::postgres::timescale.yaml
common/roles/debian::databases::postgres::timescale::prometheus.yaml
my_cool_location01/roles/debian.yaml
my_cool_location01/roles/debian::databases.yaml
my_cool_location01/roles/debian::databases::postgres.yaml
my_cool_location01/roles/debian::databases::postgres::timescale.yaml
my_cool_location01/roles/debian::databases::postgres::timescale::prometheus.yaml
my_cool_location01/hosts/vmazdbprm01.yaml

This means that any code which wants to implement the above configuration management, needs to verify if the above 13 files exists from top to bottom, and if yes loads the YAML file accordingly.

Terraform is an open-source infrastructure as code software tool that provides a consistent CLI workflow to manage hundreds of cloud services.

Implementing the above YAML hierarchy in Terraform, could be done as follows:

locals {
  host_cfg             = yamldecode(fileexists("cfgmgmt/${var.environment}/hosts/${var.node}.yaml") ? file("cfgmgmt/${var.environment}/hosts/${var.node}.yaml") : "{server_role: debian}")

  roles_list           = split("::", local.host_cfg.server_role)
  all_roles_list       = [ for index in range(length(local.roles_list)): join("::",slice(local.roles_list, 0, index + 1))  ]

  common_cfg           = yamldecode(fileexists("cfgmgmt/common.yaml") ? file("cfgmgmt/common.yaml") : "{}")
  common_role_cfg_list = [ for file in local.all_roles_list:
      yamldecode(fileexists("cfgmgmt/common/roles/${file}.yaml") ? file("cfgmgmt/common/roles/${file}.yaml") : "{}" )]
    
  env_cfg              = yamldecode(fileexists("cfgmgmt/${var.environment}.yaml") ? file("cfgmgmt/${var.environment}.yaml") : "{}")
  env_role_cfg_list    = [ for file in local.all_roles_list:
      yamldecode(fileexists("cfgmgmt/${var.environment}/roles/${file}.yaml") ? file("cfgmgmt/${var.environment}/roles/${file}.yaml") : "{}") ]
        
  common_role_cfg_map  = merge(local.common_role_cfg_list...)
  env_role_cfg_map     = merge(local.env_role_cfg_list...)

  cfg                  = merge(local.common_cfg, local.env_cfg, local.common_role_cfg_map, local.env_role_cfg_map, local.host_cfg)
}

Let’s have a look what actually happens in the above code.

All YAML files are stored in a Git/ Hiera repository, accessible in the sub-directory cfgmgmt.

The code declares “local” variables by issuing the resource “locals“, starting from line 1.

Line 2 will check if a file called cfgmgmt/${var.environment}/hosts/${var.node}.yaml exists and if true, loads the YAML content as an map into the local variable host_cfg. If the file doesn’t exists, a default YAML code will be loaded. In theory, each node/ host must have a file defined as it should have at least data configuration such as:

unique node host name
IP address
server role
(optionally) VLAN/ subnet configuration
…

Line 4 splits the server role string, stored in local.host_cfg.server_role, into a list, to build the server role hierarchy further below.

Line 5 creates a list of top level server roles which need to be imported too. Example: if the server role was set to debian::databases::postgres::timescale::prometheus , the list all_roles_list will contain the following elements:

debian
debian::databases
debian::databases::postgres
debian::databases::postgres::timescale
debian::databases::postgres::timescale::prometheus

Line 7 loads the YAML content of common.yaml, if it exists.

Line 8 loops over the all_roles_list elements, created on line 5, and will load the YAML content of the server roles (if the file exists) into a list element. The result is a list called common_role_cfg_list.

Line 11 loads the general environment configuration YAML content (if it exists) into the local variable env_cfg.

Line 12 will do the same thing as line 8, but for environment specific roles. (for instance: when certain server roles have environment specific configuration parameters).

Lines 15 and 16 will merge the elements (which in theory are maps of YAML data) of the server role lists into one big map, in the order of the list. This allows that keys can be overridden. The expansion ... notation is explained at https://www.terraform.io/language/expressions/function-calls#expanding-function-arguments.

Finally on line 18, a local variable called cfg will be created, which merges the values of:

local.common_cfg
local.env_cfg
local.common_role_cfg_map
local.env_role_cfg_map
local.host_cfg

By providing the environment name and the host/ node name to the above code (as var.environment and var.node ), all required configuration parameters can be loaded per node in Terraform, but since we’ve used a Git repository, this information can be loaded in any kind of automation tool (required is of course that each automation implements the same kind of hierarchy code).

Johnny Morano Author

1 comment

raff says:
January 15, 2023 at 11:04

Hey, have your right about using the terraform hiera provider? It adds his as a data source and can perform proper interpolation using hiera (ie: using the hierarch you have defined in your hiera.yaml). This means that it can perform lookups on a key with context. For example, I have my hierarchy:

– common/*
– environment/%{env}.yaml
– region/%{region}.yaml

So I can define my values at the most appropriate level, and have default values in common overridden by more specific values in environment or region.

To make this work, I typically encode his provider config (ie, the env & region) into the workspace name. So a single codebase can be used to deploy into multiple regions and environments. All your have to do it’s `terraform workspace new prod.us-east-1`, and run plan + apply. Your tf code needs to use a local to parse `${terraform workspace}` into env and region before passing them to the hiera provider.
Reply

IPTables Logs in Loki and Grafana (with Promtail)

byJohnny Morano

April 1, 2022

Perl script to monitor the rate of logs

byJohnny Morano

April 7, 2022

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

Hand-Picked Top-Read Stories

A monitoring solution with Docker

Jenkins to manage a libvirt infrastructure with Terraform

Using multipath together with mdadm on Debian

Trending Tags

Import configuration from Hiera or a Git repository with YAML files into Terraform

1 comment

Leave a Reply Cancel reply

Previous Post

IPTables Logs in Loki and Grafana (with Promtail)

Next Post

Perl script to monitor the rate of logs

Import configuration from Hiera or a Git repository with YAML files into Terraform

1 comment

Leave a Reply Cancel reply

Previous Post

Next Post

Related Posts