Skip to content

Latest commit

 

History

History
861 lines (664 loc) · 63.4 KB

README.md

File metadata and controls

861 lines (664 loc) · 63.4 KB

Declaratively deploy Leaf and Spine fabric

This playbook will deploy a leaf and spine fabric and its related services in a declarative manner. You only have to define a few key values such as naming convention, number of devices and addresses ranges, the playbook is smart enough to do the rest for you.

This came from my project for the IPSpace Building Network Automation Solutions course and was used in part when we were deploying Cisco 9k leaf and spine fabrics in our Data Centers. The playbook is structured in a way that it should hopefully not be too difficult to add templates to deploy leaf and spine fabrics for other vendors. My plan was to add Arista and Juniper but is unlikely to happen.

I now am done with building DCs (bring on the ☁️) and with this being on the edge of the limit of my programing knowledge I don't envisage making any future changes. If any of it is useful to you please do take it and mold it to your own needs.

This README is intended to give enough information to understand the playbooks structure and run it. The variable files hold examples of a deployment with more information on what each variable does. For more detailed information about the playbook have a look at the series off posts I did about it on my blog.


The playbook deployment is structured into the following 5 roles with the option to deploy part or all of the fabric.

  • base: Non-fabric specific core configuration such as hostname, address ranges, aaa, users, acls, ntp, syslog, etc
  • fabric: Fabric specific core elements such as fabric size, interfaces (spine-to->leaf/border), routing protocols (OSPF, BGP) and MLAG
  • services: Services provided by the fabric (not the fabric core) are split into three sub-roles:
    • tenant: VRFs, SVIs, VLANs and VXLANs on the fabric and their associated VNIs
    • interface: Access ports connecting to compute or other non-fabric core network devices
    • routing: BGP (address-families), OSPF (additional non-fabric process) and static routes

If you wish to have a more custom build the majority of the settings in the variable files (unless specifically stated) can be changed as none of the scripting or templating logic uses the actual contents (dictionary values) to make decisions.

This deployment will scale up to a max of 4 spines, 4 borders and 10 leafs, this is how it will be deployed with the default values. net_top

The default ports used for inter-switch links are in the table below, these can be changed within fabric.yml (fbc.adv.bse_intf).

Connection                                           Start Port                                           End Port                                          
SPINE-to-LEAF Eth1/1 Eth1/10
SPINE-to-BORDER Eth1/11 Eth1/14
LEAF-to-SPINE Eth1/1 Eth1/4
BORDER-to-SPINE Eth1/1 Eth1/4
MLAG Peer-link  Eth1/5 Eth1/6
MLAG keepalive  mgmt n/a

This playbook is based on 1U Nexus devices, therefore using the one linecard module for all the connections. I have not tested how it will work with multiple modules, the role intf_cleanup is likely not to work. This role ensures interface configuration is declarative by defaulting non-used interfaces, therefore could be excluded without breaking the playbook.

As Python is a lot more flexible than Ansible the dynamic inventory_plugin and filter_plugins (within the roles) do the manipulating of the data in the variable files to create the data models that are used by the templates. This helps to abstract a lot of the complexity out of the jinja templates making it easier to create new templates for different vendors as you only have to deal with the device configuration rather than data manipulation.

Fabric Core Variable Elements

These core elements are the minimum requirements to create the declarative fabric. They are used for the dynamic inventory creation as well by the majority of the Jinja2 templates. All variables are proceeded by ans, bse or fbc to make it easier to identify within the playbook, roles and templates which variable file the variable came from. From the contents of these var_files a dynamic inventory is built containing host_vars of the fabric interfaces and IP addresses.

ansible.yml (ans)

dir_path: Base directory location on the Ansible host that stores all the validation and configuration snippets
device_os: Operating system of each device type (spine, leaf and border)
creds_all: hostname (got from the inventory), username and password\

base.yml (bse)

The settings required to onboard and manage device such as hostname format, IP address ranges, aaa, syslog, etc.

device_name: Naming format that the automatically generated 'Node ID' (double decimal format) is added to and the group name created from (in lowercase). The name must contain a hyphen (-) and the characters after that hyphen must be either letters, digits or underscore as that is what the group name is created from. For example using DC1-N9K-SPINE would mean that the device is DC1-N9K-SPINE01 and the group is spine

Key Value Information
spine xx-xx Spine switch device and group naming format
border xx-xx Border switch device and group naming format
leaf xx-xx Leaf switch device and group naming format

addr: Subnets from which the device specific IP addresses are generated based on the device-type increment and the Node ID. The majority of subnets need to be at least /27 to cover a maximum network size of 4 spines, 10 leafs and 4 borders (18 addresses)

Key Value Min size Information
lp_net x.x.x.x/26 /26 The range routing (OSPF/BGP), VTEP and vPC loopbacks are from (mask will be /32)
mgmt_net x.x.x.x/27 /27 Management network, by default will use .11 to .30
mlag_peer_net x.x.x.x/26 /26 or /27 Range for OSPF peering between MLAG pairs, is split into /30 per-switch pair. Must be /26 if using same range for keepalive
mlag_kalive_net x.x.x.x/27 /27 Optional keepalive address range (split into /30). If not set uses mlag_peer_net range
mgmt_gw x.x.x.x n/a Management interface default gateway

mlag_kalive_net is only needed if not using the management interface for the keepalive or you want separate ranges for the peer-link and keepalive interfaces. The keepalive link is created in its own VRF so it can use duplicate IPs or be kept unique by offsetting it with the fbc.adv.addr_incre.mlag_kalive_incre fabric variable.

There are a lot of other system wide settings in base.yml such as AAA, NTP, DNS, usernames and management ACLs. Anything under bse.services are optional (DNS, logging, NTP, AAA, SNMP, SYSLOG) and will use the management interface and VRF as the source unless specifically set. More detailed information can be found in the variable file.

fabric.yml (fbc)

Variables used to determine how the fabric will be built, the network size, interfaces, routing protocols and address increments. At a bare minimum you only need to declare the size of fabric, total number of switch ports and the routing options.

network_size: How many of each device type make up the fabric. Can range from 1 spine and 2 leafs up to a maximum of 4 spines, 4 borders and 10 leafs. The border and leaf switches are MLAG pairs so must be in increments of 2.

Key Value Information
num_spines 2 Number of spine switches in increments of 1 up to a maximum of 4
num_borders 2 Number of border switches in increments of 2 up to a maximum of 4
num_leafs 4 Number of leaf switches in increments of 2 up to a maximum of 10

num_intf: The total number of interfaces per-device-type is required to make the interface assignment declarative by ensuring that non-defined interfaces are reset to their default values

Key Value Information
spine 1,64 The first and last interface for a spine switch
border 1,64 The first and last interface for a border switch
leaf 1,64 The first and last interface for a leaf switch

adv.bse_intf: Interface naming formats and the 'seed' interface numbers used to build the fabric

Key Value Information
intf_fmt Ethernet1/ Interface naming format
intf_short Eth1/ Short interface name used in interface descriptions
mlag_fmt port-channel MLAG interface naming format
mlag_short Po Short MLAG interface name used in MLAG interface descriptions
lp_fmt loopback Loopback interface naming format
sp_to_lf 1 First interface used for SPINE to LEAF links (1 to 10)
sp_to_bdr 11 First interface used for SPINE to BORDER links (11 to 14)
lf_to_sp 1 First interface used LEAF to SPINE links (1 to 4)
bdr_to_sp 1 First interface used BORDER to SPINE links (1 to 4)
mlag_peer 5-6 Interfaces used for the MLAG peer Link
mlag_kalive mgmt Interface for the keepalive. If it is not an integer uses the management interface

adv.address_incre: Increments added to the 'Node ID' and subnet to generate unique device IP addresses. Uniqueness is enforced by using different increments for different device-types and functions

Key Value Information
spine_ip 11 Spine mgmt and routing loopback addresses (default .11 to .14)
border_ip 16 Border mgmt and routing loopback addresses (default .16 to .19)
leaf_ip 21 Leaf mgmt and routing loopback addresses (default .21 to .30)
border_vtep_lp 36 Border VTEP (PIP) loopback addresses (default .36 to .39)
leaf_vtep_lp 41 Leaf VTEP (PIP) loopback addresses (default .41 to .50)
border_mlag_lp 56 Shared MLAG anycast (VIP) loopback addresses for each pair of borders (default .56 to .57)
leaf_mlag_lp 51 Shared MLAG anycast (VIP) loopback addresses for each pair of leafs (default .51 to .55)
border_bgw_lp 58 Shared BGW MS anycast loopback addresses for each pair of borders (default .58 to .59)
mlag_leaf_ip 1 Start IP for leaf OSPF peering over peer-link (default LEAF01 is .1, LEAF02 is .2, LEAF03 is .5, etc)
mlag_border_ip 21 Start IP for border OSPF peering over peer-link (default BORDER01 is .21, BORDER03 is .25, etc)
mlag_kalive_incre 28 Increment added to leaf/border increment (mlag_leaf_ip/mlag_border_ip) for keepalive addresses

If the management interface is not being used for the keepalive link either specify a separate network range (bse.addr.mlag_kalive_net) or use the peer-link range and define an increment (mlag_kalive_incre) that is added to the peer-link increment (mlag_leaf_ip or mlag_border_ip) to generate unique addresses.

route: Settings related to the fabric routing protocols (OSPF and BGP). BFD is not supported on unnumbered interfaces so the routing protocol timers have been shortened (OSPF 2/8, BGP 3/9), these are set under the variable file advanced settings (adv.route)

Key Value  Mandatory Information
ospf.pro string or integer Yes Can be numbered or named
ospf.area x.x.x.x Yes Area this group of interfaces are in, must be in dotted decimal format
bgp.as_num integer Yes Local BGP Autonomous System number
authentication string No Applies to both BGP and OSPF. Hash out if don't want to set authentication

acast_gw_mac: The distributed gateway anycast MAC address for all leaf and border switches in the format xxxx.xxxx.xxxx

Dynamic Inventory

The ansible, base and fabric variables are passed through the inv_from_vars.py inventory_plugin to create the dynamic inventory and host_vars of all the fabric interfaces and IP addresses. By doing this in the inventory the complexity is abstracted from the base and fabric role templates making it easier to expand the playbook to other vendors in the future.

With the exception of intf_mlag and mlag_peer_ip (not on the spines) the following host_vars are created for every host.

  • ansible_host: Devices management address
  • ansible_network_os: Got from ansible var_file and used by napalm device driver
  • intf_fbc: Dictionary of fabric interfaces with interface the keys and description the values
  • intf_lp: List of dictionaries with keys of name, ip and description
  • intf_mlag: Dictionary of MLAG peer-link interfaces with interface the key and description the value
  • mlag_peer_ip: IP of the SVI (default VLAN2) used for the OSPF peering over the MLAG peer-link
  • num_intf: Number of the first and last physical interface on the switch
  • intf_mlag_kalive: Dictionary of MLAG keepalive link interface with interface the key and description the value (only created if defined)
  • mlag_kalive_ip: IP of the keepalive link (only created if defined)

The devices (host-vars) and groups (group-vars) created by the inventory plugin can be checked using the graph flag. It is the inventory config file (.yml) not the inventory plugin (.py) that is referenced when using the dynamic inventory.

ansible-inventory --playbook-dir=$(pwd) -i inv_from_vars_cfg.yml --graph
@all:
  |--@border:
  |  |--DC1-N9K-BORDER01
  |  |--DC1-N9K-BORDER02
  |--@leaf:
  |  |--DC1-N9K-LEAF01
  |  |--DC1-N9K-LEAF02
  |  |--DC1-N9K-LEAF03
  |  |--DC1-N9K-LEAF04
  |--@spine:
  |  |--DC1-N9K-SPINE01
  |  |--DC1-N9K-SPINE02
  |--@ungrouped:

host shows the host-vars for that specific host whereas list shows everything, all host-vars and group-vars.

ansible-inventory --playbook-dir=$(pwd) -i inv_from_vars_cfg.yml --host DC1-N9K-LEAF01
ansible-inventory --playbook-dir=$(pwd) -i inv_from_vars_cfg.yml --list

An example of the host_vars created for a leaf switch.

{
    "ansible_host": "10.10.108.21",
    "ansible_network_os": "nxos",
    "intf_fbc": {
        "Ethernet1/1": "UPLINK > DC1-N9K-SPINE01 - Eth1/1",
        "Ethernet1/2": "UPLINK > DC1-N9K-SPINE02 - Eth1/1"
    },
    "intf_lp": [
        {
            "descr": "LP > Routing protocol RID and peerings",
            "ip": "192.168.101.21/32",
            "name": "loopback1"
        },
        {
            "descr": "LP > VTEP Tunnels (PIP) and MLAG (VIP)",
            "ip": "192.168.101.41/32",
            "mlag_lp_addr": "192.168.101.51/32",
            "name": "loopback2"
        }
    ],
    "intf_mlag_kalive": {
        "Ethernet1/7": "UPLINK > DC1-N9K-LEAF02 - Eth1/7 < MLAG Keepalive"
    },
    "intf_mlag_peer": {
        "Ethernet1/5": "UPLINK > DC1-N9K-LEAF02 - Eth1/5 < Peer-link",
        "Ethernet1/6": "UPLINK > DC1-N9K-LEAF02 - Eth1/6 < Peer-link",
        "port-channel1": "UPLINK > DC1-N9K-LEAF02 - Po1 < MLAG Peer-link"
    },
    "mlag_kalive_ip": "10.10.10.29/30",
    "mlag_peer_ip": "192.168.202.1/30",
    "num_intf": "1,64"
}

To use the inventory plugin in a playbook reference the inventory config file in place of the normal hosts inventory file (-i).

ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml

Services - Tenant (svc_tnt)

Tenants, SVIs, VLANs and VXLANs are created based on the variables stored in the service_tenant.yml file (svc_tnt.tnt).

tnt: A list of tenants that contains a list of VLANs (Layer2 and/ or Layer3)

  • Tenants (VRFs) will only be created on a leaf or border if a VLAN within that tenant is to be created on that device
  • Even if a tenant is not a layer3 tenant a VRF will still be created and the L3VNI and tenant VLAN number reserved
  • If the tenant is a layer3 tenant the route-map for redistribution is always created and attached to the BGP peer
Key Value  Mandatory Information
tenant_name string Yes Name of the VRF
l3_tenant True or False Yes Does it need SVIs or is routing done off the fabric (i.e external router)
bgp_redist_tag integer No Tag used to redistributed SVIs into BGP, by default uses tenant SVI number
vlans list Yes List of VLANs within this tenant (see the below table)

vlans: A List of VLANs within a tenant which at a minimum need the layer2 values of name and num. VLANs and SVIs can only be created on all leafs and/ or all borders, you can't selectively say which individual leaf or border switches to create them on

  • Unless an IP address is assigned to a VLAN (ip_addr) it will only be L2 VLAN
  • L3 VLANs are automatically redistributed into BGP. This can be disabled (ipv4_bgp_redist: False) on a per-vlan basis
  • By default VLANs will only be created on the leaf switches (create_on_leaf). This can be changed on a per-vlan basis to create only on borders (create_on_border) or on both leafs and borders
  • To add a non-VXLAN SVI (without anycast address) create the VLAN as normal but with the extra VXLAN: False dictionary. The SVI is defined in service_interface.yml as type: svi
  • Optional settings will implicitly use the default value, they only need defining if not using the default value
Key Value                Mand Information
num integer Yes The VLAN number
name string Yes The VLAN name
ip_addr x.x.x.x/x No Adding an IP address automatically making the VLAN L3 (not set by default)
ipv4_bgp_redist True or False No Dictates whether the SVI is redistributed into BGP VRF address family (default True)
create_on_leaf True or False No Dictates whether this VLAN is created on the leafs (default True)
create_on_border True or False No Dictates whether this VLAN is created on the borders (default False)
vxlan True or False No Whether VXLAN or normal VLAN. Only need if don't want it to be a VXLAN

The redistribution route-map name can be changed in the advanced (adv) section of services-tenant.yml or services-routing.yml. If defined in both places the setting in services-routing.yml take precedence.

L2VNI and L3VNI numbers

The L2VNI and L3VNI values are automatically derived and incremented on a per-tenant basis based on the start and increment seed values defined in the advanced section (svc_tnt.adv) of services_tenant.yml.

adv.bse_vni: Starting VNI numbers

Key Value Information
tnt_vlan 3001 Starting VLAN number for the transit L3VNI
l3vni 10003001 Starting L3VNI number
l2vni 10000 Starting L2VNI number, the VLAN number will be added to this

adv.vni_incre: Number by which VNIs are incremented for each tenant

Key Value Information
tnt_vlan 1 Value by which the transit L3VNI VLAN number is increased for each tenant
l3vni 1 Value by which the transit L3VNI VNI number is increased for each tenant
l2vni 10000 Value by which the L2VNI range (range + vlan) is increased for each tenant

For example a two tenant fabric each with a VLAN 20 using the above values would have L3 tenant SVIs of 3001, 3002, L3VNIs or 10003001, 10003002 and L2VNIs of 10020 and 20020.

A new data-model is created from the services_tenant.yml variables by passing them through the format_dm.py filter_plugin method create_svc_tnt_dm along with the BGP route-map name (if exists) and ASN (from fabric.yml). The result is a per-device-type (leaf and border) list of tenants, SVIs and VLANs which are used to render the svc_tnt_tmpl.j2 template and create the config snippet.

Below is an example of the data model format for a tenant and its VLANs.

{
    "bgp_redist_tag": 99,
    "l3_tnt": true,
    "l3vni": 100003004,
    "rm_name": "RM_CONN->BGP65001_RED",
    "tnt_name": "RED",
    "tnt_redist": true,
    "tnt_vlan": 3004,
    "vlans": [
        {
            "create_on_border": true,
            "create_on_leaf": false,
            "ip_addr": "10.99.99.1/24",
            "ipv4_bgp_redist": true,
            "name": "red_inet_vl99",
            "num": 99,
            "vni": 40099
        },
        {
            "ip_addr": "l3_vni",
            "ipv4_bgp_redist": false,
            "name": "RED_L3VNI",
            "num": 3004,
            "vni": 100003004
        }
    ]
}

Services - Interface (svc_intf)

The service_interface.yml variables define single or dual-homed interfaces (including port-channel) either statically or dynamically.

  • By default all interfaces are dual-homed LACP 'active'. The VPC number can not be changed, is always the port-channel number
  • Interfaces and port-channels can be assigned dynamically from a pre-defined pool (under svc_intf.adv) or specified manually
  • If the tenant (VRF) is not defined for a layer3, SVI or loopback interface it will be created in the global routing table
  • If the interface config is the same across multiple switches (like an access port) define one interface with a list of switches
  • Only specify the odd numbered switch for dual-homed interfaces, the config for MLAG neighbor is automatically generate

There are 7 pre-defined interface types that can be deployed:

  • access: A single VLAN layer2 access port with STP set to 'edge'
  • stp_trunk: A trunk going to a device that supports Bridge Assurance. STP is set to 'network'
  • stp_trunk_non_ba: Same as stp_trunk except STP is set to 'normal' as it is for devices that don't support BA
  • non_stp_trunk: A trunk port going to a device that doesn't support BPDU. STP is set to 'edge' and BPDU Guard enabled
  • layer3: A layer3 interface with an IP address. Must be single-homed as MLAG not supported for L3 interfaces
  • loopback: A loopback interface with an IP address (must be single-homed)
  • svi: To define a SVI the VLAN must exist in service_tenant.yml and not be a VXLAN (must be single-homed)

The intf.single_homed and intf.dual-homed dictionaries hold a list of all single-homed or dual-homed interfaces using any of the attributes in the table below. If there are no single-homed or dual-homed interfaces on the fabric hash out the relevant dictionary.

Key Value Mand Information
descr string Yes Interface or port-channel description
type intf_type Yes Either access, stp_trunk, stp_trunk_non_ba, non_stp_trunk, layer3, loopback or svi
ip_vlan vlan or ip Yes Depends on the type, either ip/prefix, vlan or multiple vlans separated by , and/or -
switch list Yes List of switches created on. If dual-homed needs to be odd numbered switch from MLAG pair
tenant string No Layer3, svi and loopbacks only. If not defined the default VRF is used (global routing table)
po_mbr_descr list No PO member interface description, [odd_switch, even_switch]. If undefined uses PO descr
po_mode string No Set the Port-channel mode, 'on', 'passive' or 'active' (default is 'active')
intf_num integer No Only specify the number, the name and module are got from the fbc.adv.bse_intf.intf_fmt
po_num integer No Only specify the number, the name is got from the fbc.adv.bse_intf.mlag_fmt

The playbook has the logic to recognize if statically defined interface numbers overlap with the dynamic interface range and exclude them from dynamic interface assignment. For simplicity it is probably best to use separate ranges for the dynamic and static assignments.

adv.single_homed: Reserved range of interfaces to be used for dynamic single-homed and loopback assignment

Key Value Information
first_intf integer First single-homed interface to be dynamically assigned
last_intf integer Last single-homed interface to be dynamically assigned
first_lp integer First loopback number to be dynamically used
last_lp integer Last loopback number to be dynamically used

adv.dual-homed: Reserved range of interfaces to be used for dynamic dual-homed and port-channel assignment

Key Value Information
first_intf integer First dual-homed interface to be dynamically assigned
last_intf integer Last dual-homed interface to be dynamically assigned
first_po integer First port-channel number to be dynamically used
last_po integer Last port-channel number to be dynamically used

The format_dm.py filter_plugin method create_svc_intf_dm is run for each inventory host to produce a list of all interfaces to be created on that device. In addition to the services_interface.yml variables it also passes in the interface naming format (fbc.adv.bse_intf) to create the full interface name and hostname to find the interfaces relevant to that device. This is saved to the fact flt_svc_intf which is used to render the svc_intf_tmpl.j2 template and create the config snippet.

Below is an example of the data model format for a single-homed and dual-homed interface.

{
    "descr": "UPLINK > DC1-BIP-LB01 - Eth1.1",
    "dual_homed": false,
    "intf_num": "Ethernet1/9",
    "ip_vlan": 30,
    "stp": "edge",
    "type": "access"
},
{
    "descr": "UPLINK > DC1-SWI-BLU01 - Gi0/0",
    "dual_homed": true,
    "intf_num": "Ethernet1/18",
    "ip_vlan": "10,20,30",
    "po_mode": "on",
    "po_num": 18,
    "stp": "network",
    "type": "stp_trunk"
},
{
    "descr": "UPLINK > DC1-SWI-BLU01 - Po18",
    "intf_num": "port-channel18",
    "ip_vlan": "10,20,30",
    "stp": "network",
    "type": "stp_trunk",
    "vpc_num": 18
}

Interface Cleanup - Defaulting Interfaces

The interface cleanup role is required to make sure any interfaces not assigned by the fabric or the services (svc_intf) role have a default configuration. Without this if an interface was to be changed (for example a server moved to a different interface) the old interface would not have its configuration put back to the default values.

This role goes through the interfaces assigned by the fabric (from the inventory) and service_interface role (from the svc_intf_dm method) producing a list of used physical interfaces which are then subtracted from the list of all the switches physical interfaces (fbc.num_intf). It has to be run after the fabric or service_interface role as it needs to know what interfaces have been assigned, therefore uses tags to ensure it is run anytime either of these roles are run.

Services - Route (svc_rte)

BGP peerings, non-backbone OSPF processes, static routes and redistribution (connected, static, bgp, ospf) are configured based on the variables specified in the service_route.yml file. The naming convention of the route-maps and prefix-lists used by OSPF and BGP can be changed under the advanced section (adv) of the variable file.

I am undecided about this role as it goes against the simplistic principles used by the other roles. By its very nature routing is very configurable which leads to complexity due to the number of options and inheritance. In theory all these features should work but due to the number of options and combinations available I have not tested all the possible variations of configuration.

Static routes (svc_rte.static_route)

Routes are added per-tenant with the tenant being the top-level dictionary that routes are created under.

  • tenant, switch and prefix are lists to make it easy to apply the same routes across multiple devices and tenants
  • For routes with the same attributes (like next-hop) can group all the routes as a list within the one prefix dictionary value
Parent dict Key Value Mand Information
n/a tenant list Yes List of tenants to create the routes in. Use 'global' for the global routing table
n/a switch list Yes List of switches to create all routes on (alternatively can be set per-route)
route prefix list Yes List of routes that all have same settings (gateway, interface, switch, etc)
route gateway x.x.x.x Yes Next hop gateway address
route interface string No Next hop interface, use interface full name (Ethernet), Vlan or Null0
route ad integer No Set the admin distance for this group of routes (1 - 255)
route next_hop_vrf string No Set the VRF for next-hop if it is in a different VRF (route leaking between VRFs)
route switch list Yes Switches to create this group of routes on (overrides static_route.switch)

OSPF (svc_rte.ospf)

An OSPF processes can be configured for any of the tenants or the global routing table.

  • Each OSPF process is enabled on a per-interface basis with summarization and redistribution defined on a per-switch basis
  • The mandatory process.switch list defines the switches the OSPF process is configured on
  • Non-mandatory settings only need to be defined if changing the default behavior, otherwise is no need to add the dictionary
Key Value                       Mand Information
process integer or string Yes The process can be a number or word
switch list Yes List of switches to create the OSPF process on
tenant string No The VRF OSPF is enabled in. If not defined uses the global routing table
rid list No List of RIDs, must match number of switches (if undefined uses highest loopback)
bfd True No Enable BFD globally for all interfaces (disabled by default)
default_orig True, always No Conditionally (True) or always advertise default route (disabled by default)

Interface, summary and redistribution are child dictionaries of lists under the ospf parent dictionary. They inherit process.switch unless switch is specifically defined under that child dictionary.

ospf.interface: Each list element is a group of interfaces with the same set of attributes (area number, interface type, auth, etc)

Key Value                 Mand Information
name list Yes List of one or more interfaces. Use interface full name (Ethernet) or Vlan
area x.x.x.x Yes Area this group of interfaces are in, must be in dotted decimal format
switch list No Which switches to enable OSPF on these interfaces (inherits process.switch if not set)
cost integer No Statically set the interfaces OSPF cost, can be 1-65535
authentication string No Enable authentication for the area and a password (Cisco type 7) for this interface
area_type string No By default is normal. Can be set to stub, nssa, stub/nssa no-summary, nssa default-information-originate or nssa no-redistribution
passive True No Make the interface passive. By default all configured interfaces are non-passive
hello integer No Interface hello interval (deadtime is x4), automatically disables BFD for this interface
type point-to-point No By default all interfaces are broadcast, can be changed to point-to-point

ospf.summary: All summaries with the same attributes (switch, filter, area) can be grouped in a list within the one prefix dictionary value

Key Value Mandatory Information
prefix list Yes List of summaries to apply on all the specified switches
switch list No What switches to summarize on, inherits process.switch if not set
area x.x.x.x No By default it is LSA5. For LSA3 add an area to summarize from that area
filter not-advertise No Stops advertisement of the summary and subordinate subnets (is basically filtering)

ospf.redist:: Each list element is the redistribution type (ospf_xx, bgp_xx, static or connected). Redistributed prefixes can be filtered (allow) or weighted (metric) with the route-map order being metric and then allow. If the allow list is not set it will allow any (empty route-map)

Key Value                       Mand Information
type string Yes Redistribute either OSPF process, BGP AS, static or connected
switch list No What switches to redistribute on, inherits process.switch if not set
metric dict No Add metric to redistributed prefixes. Keys are metric value and values a list of prefixes or keyword ('any' or 'default'). Can't use metric with a type of connected
allow list, any, default No List of prefixes (connected is list of interfaces) or keyword ('any' or 'default') to redistribute

BGP

Uses the concept of groups and peers with the majority of the settings configured in either

  • group holds the global settings for all peers within it. Are automatically created on any switches that peers within it are created
  • peer is a list of peers within the group. If the setting is configured in the group and peer the peer setting will take precedence
  • The group.name and peer.name are used in the construction of route-map and prefix-list names (formatting is in advanced)
  • If the tenant is not specified (dictionary not defined) the group or peer will be added to the default global routing table
  • Non-mandatory settings only need to be defined if changing the default behavior, otherwise is no need to add the dictionary
Set in Key Value Mand Information
group name string Yes Name of the group, no whitespaces or duplicate names (group or peer)
peer name string Yes Name of the peer, no whitespaces or duplicate names (group or peer)
peer peer_ip x.x.x.x Yes IP address of the peer
peer descr string Yes Description of the peer
both switch list Yes List of switches (even if is only 1) to create the group and peers on
both tenant list No List of tenants (even if is only 1) to create the peers under
both remote_as integer Yes Remote AS of this peer or if group all peers within that group
both timers [kl,ht] No List of [keepalive, holdtime], if not defined uses [3, 9] seconds
both bfd True No Enable BFD for an individual peer or all peers in group (disabled by default)
both password string No Plain-text password to authenticate a peer or all peers in group (default none)
both default True No Advertise default route to a peer or all peers in the group (default False)
both update_source string No Set the source interface used for peerings (default not set)
both ebgp_multihop integer No Increase the number of hops for eBGP peerings (2 to 255)
both next_hop_self True No Set the next-hop to itself for any advertised prefixes (default not set)

inbound or outbound: Optionally set under the group or peer to filter BGP advertisements and/ or BGP attribute manipulation

  • The naming of the route-maps and prefix-lists are dependant on where they are applied (group or peer)
  • All attribute settings are dictionaries with the key being the attribute and the value the prefixes it is applied to
Key Value Direction Information
weight dict inbound Keys are the weight and the value a list of prefixes or keyword ('any' or 'default')
pref dict inbound Keys are the local preference and the value a list of prefixes or keyword
med dict outbound Keys are the MED value and the values a list of prefixes or keyword
as_prepend dict outbound Keys are the number of times to add the ASN and values a list of prefixes or keyword
allow list, any, default both Can be a list of prefixes or a keyword to advertise just the default route or anything
deny list, any, default both Can be a list of prefixes or a keyword to not advertise the default route or anything

bgp.tnt_advertise: Optionally advertise prefixes on a per-tenant basis (list of VRFs) using network, summary and redistribution. The switch can be set globally for all network/summary/redist in a VRF and be overridden on an individual per-prefix basis

  • network: List of prefixes to be advertised on a per-switch basis (network cmd). If a device is covered by 2 different network.prefix statements it will get a combination of them both (merged), so network statements for all prefixes
  • summary: Group summaries (aggregate-address) with the same attributes (switch and summary_only) within the same list element
  • redist: Each list element is the redistribution type (ospf process, static or connected) with the redistributed prefixes weighted (metric) and/or filtered (allow). If the allow list is not set it is allow any (empty route-map). Can only have one each of types connected and static per-switch, first occurrence is used. The switch set under the redistribution type is preferred over that set in process.switch, is no merging
Set in Key Value Mand Information
tnt_advertise name string Yes A single VRF that is being advertising into (use 'global' for global routing table)
all switch list Yes What switches to redistribute on, inherits process.switch if not set
network/summary prefix list Yes List of prefixes to advertise
summary filter summary-only No Only advertise the summary, suppress all prefixes within it (disabled by default)
redist type string Yes Redistribute ospf_process (whitespace before process), static or connected
redist metric dict No Add metric to redistributed prefixes. Keys are the MED value and values a list of prefixes or keyword ('any' or 'default'). Cant use metric with connected
redist allow list, any, default No List of prefixes (can use 'ge' and/or 'le'), interfaces (for connected) or keyword ('any' or 'default') to redistribute

Advanced settings (svc_rte.adv) allow the changing of the default routing protocol timers and naming format of the route-maps and prefix-lists used for advertisement and redistribution.

The filter_plugin method create_svc_rte_dm is run for each inventory host to produce a data model of the routing configuration for that device. The outcome is a list of seven per-device data models that are used by the svc_rte_tmpl.j2 template.

  • all_pfx_lst: List of all prefix-lists with each element in the format [name, seq, permission, prefix]
  • all_rm: List of all route-maps with each element in the format [name, seq, permission, prefix, [attribute, value]]. If no BGP attributes are set in the RM the last entry in the list will be [null, null]
  • stc_rte: Per-VRF dictionaries (VRF is the key) of lists of static routes with interface and/or gateway, optional AD and destination VRF
  • group: Dictionaries of BGP groups (group is the key) that have peers on this device. The value is dictionaries of any group settings
  • peer: Dictionaries of tenants (VRFs) containing the following nested dictionaries:
    • peers: Dictionary of peers (key is the peer) with the value being dictionaries of the peers settings
    • network: List of networks to be advertised by BGP
    • summary: Dictionary of summaries with the key being the prefix and value either null (doesn't suppress) or summary-only
    • redist: Two dictionaries of the route-map name (rm_name) and redistribution type (connected, static, etc)
  • ospf_proc: Dictionary of VRFs (key) and the OSPF process settings for each VRF (settings configured under the process)
  • ospf_intf: Dictionary of interfaces (key) that have OSPF enabled, the values are the interface specific OSPF settings

Passwords

There are four main types of passwords used within the playbooks.

  • BGP/OSPF: In the variable file it is in as plain text but in the device running configuration is encrypted
  • Users: Has to be in encrypted format (type-5) in the variable file
  • TACACS: Has to be in the encrypted format (type-7) in the variable. Could use type-6 but would also need to generate a master key
  • device: The password used by Napalm to log into devices defined under ans.creds_all. Can be plain-text or use vault

Input validation

Pre-task input validation checks are run on the variable files with the goal being to highlight any problems with variable before any of the fabric build tasks are started. Fail fast based on logic rather failing halfway through a build. Pre-validation checks for things such as missing mandatory variables, variables are of the correct type (str, int, list, dict), IP addresses are valid, duplicate entires, dependencies (VLANs assigned but not created), etc. It wont catch everything but will eliminate a lot of the needless errors that would break a fabric build.

A combination of Python assert within a filter plugin (to identify any issues) and Ansible assert within the playbook (to return user-friendly information) is used to achieve the validation. All the error messages returned by input validation start with the nested location of the variable to make it easier to find.

It is run using the `pre_val' tag and will conditionally only check variable files that have been defined under var_files. It can be run using the inventory plugin but will fail if any of the values used to create inventory are wrong so better use a dummy host file.

ansible-playbook playbook.yml -i hosts --tag pre_val
ansible-playbook playbook.yml -i inv_from_vars_cfg.yml --tag pre_val

A full list of what variables are checked and the expected input can be found in the header notes of the filter plugin input_validate.py.

Playbook Structure

The main playbook (PB_build_fabric.yml) is divided into 3 sections with roles used to do the data manipulation and templating

  • pre_tasks: Pre-validation checks and deletion/creation of file structure (at each playbook run) to store config snippets
  • tasks: Imports tasks from roles which in turn use variables (.yml) and templates (.j2) to create the config snippets
    • base: From base.yml and bse_tmpl.j2 creates the base configuration snippet (aaa, logging, mgmt, ntp, etc)
    • fabric: From fabric.yml and fbc_tmpl.j2 creates the fabric configuration snippet (interfaces, OSPF, BGP)
    • services: Per-service-type tasks, templates and plugins to create the config for services that run on the fabric
      • svc_tnt: From services_tenant.yml and svc_tnt_tmpl.j2 creates the tenant config snippet (VRF, SVI, VXLAN, VLAN)
      • svc_intf: From services_interface.yml and svc_intf_tmpl.j2 creates interface config snippet (routed, access, trunk, loop)
      • svc_rte: From service_route.yml and svc_rte_tmpl.j2 creates the tenant routing config snippet (BGP, OSPF, routes, redist)
    • intf_cleanup: Based on the interfaces used in the fabric creates config snippet to default all the other interfaces
  • task_config: Assembles the config snippets into the one file and applies using Napalm replace_config

The post-validation playbook (PB_post_validate.yml) uses the validation role to do the majority of the work

  • pre_tasks: Creates the file structure to store validation files (desired_state) and the compliance report
  • roles: Imports the services role so that the filter plugins within it can be used to create the service data models for validation
  • tasks: Imports tasks from roles and checks the compliance report result
    • validation: Per-validation engine tasks to create desired_state, gather the actual_state and produce a compliance report
      • nap_val: For elements covered by napalm_getters creates desired_state and compares against actual_state
      • cus_val: For elements not covered by napalm_getters creates desired_state and compares against actual_state
    • compliance_report: Loads validation report (created by nap_val and cus_val) and checks whether it complies (passed)

Directory Structure

The directory structure is created within ~/device_configs to hold the configuration snippets, output (diff) from applied changes, validation desired_state files and compliance reports. The parent directory is deleted and re-added at each playbook run.
The base location for this directory can be changed using the ans.dir_path variable.

~/device_configs/
├── DC1-N9K-BORDER01
│   ├── config
│   │   ├── base.conf
│   │   ├── config.cfg
│   │   ├── dflt_intf.conf
│   │   ├── fabric.conf
│   │   ├── svc_intf.conf
│   │   ├── svc_rte.conf
│   │   └── svc_tnt.conf
│   └── validate
│       ├── napalm_desired_state.yml
│       └── nxos_desired_state.yml
├── diff
│   ├── DC1-N9K-BORDER01.txt
└── reports
    ├── DC1-N9K-BORDER01_compliance_report.json

Prerequisites

The deployment has been tested on NXOS 9.2(4) and NXOS 9.3(5) (in theory should be fine with 9.3(6) & 9.3(7)) using Ansible 2.10.6 and Python 3.6.9. See the Caveats section for the few nuances when running the different versions of code.

git clone https://github.com/sjhloco/build_fabric.git
mkdir ~/venv/venv_ansible2.10
python3 -m venv ~/venv/venv_ansible2.10
source ~/venv/venv_ansible2.10/bin/activate
pip install -r build_fabric/requirements.txt

Once the environment has been setup with all the packages installed run napalm-ansible to get the location of the napalm-ansible paths and add them to ansible.cfg under [defaults].

Before any configuration can be deployed using Ansible a few things need to be manually configured on all N9K devices:

  • Management IP address and default route
  • The features nxapi and scp-server are required for Naplam replace_config
  • Image validation can take a while on NXOS so is best to be done so beforehand
interface mgmt0
  ip address 10.10.108.11/24
vrf context management
  ip route 0.0.0.0/0 10.10.108.1
feature nxapi
feature scp-server
boot nxos bootflash:/nxos.9.3.5.bin sup-1
  • Leaf and border switches also need the TCAM allocation changed to allow for arp-suppression. This can differ dependant on device model, any changes made need correcting in /roles/base/templates/nxos/bse_tmpl.j2 to keep it idempotent
hardware access-list tcam region racl 512
hardware access-list tcam region arp-ether 256 double-wide
copy run start
reload

The default username/password for all devices is admin/ansible and is stored in the variable bse.users.password. Swap this out for the encrypted type5 password got from the running config. The username and password used by Napalm to connect to devices is stored in ans.creds_all and will also need changing to match (is plain-text or use vault).

Before the playbook can be run the devices SSH keys need adding on the Ansible host. ssh_key_playbook.yml (in ssh_keys directory) can be run to add these automatically, you just need to populate the device's management IPs in the ssh_hosts file.

sudo apt install ssh-keyscan
ansible-playbook ssh_keys/ssh_key_add.yml -i ssh_keys/ssh_hosts

Running playbook

The device configuration is applied using Napalm with the differences always saved to ~/device_configs/diff/device_name.txt and optionally printed to screen. Napalm commit_changes is set to True meaning that Ansible check-mode is used for dry-runs. It can take upto 6 minutes to deploy the full configuration when including the service roles so the Napalm default timeout has been increased to 360 seconds. If it takes longer (N9Kv running 9.2(4) is very slow) Ansible will report the build as failed but it is likely the process is still running on the device so give it a minute and run the playbook again, it should pass and with no changes needed.

Due to the declarative nature of the playbook and inheritance between roles there are only a certain number of combinations that the roles can be deployed in.

Ansible tag Playbook action
pre_val Checks that the var_file contents are of a valid format
bse_fbc Generates, joins and applies the base, fabric and inft_cleanup config snippets
bse_fbc_tnt Generates, joins and applies the base, fabric, inft_cleanup and tenant config snippets
bse_fbc_intf Generates, joins and applies the base, fabric, tenant, interface and inft_cleanup config snippets
full Generates, joins and applies the base, fabric, tenant, interface, inft_cleanup and route config snippets
rb Reverses the last applied change by deploying the rollback configuration (rollback_config.txt)
diff Prints the differences between the current_config (on the device) and desired_config (applied by Napalm) to screen
  • diff tag can be used with bse_fbc_tnt, bse_fbc_intf, full or rb to print the configuration changes to screen
  • Changes are always saved to file no matter whether diff is used or not
  • -C or --check-mode will do everything except actually apply the configuration

pre-validation: Validates the contents of variable files defined under var_files. Best to use dummy host file instead of dynamic inventory
ansible-playbook PB_build_fabric.yml -i hosts --tag post_val

Generate the complete config: Creates config snippets, assembles them in config.cfg, compares against device config and prints the diff
ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml --tag 'full, diff' -C

Apply the config: Replaces current config on the device with changes made automatically saved to ~/device_configs/diff/device_name.txt
ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml --tag full

All roles can be deployed individually to just to create the config snippet files, no connections are made to devices or changes applied. The merge tag can be used in conjunction with any combination of these role tags to non-declaratively merge the config snippets with the current device config rather than replacing it. As the L3VNIs and interfaces are generated automatically at a bare minimum the variable files will still need current tenants and interfaces as well as the advanced variable sections.

Ansible tag Playbook action
bse Generates the base configuration snippet saved to device_name/config/base.conf
fbc Generates the fabric and intf_cleanup configuration snippets saved to fabric.conf and dflt_intf.conf
tnt Generates the tenant configuration snippet saved to device_name/config/svc_tnt.conf
intf Generates the interface configuration snippet saved to device_name/config/svc_intf.conf
rte Generates the route configuration snippet saved to device_name/config/svc_rte.conf
merge Non-declaratively merges the new and current config, can be run with any combination of role tags

Generate the fabric config: Creates the fabric and interface cleanup config snippets and saves them to fabric.conf and dflt_intf.conf
ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml --tag fbc

Apply tenants and interfaces non-declaratively: Add additional tenant and routing objects by merging their config snippets with the devices config. The diffs for merges are simply the lines in the merge candidate config so wont be as true as the diffs from declarative deployments
ansible-playbook PB_build_fabric.yml -i inv_from_vars_cfg.yml --tag tnt,rte,merge,diff

Post Validation checks

A declaration of how the fabric should be built (desired_state) is created from the values of the variables files and validated against the actual_state. napalm_validate can only perform a compliance check against anything it has a getter for, for anything not covered by this the custom_validate filter plugin is used. This plugin uses the same napalm_validate framework but the actual state is supplied through a static input file (got using napalm_cli) rather than a getter. Both validation engines are within the same validate role with separate template and task files.

The results of the napalm_validate (nap_val.yml) and custom_validate (cus_val.yml) tasks are joined together to create the one combined compliance report. Each getter or command has a complies dictionary (True or False) to report its state which feeds into the compliance reports overall complies dictionary. It is based on this value that a task in the post-validation playbook will raise an exception.

napalm_validate

As Napalm is vendor agnostic the jinja template file used to create the validation file is the same for all vendors. The following elements are validated by napalm_validate with the roles being validated in brackets.

  • hostname (fbc): Automatically created device names are correct
  • lldp_neighbors (fbc): Devices physical fabric and MLAG connections are correct
  • bgp_neighbors (fbc, tnt): Overlay neighbors are all up (strict). fbc doesn't check for sent/rcv prefixes, this is done by tnt

An example of the desired and actual state file formats.

- get_bgp_neighbors:
    global:
      router_id: 192.168.101.16
      peers:
        _mode: strict
        192.168.101.11:
          is_enabled: true
          is_up: true

custom_validate

custom_validate requires a per-OS type template file and per-OS type method within the custom_validate.py filter_plugin. The command output is collected in JSON format using naplam_cli, passed through the nxos_dm method to create a new actual_state data model and along with the desired_state is fed into napalm_validate using the compliance_report method.

The following elements are validated by napalm_validate with the roles being validated in brackets.

  • show ip ospf neighbors detail (fbc): Underlay neighbors are all up (strict)
  • show port-channel summary (fbc, intf): Port-channel state and members (strict) are up
  • show vpc (fbc, tnt, intf): MLAG peer-link, keep-alive state, vpc status and active VLANs
  • show interfaces trunk (fbc, tnt, intf): Allowed vlans and STP forwarding vlans
  • show ip int brief include-secondary vrf all (fbc, tnt, intf): Layer3 interfaces in fabric and tenants
  • show nve peers (tnt): All VTEP tunnels are up
  • show nve vni (tnt): All VNIs are up, have correct VNI number and VLAN mapping
  • show interface status (intf): State and port type
  • show ip ospf interface brief vrf all (rte): Tenant OSPF interfaces are in correct process, area and are up
  • show bgp vrf all ipv4 unicast (rte): Prefixes advertised by network and summary are in the BGP table
  • show ip route vrf all (rte): Static routes are in the routing table with correct gateway and AD

An example of the desired and actual state file formats

cmds:
  - show ip ospf neighbors detail:
      192.168.101.11:
        state: FULL
      192.168.101.12:
        state: FULL
      192.168.101.22:
        state: FULL

To aid with creating new validations the custom_val_builder directory is a stripped down version of custom_validate to use when building new validations. The README has more detail on how to run it, the idea being to walk through each stage of creating the desired and actual state ready to add to the validate roles.

Running Post-validation

Post-validation is hierarchial as the addition of elements in the later roles effects the validation outputs in the earlier roles. For example, extra VLANs added in tenant_service will effect the bse_fbc post-validate output of show vpc (peer-link_vlans). For this reason post-validation must be run for the current role and all applied roles before it. This is done automatically by Jinja template inheritance as calling a template with the extends statement will also render the inheriting templates.

Ansible tag. Playbook action
bse_fbc Validates the configuration applied by the base and fabric roles
bse_fbc_tnt Validates the configuration applied by the base, fabric and tenant roles
bse_fbc_tnt_intf Validates the configuration applied by the base, fabric, tenant and interfaces roles
full Validates the configuration applied by the base, fabric, tenant, interfaces and route roles

Run fabric validation: Runs validation against the desired state got from all the variable files. There is no differentiation between naplam_validate and custom_validate, both are run as part of the validation tasks
ansible-playbook PB_post_validate.yml -i inv_from_vars_cfg.yml --tag full

Viewing compliance report: When viewing the validation report piping it through json.tool makes it more human readable
cat ~/device_configs/reports/DC1-N9K-SPINE01_compliance_report.json | python -m json.tool

Caveats

When starting this project I used N9Kv on EVE-NG and later moved onto physical devices when we were deploying the data centers. vPC fabric peering does not work on the virtual devices so this was never added as an option in the playbook.

As deployments are declarative and there are differences with physical devices you will need a few minor tweaks to the bse_tmpl.j2 template as different hardware can have slightly different hidden base commands. An example is the command system nve infra-vlans, it is required on physical devices (command doesnt exist on N9Kv) in order to use an SVI as an underlay interface (one that forwards/originates VXLAN-encapsulated traffic). Therefore on physical devices unhash this line in bse_tmpl.j2, it is used for the OSPF peering over the vPC link (VLAN2).

{# system nve infra-vlans {{ fbc.adv.mlag.peer_vlan }} #}

The same applies for NXOS versions, it is only the base commands that will change (features commands stay the same across versions) so if statements are used in bse_tmpl.j2 based on the bse.adv.image variable.

Although they work on EVE-NG it is not perfect for running N9Kv. I originally started on nxos.9.2.4 and although it is fairly stable in terms of features and uptime, the API can be very slow at times taking upto 10 minutes to deploy a device config. Sometimes after a deployment the API would stop responding (couldn't telnet on 443) but NXOS CLI said it was listening. To fix this you have to disable and re-enable the nxapi feature. Removing the command nxapi use-vrf management seems to have helped to make the API more stable.

I moved onto to NXOS nxos.9.3.5 and although the API is faster and more stability, there is a different issue around the interface module. When the N9Kv went to 9.3 the interfaces where moved to a separate module.

Mod Ports             Module-Type                      Model           Status
--- ----- ------------------------------------- --------------------- ---------
1    64   Nexus 9000v 64 port Ethernet Module   N9K-X9364v            ok
27   0    Virtual Supervisor Module             N9K-vSUP              active *

With 9.3(5), 9.3(6) and 9.3(7) on EVE-NG up to 5 or 6 N9Ks it is fine, however when you add anymore N9Ks (other device types are fine) things start to become unstable. New devices take an age to boot up and when they do their interface linecards normally fail and go into the pwr-cycld state.

Mod Ports             Module-Type                      Model           Status
--- ----- ------------------------------------- --------------------- ---------
1    64   Nexus 9000v 64 port Ethernet Module                         pwr-cycld
27   0    Virtual Supervisor Module             N9K-vSUP              active *

Mod  Power-Status  Reason
---  ------------  ---------------------------
1    pwr-cycld      Unknown. Issue show system reset mod ...

This in turn makes other N9Ks unstable, some freezing and others randomly having the same linecard issue. Rebooting sometimes fixes it but due to the load times it is unworkable. I have not been able to find a reason for this, it doesn't seem to be related to resources for either the virtual device or the EVE-NG box.

On in N9Kv 9.2(4) there is a bug whereas you cant have '>' in the name of the prefix-list in the route-map match statement. This name is set in the service_route.yml variables svc_rte.adv.pl_name and svc_rte.adv.pl_metric_name. The problem has been fixed in 9.3.

DC1-N9K-BGW01(config-route-map)# match ip address prefix-list PL_OSPF_BLU100->BGP_BLU
Error: CLI DN creation failed substituting values. Path sys/rpm/rtmap-[RM_OSPF_BLU100-BGP_BLU]/ent-10/mrtdst/rsrtDstAtt-[sys/rpm/pfxlistv4-[PL_OSPF_BLU100->BGP_BLU]]

If you are running these playbooks on MAC you may get the following error when running post-validations:

objc[29159]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called.
objc[29159]: +[__NSPlaceholderDictionary initialize] may have been in progress in another thread when fork() was called. We cannot safely call it or ignore it in the fork() child process. Crashing instead. Set a breakpoint on objc_initializeAfterForkError to debug.

Is the same behaviour as this older ansible bug, the solution of adding export OBJC_DISABLE_INITIALIZE_FORK_SAFETY=YES before running the post-validation playbook solved it for me.