Skip to content

Azure bicep template behavior with virtual_network_resource_group and virtual_network_name #4300

@mcgov

Description

@mcgov

There's a bug which has been blocking dpdk l3fwd testing in some environments; specifically when you can't use a public IP in Azure. The workaround was adding the option to use an existing RG and virtual network; selectable with the runbook option virtual_network_resource_group.

This option silently changes the layout of subnets for LISA environments. Without this option, an environment with some VMs; each with three NICs attached, will have NICs placed on seperate subnets: 10.0.0.0/24, 10.0.1.0/24, and 10.0.2.0/24. When the option is enabled, all NICs are placed in the same address space in the existing vnet 10.0.0.0/16.

This behavior wasn't intentional as far as I can tell; so I'd like to propose a fix that should preserve the ability to have NICs on different subnets while avoiding public IP allocations. The problem to solve is that you cannot have two azure vnets in the same RG; and you cannot have two RGs with the same vnet... sort of. You can peer two vnets; however, and you can peer individual subnets from different vnets in different RGs. There are some caveats:

  • There is a restriction that their address spaces cannot overlap.
  • There is another arbitrary restriction that you cannot have more than ~500 peered subnets in a vnet.

Large runs like the smoke tests will probably want to keep allocating all their VMs in a single /16 address space to ensure they don't run out of addresses. This should be easy to special case, since the arm template already has a subnet_count parameter.

The naive solution would be to just extend the 10.0.$subnet_id.0/24 pattern, creating a subnet for each environment. Something like 10.$environment_id.$subnet_id.0/24. This runs out of addresses fairly quickly; however, and trying to cleaning up resources when there have been multiple deployments in the same RG and VNET gets messy unless you are tracking every resource id you've created. We don't do that currently; the template does all the heavy lifting. We keep a resource group name and virtual network name for the environment and have to query with the azure sdk to get any azure specifics after the initial deployment. This is unhelpful when there is only one rg and vnet for the entire run.

So! Here's my idea for a vnet layout (using subnet peering) that will allow us to keep test environment and orchestration resources seperated. It also enables test environments to have subnets again without needing a public ip for ssh access to the orchestrator.

Bicep pseudocode to illustrate the scheme below. Let me know if you see any potential issues with my proposal.

// We can leverage partial vnet peering (at the subnet level) to have one shared network for the default nics and the orchestrator. (ie eth0)
// We can pair many small subnets within the default prefix to allow the orchestrator vm ssh access to eth0 on the test vms.
//
// The rest of the test nics then have a seperate, large, non-unique address space for their test NICs and subnets. ( eth1/eth2 etc )
// The easy-to-implement version of this allows for a large amount of test environments (~16k) 
//   each with a maximum of 256 VMs with 256 NICs per VM.
//
// There is an azure restriction of ~500 pairings per vnet at a time; however. So if we go over that amount we would need to garbage collect.
// For large runs with simple tests like the smoke test with more than 500 environments; 
// we could just set subnet_count to 0 and special case it to leave everything in 10.0.0.0/16
//
// Here's some example bicep pseudocode to illustrate the scheme.
// the template for the orchestrator will need to reserve a small subnet;
// it can be anything as long as we know which address range to avoid in the test rg template.
// something like this:
resource vnet 'Microsoft.Network/virtualNetworks@2024-01-01' = {
  name: 'orchestrator-vnet'
  location: location
  properties: {
    addressSpace: {
      addressPrefixes: [
        '10.0.0.0/8'  //addressPrefix for the default route, the ssh connection between orchestrator and test vms
      ]
    }
    subnets: [
      { 
        name: 'default'
        properties: { addressPrefix: '10.255.255.0/24' }
      }
    ]
    // ...
}


// each test VM (in it's own RG) have a vnet that looks like this:

param env_index int
// reference to existing orchestrator rg and vnet
resource orchestrator_vnet 'Microsoft.Network/virtualNetworks@2024-01-01' existing = {
  name: 'orchestrator_vnet'
  resource_group_name: 'orchestrator_rg'
  location: location
}


resource vnet 'Microsoft.Network/virtualNetworks@2024-01-01' = {
  name: 'test-vnet-e${env_index}'
  location: location
  properties: {
    addressSpace: {
      addressPrefixes: [
        '10.0.0.0/8' // the addressPrefix for the default route
        '192.168.0.0/16' // the addressPrefix for the test subnets
      ]
    }
    // declare all the subnets we need; subnet 0 will be the default route. 
    // note that the address range for the environment cannot be shared with other environments. 
    //    the rest of the subnets are placed be within the test nic address range.
    //     these subnets do not need to be unique, so they are indexed by subnet_count
    //                                 note: assert that env_index is less than 0xFFFF
    subnets:  [ for i in range(0,subnet_count): {
        name: i==0 ? 'default' : 'test-subnet-${i}'
        properties: { addressPrefix: i==0 ? '10.${env_index/256}.${env_index%256}.0/24' : '192.168.${i-1}.0/24' }
      }]
    // declare the vnet peering with the remote vnet in the orchestrator RG.
    // note that 'peerCompleteVnets : false' is important.
   virtualNetworkPeerings: [
      {
        name: 'vnet-peering-e${env_index}'
        properties: {
          allowVirtualNetworkAccess: true
          localSubnetNames: [ 'default' ]
          peerCompleteVnets: false
          remoteSubnetNames:['default']
          remoteVirtualNetwork: orchestrator_vnet.id // reference to the orchestrator vnet
        }
      }
    ]
  }
}

whaddya think

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions