Charles Chan | 9e5c617 | 2019-09-07 11:24:54 -0700 | [diff] [blame] | 1 | Dual Homing |
| 2 | =========== |
Charles Chan | 976d8a0 | 2019-09-08 17:18:50 -0700 | [diff] [blame] | 3 | |
| 4 | Overview |
| 5 | -------- |
| 6 | |
| 7 | .. image:: ../images/config-dh.png |
| 8 | |
| 9 | The dual-homing feature includes several sub components |
| 10 | |
| 11 | - **Use of "paired" ToRs**: Each rack of compute nodes have exactly two Top-of-Rack switches (ToRs), |
| 12 | that are linked to each other via a single link - such a link is referred to as a **pair link**. |
| 13 | This pairing should NOT be omitted. |
| 14 | Currently there is support for only a single link between paired ToRs. |
| 15 | In future releases, we may include dual pair links. |
| 16 | Note that the pair link is only used in failure scenarios, and not in normal operation. |
| 17 | |
| 18 | - **Dual-homed servers (compute-nodes)**: Each server is connected to both ToRs. |
| 19 | The links to the paired ToRs are (Linux) bonded |
| 20 | |
| 21 | - **Dual-homed upstream routers**: The upstream routers MUST be connected to the two ToRs that are part of a leaf-pair. |
| 22 | You cannot connect them to leafs that are not paired. This feature also requires two Quagga instances. |
| 23 | |
| 24 | - **Dual-homed access devices**. This component will be added in the future. |
| 25 | |
| 26 | Paired ToRs |
| 27 | ----------- |
| 28 | The reasoning behind two ToR (leaf) switches is simple. |
| 29 | If you only have a single ToR switch, and you lose it, the entire rack goes down. |
| 30 | Using two ToR switches increases your odds for continued connectivity for dual homed servers. |
| 31 | The reasoning behind pairing the two ToR switches is more involved, as is explained in the Usage section below. |
| 32 | |
| 33 | Configure pair ToRs |
| 34 | ^^^^^^^^^^^^^^^^^^^ |
| 35 | Configuring paired-ToRs involves device configuration. Assume switches of:205 and of:206 are paired ToRs. |
| 36 | |
| 37 | .. code-block:: json |
| 38 | |
| 39 | { |
| 40 | "devices" : { |
| 41 | "of:0000000000000205" : { |
| 42 | "segmentrouting" : { |
| 43 | "name" : "Leaf1-R2", |
| 44 | "ipv4NodeSid" : 205, |
| 45 | "ipv4Loopback" : "192.168.0.205", |
| 46 | "ipv6NodeSid" : 205, |
| 47 | "ipv6Loopback" : "2000::c0a8:0205", |
| 48 | "routerMac" : "00:00:02:05:00:01", |
| 49 | "pairDeviceId" : "of:0000000000000206", |
| 50 | "pairLocalPort" : 20, |
| 51 | "isEdgeRouter" : true, |
| 52 | "adjacencySids" : [] |
| 53 | } |
| 54 | }, |
| 55 | "of:0000000000000206" : { |
| 56 | "segmentrouting" : { |
| 57 | "name" : "Leaf2-R2", |
| 58 | "ipv4NodeSid" : 206, |
| 59 | "ipv4Loopback" : "192.168.0.206", |
| 60 | "ipv6NodeSid" : 206, |
| 61 | "ipv6Loopback" : "2000::c0a8:0206", |
| 62 | "routerMac" : "00:00:02:05:00:01", |
| 63 | "pairDeviceId" : "of:0000000000000205", |
| 64 | "pairLocalPort" : 30, |
| 65 | "isEdgeRouter" : true, |
| 66 | "adjacencySids" : [] |
| 67 | } |
| 68 | } |
| 69 | } |
| 70 | } |
| 71 | |
| 72 | There are two new pieces of device configuration. |
| 73 | |
| 74 | Each device in the ToR pair needs to specify the **deviceId of the leaf it is paired to**, in the ``pairDeviceId`` field. |
| 75 | For example, in of:205 configuration the pairDeviceid is specified as of:206, and similarly in of:206 configuration the pairDeviceId is of:205 |
| 76 | Each device in the ToR pair needs to specify the **port on the device used for the pair link** in the ``pairLocalPort`` field. |
| 77 | For example, the pair link in the config above show that port 20 on of:205 is connected to port 30 on of:206. |
| 78 | |
| 79 | In addition, there is one crucial piece of config that needs to **match for both ToRs** – the ``routerMac`` address. |
| 80 | The paired-ToRs MUST have the same routerMac - in the example above, they both have identical 00:00:02:05:00:01 routerMacs. |
| 81 | |
| 82 | All other fields are the same as before, as explained in :doc:`Device Configuration <device-config>` section. |
| 83 | |
| 84 | |
| 85 | Usage of pair link |
| 86 | ^^^^^^^^^^^^^^^^^^ |
| 87 | |
| 88 | .. image:: ../images/config-dh-pair-link.png |
| 89 | |
| 90 | |
| 91 | Dual-Homed Servers |
| 92 | ------------------ |
| 93 | There are a number of things to note when connecting dual-homed servers to paired-ToRs. |
| 94 | |
| 95 | - The switch ports on the two ToRs have to be configured the same way, when connecting a dual-homed server to the two ToRs. |
| 96 | - The server ports have to be Linux-bonded in a particular mode. |
| 97 | |
| 98 | Configure Switch Ports |
| 99 | ^^^^^^^^^^^^^^^^^^^^^^ |
| 100 | The way to configure ports are similar as described in :doc:`Bridging and Unicast <bridging-unicast>`. |
| 101 | However, there are a couple of things to note. |
| 102 | |
| 103 | **First**, dual-homed servers should have the **identical configuration on each switch port they connect to on the ToR pairs**. |
| 104 | The example below shows that the ``vlans`` and ``ips`` configured are the same on both switch ports ``of:205/12`` and ``of:206/29``. |
| 105 | They are both configured to be access ports in ``VLAN 20``, the subnet ``10.0.2.0/24`` is assigned to these ports, and the gateway-IP is ``10.0.2.254/32``. |
| 106 | |
| 107 | .. code-block:: json |
| 108 | |
| 109 | { |
| 110 | "ports" : { |
| 111 | "of:0000000000000205/12" : { |
| 112 | "interfaces" : [{ |
| 113 | "name" : "h3-intf-1", |
| 114 | "ips" : [ "10.0.2.254/24"], |
| 115 | "vlan-untagged": 20 |
| 116 | }] |
| 117 | }, |
| 118 | "of:0000000000000206/29" : { |
| 119 | "interfaces" : [{ |
| 120 | "name" : "h3-intf-2", |
| 121 | "ips" : [ "10.0.2.254/24"], |
| 122 | "vlan-untagged": 20 |
| 123 | }] |
| 124 | } |
| 125 | } |
| 126 | } |
| 127 | |
| 128 | It is worth noting the meaning behind the configuration above from a routing perspective. |
| 129 | Simply put, by configuring the same subnets on these switch ports, the fabric now believes that the entire subnet ``10.0.2.0/24`` is reachable by BOTH ToR switches ``of:205`` and ``of:206``. |
| 130 | |
| 131 | .. caution:: |
| 132 | Configuring different VLANs, or different subnets, or mismatches like "vlan-untagged" in one switch port and "vlan-tagged" in the corresponding switch port facing the dual-homed server, will result in incorrect behavior. |
| 133 | |
| 134 | **Second**, we need to configure the **pair link ports on both ToR switches to be trunk (vlan-tagged) ports that contains all dual-homed VLANs and subnets**. |
| 135 | This is an extra piece of configuration, the need for which will be removed in future releases. |
| 136 | In the example above, a dual-homed server connects to the ToR pair on port 12 on of:205 and port 29 on of:206. |
| 137 | Assume that the pair link between the two ToRs is connected to port 5 of both of:205 and of:206. |
| 138 | The config for these switch ports is shown below: |
| 139 | |
| 140 | .. code-block:: json |
| 141 | |
| 142 | { |
| 143 | "ports": { |
| 144 | "of:0000000000000205/5" : { |
| 145 | "interfaces" : [{ |
| 146 | "name" : "205-pair-port", |
| 147 | "ips" : [ "10.0.2.254/24"], |
| 148 | "vlan-tagged": [20] |
| 149 | }] |
| 150 | }, |
| 151 | "of:0000000000000206/5" : { |
| 152 | "interfaces" : [{ |
| 153 | "name" : "206-pair-port", |
| 154 | "ips" : [ "10.0.2.254/24"], |
| 155 | "vlan-tagged": [20] |
| 156 | }] |
| 157 | } |
| 158 | } |
| 159 | } |
| 160 | |
| 161 | .. note:: |
| 162 | Even though the ports ``of:205/12`` and ``of:206/`` facing the dual-homed server are configured as ``vlan-untagged``, |
| 163 | the same vlan MUST be configured as ``vlan-tagged`` on the pair-ports. |
| 164 | If additional subnets and VLANs are configured facing other dual-homed servers, they need to be similarly added to the ``ips`` and ``vlan-tagged`` arrays in the pair port config. |
| 165 | |
| 166 | |
| 167 | Configure Servers |
| 168 | ^^^^^^^^^^^^^^^^^ |
| 169 | Assuming the interfaces we are going to use for bonding are ``eth1`` and ``eth2``. |
| 170 | |
| 171 | - Bring down interfaces |
| 172 | |
| 173 | .. code-block:: console |
| 174 | |
| 175 | $ sudo ifdown eth1 |
| 176 | $ sudo ifdown eth2 |
| 177 | |
| 178 | - Modify ``/etc/network/interfaces`` |
| 179 | |
| 180 | .. code-block:: text |
| 181 | |
| 182 | auto bond0 |
| 183 | iface bond0 inet dhcp |
| 184 | bond-mode balance-xor |
| 185 | bond-xmit_hash_policy layer2+3 |
| 186 | bond-slaves none |
| 187 | |
| 188 | auto eth1 |
| 189 | iface eth1 inet manual |
| 190 | bond-master bond0 |
| 191 | |
| 192 | auto eth2 |
| 193 | iface eth2 inet manual |
| 194 | bond-master bond0 |
| 195 | |
| 196 | |
| 197 | - Start interfaces |
| 198 | |
| 199 | .. code-block:: console |
| 200 | |
| 201 | $ sudo ifup bond0 |
| 202 | $ sudo ifup eth1 |
| 203 | $ sudo ifup eth2 |
| 204 | |
| 205 | - Useful command to check bonding status |
| 206 | |
| 207 | .. code-block:: console |
| 208 | |
| 209 | # cat /proc/net/bonding/bond0 |
| 210 | Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011) |
| 211 | |
| 212 | Bonding Mode: load balancing (xor) |
| 213 | Transmit Hash Policy: layer2+3 (2) |
| 214 | MII Status: up |
| 215 | MII Polling Interval (ms): 0 |
| 216 | Up Delay (ms): 0 |
| 217 | Down Delay (ms): 0 |
| 218 | |
| 219 | Slave Interface: eth1 |
| 220 | MII Status: up |
| 221 | Speed: 1000 Mbps |
| 222 | Duplex: full |
| 223 | Link Failure Count: 0 |
| 224 | Permanent HW addr: 00:1c:42:5b:07:6a |
| 225 | Slave queue ID: 0 |
| 226 | |
| 227 | Slave Interface: eth2 |
| 228 | MII Status: up |
| 229 | Speed: Unknown |
| 230 | Duplex: Unknown |
| 231 | Link Failure Count: 0 |
| 232 | Permanent HW addr: 00:1c:42:1c:a1:7c |
| 233 | Slave queue ID: 0 |
| 234 | |
| 235 | .. caution:: |
| 236 | **Dual-homed host should not be statically configured.** |
| 237 | |
| 238 | Currently in ONOS, configured hosts are not updated when the connectPoint is lost. |
| 239 | This is not a problem with single-homed hosts because there is no other way to reach them anyway if their connectPoint goes down. |
| 240 | But in dual-homed scenarios, the controller should take corrective action if one of the connectPoints go down – |
| 241 | the trigger for this event does not happen when the dual-homed host's connect points are configured (not discovered). |
| 242 | |
| 243 | .. note:: |
| 244 | We also support static routes with dual-homed next hop. |
| 245 | The way to configure it is exactly the same as regular single-homed next hop, as described in :doc:`External Connectivity <external-connectivity>`. |
| 246 | |
| 247 | ONOS will automatically recognize when the next-hop IP resolves to a dual-homed host and program both switches (the host connects to) accordingly. |
| 248 | |
| 249 | The failure recovery mechanism for dual-homed hosts also applies to static routes that point to the host as their next hop. |
| 250 | |
| 251 | |
| 252 | Dual External Routers |
| 253 | --------------------- |
| 254 | |
| 255 | .. image:: ../images/config-dh-vr.png |
| 256 | |
| 257 | .. image:: ../images/config-dh-vr-logical.png |
| 258 | :width: 200px |
| 259 | |
| 260 | In addition to what we describe in :doc:`External Connectivity <external-connectivity>`, |
| 261 | Trellis also supports dual external routers, which view the Trellis fabric as 2 individual routers, as shown above. |
| 262 | |
| 263 | As before the vRouter control plane is implemented as a combination of Quagga, |
| 264 | which peers with the upstream routers, and ONOS which listens to Quagga (via FPM) and programs the underlying fabric. |
| 265 | **In dual-router scenarios, there are two instances of Quagga required**. |
| 266 | |
| 267 | As before the hardware fabric serves as the data-plane of vRouter. |
| 268 | In dual-router scenarios, the **external routers MUST be connected to paired-ToRs**. |
| 269 | |
| 270 | |
| 271 | ToR connects to one upstream |
| 272 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 273 | |
| 274 | Lets consider the simpler case where the external routers are each connected to a single leaf in a ToR pair. |
| 275 | The figure on the left below shows the logical view. The figure on the right shows the physical connectivity. |
| 276 | |
| 277 | .. image:: ../images/config-dh-vr-logical-simple.png |
| 278 | :width: 200px |
| 279 | |
| 280 | .. image:: ../images/config-dh-vr-physical-simple.png |
| 281 | :width: 400px |
| 282 | |
| 283 | One of the upstream routers is connected to ``of:205`` and the other is connected to ``of:206``. |
| 284 | Note that ``of:205`` and ``of:206`` are paired ToRs. |
| 285 | |
| 286 | The ToRs are connected via a physical port to separate Quagga VMs or containers. |
| 287 | These Quagga instances can be placed in any compute node. They do not need to be in the same server, and are only shown to be co-located for simplicity. |
| 288 | |
| 289 | The two Quagga instances do NOT talk to each other. |
| 290 | |
| 291 | |
| 292 | Switch port configuration |
| 293 | """"""""""""""""""""""""" |
| 294 | The ToRs follow the same rules as single router case described in :doc:`External Connectivity <external-connectivity>`. |
| 295 | In the example shown above, the switch port config would look like this: |
| 296 | |
| 297 | .. code-block:: json |
| 298 | |
| 299 | { |
| 300 | "ports": { |
| 301 | "of:0000000000000205/1" : { |
| 302 | "interfaces" : [{ |
| 303 | "ips" : [ "10.0.100.3/29", "2000::6403/125" ], |
| 304 | "vlan-untagged": 100, |
| 305 | "name" : "internet-router-1" |
| 306 | }] |
| 307 | }, |
| 308 | |
| 309 | "of:0000000000000205/48" : { |
| 310 | "interfaces" : [{ |
| 311 | "ips" : [ "10.0.100.3/29", "2000::6403/125" ], |
| 312 | "vlan-untagged": 100, |
| 313 | "name" : "quagga-1" |
| 314 | }] |
| 315 | }, |
| 316 | |
| 317 | "of:0000000000000206/1" : { |
| 318 | "interfaces" : [{ |
| 319 | "ips" : [ "10.0.200.3/29", "2000::6503/125" ], |
| 320 | "vlan-untagged": 200, |
| 321 | "name" : "internet-router-2" |
| 322 | }] |
| 323 | }, |
| 324 | |
| 325 | "of:0000000000000206/48" : { |
| 326 | "interfaces" : [{ |
| 327 | "ips" : [ "10.0.200.3/29", "2000::6503/125" ], |
| 328 | "vlan-untagged": 200, |
| 329 | "name" : "quagga2" |
| 330 | }] |
| 331 | } |
| 332 | } |
| 333 | } |
| 334 | |
| 335 | .. note:: |
| 336 | In the example shown above, switch ``of:205`` uses ``VLAN 100`` for bridging the peering session between Quagga1 and ExtRouter1, |
| 337 | while switch ``of:205`` uses ``VLAN 200`` to do the same for the other peering session. |
| 338 | But since these vlans and bridging domains are defined on different switches, the VLAN ids could have been the same. |
| 339 | |
| 340 | This philosophy is consistent with the fabric use of :doc:`bridging <bridging-unicast>`. |
| 341 | |
| 342 | |
| 343 | Quagga configuration |
| 344 | """""""""""""""""""" |
| 345 | Configuring Quagga for dual external routers are similar to what we described in :doc:`External Connectivity <external-connectivity>`. However, it is worth noting that: |
| 346 | |
| 347 | - The two Zebra instances **should point to two different ONOS instances** for their FPM connections. |
| 348 | For example Zebra in Quagga1 could point to ONOS instance with ``fpm connection ip 10.6.0.1 port 2620``, |
| 349 | while the other Zebra should point to a different ONOS instance with ``fpm connection ip 10.6.0.2 port 2620``. |
| 350 | It does not matter which ONOS instances they point to as long as they are different. |
| 351 | - The two Quagga BGP sessions should appear to come from different routers but still use the same AS number – |
| 352 | i.e. the two Quaggas' belong to the same AS, the one used to represent the entire Trellis infrastructure. |
| 353 | - The two upstream routers can belong to the same or different AS, |
| 354 | but these AS numbers should be different from the one used to represent the Trellis AS. |
| 355 | - Typically both Quagga instances advertise the same routes to the upstream. |
| 356 | These prefixes belonging to various infrastructure nodes in the deployment should be reachable from either of the leaf switches connected to the upstream routers. |
| 357 | - The upstream routers may or may not advertise the same routes. |
| 358 | Trellis will ensure that traffic directed to a route reachable only one upstream router is directed to the appropriate leaf. |
| 359 | |
| 360 | |
| 361 | ToR connects to both upstream |
| 362 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 363 | Now lets consider the **more-complicated but more fault-tolerant** case of each Quagga instance peering with BOTH external routers. |
| 364 | Again the logical view is shown on the left and the physical view on the right. |
| 365 | |
| 366 | .. image:: ../images/config-dh-vr-logical.png |
| 367 | :width: 200px |
| 368 | |
| 369 | .. image:: ../images/config-dh-vr-physical.png |
| 370 | :width: 500px |
| 371 | |
| 372 | First lets talk about the physical connectivity |
| 373 | |
| 374 | - Quagga instance 1 peers with external router R1 via port 1 on switch of:205 |
| 375 | - Quagga instance 1 peers with external router R2 via port 2 on switch of:205 |
| 376 | |
| 377 | Similarly |
| 378 | |
| 379 | - Quagga instance 2 peers with external router R1 via port 2 on switch of:206 |
| 380 | - Quagga instance 2 peers with external router R2 via port 1 on switch of:206 |
| 381 | |
| 382 | To distinguish between the two peering sessions in the same physical switch, say of:205, |
| 383 | the physical ports 1 and 2 need to be configured in **different VLANs and subnets**. |
| 384 | For example, port 1 on of:205 is (untagged) in VLAN 100, while port 2 is in VLAN 101. |
| 385 | |
| 386 | Note that peering for **Quagga1 and R1** happens with IPs in the ``10.0.100.0/29`` subnet, |
| 387 | and for **Quagga 1 and R2** in the **10.0.101.0/29** subnet. |
| 388 | |
| 389 | Furthermore, **pair link** (port 48) on of:205 carries both peering sessions to Quagga1. |
| 390 | Thus port 48 should now be configured as a **trunk port (vlan-tagged) with both VLANs and both subnets**. |
| 391 | |
| 392 | Finally the **Quagga interface** on the VM now needs **sub-interface configuration for each VLAN ID**. |
| 393 | |
| 394 | Similar configuration concepts apply to IPv6 as well. Here is a look at the switch port config in ONOS for of:205 |
| 395 | |
| 396 | .. code-block:: json |
| 397 | |
| 398 | { |
| 399 | "ports": { |
| 400 | "of:0000000000000205/1" : { |
| 401 | "interfaces" : [{ |
| 402 | "ips" : [ "10.0.100.3/29", "2000::6403/125" ], |
| 403 | "vlan-untagged": 100, |
| 404 | "name" : "internet-router1" |
| 405 | }] |
| 406 | }, |
| 407 | |
| 408 | |
| 409 | "of:0000000000000205/2" : { |
| 410 | "interfaces" : [{ |
| 411 | "ips" : [ "10.0.101.3/29", "2000::7403/125" ], |
| 412 | "vlan-untagged": 101, |
| 413 | "name" : "internet-router2" |
| 414 | }] |
| 415 | }, |
| 416 | "of:0000000000000205/48" : { |
| 417 | "interfaces" : [{ |
| 418 | "ips" : [ "10.0.100.3/29", "2000::6403/125", "10.0.101.3/29", "2000::7403/125" ], |
| 419 | "vlan-tagged": [100, 101], |
| 420 | "name" : "quagga1" |
| 421 | }] |
| 422 | |
| 423 | } |
| 424 | } |
| 425 | } |