blob: e5c86a681e985ab5bf1807aecaeb7f76a22f79f4 [file] [log] [blame]
Charles Chan9e5c6172019-09-07 11:24:54 -07001Dual Homing
2===========
Charles Chan976d8a02019-09-08 17:18:50 -07003
4Overview
5--------
6
7.. image:: ../images/config-dh.png
8
9The dual-homing feature includes several sub components
10
Zack Williamsd63d35b2020-06-23 14:12:46 -070011- **Use of "paired" ToRs**: Each rack of compute nodes have exactly two
12 Top-of-Rack switches (ToRs), that are linked to each other via a single link
13 - such a link is referred to as a **pair link**. This pairing should NOT be
14 omitted.
Charles Chan976d8a02019-09-08 17:18:50 -070015
Zack Williamsd63d35b2020-06-23 14:12:46 -070016 Currently there is support for only a single link between paired ToRs. In
17 future releases, we may include dual pair links. Note that the pair link is
18 only used in failure scenarios, and not in normal operation.
Charles Chan976d8a02019-09-08 17:18:50 -070019
Zack Williamsd63d35b2020-06-23 14:12:46 -070020- **Dual-homed servers (compute-nodes)**: Each server is connected to both
21 ToRs. The links to the paired ToRs are (Linux) bonded
22
23- **Dual-homed upstream routers**: The upstream routers MUST be connected to
24 the two ToRs that are part of a leaf-pair. You cannot connect them to leafs
25 that are not paired. This feature also requires two Quagga instances.
Charles Chan976d8a02019-09-08 17:18:50 -070026
27- **Dual-homed access devices**. This component will be added in the future.
28
29Paired ToRs
30-----------
Zack Williamsd63d35b2020-06-23 14:12:46 -070031The reasoning behind two ToR (leaf) switches is simple. If you only have a
32single ToR switch, and you lose it, the entire rack goes down. Using two ToR
33switches increases your odds for continued connectivity for dual homed servers.
34The reasoning behind pairing the two ToR switches is more involved, as is
35explained in the Usage section below.
Charles Chan976d8a02019-09-08 17:18:50 -070036
37Configure pair ToRs
38^^^^^^^^^^^^^^^^^^^
Zack Williamsd63d35b2020-06-23 14:12:46 -070039Configuring paired-ToRs involves device configuration. Assume switches of:205
40and of:206 are paired ToRs.
Charles Chan976d8a02019-09-08 17:18:50 -070041
42.. code-block:: json
43
44 {
45 "devices" : {
46 "of:0000000000000205" : {
47 "segmentrouting" : {
48 "name" : "Leaf1-R2",
49 "ipv4NodeSid" : 205,
50 "ipv4Loopback" : "192.168.0.205",
51 "ipv6NodeSid" : 205,
52 "ipv6Loopback" : "2000::c0a8:0205",
53 "routerMac" : "00:00:02:05:00:01",
54 "pairDeviceId" : "of:0000000000000206",
55 "pairLocalPort" : 20,
56 "isEdgeRouter" : true,
57 "adjacencySids" : []
58 }
59 },
60 "of:0000000000000206" : {
61 "segmentrouting" : {
62 "name" : "Leaf2-R2",
63 "ipv4NodeSid" : 206,
64 "ipv4Loopback" : "192.168.0.206",
65 "ipv6NodeSid" : 206,
66 "ipv6Loopback" : "2000::c0a8:0206",
67 "routerMac" : "00:00:02:05:00:01",
68 "pairDeviceId" : "of:0000000000000205",
69 "pairLocalPort" : 30,
70 "isEdgeRouter" : true,
71 "adjacencySids" : []
72 }
73 }
74 }
75 }
76
77There are two new pieces of device configuration.
78
Zack Williamsd63d35b2020-06-23 14:12:46 -070079Each device in the ToR pair needs to specify the **deviceId of the leaf it is
80paired to**, in the ``pairDeviceId`` field. For example, in ``of:205``
81configuration the ``pairDeviceId`` is specified as ``of:206``, and similarly in ``of:206``
82configuration the ``pairDeviceId`` is ``of:205``. Each device in the ToR pair needs to
83specify the **port on the device used for the pair link** in the
84``pairLocalPort`` field. For example, the pair link in the config above show
85that port 20 on of:205 is connected to port 30 on of:206.
Charles Chan976d8a02019-09-08 17:18:50 -070086
Zack Williamsd63d35b2020-06-23 14:12:46 -070087In addition, there is one crucial piece of config that needs to **match for
88both ToRs** – the ``routerMac`` address. The paired-ToRs MUST have the same
89routerMac - in the example above, they both have identical 00:00:02:05:00:01
90routerMacs.
Charles Chan976d8a02019-09-08 17:18:50 -070091
Zack Williamsd63d35b2020-06-23 14:12:46 -070092All other fields are the same as before, as explained in :doc:`Device
93Configuration <device-config>` section.
Charles Chan976d8a02019-09-08 17:18:50 -070094
95
96Usage of pair link
97^^^^^^^^^^^^^^^^^^
98
99.. image:: ../images/config-dh-pair-link.png
100
101
102Dual-Homed Servers
103------------------
Zack Williamsd63d35b2020-06-23 14:12:46 -0700104
Charles Chan976d8a02019-09-08 17:18:50 -0700105There are a number of things to note when connecting dual-homed servers to paired-ToRs.
106
Zack Williamsd63d35b2020-06-23 14:12:46 -0700107- The switch ports on the two ToRs have to be configured the same way, when
108 connecting a dual-homed server to the two ToRs.
109
Charles Chan976d8a02019-09-08 17:18:50 -0700110- The server ports have to be Linux-bonded in a particular mode.
111
112Configure Switch Ports
113^^^^^^^^^^^^^^^^^^^^^^
Charles Chan976d8a02019-09-08 17:18:50 -0700114
Zack Williamsd63d35b2020-06-23 14:12:46 -0700115The way to configure ports are similar as described in :doc:`Bridging and
116Unicast <bridging-unicast>`. However, there are a couple of things to note.
117
118**First**, dual-homed servers should have the **identical configuration on each
119switch port they connect to on the ToR pairs**. The example below shows that
120the ``vlans`` and ``ips`` configured are the same on both switch ports
121``of:205/12`` and ``of:206/29``. They are both configured to be access ports
122in ``VLAN 20``, the subnet ``10.0.2.0/24`` is assigned to these ports, and the
123gateway-IP is ``10.0.2.254/32``.
Charles Chan976d8a02019-09-08 17:18:50 -0700124
125.. code-block:: json
126
127 {
128 "ports" : {
129 "of:0000000000000205/12" : {
130 "interfaces" : [{
131 "name" : "h3-intf-1",
132 "ips" : [ "10.0.2.254/24"],
133 "vlan-untagged": 20
134 }]
135 },
136 "of:0000000000000206/29" : {
137 "interfaces" : [{
138 "name" : "h3-intf-2",
139 "ips" : [ "10.0.2.254/24"],
140 "vlan-untagged": 20
141 }]
142 }
143 }
144 }
145
Zack Williamsd63d35b2020-06-23 14:12:46 -0700146It is worth noting the meaning behind the configuration above from a routing
147perspective. Simply put, by configuring the same subnets on these switch
148ports, the fabric now believes that the entire subnet ``10.0.2.0/24`` is
149reachable by BOTH ToR switches ``of:205`` and ``of:206``.
Charles Chan976d8a02019-09-08 17:18:50 -0700150
151.. caution::
Zack Williamsd63d35b2020-06-23 14:12:46 -0700152 Configuring different VLANs, or different subnets, or mismatches like
153 "vlan-untagged" in one switch port and "vlan-tagged" in the corresponding
154 switch port facing the dual-homed server, will result in incorrect
155 behavior.
Charles Chan976d8a02019-09-08 17:18:50 -0700156
Zack Williamsd63d35b2020-06-23 14:12:46 -0700157**Second**, we need to configure the **pair link ports on both ToR switches to
158be trunk (vlan-tagged) ports that contains all dual-homed VLANs and subnets**.
159This is an extra piece of configuration, the need for which will be removed in
160future releases. In the example above, a dual-homed server connects to the ToR
161pair on port 12 on of:205 and port 29 on of:206. Assume that the pair link
162between the two ToRs is connected to port 5 of both of:205 and of:206. The
163config for these switch ports is shown below:
Charles Chan976d8a02019-09-08 17:18:50 -0700164
165.. code-block:: json
166
167 {
168 "ports": {
169 "of:0000000000000205/5" : {
170 "interfaces" : [{
171 "name" : "205-pair-port",
172 "ips" : [ "10.0.2.254/24"],
173 "vlan-tagged": [20]
174 }]
175 },
176 "of:0000000000000206/5" : {
177 "interfaces" : [{
178 "name" : "206-pair-port",
179 "ips" : [ "10.0.2.254/24"],
180 "vlan-tagged": [20]
181 }]
182 }
183 }
184 }
185
186.. note::
Jon Hall98006ff2021-06-23 13:48:14 -0700187 Even though the ports ``of:205/12`` and ``of:206/29`` facing the dual-homed
Zack Williamsd63d35b2020-06-23 14:12:46 -0700188 server are configured as ``vlan-untagged``, the same vlan MUST be
189 configured as ``vlan-tagged`` on the pair-ports.
190
191 If additional subnets and VLANs are configured facing other dual-homed
192 servers, they need to be similarly added to the ``ips`` and ``vlan-tagged``
193 arrays in the pair port config.
Charles Chan976d8a02019-09-08 17:18:50 -0700194
195
196Configure Servers
197^^^^^^^^^^^^^^^^^
Zack Williamsd63d35b2020-06-23 14:12:46 -0700198
Charles Chan976d8a02019-09-08 17:18:50 -0700199Assuming the interfaces we are going to use for bonding are ``eth1`` and ``eth2``.
200
201- Bring down interfaces
202
203 .. code-block:: console
204
205 $ sudo ifdown eth1
206 $ sudo ifdown eth2
207
208- Modify ``/etc/network/interfaces``
209
210 .. code-block:: text
211
212 auto bond0
213 iface bond0 inet dhcp
214 bond-mode balance-xor
215 bond-xmit_hash_policy layer2+3
216 bond-slaves none
217
218 auto eth1
219 iface eth1 inet manual
220 bond-master bond0
221
222 auto eth2
223 iface eth2 inet manual
224 bond-master bond0
225
226
227- Start interfaces
228
229 .. code-block:: console
230
231 $ sudo ifup bond0
232 $ sudo ifup eth1
233 $ sudo ifup eth2
234
235- Useful command to check bonding status
236
237 .. code-block:: console
238
239 # cat /proc/net/bonding/bond0
240 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
241
242 Bonding Mode: load balancing (xor)
243 Transmit Hash Policy: layer2+3 (2)
244 MII Status: up
245 MII Polling Interval (ms): 0
246 Up Delay (ms): 0
247 Down Delay (ms): 0
248
249 Slave Interface: eth1
250 MII Status: up
251 Speed: 1000 Mbps
252 Duplex: full
253 Link Failure Count: 0
254 Permanent HW addr: 00:1c:42:5b:07:6a
255 Slave queue ID: 0
256
257 Slave Interface: eth2
258 MII Status: up
259 Speed: Unknown
260 Duplex: Unknown
261 Link Failure Count: 0
262 Permanent HW addr: 00:1c:42:1c:a1:7c
263 Slave queue ID: 0
264
265.. caution::
266 **Dual-homed host should not be statically configured.**
267
Zack Williamsd63d35b2020-06-23 14:12:46 -0700268 Currently in ONOS, configured hosts are not updated when the connectPoint
269 is lost. This is not a problem with single-homed hosts because there is no
270 other way to reach them anyway if their connectPoint goes down. But in
271 dual-homed scenarios, the controller should take corrective action if one
272 of the connectPoints go down – the trigger for this event does not happen
273 when the dual-homed host's connect points are configured (not discovered).
Charles Chan976d8a02019-09-08 17:18:50 -0700274
275.. note::
Zack Williamsd63d35b2020-06-23 14:12:46 -0700276 We also support static routes with dual-homed next hop. The way to
277 configure it is exactly the same as regular single-homed next hop, as
278 described in :doc:`External Connectivity <external-connectivity>`.
Charles Chan976d8a02019-09-08 17:18:50 -0700279
Zack Williamsd63d35b2020-06-23 14:12:46 -0700280 ONOS will automatically recognize when the next-hop IP resolves to a
281 dual-homed host and program both switches (the host connects to)
282 accordingly.
Charles Chan976d8a02019-09-08 17:18:50 -0700283
Zack Williamsd63d35b2020-06-23 14:12:46 -0700284 The failure recovery mechanism for dual-homed hosts also applies to static
285 routes that point to the host as their next hop.
Charles Chan976d8a02019-09-08 17:18:50 -0700286
287Dual External Routers
288---------------------
289
290.. image:: ../images/config-dh-vr.png
291
292.. image:: ../images/config-dh-vr-logical.png
293 :width: 200px
294
Zack Williamsd63d35b2020-06-23 14:12:46 -0700295In addition to what we describe in :doc:`External Connectivity
296<external-connectivity>`, Trellis also supports dual external routers, which
297view the Trellis fabric as 2 individual routers, as shown above.
Charles Chan976d8a02019-09-08 17:18:50 -0700298
299As before the vRouter control plane is implemented as a combination of Quagga,
Zack Williamsd63d35b2020-06-23 14:12:46 -0700300which peers with the upstream routers, and ONOS which listens to Quagga (via
301FPM) and programs the underlying fabric. **In dual-router scenarios, there are
302two instances of Quagga required**.
Charles Chan976d8a02019-09-08 17:18:50 -0700303
Zack Williamsd63d35b2020-06-23 14:12:46 -0700304As before the hardware fabric serves as the data-plane of vRouter. In
305dual-router scenarios, the **external routers MUST be connected to
306paired-ToRs**.
Charles Chan976d8a02019-09-08 17:18:50 -0700307
308ToR connects to one upstream
309^^^^^^^^^^^^^^^^^^^^^^^^^^^^
310
Zack Williamsd63d35b2020-06-23 14:12:46 -0700311Lets consider the simpler case where the external routers are each connected to
312a single leaf in a ToR pair. The figure on the left below shows the logical
313view. The figure on the right shows the physical connectivity.
Charles Chan976d8a02019-09-08 17:18:50 -0700314
315.. image:: ../images/config-dh-vr-logical-simple.png
316 :width: 200px
317
318.. image:: ../images/config-dh-vr-physical-simple.png
319 :width: 400px
320
Zack Williamsd63d35b2020-06-23 14:12:46 -0700321One of the upstream routers is connected to ``of:205`` and the other is
322connected to ``of:206``. Note that ``of:205`` and ``of:206`` are paired ToRs.
Charles Chan976d8a02019-09-08 17:18:50 -0700323
Zack Williamsd63d35b2020-06-23 14:12:46 -0700324The ToRs are connected via a physical port to separate Quagga VMs or
325containers. These Quagga instances can be placed in any compute node. They do
326not need to be in the same server, and are only shown to be co-located for
327simplicity.
Charles Chan976d8a02019-09-08 17:18:50 -0700328
Zack Williamsd63d35b2020-06-23 14:12:46 -0700329The two Quagga instances do NOT talk to each other.
Charles Chan976d8a02019-09-08 17:18:50 -0700330
331Switch port configuration
332"""""""""""""""""""""""""
Zack Williamsd63d35b2020-06-23 14:12:46 -0700333
334The ToRs follow the same rules as single router case described in
335:doc:`External Connectivity <external-connectivity>`. In the example shown
336above, the switch port config would look like this:
Charles Chan976d8a02019-09-08 17:18:50 -0700337
338.. code-block:: json
339
340 {
341 "ports": {
342 "of:0000000000000205/1" : {
343 "interfaces" : [{
344 "ips" : [ "10.0.100.3/29", "2000::6403/125" ],
345 "vlan-untagged": 100,
346 "name" : "internet-router-1"
347 }]
348 },
349
350 "of:0000000000000205/48" : {
351 "interfaces" : [{
352 "ips" : [ "10.0.100.3/29", "2000::6403/125" ],
353 "vlan-untagged": 100,
354 "name" : "quagga-1"
355 }]
356 },
357
358 "of:0000000000000206/1" : {
359 "interfaces" : [{
360 "ips" : [ "10.0.200.3/29", "2000::6503/125" ],
361 "vlan-untagged": 200,
362 "name" : "internet-router-2"
363 }]
364 },
365
366 "of:0000000000000206/48" : {
367 "interfaces" : [{
368 "ips" : [ "10.0.200.3/29", "2000::6503/125" ],
369 "vlan-untagged": 200,
370 "name" : "quagga2"
371 }]
372 }
373 }
374 }
375
376.. note::
Zack Williamsd63d35b2020-06-23 14:12:46 -0700377 In the example shown above, switch ``of:205`` uses ``VLAN 100`` for
378 bridging the peering session between Quagga1 and ExtRouter1, while switch
379 ``of:205`` uses ``VLAN 200`` to do the same for the other peering session.
380 But since these vlans and bridging domains are defined on different
381 switches, the VLAN ids could have been the same.
Charles Chan976d8a02019-09-08 17:18:50 -0700382
Zack Williamsd63d35b2020-06-23 14:12:46 -0700383 This philosophy is consistent with the fabric use of :doc:`bridging
384 <bridging-unicast>`.
Charles Chan976d8a02019-09-08 17:18:50 -0700385
386
387Quagga configuration
388""""""""""""""""""""
Zack Williamsd63d35b2020-06-23 14:12:46 -0700389Configuring Quagga for dual external routers are similar to what we described
390in :doc:`External Connectivity <external-connectivity>`. However, it is worth
391noting that:
Charles Chan976d8a02019-09-08 17:18:50 -0700392
Zack Williamsd63d35b2020-06-23 14:12:46 -0700393- The two Zebra instances **should point to two different ONOS instances** for
394 their FPM connections. For example Zebra in Quagga1 could point to ONOS
395 instance with ``fpm connection ip 10.6.0.1 port 2620``, while the other Zebra
396 should point to a different ONOS instance with ``fpm connection ip 10.6.0.2
397 port 2620``. It does not matter which ONOS instances they point to as long
398 as they are different.
399
400- The two Quagga BGP sessions should appear to come from different routers but
401 still use the same AS number – i.e. the two Quaggas' belong to the same AS,
402 the one used to represent the entire Trellis infrastructure.
403
404- The two upstream routers can belong to the same or different AS, but these AS
405 numbers should be different from the one used to represent the Trellis AS.
406
Charles Chan976d8a02019-09-08 17:18:50 -0700407- Typically both Quagga instances advertise the same routes to the upstream.
Zack Williamsd63d35b2020-06-23 14:12:46 -0700408 These prefixes belonging to various infrastructure nodes in the deployment
409 should be reachable from either of the leaf switches connected to the
410 upstream routers.
Charles Chan976d8a02019-09-08 17:18:50 -0700411
Zack Williamsd63d35b2020-06-23 14:12:46 -0700412- The upstream routers may or may not advertise the same routes. Trellis will
413 ensure that traffic directed to a route reachable only one upstream router is
414 directed to the appropriate leaf.
Charles Chan976d8a02019-09-08 17:18:50 -0700415
416ToR connects to both upstream
417^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Zack Williamsd63d35b2020-06-23 14:12:46 -0700418
419Now lets consider the **more-complicated but more fault-tolerant** case of each
420Quagga instance peering with BOTH external routers. Again the logical view is
421shown on the left and the physical view on the right.
Charles Chan976d8a02019-09-08 17:18:50 -0700422
423.. image:: ../images/config-dh-vr-logical.png
424 :width: 200px
425
426.. image:: ../images/config-dh-vr-physical.png
427 :width: 500px
428
429First lets talk about the physical connectivity
430
431- Quagga instance 1 peers with external router R1 via port 1 on switch of:205
432- Quagga instance 1 peers with external router R2 via port 2 on switch of:205
433
434Similarly
435
436- Quagga instance 2 peers with external router R1 via port 2 on switch of:206
437- Quagga instance 2 peers with external router R2 via port 1 on switch of:206
438
Zack Williamsd63d35b2020-06-23 14:12:46 -0700439To distinguish between the two peering sessions in the same physical switch,
440say of:205, the physical ports 1 and 2 need to be configured in **different
441VLANs and subnets**. For example, port 1 on of:205 is (untagged) in VLAN 100,
442while port 2 is in VLAN 101.
Charles Chan976d8a02019-09-08 17:18:50 -0700443
Zack Williamsd63d35b2020-06-23 14:12:46 -0700444Note that peering for **Quagga1 and R1** happens with IPs in the
445``10.0.100.0/29`` subnet, and for **Quagga 1 and R2** in the **10.0.101.0/29**
446subnet.
Charles Chan976d8a02019-09-08 17:18:50 -0700447
Zack Williamsd63d35b2020-06-23 14:12:46 -0700448Furthermore, **pair link** (port 48) on of:205 carries both peering sessions to
449Quagga1. Thus port 48 should now be configured as a **trunk port (vlan-tagged)
450with both VLANs and both subnets**.
Charles Chan976d8a02019-09-08 17:18:50 -0700451
Zack Williamsd63d35b2020-06-23 14:12:46 -0700452Finally the **Quagga interface** on the VM now needs **sub-interface
453configuration for each VLAN ID**.
Charles Chan976d8a02019-09-08 17:18:50 -0700454
Zack Williamsd63d35b2020-06-23 14:12:46 -0700455Similar configuration concepts apply to IPv6 as well. Here is a look at the
456switch port config in ONOS for of:205
Charles Chan976d8a02019-09-08 17:18:50 -0700457
458.. code-block:: json
459
460 {
461 "ports": {
462 "of:0000000000000205/1" : {
463 "interfaces" : [{
464 "ips" : [ "10.0.100.3/29", "2000::6403/125" ],
465 "vlan-untagged": 100,
466 "name" : "internet-router1"
467 }]
468 },
469
470
471 "of:0000000000000205/2" : {
472 "interfaces" : [{
473 "ips" : [ "10.0.101.3/29", "2000::7403/125" ],
474 "vlan-untagged": 101,
475 "name" : "internet-router2"
476 }]
477 },
478 "of:0000000000000205/48" : {
479 "interfaces" : [{
480 "ips" : [ "10.0.100.3/29", "2000::6403/125", "10.0.101.3/29", "2000::7403/125" ],
481 "vlan-tagged": [100, 101],
482 "name" : "quagga1"
483 }]
484
485 }
486 }
487 }