blob: 3e5f729ef0eaa075f50a486da3e64cc67947ec6d [file] [log] [blame]
Charles Chan9e5c6172019-09-07 11:24:54 -07001Dual Homing
2===========
Charles Chan976d8a02019-09-08 17:18:50 -07003
4Overview
5--------
6
7.. image:: ../images/config-dh.png
8
9The dual-homing feature includes several sub components
10
11- **Use of "paired" ToRs**: Each rack of compute nodes have exactly two Top-of-Rack switches (ToRs),
12 that are linked to each other via a single link - such a link is referred to as a **pair link**.
13 This pairing should NOT be omitted.
14 Currently there is support for only a single link between paired ToRs.
15 In future releases, we may include dual pair links.
16 Note that the pair link is only used in failure scenarios, and not in normal operation.
17
18- **Dual-homed servers (compute-nodes)**: Each server is connected to both ToRs.
19 The links to the paired ToRs are (Linux) bonded
20
21- **Dual-homed upstream routers**: The upstream routers MUST be connected to the two ToRs that are part of a leaf-pair.
22 You cannot connect them to leafs that are not paired. This feature also requires two Quagga instances.
23
24- **Dual-homed access devices**. This component will be added in the future.
25
26Paired ToRs
27-----------
28The reasoning behind two ToR (leaf) switches is simple.
29If you only have a single ToR switch, and you lose it, the entire rack goes down.
30Using two ToR switches increases your odds for continued connectivity for dual homed servers.
31The reasoning behind pairing the two ToR switches is more involved, as is explained in the Usage section below.
32
33Configure pair ToRs
34^^^^^^^^^^^^^^^^^^^
35Configuring paired-ToRs involves device configuration. Assume switches of:205 and of:206 are paired ToRs.
36
37.. code-block:: json
38
39 {
40 "devices" : {
41 "of:0000000000000205" : {
42 "segmentrouting" : {
43 "name" : "Leaf1-R2",
44 "ipv4NodeSid" : 205,
45 "ipv4Loopback" : "192.168.0.205",
46 "ipv6NodeSid" : 205,
47 "ipv6Loopback" : "2000::c0a8:0205",
48 "routerMac" : "00:00:02:05:00:01",
49 "pairDeviceId" : "of:0000000000000206",
50 "pairLocalPort" : 20,
51 "isEdgeRouter" : true,
52 "adjacencySids" : []
53 }
54 },
55 "of:0000000000000206" : {
56 "segmentrouting" : {
57 "name" : "Leaf2-R2",
58 "ipv4NodeSid" : 206,
59 "ipv4Loopback" : "192.168.0.206",
60 "ipv6NodeSid" : 206,
61 "ipv6Loopback" : "2000::c0a8:0206",
62 "routerMac" : "00:00:02:05:00:01",
63 "pairDeviceId" : "of:0000000000000205",
64 "pairLocalPort" : 30,
65 "isEdgeRouter" : true,
66 "adjacencySids" : []
67 }
68 }
69 }
70 }
71
72There are two new pieces of device configuration.
73
74Each device in the ToR pair needs to specify the **deviceId of the leaf it is paired to**, in the ``pairDeviceId`` field.
75For example, in of:205 configuration the pairDeviceid is specified as of:206, and similarly in of:206 configuration the pairDeviceId is of:205
76Each device in the ToR pair needs to specify the **port on the device used for the pair link** in the ``pairLocalPort`` field.
77For example, the pair link in the config above show that port 20 on of:205 is connected to port 30 on of:206.
78
79In addition, there is one crucial piece of config that needs to **match for both ToRs** – the ``routerMac`` address.
80The paired-ToRs MUST have the same routerMac - in the example above, they both have identical 00:00:02:05:00:01 routerMacs.
81
82All other fields are the same as before, as explained in :doc:`Device Configuration <device-config>` section.
83
84
85Usage of pair link
86^^^^^^^^^^^^^^^^^^
87
88.. image:: ../images/config-dh-pair-link.png
89
90
91Dual-Homed Servers
92------------------
93There are a number of things to note when connecting dual-homed servers to paired-ToRs.
94
95- The switch ports on the two ToRs have to be configured the same way, when connecting a dual-homed server to the two ToRs.
96- The server ports have to be Linux-bonded in a particular mode.
97
98Configure Switch Ports
99^^^^^^^^^^^^^^^^^^^^^^
100The way to configure ports are similar as described in :doc:`Bridging and Unicast <bridging-unicast>`.
101However, there are a couple of things to note.
102
103**First**, dual-homed servers should have the **identical configuration on each switch port they connect to on the ToR pairs**.
104The example below shows that the ``vlans`` and ``ips`` configured are the same on both switch ports ``of:205/12`` and ``of:206/29``.
105They are both configured to be access ports in ``VLAN 20``, the subnet ``10.0.2.0/24`` is assigned to these ports, and the gateway-IP is ``10.0.2.254/32``.
106
107.. code-block:: json
108
109 {
110 "ports" : {
111 "of:0000000000000205/12" : {
112 "interfaces" : [{
113 "name" : "h3-intf-1",
114 "ips" : [ "10.0.2.254/24"],
115 "vlan-untagged": 20
116 }]
117 },
118 "of:0000000000000206/29" : {
119 "interfaces" : [{
120 "name" : "h3-intf-2",
121 "ips" : [ "10.0.2.254/24"],
122 "vlan-untagged": 20
123 }]
124 }
125 }
126 }
127
128It is worth noting the meaning behind the configuration above from a routing perspective.
129Simply put, by configuring the same subnets on these switch ports, the fabric now believes that the entire subnet ``10.0.2.0/24`` is reachable by BOTH ToR switches ``of:205`` and ``of:206``.
130
131.. caution::
132 Configuring different VLANs, or different subnets, or mismatches like "vlan-untagged" in one switch port and "vlan-tagged" in the corresponding switch port facing the dual-homed server, will result in incorrect behavior.
133
134**Second**, we need to configure the **pair link ports on both ToR switches to be trunk (vlan-tagged) ports that contains all dual-homed VLANs and subnets**.
135This is an extra piece of configuration, the need for which will be removed in future releases.
136In the example above, a dual-homed server connects to the ToR pair on port 12 on of:205 and port 29 on of:206.
137Assume that the pair link between the two ToRs is connected to port 5 of both of:205 and of:206.
138The config for these switch ports is shown below:
139
140.. code-block:: json
141
142 {
143 "ports": {
144 "of:0000000000000205/5" : {
145 "interfaces" : [{
146 "name" : "205-pair-port",
147 "ips" : [ "10.0.2.254/24"],
148 "vlan-tagged": [20]
149 }]
150 },
151 "of:0000000000000206/5" : {
152 "interfaces" : [{
153 "name" : "206-pair-port",
154 "ips" : [ "10.0.2.254/24"],
155 "vlan-tagged": [20]
156 }]
157 }
158 }
159 }
160
161.. note::
162 Even though the ports ``of:205/12`` and ``of:206/`` facing the dual-homed server are configured as ``vlan-untagged``,
163 the same vlan MUST be configured as ``vlan-tagged`` on the pair-ports.
164 If additional subnets and VLANs are configured facing other dual-homed servers, they need to be similarly added to the ``ips`` and ``vlan-tagged`` arrays in the pair port config.
165
166
167Configure Servers
168^^^^^^^^^^^^^^^^^
169Assuming the interfaces we are going to use for bonding are ``eth1`` and ``eth2``.
170
171- Bring down interfaces
172
173 .. code-block:: console
174
175 $ sudo ifdown eth1
176 $ sudo ifdown eth2
177
178- Modify ``/etc/network/interfaces``
179
180 .. code-block:: text
181
182 auto bond0
183 iface bond0 inet dhcp
184 bond-mode balance-xor
185 bond-xmit_hash_policy layer2+3
186 bond-slaves none
187
188 auto eth1
189 iface eth1 inet manual
190 bond-master bond0
191
192 auto eth2
193 iface eth2 inet manual
194 bond-master bond0
195
196
197- Start interfaces
198
199 .. code-block:: console
200
201 $ sudo ifup bond0
202 $ sudo ifup eth1
203 $ sudo ifup eth2
204
205- Useful command to check bonding status
206
207 .. code-block:: console
208
209 # cat /proc/net/bonding/bond0
210 Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)
211
212 Bonding Mode: load balancing (xor)
213 Transmit Hash Policy: layer2+3 (2)
214 MII Status: up
215 MII Polling Interval (ms): 0
216 Up Delay (ms): 0
217 Down Delay (ms): 0
218
219 Slave Interface: eth1
220 MII Status: up
221 Speed: 1000 Mbps
222 Duplex: full
223 Link Failure Count: 0
224 Permanent HW addr: 00:1c:42:5b:07:6a
225 Slave queue ID: 0
226
227 Slave Interface: eth2
228 MII Status: up
229 Speed: Unknown
230 Duplex: Unknown
231 Link Failure Count: 0
232 Permanent HW addr: 00:1c:42:1c:a1:7c
233 Slave queue ID: 0
234
235.. caution::
236 **Dual-homed host should not be statically configured.**
237
238 Currently in ONOS, configured hosts are not updated when the connectPoint is lost.
239 This is not a problem with single-homed hosts because there is no other way to reach them anyway if their connectPoint goes down.
240 But in dual-homed scenarios, the controller should take corrective action if one of the connectPoints go down –
241 the trigger for this event does not happen when the dual-homed host's connect points are configured (not discovered).
242
243.. note::
244 We also support static routes with dual-homed next hop.
245 The way to configure it is exactly the same as regular single-homed next hop, as described in :doc:`External Connectivity <external-connectivity>`.
246
247 ONOS will automatically recognize when the next-hop IP resolves to a dual-homed host and program both switches (the host connects to) accordingly.
248
249 The failure recovery mechanism for dual-homed hosts also applies to static routes that point to the host as their next hop.
250
251
252Dual External Routers
253---------------------
254
255.. image:: ../images/config-dh-vr.png
256
257.. image:: ../images/config-dh-vr-logical.png
258 :width: 200px
259
260In addition to what we describe in :doc:`External Connectivity <external-connectivity>`,
261Trellis also supports dual external routers, which view the Trellis fabric as 2 individual routers, as shown above.
262
263As before the vRouter control plane is implemented as a combination of Quagga,
264which peers with the upstream routers, and ONOS which listens to Quagga (via FPM) and programs the underlying fabric.
265**In dual-router scenarios, there are two instances of Quagga required**.
266
267As before the hardware fabric serves as the data-plane of vRouter.
268In dual-router scenarios, the **external routers MUST be connected to paired-ToRs**.
269
270
271ToR connects to one upstream
272^^^^^^^^^^^^^^^^^^^^^^^^^^^^
273
274Lets consider the simpler case where the external routers are each connected to a single leaf in a ToR pair.
275The figure on the left below shows the logical view. The figure on the right shows the physical connectivity.
276
277.. image:: ../images/config-dh-vr-logical-simple.png
278 :width: 200px
279
280.. image:: ../images/config-dh-vr-physical-simple.png
281 :width: 400px
282
283One of the upstream routers is connected to ``of:205`` and the other is connected to ``of:206``.
284Note that ``of:205`` and ``of:206`` are paired ToRs.
285
286The ToRs are connected via a physical port to separate Quagga VMs or containers.
287These Quagga instances can be placed in any compute node. They do not need to be in the same server, and are only shown to be co-located for simplicity.
288
289The two Quagga instances do NOT talk to each other.
290
291
292Switch port configuration
293"""""""""""""""""""""""""
294The ToRs follow the same rules as single router case described in :doc:`External Connectivity <external-connectivity>`.
295In the example shown above, the switch port config would look like this:
296
297.. code-block:: json
298
299 {
300 "ports": {
301 "of:0000000000000205/1" : {
302 "interfaces" : [{
303 "ips" : [ "10.0.100.3/29", "2000::6403/125" ],
304 "vlan-untagged": 100,
305 "name" : "internet-router-1"
306 }]
307 },
308
309 "of:0000000000000205/48" : {
310 "interfaces" : [{
311 "ips" : [ "10.0.100.3/29", "2000::6403/125" ],
312 "vlan-untagged": 100,
313 "name" : "quagga-1"
314 }]
315 },
316
317 "of:0000000000000206/1" : {
318 "interfaces" : [{
319 "ips" : [ "10.0.200.3/29", "2000::6503/125" ],
320 "vlan-untagged": 200,
321 "name" : "internet-router-2"
322 }]
323 },
324
325 "of:0000000000000206/48" : {
326 "interfaces" : [{
327 "ips" : [ "10.0.200.3/29", "2000::6503/125" ],
328 "vlan-untagged": 200,
329 "name" : "quagga2"
330 }]
331 }
332 }
333 }
334
335.. note::
336 In the example shown above, switch ``of:205`` uses ``VLAN 100`` for bridging the peering session between Quagga1 and ExtRouter1,
337 while switch ``of:205`` uses ``VLAN 200`` to do the same for the other peering session.
338 But since these vlans and bridging domains are defined on different switches, the VLAN ids could have been the same.
339
340 This philosophy is consistent with the fabric use of :doc:`bridging <bridging-unicast>`.
341
342
343Quagga configuration
344""""""""""""""""""""
345Configuring Quagga for dual external routers are similar to what we described in :doc:`External Connectivity <external-connectivity>`. However, it is worth noting that:
346
347- The two Zebra instances **should point to two different ONOS instances** for their FPM connections.
348 For example Zebra in Quagga1 could point to ONOS instance with ``fpm connection ip 10.6.0.1 port 2620``,
349 while the other Zebra should point to a different ONOS instance with ``fpm connection ip 10.6.0.2 port 2620``.
350 It does not matter which ONOS instances they point to as long as they are different.
351- The two Quagga BGP sessions should appear to come from different routers but still use the same AS number –
352 i.e. the two Quaggas' belong to the same AS, the one used to represent the entire Trellis infrastructure.
353- The two upstream routers can belong to the same or different AS,
354 but these AS numbers should be different from the one used to represent the Trellis AS.
355- Typically both Quagga instances advertise the same routes to the upstream.
356 These prefixes belonging to various infrastructure nodes in the deployment should be reachable from either of the leaf switches connected to the upstream routers.
357- The upstream routers may or may not advertise the same routes.
358 Trellis will ensure that traffic directed to a route reachable only one upstream router is directed to the appropriate leaf.
359
360
361ToR connects to both upstream
362^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
363Now lets consider the **more-complicated but more fault-tolerant** case of each Quagga instance peering with BOTH external routers.
364Again the logical view is shown on the left and the physical view on the right.
365
366.. image:: ../images/config-dh-vr-logical.png
367 :width: 200px
368
369.. image:: ../images/config-dh-vr-physical.png
370 :width: 500px
371
372First lets talk about the physical connectivity
373
374- Quagga instance 1 peers with external router R1 via port 1 on switch of:205
375- Quagga instance 1 peers with external router R2 via port 2 on switch of:205
376
377Similarly
378
379- Quagga instance 2 peers with external router R1 via port 2 on switch of:206
380- Quagga instance 2 peers with external router R2 via port 1 on switch of:206
381
382To distinguish between the two peering sessions in the same physical switch, say of:205,
383the physical ports 1 and 2 need to be configured in **different VLANs and subnets**.
384For example, port 1 on of:205 is (untagged) in VLAN 100, while port 2 is in VLAN 101.
385
386Note that peering for **Quagga1 and R1** happens with IPs in the ``10.0.100.0/29`` subnet,
387and for **Quagga 1 and R2** in the **10.0.101.0/29** subnet.
388
389Furthermore, **pair link** (port 48) on of:205 carries both peering sessions to Quagga1.
390Thus port 48 should now be configured as a **trunk port (vlan-tagged) with both VLANs and both subnets**.
391
392Finally the **Quagga interface** on the VM now needs **sub-interface configuration for each VLAN ID**.
393
394Similar configuration concepts apply to IPv6 as well. Here is a look at the switch port config in ONOS for of:205
395
396.. code-block:: json
397
398 {
399 "ports": {
400 "of:0000000000000205/1" : {
401 "interfaces" : [{
402 "ips" : [ "10.0.100.3/29", "2000::6403/125" ],
403 "vlan-untagged": 100,
404 "name" : "internet-router1"
405 }]
406 },
407
408
409 "of:0000000000000205/2" : {
410 "interfaces" : [{
411 "ips" : [ "10.0.101.3/29", "2000::7403/125" ],
412 "vlan-untagged": 101,
413 "name" : "internet-router2"
414 }]
415 },
416 "of:0000000000000205/48" : {
417 "interfaces" : [{
418 "ips" : [ "10.0.100.3/29", "2000::6403/125", "10.0.101.3/29", "2000::7403/125" ],
419 "vlan-tagged": [100, 101],
420 "name" : "quagga1"
421 }]
422
423 }
424 }
425 }