net_failover.rst 7.2 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182
  1. .. SPDX-License-Identifier: GPL-2.0
  2. ============
  3. NET_FAILOVER
  4. ============
  5. Overview
  6. ========
  7. The net_failover driver provides an automated failover mechanism via APIs
  8. to create and destroy a failover master netdev and manages a primary and
  9. standby slave netdevs that get registered via the generic failover
  10. infrastructure.
  11. The failover netdev acts a master device and controls 2 slave devices. The
  12. original paravirtual interface is registered as 'standby' slave netdev and
  13. a passthru/vf device with the same MAC gets registered as 'primary' slave
  14. netdev. Both 'standby' and 'failover' netdevs are associated with the same
  15. 'pci' device. The user accesses the network interface via 'failover' netdev.
  16. The 'failover' netdev chooses 'primary' netdev as default for transmits when
  17. it is available with link up and running.
  18. This can be used by paravirtual drivers to enable an alternate low latency
  19. datapath. It also enables hypervisor controlled live migration of a VM with
  20. direct attached VF by failing over to the paravirtual datapath when the VF
  21. is unplugged.
  22. virtio-net accelerated datapath: STANDBY mode
  23. =============================================
  24. net_failover enables hypervisor controlled accelerated datapath to virtio-net
  25. enabled VMs in a transparent manner with no/minimal guest userspace changes.
  26. To support this, the hypervisor needs to enable VIRTIO_NET_F_STANDBY
  27. feature on the virtio-net interface and assign the same MAC address to both
  28. virtio-net and VF interfaces.
  29. Here is an example libvirt XML snippet that shows such configuration:
  30. ::
  31. <interface type='network'>
  32. <mac address='52:54:00:00:12:53'/>
  33. <source network='enp66s0f0_br'/>
  34. <target dev='tap01'/>
  35. <model type='virtio'/>
  36. <driver name='vhost' queues='4'/>
  37. <link state='down'/>
  38. <teaming type='persistent'/>
  39. <alias name='ua-backup0'/>
  40. </interface>
  41. <interface type='hostdev' managed='yes'>
  42. <mac address='52:54:00:00:12:53'/>
  43. <source>
  44. <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
  45. </source>
  46. <teaming type='transient' persistent='ua-backup0'/>
  47. </interface>
  48. In this configuration, the first device definition is for the virtio-net
  49. interface and this acts as the 'persistent' device indicating that this
  50. interface will always be plugged in. This is specified by the 'teaming' tag with
  51. required attribute type having value 'persistent'. The link state for the
  52. virtio-net device is set to 'down' to ensure that the 'failover' netdev prefers
  53. the VF passthrough device for normal communication. The virtio-net device will
  54. be brought UP during live migration to allow uninterrupted communication.
  55. The second device definition is for the VF passthrough interface. Here the
  56. 'teaming' tag is provided with type 'transient' indicating that this device may
  57. periodically be unplugged. A second attribute - 'persistent' is provided and
  58. points to the alias name declared for the virtio-net device.
  59. Booting a VM with the above configuration will result in the following 3
  60. interfaces created in the VM:
  61. ::
  62. 4: ens10: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP group default qlen 1000
  63. link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
  64. inet 192.168.12.53/24 brd 192.168.12.255 scope global dynamic ens10
  65. valid_lft 42482sec preferred_lft 42482sec
  66. inet6 fe80::97d8:db2:8c10:b6d6/64 scope link
  67. valid_lft forever preferred_lft forever
  68. 5: ens10nsby: <BROADCAST,MULTICAST> mtu 1500 qdisc fq_codel master ens10 state DOWN group default qlen 1000
  69. link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
  70. 7: ens11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc mq master ens10 state UP group default qlen 1000
  71. link/ether 52:54:00:00:12:53 brd ff:ff:ff:ff:ff:ff
  72. Here, ens10 is the 'failover' master interface, ens10nsby is the slave 'standby'
  73. virtio-net interface, and ens11 is the slave 'primary' VF passthrough interface.
  74. One point to note here is that some user space network configuration daemons
  75. like systemd-networkd, ifupdown, etc, do not understand the 'net_failover'
  76. device; and on the first boot, the VM might end up with both 'failover' device
  77. and VF acquiring IP addresses (either same or different) from the DHCP server.
  78. This will result in lack of connectivity to the VM. So some tweaks might be
  79. needed to these network configuration daemons to make sure that an IP is
  80. received only on the 'failover' device.
  81. Below is the patch snippet used with 'cloud-ifupdown-helper' script found on
  82. Debian cloud images::
  83. @@ -27,6 +27,8 @@ do_setup() {
  84. local working="$cfgdir/.$INTERFACE"
  85. local final="$cfgdir/$INTERFACE"
  86. + if [ -d "/sys/class/net/${INTERFACE}/master" ]; then exit 0; fi
  87. +
  88. if ifup --no-act "$INTERFACE" > /dev/null 2>&1; then
  89. # interface is already known to ifupdown, no need to generate cfg
  90. log "Skipping configuration generation for $INTERFACE"
  91. Live Migration of a VM with SR-IOV VF & virtio-net in STANDBY mode
  92. ==================================================================
  93. net_failover also enables hypervisor controlled live migration to be supported
  94. with VMs that have direct attached SR-IOV VF devices by automatic failover to
  95. the paravirtual datapath when the VF is unplugged.
  96. Here is a sample script that shows the steps to initiate live migration from
  97. the source hypervisor. Note: It is assumed that the VM is connected to a
  98. software bridge 'br0' which has a single VF attached to it along with the vnet
  99. device to the VM. This is not the VF that was passthrough'd to the VM (seen in
  100. the vf.xml file).
  101. ::
  102. # cat vf.xml
  103. <interface type='hostdev' managed='yes'>
  104. <mac address='52:54:00:00:12:53'/>
  105. <source>
  106. <address type='pci' domain='0x0000' bus='0x42' slot='0x02' function='0x5'/>
  107. </source>
  108. <teaming type='transient' persistent='ua-backup0'/>
  109. </interface>
  110. # Source Hypervisor migrate.sh
  111. #!/bin/bash
  112. DOMAIN=vm-01
  113. PF=ens6np0
  114. VF=ens6v1 # VF attached to the bridge.
  115. VF_NUM=1
  116. TAP_IF=vmtap01 # virtio-net interface in the VM.
  117. VF_XML=vf.xml
  118. MAC=52:54:00:00:12:53
  119. ZERO_MAC=00:00:00:00:00:00
  120. # Set the virtio-net interface up.
  121. virsh domif-setlink $DOMAIN $TAP_IF up
  122. # Remove the VF that was passthrough'd to the VM.
  123. virsh detach-device --live --config $DOMAIN $VF_XML
  124. ip link set $PF vf $VF_NUM mac $ZERO_MAC
  125. # Add FDB entry for traffic to continue going to the VM via
  126. # the VF -> br0 -> vnet interface path.
  127. bridge fdb add $MAC dev $VF
  128. bridge fdb add $MAC dev $TAP_IF master
  129. # Migrate the VM
  130. virsh migrate --live --persistent $DOMAIN qemu+ssh://$REMOTE_HOST/system
  131. # Clean up FDB entries after migration completes.
  132. bridge fdb del $MAC dev $VF
  133. bridge fdb del $MAC dev $TAP_IF master
  134. On the destination hypervisor, a shared bridge 'br0' is created before migration
  135. starts, and a VF from the destination PF is added to the bridge. Similarly an
  136. appropriate FDB entry is added.
  137. The following script is executed on the destination hypervisor once migration
  138. completes, and it reattaches the VF to the VM and brings down the virtio-net
  139. interface::
  140. # reattach-vf.sh
  141. #!/bin/bash
  142. bridge fdb del 52:54:00:00:12:53 dev ens36v0
  143. bridge fdb del 52:54:00:00:12:53 dev vmtap01 master
  144. virsh attach-device --config --live vm01 vf.xml
  145. virsh domif-setlink vm01 vmtap01 down