linux/Documentation/infiniband/opa_vnic.rst
<<
>>
Prefs
   1=================================================================
   2Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC)
   3=================================================================
   4
   5Intel Omni-Path (OPA) Virtual Network Interface Controller (VNIC) feature
   6supports Ethernet functionality over Omni-Path fabric by encapsulating
   7the Ethernet packets between HFI nodes.
   8
   9Architecture
  10=============
  11The patterns of exchanges of Omni-Path encapsulated Ethernet packets
  12involves one or more virtual Ethernet switches overlaid on the Omni-Path
  13fabric topology. A subset of HFI nodes on the Omni-Path fabric are
  14permitted to exchange encapsulated Ethernet packets across a particular
  15virtual Ethernet switch. The virtual Ethernet switches are logical
  16abstractions achieved by configuring the HFI nodes on the fabric for
  17header generation and processing. In the simplest configuration all HFI
  18nodes across the fabric exchange encapsulated Ethernet packets over a
  19single virtual Ethernet switch. A virtual Ethernet switch, is effectively
  20an independent Ethernet network. The configuration is performed by an
  21Ethernet Manager (EM) which is part of the trusted Fabric Manager (FM)
  22application. HFI nodes can have multiple VNICs each connected to a
  23different virtual Ethernet switch. The below diagram presents a case
  24of two virtual Ethernet switches with two HFI nodes::
  25
  26                               +-------------------+
  27                               |      Subnet/      |
  28                               |     Ethernet      |
  29                               |      Manager      |
  30                               +-------------------+
  31                                  /          /
  32                                /           /
  33                              /            /
  34                            /             /
  35  +-----------------------------+  +------------------------------+
  36  |  Virtual Ethernet Switch    |  |  Virtual Ethernet Switch     |
  37  |  +---------+    +---------+ |  | +---------+    +---------+   |
  38  |  | VPORT   |    |  VPORT  | |  | |  VPORT  |    |  VPORT  |   |
  39  +--+---------+----+---------+-+  +-+---------+----+---------+---+
  40           |                 \        /                 |
  41           |                   \    /                   |
  42           |                     \/                     |
  43           |                    /  \                    |
  44           |                  /      \                  |
  45       +-----------+------------+  +-----------+------------+
  46       |   VNIC    |    VNIC    |  |    VNIC   |    VNIC    |
  47       +-----------+------------+  +-----------+------------+
  48       |          HFI           |  |          HFI           |
  49       +------------------------+  +------------------------+
  50
  51
  52The Omni-Path encapsulated Ethernet packet format is as described below.
  53
  54==================== ================================
  55Bits                 Field
  56==================== ================================
  57Quad Word 0:
  580-19                 SLID (lower 20 bits)
  5920-30                Length (in Quad Words)
  6031                   BECN bit
  6132-51                DLID (lower 20 bits)
  6252-56                SC (Service Class)
  6357-59                RC (Routing Control)
  6460                   FECN bit
  6561-62                L2 (=10, 16B format)
  6663                   LT (=1, Link Transfer Head Flit)
  67
  68Quad Word 1:
  690-7                  L4 type (=0x78 ETHERNET)
  708-11                 SLID[23:20]
  7112-15                DLID[23:20]
  7216-31                PKEY
  7332-47                Entropy
  7448-63                Reserved
  75
  76Quad Word 2:
  770-15                 Reserved
  7816-31                L4 header
  7932-63                Ethernet Packet
  80
  81Quad Words 3 to N-1:
  820-63                 Ethernet packet (pad extended)
  83
  84Quad Word N (last):
  850-23                 Ethernet packet (pad extended)
  8624-55                ICRC
  8756-61                Tail
  8862-63                LT (=01, Link Transfer Tail Flit)
  89==================== ================================
  90
  91Ethernet packet is padded on the transmit side to ensure that the VNIC OPA
  92packet is quad word aligned. The 'Tail' field contains the number of bytes
  93padded. On the receive side the 'Tail' field is read and the padding is
  94removed (along with ICRC, Tail and OPA header) before passing packet up
  95the network stack.
  96
  97The L4 header field contains the virtual Ethernet switch id the VNIC port
  98belongs to. On the receive side, this field is used to de-multiplex the
  99received VNIC packets to different VNIC ports.
 100
 101Driver Design
 102==============
 103Intel OPA VNIC software design is presented in the below diagram.
 104OPA VNIC functionality has a HW dependent component and a HW
 105independent component.
 106
 107The support has been added for IB device to allocate and free the RDMA
 108netdev devices. The RDMA netdev supports interfacing with the network
 109stack thus creating standard network interfaces. OPA_VNIC is an RDMA
 110netdev device type.
 111
 112The HW dependent VNIC functionality is part of the HFI1 driver. It
 113implements the verbs to allocate and free the OPA_VNIC RDMA netdev.
 114It involves HW resource allocation/management for VNIC functionality.
 115It interfaces with the network stack and implements the required
 116net_device_ops functions. It expects Omni-Path encapsulated Ethernet
 117packets in the transmit path and provides HW access to them. It strips
 118the Omni-Path header from the received packets before passing them up
 119the network stack. It also implements the RDMA netdev control operations.
 120
 121The OPA VNIC module implements the HW independent VNIC functionality.
 122It consists of two parts. The VNIC Ethernet Management Agent (VEMA)
 123registers itself with IB core as an IB client and interfaces with the
 124IB MAD stack. It exchanges the management information with the Ethernet
 125Manager (EM) and the VNIC netdev. The VNIC netdev part allocates and frees
 126the OPA_VNIC RDMA netdev devices. It overrides the net_device_ops functions
 127set by HW dependent VNIC driver where required to accommodate any control
 128operation. It also handles the encapsulation of Ethernet packets with an
 129Omni-Path header in the transmit path. For each VNIC interface, the
 130information required for encapsulation is configured by the EM via VEMA MAD
 131interface. It also passes any control information to the HW dependent driver
 132by invoking the RDMA netdev control operations::
 133
 134        +-------------------+ +----------------------+
 135        |                   | |       Linux          |
 136        |     IB MAD        | |      Network         |
 137        |                   | |       Stack          |
 138        +-------------------+ +----------------------+
 139                 |               |          |
 140                 |               |          |
 141        +----------------------------+      |
 142        |                            |      |
 143        |      OPA VNIC Module       |      |
 144        |  (OPA VNIC RDMA Netdev     |      |
 145        |     & EMA functions)       |      |
 146        |                            |      |
 147        +----------------------------+      |
 148                    |                       |
 149                    |                       |
 150           +------------------+             |
 151           |     IB core      |             |
 152           +------------------+             |
 153                    |                       |
 154                    |                       |
 155        +--------------------------------------------+
 156        |                                            |
 157        |      HFI1 Driver with VNIC support         |
 158        |                                            |
 159        +--------------------------------------------+
 160