fbnic.rst 8.1 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192
  1. .. SPDX-License-Identifier: GPL-2.0+
  2. =====================================
  3. Meta Platforms Host Network Interface
  4. =====================================
  5. Firmware Versions
  6. -----------------
  7. fbnic has three components stored on the flash which are provided in one PLDM
  8. image:
  9. 1. fw - The control firmware used to view and modify firmware settings, request
  10. firmware actions, and retrieve firmware counters outside of the data path.
  11. This is the firmware which fbnic_fw.c interacts with.
  12. 2. bootloader - The firmware which validate firmware security and control basic
  13. operations including loading and updating the firmware. This is also known
  14. as the cmrt firmware.
  15. 3. undi - This is the UEFI driver which is based on the Linux driver.
  16. fbnic stores two copies of these three components on flash. This allows fbnic
  17. to fall back to an older version of firmware automatically in case firmware
  18. fails to boot. Version information for both is provided as running and stored.
  19. The undi is only provided in stored as it is not actively running once the Linux
  20. driver takes over.
  21. devlink dev info provides version information for all three components. In
  22. addition to the version the hg commit hash of the build is included as a
  23. separate entry.
  24. Configuration
  25. -------------
  26. Ringparams (ethtool -g / -G)
  27. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  28. fbnic has two submission (host -> device) rings for every completion
  29. (device -> host) ring. The three ring objects together form a single
  30. "queue" as used by higher layer software (a Rx, or a Tx queue).
  31. For Rx the two submission rings are used to pass empty pages to the NIC.
  32. Ring 0 is the Header Page Queue (HPQ), NIC will use its pages to place
  33. L2-L4 headers (or full frames if frame is not header-data split).
  34. Ring 1 is the Payload Page Queue (PPQ) and used for packet payloads.
  35. The completion ring is used to receive packet notifications / metadata.
  36. ethtool ``rx`` ringparam maps to the size of the completion ring,
  37. ``rx-mini`` to the HPQ, and ``rx-jumbo`` to the PPQ.
  38. For Tx both submission rings can be used to submit packets, the completion
  39. ring carries notifications for both. fbnic uses one of the submission
  40. rings for normal traffic from the stack and the second one for XDP frames.
  41. ethtool ``tx`` ringparam controls both the size of the submission rings
  42. and the completion ring.
  43. Every single entry on the HPQ and PPQ (``rx-mini``, ``rx-jumbo``)
  44. corresponds to 4kB of allocated memory, while entries on the remaining
  45. rings are in units of descriptors (8B). The ideal ratio of submission
  46. and completion ring sizes will depend on the workload, as for small packets
  47. multiple packets will fit into a single page.
  48. Upgrading Firmware
  49. ------------------
  50. fbnic supports updating firmware using signed PLDM images with devlink dev
  51. flash. PLDM images are written into the flash. Flashing does not interrupt
  52. the operation of the device.
  53. On host boot the latest UEFI driver is always used, no explicit activation
  54. is required. Firmware activation is required to run new control firmware. cmrt
  55. firmware can only be activated by power cycling the NIC.
  56. Health reporters
  57. ----------------
  58. fw reporter
  59. ~~~~~~~~~~~
  60. The ``fw`` health reporter tracks FW crashes. Dumping the reporter will
  61. show the core dump of the most recent FW crash, and if no FW crash has
  62. happened since power cycle - a snapshot of the FW memory. Diagnose callback
  63. shows FW uptime based on the most recently received heartbeat message
  64. (the crashes are detected by checking if uptime goes down).
  65. otp reporter
  66. ~~~~~~~~~~~~
  67. OTP memory ("fuses") are used for secure boot and anti-rollback
  68. protection. The OTP memory is ECC protected, ECC errors indicate
  69. either manufacturing defect or part deteriorating with age.
  70. Statistics
  71. ----------
  72. TX MAC Interface
  73. ~~~~~~~~~~~~~~~~
  74. - ``ptp_illegal_req``: packets sent to the NIC with PTP request bit set but routed to BMC/FW
  75. - ``ptp_good_ts``: packets successfully routed to MAC with PTP request bit set
  76. - ``ptp_bad_ts``: packets destined for MAC with PTP request bit set but aborted because of some error (e.g., DMA read error)
  77. TX Extension (TEI) Interface (TTI)
  78. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  79. - ``tti_cm_drop``: control messages dropped at the TX Extension (TEI) Interface because of credit starvation
  80. - ``tti_frame_drop``: packets dropped at the TX Extension (TEI) Interface because of credit starvation
  81. - ``tti_tbi_drop``: packets dropped at the TX BMC Interface (TBI) because of credit starvation
  82. RXB (RX Buffer) Enqueue
  83. ~~~~~~~~~~~~~~~~~~~~~~~
  84. - ``rxb_integrity_err[i]``: frames enqueued with integrity errors (e.g., multi-bit ECC errors) on RXB input i
  85. - ``rxb_mac_err[i]``: frames enqueued with MAC end-of-frame errors (e.g., bad FCS) on RXB input i
  86. - ``rxb_parser_err[i]``: frames experienced RPC parser errors
  87. - ``rxb_frm_err[i]``: frames experienced signaling errors (e.g., missing end-of-packet/start-of-packet) on RXB input i
  88. - ``rxb_drbo[i]_frames``: frames received at RXB input i
  89. - ``rxb_drbo[i]_bytes``: bytes received at RXB input i
  90. RXB (RX Buffer) FIFO
  91. ~~~~~~~~~~~~~~~~~~~~
  92. - ``rxb_fifo[i]_drop``: transitions into the drop state on RXB pool i
  93. - ``rxb_fifo[i]_dropped_frames``: frames dropped on RXB pool i
  94. - ``rxb_fifo[i]_ecn``: transitions into the ECN mark state on RXB pool i
  95. - ``rxb_fifo[i]_level``: current occupancy of RXB pool i
  96. RXB (RX Buffer) Dequeue
  97. ~~~~~~~~~~~~~~~~~~~~~~~
  98. - ``rxb_intf[i]_frames``: frames sent to the output i
  99. - ``rxb_intf[i]_bytes``: bytes sent to the output i
  100. - ``rxb_pbuf[i]_frames``: frames sent to output i from the perspective of internal packet buffer
  101. - ``rxb_pbuf[i]_bytes``: bytes sent to output i from the perspective of internal packet buffer
  102. RPC (Rx parser)
  103. ~~~~~~~~~~~~~~~
  104. - ``rpc_unkn_etype``: frames containing unknown EtherType
  105. - ``rpc_unkn_ext_hdr``: frames containing unknown IPv6 extension header
  106. - ``rpc_ipv4_frag``: frames containing IPv4 fragment
  107. - ``rpc_ipv6_frag``: frames containing IPv6 fragment
  108. - ``rpc_ipv4_esp``: frames with IPv4 ESP encapsulation
  109. - ``rpc_ipv6_esp``: frames with IPv6 ESP encapsulation
  110. - ``rpc_tcp_opt_err``: frames which encountered TCP option parsing error
  111. - ``rpc_out_of_hdr_err``: frames where header was larger than parsable region
  112. - ``ovr_size_err``: oversized frames
  113. Hardware Queues
  114. ~~~~~~~~~~~~~~~
  115. 1. RX DMA Engine:
  116. - ``rde_[i]_pkt_err``: packets with MAC EOP, RPC parser, RXB truncation, or RDE frame truncation errors. These error are flagged in the packet metadata because of cut-through support but the actual drop happens once PCIE/RDE is reached.
  117. - ``rde_[i]_pkt_cq_drop``: packets dropped because RCQ is full
  118. - ``rde_[i]_pkt_bdq_drop``: packets dropped because HPQ or PPQ ran out of host buffer
  119. PCIe
  120. ~~~~
  121. The fbnic driver exposes PCIe hardware performance statistics through debugfs
  122. (``pcie_stats``). These statistics provide insights into PCIe transaction
  123. behavior and potential performance bottlenecks.
  124. 1. PCIe Transaction Counters:
  125. These counters track PCIe transaction activity:
  126. - ``pcie_ob_rd_tlp``: Outbound read Transaction Layer Packets count
  127. - ``pcie_ob_rd_dword``: DWORDs transferred in outbound read transactions
  128. - ``pcie_ob_wr_tlp``: Outbound write Transaction Layer Packets count
  129. - ``pcie_ob_wr_dword``: DWORDs transferred in outbound write
  130. transactions
  131. - ``pcie_ob_cpl_tlp``: Outbound completion TLP count
  132. - ``pcie_ob_cpl_dword``: DWORDs transferred in outbound completion TLPs
  133. 2. PCIe Resource Monitoring:
  134. These counters indicate PCIe resource exhaustion events:
  135. - ``pcie_ob_rd_no_tag``: Read requests dropped due to tag unavailability
  136. - ``pcie_ob_rd_no_cpl_cred``: Read requests dropped due to completion
  137. credit exhaustion
  138. - ``pcie_ob_rd_no_np_cred``: Read requests dropped due to non-posted
  139. credit exhaustion
  140. XDP Length Error:
  141. ~~~~~~~~~~~~~~~~~
  142. For XDP programs without frags support, fbnic tries to make sure that MTU fits
  143. into a single buffer. If an oversized frame is received and gets fragmented,
  144. it is dropped and the following netlink counters are updated
  145. - ``rx-length``: number of frames dropped due to lack of fragmentation
  146. support in the attached XDP program
  147. - ``rx-errors``: total number of packets with errors received on the interface