Skip to content

[EVPN-MH] Add kernel patches for EVPN VXLAN Multihoming support#540

Merged
saiarcot895 merged 6 commits into
sonic-net:masterfrom
bdfriedman:evpn_mh
May 12, 2026
Merged

[EVPN-MH] Add kernel patches for EVPN VXLAN Multihoming support#540
saiarcot895 merged 6 commits into
sonic-net:masterfrom
bdfriedman:evpn_mh

Conversation

@bdfriedman
Copy link
Copy Markdown
Contributor

@bdfriedman bdfriedman commented Feb 25, 2026

Why I did it

This PR adds three critical Linux kernel patches required to enable EVPN VXLAN Multihoming in SONiC. These kernel enhancements provide the necessary infrastructure for:

  1. Extended neighbor flags for multi-homing peer synchronization
  2. Protocol field tracking in bridge FDB entries to distinguish control plane vs data plane learned MACs
  3. External validation flag to prevent kernel from invalidating externally managed neighbor entries

These patches are essential for implementing the EVPN-MH feature as described in the EVPN VXLAN Multihoming HLD.

NOTE: The file patches-sonic/0003-neighbor-Add-NTF_EXT_VALIDATED-flag-for-externally-v.patch has been committed upstream to the Linux kernel master branch. See torvalds/linux@03dc03f

Work item tracking
  • Microsoft ADO (number only):

How I did it

Added three kernel patches to patches-sonic directory:

1. NDA_FLAGS_EXT Support with NTF_EXT_MH_PEER_SYNC (0001-vxlan-bridge-Add-NDA_FLAGS_EXT-support-with-NTF_EXT_.patch)

This patch adds extended flags support for VXLAN and bridge FDB entries to enable multi-homing peer synchronization:

  • New field: ext_flags in vxlan_fdb structure
  • New flag: NTF_EXT_MH_PEER_SYNC - Indicates FDB entry is synchronized across EVPN-MH peers
  • New neighbor update flag: NEIGH_UPDATE_F_EXT_MH_PEER_SYNC for propagating sync state
  • Modified functions:
    • vxlan_fdb_alloc() - Initialize ext_flags
    • vxlan_fdb_create() - Pass ext_flags parameter
    • vxlan_fdb_update_existing() - Handle ext_flags updates and notifications
    • vxlan_fdb_update_create() - Create FDB with ext_flags
    • vxlan_fdb_info() - Include NDA_FLAGS_EXT in netlink messages
    • Bridge FDB functions - Propagate ext_flags through bridge layer

Files modified:

  • drivers/net/vxlan/vxlan_core.c (140 lines)
  • drivers/net/vxlan/vxlan_private.h (21 lines)
  • drivers/net/vxlan/vxlan_vnifilter.c (11 lines)
  • include/net/neighbour.h (4 lines)
  • include/uapi/linux/neighbour.h (1 line)
  • net/bridge/br.c (4 lines)
  • net/bridge/br_fdb.c (35 lines)
  • net/bridge/br_private.h (5 lines)
  • net/core/neighbour.c (13 lines)

2. Protocol Field in Bridge FDB (0001-net-bridge-vxlan-Protocol-field-in-bridge-fdb.patch)

This patch introduces an optional "protocol" field for bridge FDB entries to distinguish between control plane and data plane learned MAC addresses:

Purpose: In EVPN Multihoming, MAC addresses can be learned via:

  • Control plane (ZEBRA protocol): Static MACs distributed by FRR/BGP
  • Data plane (HW protocol): Dynamic MACs learned from traffic with aging enabled

This distinction enables:

  • Proper state machine management during MAC transitions
  • Handling traffic hashing between EVPN-MH peers
  • Managing MAC mobility across EVPN peers
  • Synchronization between control and data planes

Implementation:

  • New field: protocol in net_bridge_fdb_entry and vxlan_fdb structures
  • Protocol values: Uses standard routing protocol values (RTPROT_UNSPEC, RTPROT_ZEBRA, RTPROT_KERNEL, etc.)
  • Default: RTPROT_UNSPEC when protocol not specified (backward compatible)
  • NDA_PROTOCOL attribute: Encoded in netlink messages for FDB entries

Usage Example:

# Add MAC with hardware protocol (data plane learned)
bridge fdb add 00:00:00:00:00:88 dev hostbond2 vlan 1000 master dynamic extern_learn proto hw

# Display with protocol field
bridge -d fdb show dev hostbond2
# Output: 00:00:00:00:00:88 vlan 1000 extern_learn master br1000 proto hw

# Transition to zebra (control plane)
bridge fdb replace 00:00:00:00:00:88 dev hostbond2 vlan 1000 master dynamic extern_learn proto zebra

Files modified:

  • drivers/net/vxlan/vxlan_core.c (55 lines)
  • drivers/net/vxlan/vxlan_private.h (5 lines)
  • drivers/net/vxlan/vxlan_vnifilter.c (4 lines)
  • net/bridge/br.c (2 lines)
  • net/bridge/br_fdb.c (55 lines)
  • net/bridge/br_private.h (5 lines)

3. NTF_EXT_VALIDATED Flag for External Validation (0001-neighbor-Add-NTF_EXT_VALIDATED-flag-for-externally-v.patch)

This patch adds a new "extern_valid" neighbor flag to indicate entries learned and validated externally that should not be invalidated by the kernel:

Background: In EVPN multi-homing:

  • Each host is multi-homed via Ethernet Segment (ES/LAG) to multiple VTEPs
  • Neighbor entries are distributed to ES peers using EVPN MAC/IP advertisement routes
  • When an ES link goes down, EVPN routes are withdrawn, causing intermittent failures

Solution (based on draft-rbickhart-evpn-ip-mac-proxy-adv-03):

  • ES peers install neighbor entries and inject proxy EVPN MAC/IP advertisements
  • When ES link goes down, ES peers start aging timers instead of immediately withdrawing
  • If an ES peer locally learns the entry (becomes "reachable"), it restarts timer and removes proxy indication
  • Prevents intermittent routing failures during ES link transitions

Implementation:

  • New flag: NTF_EXT_VALIDATED (extern_valid) - Entry is externally validated
  • Behavior:
    • Kernel will NOT remove or invalidate the entry
    • Kernel can probe the entry and notify user space when it becomes "reachable"
    • If no confirmation received, kernel returns entry to "stale" state (NOT "failed" state)
    • Control plane (FRR) manages entry lifecycle
  • Initial state: "stale" when installed by control plane
  • State transitions: Kernel notifies control plane when entry becomes "reachable"

Use case: Required for EVPN-MH proxy advertisements where control plane needs full control over neighbor entry validity and removal decisions.

Files modified:

  • Neighbor subsystem for external validation support
  • Netlink attributes for extern_valid flag
  • State machine modifications

How to verify it

  1. Build kernel with these patches applied:

    cd sonic-linux-kernel
    make BLDENV=bookworm
  2. Verify NDA_FLAGS_EXT support:

    # Add FDB entry with extended flags
    bridge fdb add <mac> dev <vxlan-dev> dst <vtep-ip> vni <vni> extern_learn
    
    # Verify in kernel via netlink dump
    bridge -d fdb show | grep <mac>
  3. Verify protocol field support:

    # Add MAC with specific protocol
    bridge fdb add <mac> dev <device> vlan <vid> master dynamic extern_learn proto hw
    
    # Verify protocol shows up
    bridge -d fdb show dev <device> | grep <mac>
    # Expected output includes: proto hw
    
    # Transition protocol
    bridge fdb replace <mac> dev <device> vlan <vid> master dynamic extern_learn proto zebra
    
    # Verify protocol changed
    bridge -d fdb show dev <device> | grep <mac>
    # Expected output includes: proto zebra
  4. Verify extern_valid flag:

    # Add neighbor with extern_valid flag (via FRR/control plane)
    # Entry should remain in "stale" state and not be removed by kernel GC
    
    # Monitor neighbor state transitions
    ip -d neigh show
  5. Integration testing with EVPN-MH:

    • Configure EVPN multi-homing with ES peers
    • Verify MAC/neighbor synchronization across peers
    • Test ES link failure scenarios
    • Verify proxy advertisements and aging behavior
    • Confirm no intermittent routing/ARP failures during transitions
  6. Compatibility testing:

    • Verify existing bridge/VXLAN functionality still works
    • Test backward compatibility (entries without new fields/flags)
    • Confirm no regressions in non-EVPN scenarios

Which release branch to backport (provide reason below if selected)

  • 202305
  • 202311
  • 202405
  • 202411
  • 202505
  • 202511

Tested branch (Please provide the tested image version)

Description for the changelog

Add kernel patches for EVPN VXLAN Multihoming: extended FDB flags (NTF_EXT_MH_PEER_SYNC), protocol field for bridge FDB entries, and extern_valid flag for externally validated neighbor entries

Link to config_db schema for YANG model changes

N/A - This PR only adds kernel patches, no CONFIG_DB schema changes

Depends on

Related upstream work

  • EVPN MAC/IP proxy advertisement draft: draft-rbickhart-evpn-ip-mac-proxy-adv-03
  • Kernel patch for protocol field: Authored by Mrinmoy Ghosh mrghosh@cisco.com

Summary:

  • Total patches: 3
  • Total lines added: +1,558
  • Kernel subsystems modified: VXLAN driver, bridge FDB, neighbor subsystem, netlink attributes
  • Backward compatible: Yes - all new fields/flags are optional with sensible defaults

Critical for EVPN-MH:
✅ Peer synchronization flag (NTF_EXT_MH_PEER_SYNC)
✅ Control/data plane MAC distinction (protocol field)
✅ External neighbor validation (extern_valid flag)
✅ Proxy advertisement support
✅ Prevents intermittent EVPN-MH failures

@bdfriedman bdfriedman requested a review from a team as a code owner February 25, 2026 22:19
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@lguohan
Copy link
Copy Markdown
Contributor

lguohan commented Mar 11, 2026

@bdfriedman , are these PR upstreamed to linux kernel already? or in the process of upstream?

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Bug in br.c (patch 2): br_switchdev_event passes fdb_info->locked (bool) as the protocol parameter and RTPROT_UNSPEC as ext_flags — argument order is wrong after the signature change
  • Bug in br_fdb_add (patch 2): protocol is initialized to RTPROT_UNSPEC but never parsed from tb[NDA_PROTOCOL] — the add path silently ignores the user-supplied protocol, unlike the delete path which parses it correctly
  • Patch 1 (NDA_FLAGS_EXT) and Patch 3 (NTF_EXT_VALIDATED) look structurally sound
  • Patches are not upstream-accepted — these appear to be custom SONiC patches carrying significant kernel neighbor/FDB subsystem changes

- bool swdev_notify, u32 ext_flags)
+ const unsigned char *addr, u16 vid, u8 protocol,
+ bool locked, bool swdev_notify, u32 ext_flags)
{
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: protocol is never parsed from netlink in the add path. The variable is initialized to RTPROT_UNSPEC but there's no if (tb[NDA_PROTOCOL]) protocol = nla_get_u8(tb[NDA_PROTOCOL]); anywhere in br_fdb_add. Compare with br_fdb_delete which correctly parses it. This means bridge fdb add ... proto hw will silently ignore the protocol — it will always store RTPROT_UNSPEC.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open. br_fdb_add initializes u8 protocol = RTPROT_UNSPEC but never parses tb[NDA_PROTOCOL]. The parsing only exists in br_fdb_delete and vxlan_fdb_parse, not in the bridge add path.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still unaddressed. br_fdb_add declares u8 protocol = RTPROT_UNSPEC but never parses tb[NDA_PROTOCOL]. The parsing block (if (tb[NDA_PROTOCOL]) protocol = nla_get_u8(...)) exists in br_fdb_delete but is missing from br_fdb_add.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still unaddressed in iteration 2. This bug remains in the current diff.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 41a91c8. br_fdb_add now parses tb[NDA_PROTOCOL] via nla_get_u8. ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still unaddressed in 0ae2cb5. br_fdb_add declares u8 protocol = RTPROT_UNSPEC but never parses tb[NDA_PROTOCOL]. The parsing exists in br_fdb_delete but is still missing from br_fdb_add.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 27a7faf. br_fdb_add now correctly parses tb[NDA_PROTOCOL] via nla_get_u8. ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified fixed in 27a7faf. br_fdb_add now parses NDA_PROTOCOL.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ad1ebce. br_fdb_add now parses tb[NDA_PROTOCOL] via nla_get_u8. ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ad1ebce. ✅

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Patch ordering concern: Patch 2 (protocol field) modifies br_fdb_external_learn_add signature from patch 1 but gets argument order wrong in br.c (already noted by prior reviewer).
  • Missing NDA_PROTOCOL parse in br_fdb_add: protocol is initialized to RTPROT_UNSPEC but never read from tb[NDA_PROTOCOL], so bridge fdb add silently ignores user-specified protocol (already noted).
  • Internal callers can silently overwrite protocol: vxlan_snoop and vxlan_fdb_external_learn_add pass RTPROT_UNSPEC — if the entry already has a real protocol set, it gets silently cleared on update.
  • Patches are well-structured overall: Clean kernel patch backports with proper commit messages and sign-offs. The NTF_EXT_VALIDATED and NTF_EXT_MH_PEER_SYNC patches look correct.

int err;

if (!(ndm->ndm_state & (NUD_PERMANENT|NUD_REACHABLE))) {
@@ -1281,7 +1296,7 @@ static int vxlan_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Protocol overwrite on internal update. When vxlan_fdb_update_existing is called by internal callers like vxlan_snoop or vxlan_fdb_external_learn_add with RTPROT_UNSPEC, this block will overwrite any previously-set protocol value back to 0. Consider guarding with if (protocol != RTPROT_UNSPEC && f->protocol != protocol) to avoid unintentional protocol clearing by internal callers that don't carry protocol context.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still open. vxlan_fdb_update_existing unconditionally sets f->protocol = protocol when they differ. Internal callers like vxlan_snoop pass RTPROT_UNSPEC, which will overwrite a previously-set protocol value.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still unaddressed. vxlan_fdb_update_existing unconditionally overwrites f->protocol when values differ. Internal callers like vxlan_snoop pass RTPROT_UNSPEC, which will clear a previously-set protocol value (e.g. RTPROT_ZEBRARTPROT_UNSPEC).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still unaddressed in iteration 2. This bug remains in the current diff.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still unaddressed in 41a91c8. vxlan_fdb_update_existing unconditionally overwrites f->protocol when values differ. Internal callers like vxlan_snoop pass RTPROT_UNSPEC, which will clear a previously-set protocol (e.g. RTPROT_ZEBRARTPROT_UNSPEC). Needs a guard: if (protocol != RTPROT_UNSPEC && ...).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed. vxlan_fdb_update_existing now guards with if (protocol != RTPROT_UNSPEC && f->protocol != protocol), preventing snoop from clearing control-plane protocol. ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ad1ebce. vxlan_fdb_update_existing now guards with if (protocol != RTPROT_UNSPEC && f->protocol != protocol). ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ad1ebce. ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ad1ebce. vxlan_fdb_update_existing now guards with if (protocol != RTPROT_UNSPEC && f->protocol != protocol). ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c2ee918. vxlan_fdb_update_existing now guards with if (protocol != RTPROT_UNSPEC && f->protocol != protocol) preventing internal callers from clobbering the protocol. ✅

Comment thread patches-sonic/0002-net-bridge-vxlan-Protocol-field-in-bridge-fdb.patch Outdated
Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • VXLAN FDB protocol field not emitted in netlink dumps: vxlan_fdb_info() is not updated to include NDA_PROTOCOL in the protocol patch, so VXLAN FDB dumps won't show the protocol even though it's stored internally. Bridge side (fdb_fill_info) does this correctly.
  • Existing review comments correctly identify: wrong argument order in br_switchdev_event call, missing NDA_PROTOCOL parsing in br_fdb_add, protocol not propagated through fdb_add_entry path, and protocol overwrite risk from internal VXLAN callers.
  • NTF_EXT_VALIDATED patch is well-designed — proper state validation, GC exemption, stale-instead-of-failed fallback, and carrier-down protection.
  • NTF_EXT_MH_PEER_SYNC / ext_flags plumbing through VXLAN and bridge layers looks correct.
  • All three patches use 0001- prefix — consider renaming to 0001-/0002-/0003- for clarity in the series.

Comment thread patches-sonic/0002-net-bridge-vxlan-Protocol-field-in-bridge-fdb.patch Outdated
@tahmed-dev
Copy link
Copy Markdown
Contributor

Patch 3 (NTF_EXT_VALIDATED) is upstreamed and merged — great. Patches 1 and 2 appear to be Cisco-specific and not yet submitted upstream. Are there plans to upstream these? Carrying out-of-tree kernel patches long-term creates maintenance burden on every kernel version bump. If there are upstream submissions in progress, linking them here would help reviewers assess the risk.

@tahmed-dev
Copy link
Copy Markdown
Contributor

The series file applies NDA_FLAGS_EXT first, then Protocol, then NTF_EXT_VALIDATED. But NTF_EXT_VALIDATED is already in mainline (merged by Kicinski). When SONiC bumps to a kernel that includes it natively, the patch will fail to apply. Should this be gated with a version check, or is the plan to drop it from the series once the kernel is rebased past that commit?

@tahmed-dev
Copy link
Copy Markdown
Contributor

The bugs flagged by banidoru (wrong argument order in br_fdb_external_learn_add, protocol never parsed from netlink in the add path, protocol overwrite on internal update, non-extern_learn path missing propagation, NDA_PROTOCOL not emitted in vxlan_fdb_info) look like they'd break the protocol field feature on real hardware. Are fixes in progress?

@saiarcot895
Copy link
Copy Markdown
Contributor

saiarcot895 commented Apr 20, 2026

Agree with what Tamer has said about upstreaming patches. There is already a fullcone NAT patch in this repo that hasn't been upstreamed, requires significant maintenance work to update, and is currently not getting applied on the Trixie kernel because no one has updated it. I do not want another instance of that.

Additionally, if a patch has been upstreamed, please use the git patch from upstream, adjust for any patch conflicts, and (preferably) add the commit ID in the patch description.

@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review (iteration 2):

  • Fixed: vxlan_fdb_info() now emits NDA_PROTOCOL — resolved
  • Still open (bug): br_switchdev_event passes fdb_info->locked as protocol — wrong argument order after patch 0002 changes the signature
  • Still open (bug): br_fdb_add never parses tb[NDA_PROTOCOL] — protocol is always RTPROT_UNSPEC in the bridge add path
  • Still open: vxlan_fdb_update_existing unconditionally overwrites protocol from internal callers passing RTPROT_UNSPEC
  • Still open: non-extern_learn path in __br_fdb_add ignores the protocol parameter

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review (iteration 2) — 4 of 5 previous issues remain unaddressed:

  • br_switchdev_event wrong arg order: fdb_info->locked passed as protocol param — will interpret bool as protocol type
  • br_fdb_add missing NDA_PROTOCOL parse: protocol always RTPROT_UNSPEC on add path
  • vxlan_fdb_update_existing unconditional overwrite: internal callers (snoop) clear previously-set protocol to RTPROT_UNSPEC
  • fdb_add_entry ignores protocol: non-extern_learn bridge FDB adds never store protocol

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Patch 0002 bug: br_switchdev_event passes fdb_info->locked as protocol — wrong argument order after signature change
  • Patch 0002 bug: br_fdb_add never parses tb[NDA_PROTOCOL]bridge fdb add ... proto hw silently ignored
  • Patch 0002 design: vxlan_fdb_update_existing unconditionally overwrites protocol — internal callers (e.g. vxlan_snoop) will clear previously-set protocol values
  • Patch 0002 design: fdb_add_entry (non-extern_learn path) ignores protocol param — silent data loss
  • Patch 0002 fix confirmed: vxlan_fdb_info now emits NDA_PROTOCOL
  • Patches 0001, 0003: No new issues found; well-structured kernel patches with proper state machine handling

@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Fixed: br_switchdev_event arg order — RTPROT_UNSPEC now correctly passed as protocol
  • Fixed: br_fdb_add now parses tb[NDA_PROTOCOL] from netlink
  • Still open: vxlan_fdb_update_existing unconditionally overwrites f->protocol — internal callers like vxlan_snoop will clear a previously-set protocol value
  • Still open: fdb_add_entry (non-extern_learn path) ignores protocol — bridge fdb add without extern_learn always stores RTPROT_UNSPEC

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iteration 3 re-review:

  • br_switchdev_event argument order fixed — RTPROT_UNSPEC now correctly passed as protocol
  • br_fdb_add now parses tb[NDA_PROTOCOL]
  • vxlan_fdb_info() emits NDA_PROTOCOL (confirmed in prior iteration)
  • vxlan_fdb_update_existing still unconditionally overwrites f->protocol — internal callers passing RTPROT_UNSPEC will clear a previously-set protocol
  • fdb_add_entry (non-extern_learn path) still ignores the protocol parameter

int err;

if (!(ndm->ndm_state & (NUD_PERMANENT|NUD_REACHABLE))) {
@@ -1281,7 +1296,7 @@ static int vxlan_fdb_add(struct ndmsg *ndm, struct nlattr *tb[],
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still unaddressed in iteration 3. vxlan_fdb_update_existing does if (f->protocol != protocol) { f->protocol = protocol; } but internal callers like vxlan_snoop pass RTPROT_UNSPEC. This will silently clear a protocol value set by the control plane (e.g. RTPROT_ZEBRARTPROT_UNSPEC).

Suggested fix: if (protocol != RTPROT_UNSPEC && f->protocol != protocol)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 0ae2cb5. vxlan_fdb_update_existing now correctly guards with if (protocol != RTPROT_UNSPEC && f->protocol != protocol). ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confirmed fixed in 0ae2cb5. ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified fixed in 0ae2cb5.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in ad1ebce. ✅

Comment thread patches-sonic/0002-net-bridge-vxlan-Protocol-field-in-bridge-fdb.patch Outdated
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla Bot commented May 4, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command.

Add three Linux kernel patches required for EVPN VXLAN Multihoming in SONiC:

1. NDA_FLAGS_EXT with NTF_EXT_MH_PEER_SYNC - Extended FDB flags for
   multi-homing peer synchronization across EVPN-MH peers.

2. Protocol field in bridge FDB - Distinguishes control plane (zebra)
   vs data plane (hw) learned MACs for proper EVPN-MH state management.

3. NTF_EXT_VALIDATED for externally validated neighbors - Prevents kernel
   from invalidating neighbor entries managed by the control plane during
   ES link transitions.

These patches are essential for the EVPN-MH feature described in:
https://github.com/sonic-net/SONiC/blob/master/doc/vxlan/EVPN/EVPN_VxLAN_Multihoming.md

Co-authored-by: Barry Friedman (friedman) <friedman@cisco.com>
Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>
Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iteration 7 re-review (ad1ebce)

✅ All previously reported bugs are now fixed:

  • br_switchdev_event argument order
  • br_fdb_add parsing NDA_PROTOCOL
  • vxlan_fdb_update_existing RTPROT_UNSPEC guard
  • fdb_add_entry protocol propagation
  • vxlan_fdb_info emitting NDA_PROTOCOL
  • br_fdb_external_learn_add update-path RTPROT_UNSPEC guard

Remaining (non-blocking):

  • ext_flags declared but unused in vxlan_fdb_delete (will cause compiler warning)
  • Minor style: ext_flags/protocol fields in vxlan_private.h lost tab alignment vs surrounding members
  • Patch 0002 reformats alignment in include/net/neighbour.h (removes tabs from NTF_EXT_MASK, NTF_MH_PEER_SYNC, etc.) — cosmetic churn unrelated to protocol field

Verdict: Approve. Core bugs resolved. Only cosmetic nits remain.

goto out;
}
@@ -1301,7 +1316,7 @@ out:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still present in ad1ebce. ext_flags is declared and parsed via vxlan_fdb_parse but never referenced in vxlan_fdb_delete. This will produce a compiler warning (-Wunused-variable). Non-blocking but should be cleaned up.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still present in ad1ebce. ext_flags declared and parsed but unused in vxlan_fdb_delete. Will cause compiler warning. Non-blocking nit.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still present in ad1ebce. ext_flags parsed but unused in vxlan_fdb_delete. Non-blocking nit.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still present — ext_flags parsed but unused in vxlan_fdb_delete. Non-blocking.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c2ee918. ext_flags is no longer declared in vxlan_fdb_delete — passes NULL for ext_flags to vxlan_fdb_parse. ✅

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All previously reported bugs are fixed: argument order, NDA_PROTOCOL parsing, RTPROT_UNSPEC guard, fdb_add_entry propagation, br_fdb_external_learn_add guard, vxlan_fdb_info emission
  • Protocol-aware FDB delete logic (both bridge and vxlan) is a solid addition — prevents cross-protocol entry clobbering
  • Two minor issues remain open: unused ext_flags variable in vxlan_fdb_delete (compiler warning), and tab+spaces alignment inconsistency for new struct fields in vxlan_private.h
  • Patch 0003 (NTF_EXT_VALIDATED backport) looks clean and well-documented
  • Verdict: Approve — remaining items are non-blocking style/warning nits

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All previously reported bugs are fixed: argument order in br_switchdev_event, NDA_PROTOCOL parsing in br_fdb_add, RTPROT_UNSPEC guard in vxlan_fdb_update_existing and br_fdb_external_learn_add, fdb_add_entry protocol propagation, vxlan_fdb_info emission
  • Two minor nits remain from prior reviews: unused ext_flags variable in vxlan_fdb_delete (compiler warning), and ext_flags/protocol struct fields in vxlan_fdb drop tab+space alignment used by surrounding members
  • Patch 0002 formatting commit drops tab alignment in include/net/neighbour.h defines (NTF_EXT_MASK, NTF_MH_PEER_SYNC, NEIGH_UPDATE_F_EXT_MH_PEER_SYNC) — cosmetic only
  • Overall: correctness issues are resolved, design is sound, backward compatible. Approving with the two non-blocking nits noted above.

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All reviewers approved. LGTM.

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All prior bugs fixed: argument order, NDA_PROTOCOL parsing in br_fdb_add, RTPROT_UNSPEC guards, fdb_add_entry propagation
  • Protocol-aware delete logic in both bridge and VXLAN is clean and correct
  • Patch 0003 (NTF_EXT_VALIDATED) is well-structured with proper state validation on create/replace paths
  • Minor nits remain: unused ext_flags in vxlan_fdb_delete, inconsistent indentation in vxlan_private.h — non-blocking
  • LGTM, approve

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All previously reported bugs (argument order, protocol parsing, RTPROT_UNSPEC guards, vxlan_fdb_info emission, fdb_add_entry propagation) are fixed
  • Protocol-aware FDB deletion logic is a nice addition for control-plane/data-plane coexistence
  • Two non-blocking nits remain: ext_flags unused in vxlan_fdb_delete, and struct field indentation style mismatch
  • Overall design is sound — protocol field correctly propagated through all bridge and VXLAN FDB paths
  • LGTM

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All previously reported bugs (argument order, missing netlink parse, protocol overwrite, fdb_add_entry propagation, NTF_EXT bit conflict) are fixed in ad1ebce
  • ext_flags still parsed but unused in vxlan_fdb_delete — harmless dead code, non-blocking
  • Indentation style inconsistency (ext_flags/protocol fields use spaces instead of tab+spaces) persists — non-blocking
  • Protocol-guarded delete in both bridge and vxlan layers is a nice addition for control/data plane coexistence
  • Patch 0003 (NTF_EXT_VALIDATED) is clean upstream cherry-pick, no concerns

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All reviewers approved. LGTM.

@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Comment thread patches-sonic/0002-net-bridge-vxlan-Protocol-field-in-bridge-fdb.patch Outdated
Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c2ee918. ext_flags is no longer declared in vxlan_fdb_delete — the function now passes NULL for the ext_flags parameter to vxlan_fdb_parse, and the parse function guards with if (ext_flags) before dereferencing. ✅

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in c2ee918. The indentation in patch 0003 now uses tab+spaces alignment matching the surrounding struct members. ✅

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iteration 9 re-review of c2ee918:

  • ext_flags unused variable in vxlan_fdb_delete — fixed by passing NULL and adding null guards in vxlan_fdb_parse
  • ✅ Indentation in vxlan_private.h and patch 0003 defines — now uses proper tab+spaces alignment
  • include/net/neighbour.h / include/uapi/linux/neighbour.h removed from patch 0002 (moved to patch 0003 where they belong) — addresses saiarcot895's comment about removing non-functional sections
  • ⚠️ New issue: u32 ext_flags declared but unused in vxlan_fdb_external_learn_add (line 4772 in the patched file). The function passes literal 0 to vxlan_fdb_update rather than using this variable. Will produce a compiler warning.
  • Non-blocking style nit: u8 protocol; in br_private.h still lacks tab alignment matching surrounding struct members — cosmetic only.

struct netlink_ext_ack *extack;
u32 hash_index;
+ u32 ext_flags;
int err;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable. ext_flags is declared here but never referenced — the call to vxlan_fdb_update below passes literal 0 for the ext_flags argument. Either remove this declaration or use fdb_info->ext_flags (if the struct carries one) to pass through the actual value. Will produce a -Wunused-variable compiler warning.

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-review of c2ee918. All prior bugs are confirmed fixed. Remaining nits:

  • vxlan_fdb_external_learn_add declares u32 ext_flags (patch 0002) that is never used — literal 0 is passed to vxlan_fdb_update instead. Compiler warning risk.
  • br_private.h: u8 protocol; uses space-only alignment while surrounding struct members use tab+spaces. Non-blocking.
  • Overall: clean implementation, protocol-aware deletion logic is a nice addition, RTPROT_UNSPEC guards are consistently applied.

struct netlink_ext_ack *extack;
u32 hash_index;
+ u32 ext_flags;
int err;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused variable. ext_flags is declared here but the call to vxlan_fdb_update below passes literal 0 for the ext_flags argument — this variable is never read. Will produce a -Wunused-variable compiler warning. Either remove it or use it in the vxlan_fdb_update call.

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All previously reported bugs (argument order, protocol parsing, RTPROT_UNSPEC guard, vxlan_fdb_delete ext_flags, indentation, neighbour.h section removal) are now fixed in c2ee918
  • Unused ext_flags variable in vxlan_fdb_external_learn_add (patch 0002) — declared but literal 0 passed to vxlan_fdb_update. Non-blocking compiler warning
  • Patch 0003 (NTF_EXT_VALIDATED) is well-structured with proper validation of state before setting the flag, correct gc exemption, and fallback to NUD_STALE instead of NUD_FAILED
  • Protocol-aware delete logic in vxlan_fdb_delete is a good addition for control-plane/data-plane coexistence
  • LGTM overall — remaining nits are non-blocking style issues

Signed-off-by: Tamer Ahmed <tamerahmed@microsoft.com>
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

banidoru

This comment was marked as duplicate.

banidoru

This comment was marked as duplicate.

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All previously identified bugs are fixed in 5e7b511.
  • Protocol-aware FDB management and external validation flag (NTF_EXT_VALIDATED) are well-implemented.
  • Remaining nits are cosmetic and non-blocking.
  • LGTM.

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • All previously raised issues (protocol parsing, RTPROT_UNSPEC guard, unused variables, bit conflict, indentation) are fixed in this revision.
  • Patch 0002 still carries cosmetic reformatting of existing call sites (e.g. vxlan_snoop, __vxlan_dev_create) that add churn but are non-blocking.
  • NTF_EXT_MH_PEER_SYNC uses bit 3, NTF_EXT_EXT_VALIDATED uses bit 2 — no conflict. Series ordering is correct.
  • LGTM. Approve.

Copy link
Copy Markdown

@banidoru banidoru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All reviewers approved. LGTM.

@saiarcot895 saiarcot895 merged commit fd6cc0a into sonic-net:master May 12, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants