US9497073B2 - Distributed link aggregation group (LAG) for a layer 2 fabric - Google Patents

Distributed link aggregation group (LAG) for a layer 2 fabric Download PDF

Info

Publication number
US9497073B2
US9497073B2 US13/314,455 US201113314455A US9497073B2 US 9497073 B2 US9497073 B2 US 9497073B2 US 201113314455 A US201113314455 A US 201113314455A US 9497073 B2 US9497073 B2 US 9497073B2
Authority
US
United States
Prior art keywords
bridge
link
lag
switch
data network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active, expires
Application number
US13/314,455
Other versions
US20120320926A1 (en
Inventor
Dayavanti G. Kamath
Keshav Kamble
Dar-Ren Leu
Nilanjan Mukherjee
Vijoy A. Pandey
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US13/314,455 priority Critical patent/US9497073B2/en
Publication of US20120320926A1 publication Critical patent/US20120320926A1/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: KAMATH, DAYAVANTI G., KAMBLE, KESHAV, LEU, DAR-REN, MUKHERJEE, NILANJAN, PANDEY, VIJOY A.
Application granted granted Critical
Publication of US9497073B2 publication Critical patent/US9497073B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L41/00Arrangements for maintenance, administration or management of data switching networks, e.g. of packet switching networks
    • H04L41/06Management of faults, events, alarms or notifications
    • H04L41/0654Management of faults, events, alarms or notifications using network fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/24Multipath
    • H04L45/245Link aggregation, e.g. trunking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/28Routing or path finding of packets in data switching networks using route fault recovery
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L45/00Routing or path finding of packets in data switching networks
    • H04L45/66Layer 2 routing, e.g. in Ethernet based MAN's
    • Y02B60/33
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/50Reducing energy consumption in communication networks in wire-line communication networks, e.g. low power modes or reduced link rate

Definitions

  • the present invention relates in general to data networks, and in particular, to a link aggregation group (LAG) for a Layer 2 data network, such as a Transparent Interconnection of Lots of Links (TRILL) network.
  • LAG link aggregation group
  • TRILL Transparent Interconnection of Lots of Links
  • the IEEE 802.1D standard defines the Spanning Tree Protocol (STP), which is a conventional data link layer protocol that ensures that a bridged Ethernet network is free of bridge loops and that a single active network path exists between any given pair of network nodes.
  • STP Spanning Tree Protocol
  • LAN local area network
  • SAN storage area network
  • FCoE Fibre Channel over Ethernet
  • iSCSI Internet Small Computer System Interface
  • STP permits only a single active network path between any two network nodes and blocks all alternate network paths, aggregate network bandwidth is artificially reduced and is inefficiently utilized. STP also reacts to even small topology changes and may force partitioning of virtual LANs due to network connectivity changes.
  • Ethernet header of STP frames does not include a hop count (or Time to Live (TTL)) field, limiting flexibility.
  • TTL Time to Live
  • TRILL Transparent Interconnection of Lots of Links
  • TRILL routing a special routing methodology in a TRILL campus comprising a network of RBridges and links (and possibly intervening standard L2 bridges) bounded by end stations. Multi-pathing is currently supported for unicast and multidestination traffic within a TRILL campus, but not on its boundary.
  • TRILL permits an external switch or server to have only one active link connected to a TRILL campus for the same Virtual LAN (VLAN).
  • VLAN Virtual LAN
  • the present application recognizes that it is desirable to promote high availability by supporting redundant links between external nodes and multiple RBridges in a TRILL campus.
  • the present application additionally recognizes that it is also desirable to place these redundant links into a Link Aggregation Group (LAG) in order to utilize the bandwidth of all the links effectively.
  • LAG Link Aggregation Group
  • the present application discloses mechanisms and associated methodologies, referred to herein as TRILL LAG or t-LAG, that supports connection of external network nodes (e.g., switches and/or servers) to a TRILL campus via a DMLT (Distributed Multi-Link Trunk).
  • TRILL LAG Link Aggregation Group
  • each of first and second bridges of a data network having respective links to an external node implement a network bridge component that forwards traffic inside the data network and a virtual bridge component that forwards traffic outside of the data network.
  • a virtual bridge is formed including the virtual bridge components of the first and second bridges and an interswitch link (ISL) between the virtual bridge components of the first and second bridges.
  • Data frames are communicated with each of multiple external network nodes outside the data network via a respective one of multiple link aggregation groups all commonly supported by the virtual bridge.
  • each of first and second bridges of a data network having respective external links to an external node implement a network bridge component that forwards traffic inside the network and a virtual bridge component that forwards traffic outside of the network.
  • a virtual bridge is formed including the virtual bridge components of the first and second bridges and an interswitch link (ISL) between the virtual bridge components of the first and second bridges. Data frames are redirected via the ISL in response to a link-down condition of one of the external links.
  • ISL interswitch link
  • a switch of a data network implements both a bridge and a virtual bridge.
  • the switch In response to receipt of a data frame by the switch from an external link, the switch performs a lookup in a data structure using a source media access control (SMAC) address specified by the data frame.
  • SMAC source media access control
  • the switch determines if the external link is configured in a link aggregation group (LAG) and if the SMAC address is newly learned.
  • LAG link aggregation group
  • the switch associates the SMAC with the virtual bridge and communicates the association to a plurality of bridges in the data network.
  • FIG. 1 is a high level block diagram of an conventional TRILL campus in accordance with the prior art
  • FIG. 2 depicts an exemplary network environment in which a network node external to a TRILL campus can be connected to multiple RBridges (RBs) in the TRILL campus via multiple redundant links forming a LAG;
  • RBs RBridges
  • FIG. 3 illustrates an exemplary network environment in which a TRILL RB handles ingress and egress traffic for multiple RBs coupled to a TRILL campus via t-LAGs;
  • FIG. 4 depicts an exemplary network environment in which unicast traffic is autonomously distributed across the links of a t-LAG
  • FIG. 5 illustrates an exemplary network environment in which the use of the ingress virtual-RB as the source RB in TRILL encapsulation of frames may cause problems in distribution of multidestination traffic in the TRILL campus;
  • FIG. 6 depicts an exemplary switch, which can be utilized to implement a TRILL RB (or vRB) in accordance with one or more embodiments;
  • FIGS. 7-8 respectively illustrate more detailed view of the Forwarding Database (FDB) and RB data structures in accordance with one embodiment
  • FIG. 9 is a high level logical flowchart of an exemplary process by which an edge RB (or vRB) of a TRILL campus implements forwarding for UC traffic ingressing the TRILL campus in accordance with one embodiment
  • FIG. 10 depicts an exemplary embodiment of a TRILL data frame including a native Ethernet frame is augmented with a TRILL header and an outer Ethernet header;
  • FIG. 11 is a high level logical flowchart of an exemplary process by which an edge RB (or vRB) of a TRILL campus implements forwarding for multidestination (MC/BC/DLF) traffic ingressing the TRILL campus in accordance with one embodiment;
  • FIG. 12 is a high level logical flowchart of an exemplary process by which an RB (or vRB) of a TRILL campus implements forwarding for UC traffic received at a network port coupled to an internal link of the TRILL campus in accordance with one embodiment;
  • FIG. 13 is a high level logical flowchart of an exemplary process by which an RB (or vRB) of a TRILL campus implements forwarding for MC traffic received at a network port coupled to an internal link of the TRILL campus in accordance with one embodiment;
  • FIG. 14 is high level logical flowchart of an exemplary process by which an ACL installed at an egress t-LAG port of an edge RB of a TRILL campus can be applied to prevent frame looping for multidestination traffic in accordance with one embodiment
  • FIG. 15 is a high level logical flowchart of an exemplary process by which an ingress RB of a TRILL campus performs MAC learning at a t-LAG port in accordance with one embodiment
  • FIG. 16 is a high level logical flowchart of an exemplary process by which an egress RB of a TRILL campus performs MAC learning in response to receipt of an End Station Address Distribution Instance (ESADI) frame from another RB in accordance with one embodiment;
  • ESADI End Station Address Distribution Instance
  • FIG. 17 is a high level logical flowchart of an exemplary method of configuring a RB of a TRILL campus to support a t-LAG in accordance with one embodiment
  • FIG. 18 is a high level logical flowchart of an exemplary process by which an egress RB of a TRILL campus implements MAC learning in response to a TRILL data frame in accordance with one embodiment
  • FIG. 19 is a high level logical flowchart of an exemplary process by which an RB of a TRILL campus supports fault tolerant communication via a t-LAG in accordance with one embodiment
  • FIGS. 20-21 illustrate an exemplary network environment in which, in the event of a failure of link of a t-LAG, unicast traffic is redirected via the t-LAG ISL to a peer RB in the same t-LAG cluster for egress through a healthy t-LAG link;
  • FIG. 22 depicts an exemplary network environment in which, if the number of failed t-LAG links exceeds a predetermined threshold, unicast traffic is rerouted to a different egress RB;
  • FIGS. 23-24 illustrate an exemplary network environment in which, in the event of a failure of a t-LAG link, the t-LAG ISL is used to pass multidestination traffic to a peer RB in the same t-LAG cluster, which then sends egress frames out;
  • FIG. 25 is a high level logical flowchart of an exemplary process by which a t-LAG-enabled RB is configured by default at startup in accordance with one embodiment
  • FIG. 26 is a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a local link-up event in accordance with one embodiment
  • FIG. 27 is a high level logical flowchart of an exemplary t-LAG reconfiguration process in accordance with one embodiment
  • FIG. 28 is a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a remote link-up event in accordance with one embodiment
  • FIG. 29 is a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a local link-down event in accordance with one embodiment
  • FIG. 30 is a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a remote link-down event in accordance with one embodiment
  • FIG. 31 is a high level logical flowchart of a prior art process of MAC learning in a conventional TRILL network.
  • TRILL LAG or t-LAG
  • network nodes e.g., servers and/or switches
  • LAG Link Aggregation Group
  • virtual-RB virtual routing bridge
  • Multiple t-LAGs may additionally be hosted by a set of multiple physical switches, herein referred to as a t-LAG cluster, with all t-LAGs in a given t-LAG cluster preferably (but not necessarily) sharing the same virtual-RB.
  • the use of the virtual-RB for the t-LAGs can resolve load distribution for unicast (UC) traffic.
  • UC unicast
  • multidestination e.g., multicast (MC), broadcast (BC), destination lookup fail (DLF)
  • MC multicast
  • BC broadcast
  • DLF destination lookup fail
  • different mechanisms are employed to ensure traffic is properly delivered to a peer RB of a t-LAG cluster; otherwise, either more than one copy of a multidestination frame may be sent to the same destination or a frame may be erroneously returned to an external network node that sourced the frame via the same t-LAG at which the frame ingressed the TRILL campus.
  • ISL interswitch link
  • Prior art TRILL campus 100 includes a packet-switched data network including plurality of Rbridges (RBs) interconnected by network links. As shown, various of the RBs are coupled to external LANs and/or network nodes, such as switch 102 .
  • RBs Rbridges
  • the present TRILL protocols permit multi-paths within TRILL campus 100 , but not at its boundary. Consequently, if an external network node, such as switch 102 , wants to connect to a TRILL campus by multiple physical links, such as links 104 and 106 , the TRILL protocols will determine an appointed forwarder for each VLAN running on top of the links and, as a result, will utilize only a single link for data forwarding at run time for each VLAN. Accordingly, for a given VLAN, traffic between switch 102 and RB 112 on link 104 is blocked (as shown) if RB 110 is chosen as the appointed forwarder for that VLAN. Consequently, all traffic for that VLAN will be forwarded from TRILL campus 100 to switch 102 via link 106 .
  • the exemplary network environment includes a TRILL campus 200 comprising a packet-switched data network including a plurality of RBs (e.g., RB 1 -RB 6 ) coupled by internal network links 202 a - 202 h .
  • RB 1 through RB 6 are connected by external links to external networks or external nodes.
  • RB 1 and RB 2 connect to an external LAN 210 a supporting end stations 220 a - 220 c by external links 212 a and 212 b , respectively.
  • RB 5 connects to an external LAN 210 b , which supports ends stations 220 e - 220 f , by an external link 212 e .
  • RB 4 and RB 6 connect to an external switch 202 , which supports end station 220 g , by external links 212 c and 212 d , respectively, and RB 4 and RB 6 further connect to an end station 220 d by external links 212 f and 212 g .
  • external links 212 c and 212 d form t-LAG 230 a
  • external links 212 f and 212 g form t-LAG 230 b .
  • an additional RBridge referred to as a virtual-RB or vRB herein, is created and deployed for each t-LAG.
  • vRB 7 running on top of RB 4 and RB 6 supports t-LAG 230 a
  • vRB 8 running on top of RB 4 and RB 6 supports t-LAG 230 b .
  • All the virtual-RBs in a TRILL campus created for the same t-LAG preferably employ the same RB nickname, which, as known to those skilled in the art, is utilized to identify an ingress RB in the TRILL tunneling header encapsulating an Ethernet frame. Further details regarding the TRILL header are described below with reference to FIG. 10 .
  • All the virtual-RBs supporting t-LAGs are preferably involved in the TRILL IS-IS communication in active-active mode, as well as End Station Address Distribution Instance (ESADI) communication.
  • ESADI End Station Address Distribution Instance
  • each t-LAG-enabled switch preferably handles all the MAC addresses learned at its local t-LAG ports.
  • a t-LAG-enabled RB preferably conducts this communication on behalf of the virtual-RB(s) running on top of it, if any.
  • a LSP Link State PDU (Protocol Data Unit)
  • SPF Shortest path first
  • the switch chip(s) providing the switching intelligence of RB 1 through RB 6 in TRILL campus 200 preferably have the capability of contemporaneously handling traffic for more than one RB.
  • RB 4 handles ingress and egress traffic for RB 4 (the switch itself), as well as vRB 7 and vRB 8 ;
  • RB 6 similarly handles ingress and egress traffic for itself (i.e., RB 6 ), as well as vRB 7 and vRB 8 .
  • the edge RBs (i.e., those connected to at least one external link 212 ) within TRILL campus 200 are preferably able to employ the corresponding ingress virtual-RB nickname as the ingress RB for TRILL encapsulation of the frames.
  • the traffic ingressing at RB 4 may use RB 4 , vRB 7 or vRB 8 as the ingress RB in the TRILL header, depending upon which local port the frame is ingressing on.
  • the traffic ingressing at RB 6 may use RB 6 , vRB 7 or vRB 8 as the ingress RB, again depending on the local port the frame is ingressing on.
  • the MAC learning performed at egress RBs will automatically bind the client source Media Access Control (SMAC) address to the ingress virtual-RB. Once this binding is established, UC traffic destined for a t-LAG will be autonomously load balanced across the external links comprising the t-LAG, as shown in FIG. 4 .
  • SMAC Media Access Control
  • switch chips may not be capable of contemporaneously handling TRILL data frames for more than one RB or may support only a limited number of RBs (i.e., fewer than the number of RBs deployed).
  • the number of distribution trees supported on a switch chip can also be very limited. Due to these factors, some adjustments may be required to adapt to such switching hardware limitations.
  • FIG. 3 there is illustrated a high level view of a network environment in which multiple t-LAGs supported by a TRILL campus form a t-LAG cluster.
  • FIG. 3 depicts a similar network environment as that described above with reference FIG. 2 with a couple of differences.
  • the network environment of FIG. 3 includes an additional end station 220 h , which is coupled to RB 4 and RB 6 via an additional t-LAG 230 c including external links 212 h and 212 i .
  • t-LAGs 230 a - 230 c which all belong to the same t-LAG cluster, are supported by a single virtual-RB (i.e., vRB 9 ) rather than two virtual RBs (i.e., vRB 7 and vRB 8 ) and thus can share one RB nickname, if desired.
  • vRB 9 virtual-RB
  • vRB 7 and vRB 8 virtual RBs
  • the total number of RBs used in TRILL campus 200 will be reduced as compared to embodiments in which one virtual-RB is implemented per t-LAG.
  • FIG. 3 further depicts that RB 4 and RB 6 are each comprised of two components: an intra-campus RB component (RB 4 ′ and RB 6 ′) designated to handle traffic forwarding inside the TRILL campus 200 and an extra-campus RB component (RB 4 ′′ and RB 6 ′′) designated to handle the traffic forwarding outside of TRILL campus 200 (i.e., in the regular L2 switching domain).
  • the virtual-RB supporting the t-LAG cluster i.e., vRB 9
  • T-LAG ISL 300 is utilized for control communication and for failure handling.
  • t-LAG ISL 300 can be utilized for frame redirection in the event of a link failure on any local t-LAG port, as discussed further herein with reference to FIGS. 19-30 .
  • vRB 9 For frames ingressing into TRILL campus 200 , vRB 9 passes the frame either to RB 4 ′ or to RB 6 ′ based upon whether the frame was received at RB 4 ′′ or RB 6 ′′, respectively. As noted in FIG. 3 , for traffic that needs to pass beyond TRILL campus 200 , RB 4 ′′ is only connected to RB 4 ′, and RB 6 ′′ is only connected to RB 6 ′.
  • the virtual links connecting RB 4 ′ to RB 4 ′′ and RB 6 ′ to RB 6 ′′ are zero cost and should be handled transparently by the switch chips on RB 4 and RB 6 , respectively. It is recommended but not required that the handling of all local L2 switching in a virtual-RB (e.g., vRB 9 ) should be handled locally within the RB itself.
  • a link in a t-LAG may go down at run time. Consequently, it is desirable to handle such link failures in a manner that minimizes or reduces frame loss. At least two techniques of failure handling are possible:
  • the virtual link between the intra-campus RB component (e.g., RB 4 ′) and the virtual-RB (e.g., RB 4 ′′ or, actually, vRB 9 ) will be claimed link-down.
  • the UC traffic previously routed to RB 4 will be routed to RB 6 for egress via a t-LAG link in RB 6 ′′.
  • the local access ports on edge RBs (those like RB 4 ′′ and RB 6 ′′ that interface with external links 212 a - 212 i ) will need to be adjusted at run time to allow the traffic be delivered via a healthy link in RB 6 ′′ for the same t-LAG.
  • the t-LAG ISL e.g., ISL 300
  • ISL 300 is used to redirect UC or multidestination frames to the peer RB in the same t-LAG cluster in case a t-LAG port on the local RB has a link down.
  • the t-LAG ISL (e.g., ISL 300 ) may get over-loaded if too much traffic needs to pass through it. It is therefore presently preferred if both the first and second solutions are implemented in order to better address link failures on t-LAGs.
  • a threshold is preferably implemented and pre-specified so that a t-LAG-enabled switch can stop claiming the connectivity between the switch RB (e.g., RB 4 ) and the virtual-RB (e.g., vRB 9 ) if the number of the local t-LAG ports that are link-down exceeds the threshold.
  • the switch RB e.g., RB 4
  • the virtual-RB e.g., vRB 9
  • multidestination traffic In TRILL, multidestination traffic (MC/BC/DLF) is handled differently from UC traffic.
  • a distribution tree is predetermined and followed for a specific flow of multidestination traffic ingressing a TRILL campus at an RB.
  • all RBs in TRILL campus will be visited in all the distribution trees unless a VLAN or pruning has been applied to the distribution tree.
  • more than one copy of a frame will (undesirably) be delivered to external switches or servers via a t-LAG, if the frame is flooded in the TRILL campus following a distribution tree and all RBridges transmit the frame out of their local access ports.
  • a primary link for each t-LAG is preferably predetermined and followed for a specific multidestination (MC/BC/DLF) traffic flow egressing from a TRILL campus.
  • MC/BC/DLF multidestination traffic flow egressing from a TRILL campus.
  • the pre-determined selection of the primary link for a t-LAG may need to be adjusted at run time if a link-down event occurs in a t-LAG.
  • the RBs in a t-LAG cluster preferably inter-communicate link-up and link-down event notifications.
  • the t-LAG ISL e.g., t-LAG ISL 300
  • the t-LAG ISL 300 can be used for frame redirection to avoid frame drop due to frames being sent to a failed primary t-LAG link.
  • the ingress virtual-RB can be used as the ingress RB in TRILL encapsulation for a frame when it enters at a t-LAG, as the MAC learning performed at egress RBs will do this binding automatically.
  • the use of the ingress virtual-RB as the ingress RB in TRILL encapsulation of frames may cause problems in distribution of multidestination traffic in the TRILL campus for some switch chips, as now described with reference to FIG. 5 .
  • vRB 9 Assuming the illustrated distribution tree rooted at vRB 9 is used for a multidestination flow and the link between RB 4 ′ and RB 4 ′′ is chosen as part of the distribution tree, if a data frame ingresses into the TRILL campus via a t-LAG in RB 6 ′′, the data frame may get dropped as it traverses in TRILL campus 200 (e.g., by RB 1 or RB 3 ) because vRB 9 is used as the ingress RB in the TRILL header of the frame, but is actually on the destination side of the distribution tree.
  • the switch RB e.g., RB 6
  • the switch RB should be used as the source RB in TRILL encapsulation in the above case in order to prevent erroneous frame dropping.
  • This ingress RB designation should be applied to both UC and multidestination traffic to avoid MAC flapping at egress RBs.
  • t-LAG is the binding of the client SMAC learned at a t-LAG to the virtual-RB created for that t-LAG.
  • the virtual-RB e.g., vRB 9
  • the desired binding can be automatically achieved (e.g., by hardware) via the MAC learning performed at egress RBs.
  • the switch RB e.g., RB 6
  • a different technique must be employed to achieve the desired binding of the client SMAC to the virtual-RB.
  • One alternative technique to achieve the desired binding of the client SMAC to the virtual-RB is through software-based MAC learning performed on a t-LAG-enabled switch (as described, for example, with reference to FIG. 15 ).
  • a MAC address learned at a t-LAG port can be specially manipulated in software to bind to ingress virtual-RB; this newly learned MAC entry can then be propagated via ESADI to all other RBs in the TRILL campus for configuration. In this way, the load distribution of UC traffic at any ingress RB can then be achieved automatically.
  • Filtering Database for Bridge (FDB) sync for SMACs learned at t-LAG ports is preferably implemented between the peer RBs in a t-LAG cluster, especially if the LAG hashing algorithm performed on external switches or servers is SMAC-based.
  • This FDB synchronization avoids unnecessary flooding or dropping of known UC traffic at egress to a t-LAG if the egress RB has no related MAC information.
  • the MAC information of the peer RB in the same cluster is also needed upon making a decision to redirect traffic to the t-LAG ISL when a local t-LAG link fails.
  • a data frame may attempt to return to the t-LAG at which it ingresses, for example, through a link for the same t-LAG but on a different RB than the ingress RB.
  • Actions such as the enforcement of ACLs, can be applied on all the t-LAG-enabled RBs to ensure that such looping data frames are dropped before egressing from TRILL campus 200 , as described further below with reference to FIG. 14 .
  • switch 600 includes a plurality of physical ports 602 a - 602 m .
  • Each physical port 602 includes a respective one of a plurality of receive (Rx) interfaces 604 a - 604 m and a respective one of a plurality of ingress queues 606 a - 606 m that buffers frames of data traffic received by the associated Rx interface 604 .
  • Rx receive
  • ingress queues 606 a - 606 m
  • Each of ports 602 a - 602 m further includes a respective one of a plurality of egress queues 614 a - 614 m and a respective one of a plurality of transmit (Tx) interfaces 620 a - 620 m that transmit frames of data traffic from an associated egress queue 614 .
  • Ports 602 connected to external links 212 are referred to herein as “local access ports,” while ports 602 connected to internal links 202 of TRILL campus 200 are referred to herein as “local network ports.”
  • Switch 600 additionally includes a switch fabric 610 , such as a crossbar or shared memory switch fabric, which is operable to intelligently switch data frames from any of ingress queues 606 a - 606 m to any of egress queues 614 a - 614 m under the direction of switch controller 630 .
  • switch controller 630 can be implemented with one or more centralized or distributed, special-purpose or general-purpose processing elements or logic devices (also referred to as “switch chips”), which may implement control entirely in hardware, or more commonly, through the execution of firmware and/or software by a processing element. Switch controller 630 thus provides the switching intelligence that implements the RB (and vRB) behavior herein described.
  • switch controller 630 implements a number of data structures in volatile or non-volatile data storage, such as cache, memory or disk storage.
  • volatile or non-volatile data storage such as cache, memory or disk storage.
  • these data structures are commonly referred to as “tables,” those skilled in the art will appreciate that a variety of physical data structures including, without limitation, arrays, lists, trees, or composites thereof, etc. may be utilized to implement various ones of the data structures.
  • the depicted data structures include FDB data structure 640 , which as illustrated in FIG. 7 , includes multiple entries each including fields for specifying an RB (or vRB), a virtual local area network (VLAN) identifier (VID), a destination media access control (DMAC) address, and a destination port (i.e., either a local access port (lport) or virtual port (vport) on a remote RB).
  • FDB data structure 640 For L2 switching; based on a (DMAC, VLAN) tuple, FDB data structure 640 returns the destination port of the frame, which can be a local access port, a vport for a remote RB (for UC traffic), or a vport for a distribution tree (for multidestination traffic).
  • FDB data structure 640 responsive to an input (RB, VLAN) or (RB, DMAC, VLAN) tuple, returns a vport for a distribution tree for the multidestination traffic.
  • the data structures of switch controller 630 additionally includes RB data structure 642 , which, as depicted in FIG. 8 , includes multiple entries each including fields for specifying an RB (or vRB) and a destination port (i.e., an lport or a vport).
  • RB data structure 642 responsive to an indication of the egress RB of a data frame, returns a destination port for sending out data traffic, where the destination port can be a local access port or a vport for a remote RB.
  • RB data structure 642 Based on the specification of an ingress RB, RB data structure 642 additionally provides the vport for MAC learning at an egress RB.
  • RB data structure 642 provides the vport for a distribution tree based on the root RB.
  • the data structures employed by switch controller 630 further include:
  • an edge RB (or vRB) of TRILL campus 200 implements forwarding for UC traffic ingressing TRILL campus 200 in accordance with one embodiment.
  • the process begins at block 900 and then proceeds to block 902 , which depicts an edge RB of TRILL campus 200 receiving a UC data frame at an access port (e.g., a port 602 connected to one of external links 212 a - 212 i ).
  • an access port e.g., a port 602 connected to one of external links 212 a - 212 i .
  • the edge RB performs a lookup in FDB data structure 640 based on a tuple including the DMAC and VLAN specified in the data frame.
  • the edge RB forwards the UC data frame in accordance with the MC forwarding process depicted in FIG. 11 , which is described below. Thereafter, the UC forwarding process depicted in FIG. 9 ends at block 930 .
  • the edge RB determines at block 910 if the destination port indicated by FDB data structure 640 is a vport for a remote RB. If not, the edge RB sends the data frame out of the local access port indicated by FDB data structure 640 (i.e., performs regular L2 forwarding on an external link 212 outside of TRILL campus 200 ) as shown at block 912 , and the UC forwarding process of FIG. 9 ends at bock 930 .
  • the edge RB determines at block 910 that the destination port specified by FDB data structure 640 is a vport for a remote RB, the edge RB, which will serve as the ingress RB, further determines whether ECMP is enabled (block 920 ). If not, the process proceeds to block 924 , described below. If ECMP is enabled, the edge RB accesses ECMP data structure 646 to determine the next hop for the data frame (block 922 ). Following either block 920 (if ECMP is disabled) or block 922 (if ECMP is enabled), the edge RB accesses next hop data structure 648 to retrieve information for the next hop interface (block 924 ).
  • the edge RB adds a TRILL header and an outer encapsulating Ethernet header to the data frame (block 926 ) and sends the encapsulated data frame out of a local network port on an internal link 202 of TRILL campus 200 to the next hop (block 928 ). Thereafter, the UC forwarding process terminates at block 930 .
  • a TRILL data frame 1000 in accordance with one embodiment.
  • a conventional (native) Ethernet data frame 1010 includes a Ethernet header 1012 and an Ethernet payload 1014 .
  • the edge RB prepends a TRILL header to Ethernet frame 1010 and then encapsulates the whole with an outer Ethernet header 1020 (which specifies a TRILL Ethertype) and an Ethernet FCS 1022 .
  • the TRILL header begins with a collection of fields 1030 including a TRILL version field (V), a reserved field (R), a multi-destination bit (M) indicating whether the TRILL data frame is a multidestination frame, an op-length field (OpLen) that gives the length of the TRILL header optional fields, if any, terminating the TRILL header, and a hop count field (HC) decremented by each RB “hop” as TRILL data frame 1000 is forwarded in TRILL campus 200 .
  • a TRILL version field V
  • R reserved field
  • M multi-destination bit
  • OpLen op-length field
  • HC hop count field
  • the TRILL header additionally includes an egress RB nickname field 1032 that, for UC data frames, identifies by RB nickname the last RB (i.e., egress RB) in TRILL campus 200 that will handle the data frame and is therefore responsible for decapsulating native Ethernet data frame 1010 and forwarding it to an external node.
  • the TRILL header further includes an ingress RB nickname field 1034 that indicates the RB nickname of the edge RB. As indicated above, it is preferable if the specified RB nickname is the RB nickname of the edge switch RB (e.g., RB 4 ) rather than the RB nickname of the edge vRB (e.g., vRB 9 ).
  • an edge RB (or vRB) of TRILL campus 200 implements forwarding for multidestination (MC/BC/DLF) traffic ingressing TRILL campus 200 in accordance with one embodiment.
  • the process begins at block 1100 and then proceeds to block 1102 , which depicts an edge RB of TRILL campus 200 receiving a multidestination data frame at an access port (e.g., a port 602 coupled to one of external links 212 a - 212 i ).
  • the edge RB determines at block 1104 if the multidestination data frame is a MC data frame.
  • an MC data frame can be detected by determining whether the least significant bit of the DMAC specified by the data frame is set. In response to a determination at block 1104 that the data frame is not a MC data frame, the process proceeds to block 1112 , which is described below. If, however, the edge RB determines at block 1104 that the multidestination frame is a MC data frame, the edge RB performs a lookup in FDB data structure 640 based on a tuple including the DMAC and VLAN specified in the data frame (block 1106 ).
  • the edge RB accesses VLAN data structure 652 to obtain the vport for the distribution tree (block 1112 ).
  • the edge RB accesses vport data structure 644 and MC bitmap data structure 650 to obtain L2 and L3 bitmaps for the data frame (block 1114 ).
  • the edge RB then sends a copy of the native data frame out of each local access port, if any, indicated by the L2 bitmap (block 1116 ), which are the local access port(s) of the edge RB connected to external links 212 outside of TRILL campus 200 .
  • the edge RB adds a TRILL header and an outer encapsulating Ethernet header to the data frame and sends the encapsulated data frame out of each local network port, if any, of TRILL campus 200 indicated by the L3 bitmap (block 1118 ). Thereafter, the multidestination forwarding process of FIG. 11 terminates at block 1120 .
  • FIG. 12 there is illustrated a high level logical flowchart of an exemplary process by which an RB (or vRB) of TRILL campus 200 implements forwarding for UC traffic received at a network port coupled to an internal link 202 of TRILL campus 200 in accordance with one embodiment.
  • the process begins at block 1200 and then proceeds to block 1202 , which depicts an RB of TRILL campus 200 receiving a UC data frame at a network port coupled to an internal link 202 of TRILL campus 200 .
  • the RB performs a lookup in RB data structure 642 based on the egress RB specified in egress RB nickname field 1032 of the TRILL header of the data frame.
  • the RB discards the UC data frame.
  • the UC forwarding process depicted in FIG. 12 ends at block 1230 .
  • the RB determines whether or not the egress port indicated by RB data structure 640 is a local access port, that is, a port connected to an external link 212 . If not (i.e., the egress port is a network port), the process proceeds to block 1220 , which is described below. If, however, the RB determines at block 1210 that the egress port is a local access port, the RB performs MAC learning for the data frame, if enabled (block 1212 ). An exemplary process for MAC learning is described below with reference to FIG. 18 . The RB then decapsulates the native L2 data frame by removing outer Ethernet header 1020 and the TRILL header (block 1214 ) and sends the native L2 data frame out of the local access port indicated by RB data structure 642 .
  • the RB determines whether ECMP is enabled. If not, the process proceeds to block 1224 , described below. If, however, ECMP is enabled, the RB accesses ECMP data structure 646 to determine the next hop for the data frame (block 1222 ). Following either block 1220 (if ECMP is disabled) or block 1222 (if ECMP is enabled), the RB accesses next hop data structure 648 to retrieve information for the next hop interface (block 1224 ). Thereafter, the RB modifies the outer encapsulating Ethernet header of the UC data frame to specify the appropriate source and destination MAC addresses (block 1226 ) and sends the data frame out of a local network port to the next hop in TRILL campus 200 (block 1228 ). Thereafter, the UC forwarding process depicted in FIG. 12 terminates at block 1230 .
  • FIG. 13 there is illustrated a high level logical flowchart of an exemplary process by which an RB (or vRB) of TRILL campus 200 implements forwarding for MC data frames received at a network port connected to an internal link 202 of TRILL campus 200 in accordance with one embodiment.
  • the process begins at block 1300 and then proceeds to block 1302 , which depicts a RB of TRILL campus 200 receiving a MC data frame at a network port (e.g., a port 602 coupled to one of internal links 202 of TRILL campus 200 ).
  • the RB performs a lookup in FDB data structure 640 based on a tuple including the RB and the DMAC and VLAN specified in the data frame (block 1304 ).
  • the process proceeds to block 1320 , which is described below.
  • the vport for the distribution tree for the MC data frame is returned, and the process proceeds to block 1310 .
  • the RB accesses vport data structure 644 and MC bitmap data structure 650 to obtain L2 and L3 bitmaps for the data frame.
  • the RB then sends a copy of the data frame out of each local access port, if any, indicated by the L2 bitmap (block 1312 ), which are the local access port(s) of the RB connected to external links 212 outside of TRILL campus 200 .
  • the RB sends a copy of the data frame out of each local network port, if any, of TRILL campus 200 indicated by the L3 bitmap after updating the outer encapsulating Ethernet header of the MC data frame to specify the appropriate source MAC addresses (block 1314 ).
  • the MC forwarding process of FIG. 13 terminates at block 1330 .
  • the RB performs a lookup for the MC data frame in FDB data structure 640 based on a tuple including the identifier of the RB and the VLAN specified by the MC data frame. If the RB determines at block 1322 that a matching entry for the MC data frame is found in FDB data structure 640 , RB forwards the MC data frame as has been described with respect to blocks 1310 - 1314 .
  • the RB determines at block 1322 that no matching entry for the data frame is present in FDB data structure 640 . If the RB determines at block 1322 that no matching entry for the data frame is present in FDB data structure 640 , the RB performs a lookup in RB data structure 642 utilizing the egress RB specified in egress RB nickname field 1032 of the TRILL header of the data frame (block 1324 ). If the RB determines at block 1326 that a matching entry for the data frame is present in RB data structure 642 , the RB forwards the MC data frame as has been described with respect to blocks 1310 - 1314 . If, however, the RB determines at block 1326 that no matching entry for the data frame is found in RB data structure 642 , the RB discards the data frame at block 1328 . Thereafter, the MC data frame forwarding process depicted in FIG. 13 terminates at block 1330 .
  • FIG. 14 there is depicted a high level logical flowchart of an exemplary process by which an ACL installed at an egress t-LAG port of an edge RB of a TRILL campus 200 can be applied to prevent frame looping for multidestination traffic in accordance with one embodiment.
  • the process begins at block 1400 in response to an edge RB of a TRILL campus receiving a data frame at an egress local access port configured as part of a t-LAG.
  • the RB determines whether the data frame is a TRILL MC data frame, for example, by examining the multicast bit in TRILL header fields 1030 .
  • the RB allows the data frame to egress through the local access port (block 1406 ).
  • the RB applies an ACL at block 1404 by determining whether or not the RB identified in ingress RB nickname field 1034 of the TRILL header is a peer RB belonging to the same t-LAG cluster as the current RB. If not, the RB allows the data frame to egress through the local access port (block 1406 ). If, however, the RB determines that the RB identified in the ingress RB nickname field 1034 of the TRILL header is a peer RB belonging to the same t-LAG cluster as the current RB, the RB enforces the ACL by discarding the data frame (block 1408 ), thus preventing frame looping. Following block 1406 or block 1408 , the process depicted in FIG. 14 terminates at block 1410 .
  • source pruning for TRILL multidestination frames can be performed by employing a different distribution tree for frames entering at different switch RBs supporting a t-LAG cluster.
  • vRB 9 can implement source pruning for multidestination traffic by employ differing distribution trees for each combination of switch RB (i.e., RB 4 or RB 6 ) and t-LAG.
  • FIG. 31 A prior art MAC learning process in a conventional TRILL campus is shown in FIG. 31 .
  • the depicted process begins at block 3100 and then proceeds to block 3102 , which illustrates an egress RB of a conventional TRILL campus receiving a TRILL data frame at one of its local network ports.
  • the egress RB performs an RB lookup in its RB data structure based on the egress RB nickname specified in the TRILL header of the TRILL data frame (block 3104 ).
  • the conventional MAC learning process depicted in FIG. 31 terminates at block 3120 . If, on the other hand, the egress RB determines at block 3106 that the destination port returned by the RB lookup is a local access port (i.e., the local RB is the egress RB for the TRILL data frame), then the egress RB performs hardware SMAC learning and binds the SMAC to the ingress RB indicated by the TRILL header of the TRILL data frame (block 3110 ). Thereafter, the process depicted in FIG. 31 ends at bock 3120 .
  • the conventional MAC learning process depicted in FIG. 31 is replaced in TRILL campus 200 with a more comprehensive MAC learning methodology supporting the use of t-LAGs and t-LAG clusters as described herein.
  • This comprehensive MAC learning methodology includes MAC learning at t-LAG ports of ingress RBs (e.g., as depicted in FIG. 15 ), MAC learning at RBs in the same t-LAG cluster as an edge RB (e.g., as illustrated in FIG. 16 ), and MAC learning at egress RBs that binds SMACs to ingress vports (e.g., as depicted in FIG. 18 ).
  • FIG. 15 there is illustrated a high level logical flowchart of an exemplary process by which an ingress RB of a TRILL campus 200 performs MAC learning at a t-LAG port in accordance with one embodiment.
  • the process begins at 1500 and then proceeds to block 1502 , which depicts an ingress RB of a TRILL campus 200 receiving a native L2 data frame at a local access port connected to an external link 212 .
  • the ingress RB performs a lookup of the data frame in FDB data structure 640 using the SMAC address specified by the data frame (block 1504 ).
  • the ingress RB determines at block 1506 whether or not the FDB entry obtained by the lookup performed at block 1504 is newly learned at a local access port that is configured in a t-LAG. If not, the process depicted in FIG. 15 terminates at block 1520 . If, however, a determination is made at block 1506 that the entry obtained by the FDB lookup is newly learned at a local access port configured in a t-LAG, the contents of the FDB entry are passed to software for MAC learning (block 1510 ). Software accordingly binds the SMAC of the data frame to the ingress vRB if the ingress local access port is a t-LAG port (block 1512 ).
  • Binding the SMAC of the data frame to the ingress vRB (rather than ingress RB) in this manner supports the automatic load balancing and fault tolerant communication described herein.
  • the ingress RB then passes the contents of the FDB entry to all other RBs of TRILL campus 200 via ESADI (block 1514 ). Thereafter, the process illustrated in FIG. 15 ends at block 1520 .
  • FIG. 16 there is depicted a high level logical flowchart of an exemplary process by which an egress RB of a TRILL campus 200 performs MAC learning in accordance with one embodiment.
  • the process begins at block 1600 and then proceeds to block 1602 , which illustrates an egress RB of TRILL campus 200 receiving an ESADI frame from another RB in TRILL campus 200 .
  • the ESADI frame can be originated, for example, at block 1514 of the ingress RB MAC learning process depicted in FIG. 15 .
  • the egress RB determines at block 1604 whether or not it is configured within a common t-LAG cluster with the remote RB from which the ESADI frame originated.
  • the egress RB configures its switch controller 630 to bind the SMAC to a vport for the ingress vRB of the traffic flow (block 1610 ). If, however, the egress RB determines at block 1604 that is configured in the same t-LAG cluster as the remote RB, the egress RB configures its switch controller 630 to bind the SMAC to a local t-LAG port of the t-LAG cluster (block 1606 ). Following either block 1606 or block 1610 , the egress RB MAC learning process depicted in FIG. 16 terminates at block 1612 .
  • FIG. 17 there is illustrated a high level logical flowchart of an exemplary method of configuring a RB of a TRILL campus to support a t-LAG in accordance with one embodiment.
  • the process begins at block 1700 and then proceeds to block 1702 , which illustrates a RB of TRILL campus 200 receiving a t-LAG configuration specifying which ports 602 of the RB belong to a link aggregation group (LAG).
  • LAG link aggregation group
  • the RB configures switch controller 630 to map both vRB(s) and switch-based (i.e., physical) RBs in the same t-LAG to the same vport (block 1704 ).
  • FIG. 18 there is depicted a high level logical flowchart of an exemplary process by which an egress RB of a TRILL campus implements MAC learning in response to a TRILL data frame in accordance with one embodiment.
  • the process begins at block 1800 and then proceeds to block 1802 , which illustrates an egress RB of TRILL campus 200 receiving a TRILL data frame 1000 as illustrated in FIG. 10 via an internal link 202 of TRILL campus 200 at one of its local network ports.
  • the egress RB performs a lookup in RB data structure 642 based on the egress RB nickname specified in the egress RB nickname field 1032 of the TRILL header (block 1804 ).
  • the process depicted in FIG. 18 terminates at block 1820 . If, on the other hand, the egress RB determines at block 1806 that the destination port returned by the RB lookup is a local access port, then the egress RB again performs a lookup in RB data structure 642 based on the ingress RB nickname specified in the ingress RB nickname field 1034 of the TRILL header in order to determine the ingress vport (block 1810 ). As noted above with reference to FIG.
  • both the ingress RB and any related vRB are preferably configured to map to the same vport.
  • the egress RB then performs hardware SMAC learning and binds the SMAC to the ingress vport returned by the second RB lookup (block 1812 ). Thereafter, the process depicted in FIG. 18 ends at bock 1820 .
  • FIG. 19 there is illustrated a high level logical flowchart of an exemplary process by which a RB of TRILL campus 200 provides fault-tolerant communication via a t-LAG cluster in accordance with one embodiment.
  • the process begins at block 1900 and then proceeds to block 1902 , which depicts an RB supporting a t-LAG (hereafter assumed for the sake of example to be RB 4 ) determining whether or not a t-LAG link-down event has been detected for one of its external links 212 . If not, the process iterates at block 1902 until a t-LAG link-down event is detected for one of its external links 212 .
  • a t-LAG link-down event is detected for one of its external links 212 .
  • RB 4 determines at block 1904 whether the number of its currently downed links exceeds a predetermined threshold (in at least some embodiments, RBs (or vRBs at different RBs) can have different numbers of external links and different thresholds). If so, the process proceeds to block 1920 , which is described below. If, however, RB 4 determines at block 1904 that the number of its current downed links does not exceed the predetermined threshold, the process proceeds to block 1910 .
  • a predetermined threshold in at least some embodiments, RBs (or vRBs at different RBs) can have different numbers of external links and different thresholds.
  • RB 4 utilizes an ISL of the t-LAG cluster to redirect egress traffic of TRILL campus 200 that was directed to the downed link.
  • FIG. 20 depicts an exemplary flow of UC traffic via TRILL campus 200 to an external node (i.e., switch 202 ) via a t-LAG 230 a prior to a link down event. If an external link, such as external link 212 c , fails as shown in FIG.
  • RB 4 ′′ redirects the UC traffic via t-LAG ISL 300 to the peer RB (i.e., RB 6 ′′) in the same t-LAG cluster for egress through a healthy t-LAG link, such as link 212 d.
  • RB 4 continues to monitor to determine if its downed external link 212 has been restored.
  • RB 4 reverts communication of egress traffic from t-LAG ISL 300 to the restored external link (block 1914 ). Thereafter, the process returns to block 1902 , which has been described.
  • RB 4 reports a link-down condition (e.g., via TRILL IS-IS) to TRILL campus 200 indicating that the connectivity between its intra-campus RB (i.e., RB 4 ′) and the vRB including its extra-campus RB (i.e., RB 4 ′′) is down (even though the actual link failure events impact external links 212 ).
  • TRILL campus 200 automatically reroutes traffic that was previously routed to RB 4 to a peer RB of the t-LAG cluster for egress.
  • TRILL campus 200 automatically reroutes traffic that was previously routed to RB 4 to a peer RB of the t-LAG cluster for egress.
  • RB 4 further determines at block 1922 whether or not one or more of its downed external links 212 have been restored. If so, RB 4 additionally determines at block 1924 whether or not the number of its external links 212 that are down still exceeds the threshold. If so the process returns to block 1922 .
  • RB 4 determines at block 1924 that the restoration of one or more external links 212 has caused the number of its external links 212 that are down to not exceed the threshold, RB 4 communicates to TRILL campus 200 a link-up event for the link between its intra-campus RB (i.e., RB 4 ′) and the vRB (i.e., vRB 9 ) including its extra-campus RB (i.e., RB 4 ′′).
  • TRILL campus 200 re-establishes routing for the egress traffic through RB 4 , as shown in FIG. 20 .
  • FIGS. 23-24 illustrate that the same technique depicted in FIG. 19 can be utilized to provide fault-tolerant communication for multidestination traffic ingressing at a t-LAG cluster.
  • a multidestination flow e.g., broadcast flow
  • vRB 9 can distribute the multidestination flow to TRILL campus 200 and its external nodes utilizing a distribution tree rooted at vRB 9 , as shown in FIG. 23 .
  • RB 4 can utilize t-LAG ISL 300 to redirect the multidestination traffic to the peer RB (RB 6 ′′) in the same t-LAG cluster in order to send out the egress frames to external switch 202 , as shown in FIG. 24 .
  • an external link e.g., external link 212 c
  • RB 4 can utilize t-LAG ISL 300 to redirect the multidestination traffic to the peer RB (RB 6 ′′) in the same t-LAG cluster in order to send out the egress frames to external switch 202 , as shown in FIG. 24 .
  • RB 4 can report a link down between its intra-campus RB 4 ′ and vRB 9 to TRILL campus 200 in order to enforce use of a different primary link for the egress multidestination traffic directed to the external node coupled to TRILL campus 200 by the downed link until the number of its downed external links is less than or equal to the threshold.
  • dynamic reconfiguration of RBs is preferably implemented as now described with reference to FIGS. 25-30 .
  • FIG. 25 there is illustrated a high level logical flowchart of an exemplary process by which a t-LAG-enabled RB of a TRILL campus is configured by default at startup in accordance with one embodiment.
  • the process begins at block 2500 following startup of a t-LAG-enabled RB of TRILL campus 200 (e.g., RB 4 or RB 6 ).
  • the t-LAG-enabled RB then applies a default configuration for traffic flow in the t-LAG, as depicted at blocks 2502 - 2508 .
  • the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from any local access port or local network port to the port for t-LAG ISL 300 (blocks 2502 and 2504 ).
  • the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from the port for t-LAG ISL 300 to any local access port or local network port (blocks 2506 and 2508 ). Thereafter, the default t-LAG configuration process illustrated in FIG. 25 ends at block 2510 .
  • FIG. 26 there is depicted a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB of a TRILL campus in response to a local link-up event in accordance with one embodiment.
  • the process begins at block 2600 and then proceeds to block 2602 , which illustrates a t-LAG-enabled RB of TRILL campus 200 (e.g., RB 4 or RB 6 ) detecting a link-up event on a local t-LAG of the RB (block 2602 ).
  • the t-LAG-enabled RB notifies its peer RB in the t-LAG cluster of the link-up event (block 2604 ).
  • the t-LAG-enabled RB determines at block 2606 whether or not the t-LAG supported by the peer RB of the t-LAG cluster is currently up. If not, the t-LAG-enabled RB configures its switch controller 630 to allow traffic to flow from the port connected to the t-LAG ISL 300 to the local port having the link-up event detected at block 2602 (block 2620 ). The process then proceeds to block 2622 , which depicts the t-LAG-enabled RB initiating a t-LAG reconfiguration, as described in detail below with reference to FIG. 27 . Thereafter, the process depicted in FIG. 26 ends at block 2630 .
  • the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from the port connected to the t-LAG ISL 300 to the local port having the link-up event detected at block 2602 (block 2610 ).
  • the t-LAG-enabled RB updates the MAC entries to bind to the local port that just experienced the link-up event (block 2612 ). From block 2612 , the process proceeds to block 2622 and 2630 , which have been described.
  • FIG. 27 there is illustrated a high level logical flowchart of an exemplary t-LAG reconfiguration process in accordance with one embodiment. The process is performed, for example, at block 2622 of FIG. 26 , as well as block 2810 of FIG. 28 , block 2920 of FIG. 29 and block 3010 of FIG. 30 , as described further below.
  • FIG. 27 begins at block 2700 and thereafter proceeds to block 2702 , which depicts a t-LAG-enabled RB of TRILL campus 200 determining whether or not any local t-LAG link of the RB is down while the t-LAG of a remote RB in the same t-LAG cluster is up, for example, as shown in FIG. 21 . If not, the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from any local network port or from any local t-LAG port to the port connected to the t-LAG ISL 300 (blocks 2704 and 2706 ).
  • the t-LAG-enabled RB makes an affirmative determination at block 2702 , the t-LAG-enabled RB configures its switch controller 630 to allow traffic to flow from any local network port or from any local t-LAG port to the port connected to the t-LAG ISL 300 (blocks 2710 and 2712 ). Following either of blocks 2706 or 2712 , the t-LAG reconfiguration process illustrated in FIG. 27 ends at block 2714 .
  • FIG. 28 there is depicted a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a remote link-up event in accordance with one embodiment.
  • the process begins at block 2800 and then proceeds to block 2802 , which illustrates a t-LAG-enabled RB of TRILL campus 200 (e.g., RB 4 or RB 6 ) detecting a link-up event for a remote t-LAG in the same t-LAG cluster.
  • the t-LAG-enabled RB may detect the event based on a notification communicated by a peer RB in the t-LAG cluster as described at block 2604 of FIG. 26 .
  • the t-LAG-enabled RB determines at block 2804 whether or not the local t-LAG it supports is currently up. If not, the t-LAG-enabled RB initiates a t-LAG reconfiguration, as described with reference to FIG. 27 (block 2810 ). Thereafter, the process depicted in FIG. 28 ends at block 2812 .
  • the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from the port connected to the t-LAG ISL 300 to its local t-LAG (block 2806 ). Thereafter, the process proceeds to block 2810 and 2812 , which have been described.
  • FIG. 29 there is illustrated a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB of a TRILL campus in response to a local t-LAG link-down event in accordance with one embodiment.
  • the depicted process begins at block 2900 and then proceeds to block 2902 , which illustrates a t-LAG-enabled RB of TRILL campus 200 (e.g., RB 4 or RB 6 ) detecting a link-down event on a local t-LAG of the RB (block 2902 ).
  • the t-LAG-enabled RB In response to detecting the link-down event, notifies its peer RB in the t-LAG cluster of the link-down event (block 2904 ).
  • the t-LAG-enabled RB also configures its switch controller 630 to not allow traffic to flow from the port connected to t-LAG ISL 300 to the port connected to the downed t-LAG link (block 2906 ).
  • the t-LAG-enabled RB additionally determines at block 2910 whether or not the t-LAG supported by the peer RB of the t-LAG cluster is currently up. If not, the t-LAG-enabled RB clears all the MAC entries learned for the t-LAG cluster. The process then proceeds to block 2920 , which depicts the t-LAG-enabled RB initiating a t-LAG reconfiguration, as described with reference to FIG. 27 . Thereafter, the process depicted in FIG. 29 ends at block 2922 .
  • the t-LAG-enabled RB updates the MAC entries to bind entries for the local t-LAG to the port connected to t-LAG ISL 300 (block 2914 ). From block 2914 , the process proceeds to block 2920 and 2922 , which have been described.
  • FIG. 30 there is depicted a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a remote link-down event in accordance with one embodiment.
  • the depicted process begins at block 3000 and then proceeds to block 3002 , which illustrates a t-LAG-enabled RB of TRILL campus 200 (e.g., RB 4 or RB 6 ) detecting a link-down event for a remote t-LAG in the same t-LAG cluster.
  • the t-LAG-enabled RB may detect the event based on a notification communicated by a peer RB in the t-LAG cluster as described at block 2904 of FIG. 29 .
  • the t-LAG-enabled RB determines at block 3004 whether or not the local t-LAG it supports is currently up. If not, the t-LAG-enabled RB initiates a t-LAG reconfiguration, as described with reference to FIG. 27 (block 3010 ). Thereafter, the process depicted in FIG. 30 ends at block 3012 .
  • the t-LAG-enabled RB configures its switch controller 630 to allow traffic to flow from the port connected to the t-LAG ISL 300 to its local t-LAG (block 3006 ). Thereafter, the process proceeds to block 3010 and 3012 , which have been described.
  • the t-LAG support disclosed herein is designed to provide DMLT capability for external network nodes (e.g., switches or servers) connected to a TRILL campus, with all the links in a t-LAG used in an active-active mode for the same VLAN.
  • the use of a virtual-RB for each t-LAG leads to efficient load distribution of UC traffic in the t-LAG.
  • the use of this virtual-RB as the ingress RB in the TRILL encapsulation enables the MAC learning performed at egress RBs to be performed by hardware automatically.
  • the switch RB can alternatively be used as the ingress RB in TRILL encapsulation.
  • switch chips are capable of handling traffic for multiple RBs, but in some cases switch chips may lack such support in terms of capability or capacity. If such support is lacking, a t-LAG cluster including multiple RBs can be employed to adapt available hardware to provide t-LAG support. All the t-LAGs in a t-LAG cluster need to use just one virtual-RB in this case.
  • a link in a t-LAG is preferably selected as the primary link for multidestination transmission for each specific frame flow.
  • the selection of the primary link for a t-LAG can be system-based or based on a combination of distribution tree, VLAN, and/or DMAC.
  • Actions, such as enforcement of ACLs, are applied at egress RBs to make sure a multidestination frame will not be returned to its originating t-LAG.
  • Traffic handling in a t-LAG cluster is preferably separated into two domains: one for traffic routing within the TRILL campus and the other for the traffic switching in the regular L2 domain. It is recommended to totally separate the traffic handling in these two domains in a t-LAG cluster.
  • a t-LAG ISL is utilized in a t-LAG cluster between peer RBs to handle the traffic redirection in the event of a local link failure on a t-LAG. The traffic redirection via the t-LAG ISL is employed until a new route or distribution tree for affected traffic can be determined and applied.
  • present invention has been particularly shown as described with reference to one or more preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention.
  • a data processing system e.g., server computer, network switch, etc.
  • present invention may alternatively be implemented as a program product including a data storage medium/device storing program code that can be processed by a data processing system to implement the functionality.
  • the data storage medium/device can be, for example, an optical or magnetic disk, a volatile or non-volatile memory device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

Each of first and second bridges of a data network having respective links to an external node implement a network bridge component that forwards traffic inside the data network and a virtual bridge component that forwards traffic outside of the data network. A virtual bridge is formed including the virtual bridge components of the first and second bridges and an interswitch link (ISL) between the virtual bridge components of the first and second bridges. Data frames are communicated with each of multiple external network nodes outside the data network via a respective one of multiple link aggregation groups all commonly supported by the virtual bridge.

Description

PRIORITY CLAIM
The present application claims priority to U.S. Provisional Patent Application 61/498,316, filed Jun. 17, 2011.
BACKGROUND OF THE INVENTION
1. Technical Field
The present invention relates in general to data networks, and in particular, to a link aggregation group (LAG) for a Layer 2 data network, such as a Transparent Interconnection of Lots of Links (TRILL) network.
2. Description of the Related Art
The IEEE 802.1D standard defines the Spanning Tree Protocol (STP), which is a conventional data link layer protocol that ensures that a bridged Ethernet network is free of bridge loops and that a single active network path exists between any given pair of network nodes. Current trends for packet-switched data networks—including the convergence of local area network (LAN) and storage area network (SAN) traffic (e.g., Fibre Channel, Fibre Channel over Ethernet (FCoE), Internet Small Computer System Interface (iSCSI), etc.), rapidly increasing bandwidth capacities of (and demand on) network links, and increased virtualization of network resources and infrastructure—place significant additional demands on network infrastructure and management.
These demands have exposed weaknesses in STP and have generated significant industry interest in replacing STP with a more robust, efficient, and flexible Layer 2 protocol. For example, because STP permits only a single active network path between any two network nodes and blocks all alternate network paths, aggregate network bandwidth is artificially reduced and is inefficiently utilized. STP also reacts to even small topology changes and may force partitioning of virtual LANs due to network connectivity changes. In addition, the Ethernet header of STP frames does not include a hop count (or Time to Live (TTL)) field, limiting flexibility. Furthermore, because only a single active network link is supported between any two nodes, STP has poor fault tolerance, lengthy failure recovery (which can require broadcast traffic to relearn forwarding paths) and low reliability (i.e., dropped traffic).
In view of the weaknesses of STP, the Internet Engineering Task Force (IETF) has recently proposed to replace STP with a new set of Transparent Interconnection of Lots of Links (TRILL) protocols, defined, for example, in Perlman, R., et al., “RBridges: Appointed Forwarders”, Internet-Draft, expires Nov. 18, 2011, and Perlman, R., et al., “RBridges: Base Protocol Specification”, Internet-Draft, expires September 2010, which has been superseded by RFC6325 “RBridges: Base Protocol Specification,” dated July 2011 and incorporated herein by reference. These and other TRILL protocols presuppose the use of IS-IS (as defined, for example, in IETF RFC6165) in the control plane.
With the use of TRILL protocols, regular L2 traffic is tunneled and passed via a special routing methodology (referred to herein as TRILL routing) in a TRILL campus comprising a network of RBridges and links (and possibly intervening standard L2 bridges) bounded by end stations. Multi-pathing is currently supported for unicast and multidestination traffic within a TRILL campus, but not on its boundary. Thus, at run time TRILL permits an external switch or server to have only one active link connected to a TRILL campus for the same Virtual LAN (VLAN).
The present application recognizes that it is desirable to promote high availability by supporting redundant links between external nodes and multiple RBridges in a TRILL campus. The present application additionally recognizes that it is also desirable to place these redundant links into a Link Aggregation Group (LAG) in order to utilize the bandwidth of all the links effectively. Accordingly, the present application discloses mechanisms and associated methodologies, referred to herein as TRILL LAG or t-LAG, that supports connection of external network nodes (e.g., switches and/or servers) to a TRILL campus via a DMLT (Distributed Multi-Link Trunk).
SUMMARY OF THE INVENTION
In at least one embodiment, each of first and second bridges of a data network having respective links to an external node implement a network bridge component that forwards traffic inside the data network and a virtual bridge component that forwards traffic outside of the data network. A virtual bridge is formed including the virtual bridge components of the first and second bridges and an interswitch link (ISL) between the virtual bridge components of the first and second bridges. Data frames are communicated with each of multiple external network nodes outside the data network via a respective one of multiple link aggregation groups all commonly supported by the virtual bridge.
In at least one embodiment, each of first and second bridges of a data network having respective external links to an external node implement a network bridge component that forwards traffic inside the network and a virtual bridge component that forwards traffic outside of the network. A virtual bridge is formed including the virtual bridge components of the first and second bridges and an interswitch link (ISL) between the virtual bridge components of the first and second bridges. Data frames are redirected via the ISL in response to a link-down condition of one of the external links.
In at least one embodiment, a switch of a data network implements both a bridge and a virtual bridge. In response to receipt of a data frame by the switch from an external link, the switch performs a lookup in a data structure using a source media access control (SMAC) address specified by the data frame. The switch determines if the external link is configured in a link aggregation group (LAG) and if the SMAC address is newly learned. In response to a determination that the external link is configured in a LAG and the SMAC address is newly learned, the switch associates the SMAC with the virtual bridge and communicates the association to a plurality of bridges in the data network.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a high level block diagram of an conventional TRILL campus in accordance with the prior art;
FIG. 2 depicts an exemplary network environment in which a network node external to a TRILL campus can be connected to multiple RBridges (RBs) in the TRILL campus via multiple redundant links forming a LAG;
FIG. 3 illustrates an exemplary network environment in which a TRILL RB handles ingress and egress traffic for multiple RBs coupled to a TRILL campus via t-LAGs;
FIG. 4 depicts an exemplary network environment in which unicast traffic is autonomously distributed across the links of a t-LAG;
FIG. 5 illustrates an exemplary network environment in which the use of the ingress virtual-RB as the source RB in TRILL encapsulation of frames may cause problems in distribution of multidestination traffic in the TRILL campus;
FIG. 6 depicts an exemplary switch, which can be utilized to implement a TRILL RB (or vRB) in accordance with one or more embodiments;
FIGS. 7-8 respectively illustrate more detailed view of the Forwarding Database (FDB) and RB data structures in accordance with one embodiment;
FIG. 9 is a high level logical flowchart of an exemplary process by which an edge RB (or vRB) of a TRILL campus implements forwarding for UC traffic ingressing the TRILL campus in accordance with one embodiment;
FIG. 10 depicts an exemplary embodiment of a TRILL data frame including a native Ethernet frame is augmented with a TRILL header and an outer Ethernet header;
FIG. 11 is a high level logical flowchart of an exemplary process by which an edge RB (or vRB) of a TRILL campus implements forwarding for multidestination (MC/BC/DLF) traffic ingressing the TRILL campus in accordance with one embodiment;
FIG. 12 is a high level logical flowchart of an exemplary process by which an RB (or vRB) of a TRILL campus implements forwarding for UC traffic received at a network port coupled to an internal link of the TRILL campus in accordance with one embodiment;
FIG. 13 is a high level logical flowchart of an exemplary process by which an RB (or vRB) of a TRILL campus implements forwarding for MC traffic received at a network port coupled to an internal link of the TRILL campus in accordance with one embodiment;
FIG. 14 is high level logical flowchart of an exemplary process by which an ACL installed at an egress t-LAG port of an edge RB of a TRILL campus can be applied to prevent frame looping for multidestination traffic in accordance with one embodiment;
FIG. 15 is a high level logical flowchart of an exemplary process by which an ingress RB of a TRILL campus performs MAC learning at a t-LAG port in accordance with one embodiment;
FIG. 16 is a high level logical flowchart of an exemplary process by which an egress RB of a TRILL campus performs MAC learning in response to receipt of an End Station Address Distribution Instance (ESADI) frame from another RB in accordance with one embodiment;
FIG. 17 is a high level logical flowchart of an exemplary method of configuring a RB of a TRILL campus to support a t-LAG in accordance with one embodiment;
FIG. 18 is a high level logical flowchart of an exemplary process by which an egress RB of a TRILL campus implements MAC learning in response to a TRILL data frame in accordance with one embodiment;
FIG. 19 is a high level logical flowchart of an exemplary process by which an RB of a TRILL campus supports fault tolerant communication via a t-LAG in accordance with one embodiment;
FIGS. 20-21 illustrate an exemplary network environment in which, in the event of a failure of link of a t-LAG, unicast traffic is redirected via the t-LAG ISL to a peer RB in the same t-LAG cluster for egress through a healthy t-LAG link;
FIG. 22 depicts an exemplary network environment in which, if the number of failed t-LAG links exceeds a predetermined threshold, unicast traffic is rerouted to a different egress RB;
FIGS. 23-24 illustrate an exemplary network environment in which, in the event of a failure of a t-LAG link, the t-LAG ISL is used to pass multidestination traffic to a peer RB in the same t-LAG cluster, which then sends egress frames out;
FIG. 25 is a high level logical flowchart of an exemplary process by which a t-LAG-enabled RB is configured by default at startup in accordance with one embodiment;
FIG. 26 is a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a local link-up event in accordance with one embodiment;
FIG. 27 is a high level logical flowchart of an exemplary t-LAG reconfiguration process in accordance with one embodiment;
FIG. 28 is a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a remote link-up event in accordance with one embodiment;
FIG. 29 is a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a local link-down event in accordance with one embodiment;
FIG. 30 is a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a remote link-down event in accordance with one embodiment; and
FIG. 31 is a high level logical flowchart of a prior art process of MAC learning in a conventional TRILL network.
In the drawings, common reference characters are utilized to identify like or corresponding features.
DETAILED DESCRIPTION OF ILLUSTRATIVE EMBODIMENT
The present application describes mechanisms and associated methodologies, referred to herein as TRILL LAG or t-LAG, that facilitate the connection of network nodes (e.g., servers and/or switches) external to a TRILL campus in a Link Aggregation Group (LAG) through the use of a virtual routing bridge (virtual-RB). Multiple t-LAGs may additionally be hosted by a set of multiple physical switches, herein referred to as a t-LAG cluster, with all t-LAGs in a given t-LAG cluster preferably (but not necessarily) sharing the same virtual-RB. The use of the virtual-RB for the t-LAGs can resolve load distribution for unicast (UC) traffic. For multidestination (e.g., multicast (MC), broadcast (BC), destination lookup fail (DLF)) traffic, different mechanisms are employed to ensure traffic is properly delivered to a peer RB of a t-LAG cluster; otherwise, either more than one copy of a multidestination frame may be sent to the same destination or a frame may be erroneously returned to an external network node that sourced the frame via the same t-LAG at which the frame ingressed the TRILL campus.
It is presently preferred to separate the traffic forwarding in a t-LAG cluster into two domains: the TRILL routing domain and the regular L2 switching domain. That is, it is preferred if the data switching in the regular L2 domain in a t-LAG cluster is handled within the virtual-RB itself and does not go through TRILL routing at all, if possible. An interswitch link (ISL) for a t-LAG cluster can advantageously be used for frame redirection in the regular L2 switching domain in the event of a link failure on any t-LAG in the t-LAG cluster.
With reference now to the figures and with particular reference to FIG. 1, there is illustrated a high level block diagram of an conventional TRILL campus 100 in accordance with the prior art. Prior art TRILL campus 100 includes a packet-switched data network including plurality of Rbridges (RBs) interconnected by network links. As shown, various of the RBs are coupled to external LANs and/or network nodes, such as switch 102.
The present TRILL protocols permit multi-paths within TRILL campus 100, but not at its boundary. Consequently, if an external network node, such as switch 102, wants to connect to a TRILL campus by multiple physical links, such as links 104 and 106, the TRILL protocols will determine an appointed forwarder for each VLAN running on top of the links and, as a result, will utilize only a single link for data forwarding at run time for each VLAN. Accordingly, for a given VLAN, traffic between switch 102 and RB 112 on link 104 is blocked (as shown) if RB 110 is chosen as the appointed forwarder for that VLAN. Consequently, all traffic for that VLAN will be forwarded from TRILL campus 100 to switch 102 via link 106.
Referring now to FIG. 2, an exemplary network environment is shown in which a network node external to a TRILL campus 200 can be connected to multiple RBs in the TRILL campus via multiple redundant links forming a LAG. The exemplary network environment includes a TRILL campus 200 comprising a packet-switched data network including a plurality of RBs (e.g., RB1-RB6) coupled by internal network links 202 a-202 h. Various of RB1 through RB6 are connected by external links to external networks or external nodes. For example, RB1 and RB2 connect to an external LAN 210 a supporting end stations 220 a-220 c by external links 212 a and 212 b, respectively. Similarly, RB5 connects to an external LAN 210 b, which supports ends stations 220 e-220 f, by an external link 212 e. Further, RB4 and RB6 connect to an external switch 202, which supports end station 220 g, by external links 212 c and 212 d, respectively, and RB4 and RB6 further connect to an end station 220 d by external links 212 f and 212 g. As further shown, external links 212 c and 212 d form t-LAG 230 a, and external links 212 f and 212 g form t-LAG 230 b.
In at least one embodiment, for TRILL campus 200 to support t-LAGs to external nodes (e.g., t- LAGs 230 a and 230 b), an additional RBridge, referred to as a virtual-RB or vRB herein, is created and deployed for each t-LAG. Thus, for example, vRB7 running on top of RB4 and RB6 supports t-LAG 230 a, and vRB8 running on top of RB4 and RB6 supports t-LAG 230 b. All the virtual-RBs in a TRILL campus created for the same t-LAG preferably employ the same RB nickname, which, as known to those skilled in the art, is utilized to identify an ingress RB in the TRILL tunneling header encapsulating an Ethernet frame. Further details regarding the TRILL header are described below with reference to FIG. 10.
All the virtual-RBs supporting t-LAGs are preferably involved in the TRILL IS-IS communication in active-active mode, as well as End Station Address Distribution Instance (ESADI) communication. For ESADI communication, each t-LAG-enabled switch preferably handles all the MAC addresses learned at its local t-LAG ports. A t-LAG-enabled RB preferably conducts this communication on behalf of the virtual-RB(s) running on top of it, if any. In addition, a LSP (Link State PDU (Protocol Data Unit)) is preferably generated automatically by a local switch for each virtual-RB on it. Shortest path first (SPF) computation preferably also takes these virtual-RBs into account, at least for UC traffic.
As further shown in FIG. 2, the switch chip(s) providing the switching intelligence of RB1 through RB6 in TRILL campus 200 preferably have the capability of contemporaneously handling traffic for more than one RB. For example, in the depicted embodiment, RB4 handles ingress and egress traffic for RB4 (the switch itself), as well as vRB7 and vRB8; RB6 similarly handles ingress and egress traffic for itself (i.e., RB6), as well as vRB7 and vRB8. To support this capability, for traffic ingressing at a t-LAG, the edge RBs (i.e., those connected to at least one external link 212) within TRILL campus 200 are preferably able to employ the corresponding ingress virtual-RB nickname as the ingress RB for TRILL encapsulation of the frames. For example, the traffic ingressing at RB4 may use RB4, vRB7 or vRB8 as the ingress RB in the TRILL header, depending upon which local port the frame is ingressing on. Similarly, the traffic ingressing at RB6 may use RB6, vRB7 or vRB8 as the ingress RB, again depending on the local port the frame is ingressing on. In this way, when a frame exits TRILL campus 200, the MAC learning performed at egress RBs will automatically bind the client source Media Access Control (SMAC) address to the ingress virtual-RB. Once this binding is established, UC traffic destined for a t-LAG will be autonomously load balanced across the external links comprising the t-LAG, as shown in FIG. 4. This use of the ingress virtual-RB as the ingress RB in TRILL headers may, however, cause problems for multidestination traffic traversing inside a TRILL campus, as discussed below.
In some cases, switch chips may not be capable of contemporaneously handling TRILL data frames for more than one RB or may support only a limited number of RBs (i.e., fewer than the number of RBs deployed). In addition, the number of distribution trees supported on a switch chip can also be very limited. Due to these factors, some adjustments may be required to adapt to such switching hardware limitations.
With reference now to FIG. 3, there is illustrated a high level view of a network environment in which multiple t-LAGs supported by a TRILL campus form a t-LAG cluster. As seen by comparison of FIGS. 2 and 3, FIG. 3 depicts a similar network environment as that described above with reference FIG. 2 with a couple of differences.
First, the network environment of FIG. 3 includes an additional end station 220 h, which is coupled to RB4 and RB6 via an additional t-LAG 230 c including external links 212 h and 212 i. Second, t-LAGs 230 a-230 c, which all belong to the same t-LAG cluster, are supported by a single virtual-RB (i.e., vRB9) rather than two virtual RBs (i.e., vRB7 and vRB8) and thus can share one RB nickname, if desired. As a result, the total number of RBs used in TRILL campus 200 will be reduced as compared to embodiments in which one virtual-RB is implemented per t-LAG. It should be noted that it is possible for a t-LAG cluster to use more than one RB nickname if desired, meaning, for example, the assignment of a virtual-RB to a t-LAG can be t-LAG-based.
Second, FIG. 3 further depicts that RB4 and RB6 are each comprised of two components: an intra-campus RB component (RB4′ and RB6′) designated to handle traffic forwarding inside the TRILL campus 200 and an extra-campus RB component (RB4″ and RB6″) designated to handle the traffic forwarding outside of TRILL campus 200 (i.e., in the regular L2 switching domain). As shown, the virtual-RB supporting the t-LAG cluster (i.e., vRB9) is formed of extra-campus RB components RB4″ and RB6″ linked by a t-LAG ISL 300 and thus may be distributed across multiple physical switch platforms. T-LAG ISL 300 is utilized for control communication and for failure handling. For example, t-LAG ISL 300 can be utilized for frame redirection in the event of a link failure on any local t-LAG port, as discussed further herein with reference to FIGS. 19-30.
For frames ingressing into TRILL campus 200, vRB9 passes the frame either to RB4′ or to RB6′ based upon whether the frame was received at RB4″ or RB6″, respectively. As noted in FIG. 3, for traffic that needs to pass beyond TRILL campus 200, RB4″ is only connected to RB4′, and RB6″ is only connected to RB6′. The virtual links connecting RB4′ to RB4″ and RB6′ to RB6″ are zero cost and should be handled transparently by the switch chips on RB4 and RB6, respectively. It is recommended but not required that the handling of all local L2 switching in a virtual-RB (e.g., vRB9) should be handled locally within the RB itself.
As with all network links, a link in a t-LAG may go down at run time. Consequently, it is desirable to handle such link failures in a manner that minimizes or reduces frame loss. At least two techniques of failure handling are possible:
    • 1. To adjust the connectivity between the intra-campus RB (e.g., RB4′ in FIG. 3) and its virtual-RBs (e.g., RB4″ or vRB9) at run time; and/or
    • 2. To use the t-LAG ISL (e.g., ISL 300 between RB4″ and RB6″ in FIG. 3) for frame redirection whenever a link failure occurs in the t-LAG cluster.
With the first solution, if a t-LAG link drops on a switch (e.g., RB4″), the virtual link between the intra-campus RB component (e.g., RB4′) and the virtual-RB (e.g., RB4″ or, actually, vRB9) will be claimed link-down. In this way, after the topology change has been communicated to all the RBs and a new path has taken effect, the UC traffic previously routed to RB4 will be routed to RB6 for egress via a t-LAG link in RB6″. For multidestination (MC/BC/DLF) traffic, the local access ports on edge RBs (those like RB4″ and RB6″ that interface with external links 212 a-212 i) will need to be adjusted at run time to allow the traffic be delivered via a healthy link in RB6″ for the same t-LAG. With the second solution, the t-LAG ISL (e.g., ISL 300) is used to redirect UC or multidestination frames to the peer RB in the same t-LAG cluster in case a t-LAG port on the local RB has a link down.
Because more than one t-LAG shares the same virtual link in the first solution (e.g., the virtual link from RB4′ to RB4″ in FIG. 3), all other healthy t-LAG links on that RB (e.g., RB4″) will not be used for UC frame delivery once the connectivity between RB4′ and RB4″ is claimed link-down. Thus, some bandwidth of healthy t-LAG links can be wasted in this case. In the second solution, the t-LAG ISL (e.g., ISL 300) may get over-loaded if too much traffic needs to pass through it. It is therefore presently preferred if both the first and second solutions are implemented in order to better address link failures on t-LAGs. In this combined solution, a threshold is preferably implemented and pre-specified so that a t-LAG-enabled switch can stop claiming the connectivity between the switch RB (e.g., RB4) and the virtual-RB (e.g., vRB9) if the number of the local t-LAG ports that are link-down exceeds the threshold. It should be noted that it will take time for related TRILL IS-IS communication as well as SPF computation to occur and complete before a new topology path can be applied in response to a t-LAG link-down event. Before these complete, all the traffic directed to a failed t-LAG link should be redirected as soon as possible via the t-LAG ISL to the peer RB for delivery to external network nodes (e.g., switches or servers).
In TRILL, multidestination traffic (MC/BC/DLF) is handled differently from UC traffic. A distribution tree is predetermined and followed for a specific flow of multidestination traffic ingressing a TRILL campus at an RB. Usually, all RBs in TRILL campus will be visited in all the distribution trees unless a VLAN or pruning has been applied to the distribution tree. Unless some provision is made, more than one copy of a frame will (undesirably) be delivered to external switches or servers via a t-LAG, if the frame is flooded in the TRILL campus following a distribution tree and all RBridges transmit the frame out of their local access ports.
To prevent delivery of duplicate frames, a primary link for each t-LAG is preferably predetermined and followed for a specific multidestination (MC/BC/DLF) traffic flow egressing from a TRILL campus. Several methodologies are possible for selecting the primary link for a t-LAG, including:
    • System-based: The same link in a t-LAG is always used across a TRILL campus as the primary link for multidestination transmission, if the link is available.
    • Distribution tree-based: Different distribution trees can use different t-LAG links as the primary link for multidestination transmission.
    • (Distribution tree, VLAN)-based: Different t-LAG links can be used as the primary link for different VLANs in a distribution tree.
    • (Distribution tree, VLAN, DMAC)-based: Different t-LAG links can be used as the primary link for different destination MAC (DMAC) addresses for the same distribution tree and the same VLAN.
As will be appreciated, the pre-determined selection of the primary link for a t-LAG may need to be adjusted at run time if a link-down event occurs in a t-LAG. Accordingly, the RBs in a t-LAG cluster preferably inter-communicate link-up and link-down event notifications. Before any required adjustment in the predetermined selection of the primary link is implemented in response to a link-down event, the t-LAG ISL (e.g., t-LAG ISL 300) can be used for frame redirection to avoid frame drop due to frames being sent to a failed primary t-LAG link.
It is important to the t-LAG design to bind a client SMAC to the ingress virtual-RB for a t-LAG. It would also be beneficial if the ingress virtual-RB can be used as the ingress RB in TRILL encapsulation for a frame when it enters at a t-LAG, as the MAC learning performed at egress RBs will do this binding automatically. However, the use of the ingress virtual-RB as the ingress RB in TRILL encapsulation of frames may cause problems in distribution of multidestination traffic in the TRILL campus for some switch chips, as now described with reference to FIG. 5.
Assuming the illustrated distribution tree rooted at vRB9 is used for a multidestination flow and the link between RB4′ and RB4″ is chosen as part of the distribution tree, if a data frame ingresses into the TRILL campus via a t-LAG in RB6″, the data frame may get dropped as it traverses in TRILL campus 200 (e.g., by RB1 or RB3) because vRB9 is used as the ingress RB in the TRILL header of the frame, but is actually on the destination side of the distribution tree. Instead of using the virtual-RB (e.g., vRB9) as the source, the switch RB (e.g., RB6) should be used as the source RB in TRILL encapsulation in the above case in order to prevent erroneous frame dropping. This ingress RB designation should be applied to both UC and multidestination traffic to avoid MAC flapping at egress RBs.
As mentioned, one aspect of the implementation of t-LAG is the binding of the client SMAC learned at a t-LAG to the virtual-RB created for that t-LAG. If the virtual-RB (e.g., vRB9) can be used as the ingress RB in TRILL encapsulation, then the desired binding can be automatically achieved (e.g., by hardware) via the MAC learning performed at egress RBs. If the switch RB (e.g., RB6) is instead used as the ingress RB for TRILL encapsulation to avoid erroneous frame dropping of multidestination traffic as discussed above, then a different technique must be employed to achieve the desired binding of the client SMAC to the virtual-RB.
One alternative technique to achieve the desired binding of the client SMAC to the virtual-RB is through software-based MAC learning performed on a t-LAG-enabled switch (as described, for example, with reference to FIG. 15). A MAC address learned at a t-LAG port can be specially manipulated in software to bind to ingress virtual-RB; this newly learned MAC entry can then be propagated via ESADI to all other RBs in the TRILL campus for configuration. In this way, the load distribution of UC traffic at any ingress RB can then be achieved automatically. It is also possible to perform the MAC learning via hardware at egress RBs if the chips of the relevant switches allow multiple RBs be mapped into the same virtual port so that the MAC learning performed on the chips can bind a client SMAC to the corresponding ingress virtual-RB.
Filtering Database for Bridge (FDB) sync for SMACs learned at t-LAG ports is preferably implemented between the peer RBs in a t-LAG cluster, especially if the LAG hashing algorithm performed on external switches or servers is SMAC-based. This FDB synchronization avoids unnecessary flooding or dropping of known UC traffic at egress to a t-LAG if the egress RB has no related MAC information. The MAC information of the peer RB in the same cluster is also needed upon making a decision to redirect traffic to the t-LAG ISL when a local t-LAG link fails.
Because all RBs in a TRILL campus will usually be part of a distribution tree, it is possible that a data frame may attempt to return to the t-LAG at which it ingresses, for example, through a link for the same t-LAG but on a different RB than the ingress RB. Actions, such as the enforcement of ACLs, can be applied on all the t-LAG-enabled RBs to ensure that such looping data frames are dropped before egressing from TRILL campus 200, as described further below with reference to FIG. 14.
With reference now to FIG. 6, there is illustrated an exemplary embodiment of a physical switch 600 that may be utilized to implement any of the RBs or vRBs of TRILL campus 200, as depicted in FIG. 2 or FIG. 3. As shown, switch 600 includes a plurality of physical ports 602 a-602 m. Each physical port 602 includes a respective one of a plurality of receive (Rx) interfaces 604 a-604 m and a respective one of a plurality of ingress queues 606 a-606 m that buffers frames of data traffic received by the associated Rx interface 604. Each of ports 602 a-602 m further includes a respective one of a plurality of egress queues 614 a-614 m and a respective one of a plurality of transmit (Tx) interfaces 620 a-620 m that transmit frames of data traffic from an associated egress queue 614. Ports 602 connected to external links 212 are referred to herein as “local access ports,” while ports 602 connected to internal links 202 of TRILL campus 200 are referred to herein as “local network ports.”
Switch 600 additionally includes a switch fabric 610, such as a crossbar or shared memory switch fabric, which is operable to intelligently switch data frames from any of ingress queues 606 a-606 m to any of egress queues 614 a-614 m under the direction of switch controller 630. As will be appreciated, switch controller 630 can be implemented with one or more centralized or distributed, special-purpose or general-purpose processing elements or logic devices (also referred to as “switch chips”), which may implement control entirely in hardware, or more commonly, through the execution of firmware and/or software by a processing element. Switch controller 630 thus provides the switching intelligence that implements the RB (and vRB) behavior herein described.
In support of the RB and vRB behavior described herein, switch controller 630 implements a number of data structures in volatile or non-volatile data storage, such as cache, memory or disk storage. Although these data structures are commonly referred to as “tables,” those skilled in the art will appreciate that a variety of physical data structures including, without limitation, arrays, lists, trees, or composites thereof, etc. may be utilized to implement various ones of the data structures.
The depicted data structures include FDB data structure 640, which as illustrated in FIG. 7, includes multiple entries each including fields for specifying an RB (or vRB), a virtual local area network (VLAN) identifier (VID), a destination media access control (DMAC) address, and a destination port (i.e., either a local access port (lport) or virtual port (vport) on a remote RB). For L2 switching; based on a (DMAC, VLAN) tuple, FDB data structure 640 returns the destination port of the frame, which can be a local access port, a vport for a remote RB (for UC traffic), or a vport for a distribution tree (for multidestination traffic). For TRILL multidestination traffic, FDB data structure 640, responsive to an input (RB, VLAN) or (RB, DMAC, VLAN) tuple, returns a vport for a distribution tree for the multidestination traffic.
The data structures of switch controller 630 additionally includes RB data structure 642, which, as depicted in FIG. 8, includes multiple entries each including fields for specifying an RB (or vRB) and a destination port (i.e., an lport or a vport). For TRILL routing, RB data structure 642, responsive to an indication of the egress RB of a data frame, returns a destination port for sending out data traffic, where the destination port can be a local access port or a vport for a remote RB. Based on the specification of an ingress RB, RB data structure 642 additionally provides the vport for MAC learning at an egress RB. For TRILL multidestination traffic, RB data structure 642 provides the vport for a distribution tree based on the root RB.
The data structures employed by switch controller 630 further include:
    • Vport data structure 642: for UC traffic, given a vport, vport data structure 642 returns the egress RB and an index to next hop data structure 648 (if Equal-Cost Multi-Path routing (ECMP) is disabled) or an index to ECMP data structure 646 (if ECMP is enabled); for multidestination traffic, vport data structure 642 returns the root RB (or vRB) of the distribution tree plus an index to MC bitmap data structure 650;
    • ECMP data structure 646: given an index, ECMP data structure 646 resolves the index to a next hop for TRILL routing;
    • Next hop data structure 648: for a given traffic flow, next hop data structure 648 indicates the local port for egress, the next-hop DMAC, and the SMAC and VLAN to use for forwarding;
    • MC bitmap data structure 650: given an index, MC bitmap data structure 650 returns both a Layer 2 (L2) and a Layer 3 (L3) bitmap; the L2 bitmap is used for flooding to local access ports, and the L3 bitmap is used for tree distribution inside the TRILL campus, where a bit turned on in L3 bitmap can be used in port data structure 654 to index into next hop data structure 648;
    • VLAN data structure 652: VLAN data structure 652 contains a vport for a distribution tree for BC/DLF flooding;
    • Port data structure 654: for each local port 602, port data structure 654 contains an index to next hop data structure 648 to support TRILL distribution trees and further indicates the ingress RB to use for multidestination traffic flows.
With reference now to FIG. 9, there is illustrated a high level logical flowchart of an exemplary process by which an edge RB (or vRB) of TRILL campus 200 implements forwarding for UC traffic ingressing TRILL campus 200 in accordance with one embodiment. The process begins at block 900 and then proceeds to block 902, which depicts an edge RB of TRILL campus 200 receiving a UC data frame at an access port (e.g., a port 602 connected to one of external links 212 a-212 i). In response to receipt of the UC data frame, the edge RB performs a lookup in FDB data structure 640 based on a tuple including the DMAC and VLAN specified in the data frame. As indicated at block 906, if no matching entry for the data frame is found in FDB data structure 640, the edge RB forwards the UC data frame in accordance with the MC forwarding process depicted in FIG. 11, which is described below. Thereafter, the UC forwarding process depicted in FIG. 9 ends at block 930.
Returning to block 906, in response to finding a matching entry for the (DMAC, VLAN) tuple in FDB data structure 640, the edge RB determines at block 910 if the destination port indicated by FDB data structure 640 is a vport for a remote RB. If not, the edge RB sends the data frame out of the local access port indicated by FDB data structure 640 (i.e., performs regular L2 forwarding on an external link 212 outside of TRILL campus 200) as shown at block 912, and the UC forwarding process of FIG. 9 ends at bock 930. If, however, the edge RB determines at block 910 that the destination port specified by FDB data structure 640 is a vport for a remote RB, the edge RB, which will serve as the ingress RB, further determines whether ECMP is enabled (block 920). If not, the process proceeds to block 924, described below. If ECMP is enabled, the edge RB accesses ECMP data structure 646 to determine the next hop for the data frame (block 922). Following either block 920 (if ECMP is disabled) or block 922 (if ECMP is enabled), the edge RB accesses next hop data structure 648 to retrieve information for the next hop interface (block 924). Thereafter, the edge RB adds a TRILL header and an outer encapsulating Ethernet header to the data frame (block 926) and sends the encapsulated data frame out of a local network port on an internal link 202 of TRILL campus 200 to the next hop (block 928). Thereafter, the UC forwarding process terminates at block 930.
Referring to FIG. 10, there is depicted an exemplary embodiment of a TRILL data frame 1000 in accordance with one embodiment. As received at an edge RB (e.g., at block 902 of FIG. 9), a conventional (native) Ethernet data frame 1010 includes a Ethernet header 1012 and an Ethernet payload 1014. As described at block 926 of FIG. 9, the edge RB prepends a TRILL header to Ethernet frame 1010 and then encapsulates the whole with an outer Ethernet header 1020 (which specifies a TRILL Ethertype) and an Ethernet FCS 1022. As depicted, the TRILL header begins with a collection of fields 1030 including a TRILL version field (V), a reserved field (R), a multi-destination bit (M) indicating whether the TRILL data frame is a multidestination frame, an op-length field (OpLen) that gives the length of the TRILL header optional fields, if any, terminating the TRILL header, and a hop count field (HC) decremented by each RB “hop” as TRILL data frame 1000 is forwarded in TRILL campus 200. The TRILL header additionally includes an egress RB nickname field 1032 that, for UC data frames, identifies by RB nickname the last RB (i.e., egress RB) in TRILL campus 200 that will handle the data frame and is therefore responsible for decapsulating native Ethernet data frame 1010 and forwarding it to an external node. The TRILL header further includes an ingress RB nickname field 1034 that indicates the RB nickname of the edge RB. As indicated above, it is preferable if the specified RB nickname is the RB nickname of the edge switch RB (e.g., RB4) rather than the RB nickname of the edge vRB (e.g., vRB9).
With reference now to FIG. 11, there is illustrated a high level logical flowchart of an exemplary process by which an edge RB (or vRB) of TRILL campus 200 implements forwarding for multidestination (MC/BC/DLF) traffic ingressing TRILL campus 200 in accordance with one embodiment. The process begins at block 1100 and then proceeds to block 1102, which depicts an edge RB of TRILL campus 200 receiving a multidestination data frame at an access port (e.g., a port 602 coupled to one of external links 212 a-212 i). In response to receipt of the multidestination data frame, the edge RB determines at block 1104 if the multidestination data frame is a MC data frame. If, for example, the data frame is an Ethernet data frame, an MC data frame can be detected by determining whether the least significant bit of the DMAC specified by the data frame is set. In response to a determination at block 1104 that the data frame is not a MC data frame, the process proceeds to block 1112, which is described below. If, however, the edge RB determines at block 1104 that the multidestination frame is a MC data frame, the edge RB performs a lookup in FDB data structure 640 based on a tuple including the DMAC and VLAN specified in the data frame (block 1106).
As indicated at block 1110, if a matching entry is located in FDB data structure 640, the vport for the distribution tree for the multidestination data frame is returned, and the process proceeds to block 1114, which is described below. If, however, no matching entry for the multidestination data frame is found in FDB data structure 640, the edge RB accesses VLAN data structure 652 to obtain the vport for the distribution tree (block 1112). In addition, the edge RB accesses vport data structure 644 and MC bitmap data structure 650 to obtain L2 and L3 bitmaps for the data frame (block 1114).
The edge RB then sends a copy of the native data frame out of each local access port, if any, indicated by the L2 bitmap (block 1116), which are the local access port(s) of the edge RB connected to external links 212 outside of TRILL campus 200. In addition, the edge RB adds a TRILL header and an outer encapsulating Ethernet header to the data frame and sends the encapsulated data frame out of each local network port, if any, of TRILL campus 200 indicated by the L3 bitmap (block 1118). Thereafter, the multidestination forwarding process of FIG. 11 terminates at block 1120.
Referring now to FIG. 12, there is illustrated a high level logical flowchart of an exemplary process by which an RB (or vRB) of TRILL campus 200 implements forwarding for UC traffic received at a network port coupled to an internal link 202 of TRILL campus 200 in accordance with one embodiment. The process begins at block 1200 and then proceeds to block 1202, which depicts an RB of TRILL campus 200 receiving a UC data frame at a network port coupled to an internal link 202 of TRILL campus 200. In response to receipt of the UC data frame, the RB performs a lookup in RB data structure 642 based on the egress RB specified in egress RB nickname field 1032 of the TRILL header of the data frame. As indicated at block 1206, if no matching entry for the data frame is found in RB data structure 642, the RB discards the UC data frame. Thereafter, the UC forwarding process depicted in FIG. 12 ends at block 1230.
Returning to block 1206, in response to finding a matching entry for the egress RB in RB data structure 642, the RB determines whether or not the egress port indicated by RB data structure 640 is a local access port, that is, a port connected to an external link 212. If not (i.e., the egress port is a network port), the process proceeds to block 1220, which is described below. If, however, the RB determines at block 1210 that the egress port is a local access port, the RB performs MAC learning for the data frame, if enabled (block 1212). An exemplary process for MAC learning is described below with reference to FIG. 18. The RB then decapsulates the native L2 data frame by removing outer Ethernet header 1020 and the TRILL header (block 1214) and sends the native L2 data frame out of the local access port indicated by RB data structure 642.
Referring to block 1220, the RB determines whether ECMP is enabled. If not, the process proceeds to block 1224, described below. If, however, ECMP is enabled, the RB accesses ECMP data structure 646 to determine the next hop for the data frame (block 1222). Following either block 1220 (if ECMP is disabled) or block 1222 (if ECMP is enabled), the RB accesses next hop data structure 648 to retrieve information for the next hop interface (block 1224). Thereafter, the RB modifies the outer encapsulating Ethernet header of the UC data frame to specify the appropriate source and destination MAC addresses (block 1226) and sends the data frame out of a local network port to the next hop in TRILL campus 200 (block 1228). Thereafter, the UC forwarding process depicted in FIG. 12 terminates at block 1230.
With reference now to FIG. 13, there is illustrated a high level logical flowchart of an exemplary process by which an RB (or vRB) of TRILL campus 200 implements forwarding for MC data frames received at a network port connected to an internal link 202 of TRILL campus 200 in accordance with one embodiment. The process begins at block 1300 and then proceeds to block 1302, which depicts a RB of TRILL campus 200 receiving a MC data frame at a network port (e.g., a port 602 coupled to one of internal links 202 of TRILL campus 200). In response to receipt of the MC data frame, the RB performs a lookup in FDB data structure 640 based on a tuple including the RB and the DMAC and VLAN specified in the data frame (block 1304).
As indicated at block 1306, if no matching entry is located in FDB data structure 640, the process proceeds to block 1320, which is described below. In response to the RB locating a matching entry for the MC data frame in FDB data structure 640, the vport for the distribution tree for the MC data frame is returned, and the process proceeds to block 1310. At block 1310, the RB accesses vport data structure 644 and MC bitmap data structure 650 to obtain L2 and L3 bitmaps for the data frame. The RB then sends a copy of the data frame out of each local access port, if any, indicated by the L2 bitmap (block 1312), which are the local access port(s) of the RB connected to external links 212 outside of TRILL campus 200. In addition, the RB sends a copy of the data frame out of each local network port, if any, of TRILL campus 200 indicated by the L3 bitmap after updating the outer encapsulating Ethernet header of the MC data frame to specify the appropriate source MAC addresses (block 1314). Thereafter, the MC forwarding process of FIG. 13 terminates at block 1330.
Referring now to block 1320, the RB performs a lookup for the MC data frame in FDB data structure 640 based on a tuple including the identifier of the RB and the VLAN specified by the MC data frame. If the RB determines at block 1322 that a matching entry for the MC data frame is found in FDB data structure 640, RB forwards the MC data frame as has been described with respect to blocks 1310-1314. If, however, the RB determines at block 1322 that no matching entry for the data frame is present in FDB data structure 640, the RB performs a lookup in RB data structure 642 utilizing the egress RB specified in egress RB nickname field 1032 of the TRILL header of the data frame (block 1324). If the RB determines at block 1326 that a matching entry for the data frame is present in RB data structure 642, the RB forwards the MC data frame as has been described with respect to blocks 1310-1314. If, however, the RB determines at block 1326 that no matching entry for the data frame is found in RB data structure 642, the RB discards the data frame at block 1328. Thereafter, the MC data frame forwarding process depicted in FIG. 13 terminates at block 1330.
Referring now to FIG. 14, there is depicted a high level logical flowchart of an exemplary process by which an ACL installed at an egress t-LAG port of an edge RB of a TRILL campus 200 can be applied to prevent frame looping for multidestination traffic in accordance with one embodiment. The process begins at block 1400 in response to an edge RB of a TRILL campus receiving a data frame at an egress local access port configured as part of a t-LAG. As indicated at block 1402, the RB determines whether the data frame is a TRILL MC data frame, for example, by examining the multicast bit in TRILL header fields 1030. In response to a determination that the data frame is not a TRILL MC data frame, the RB allows the data frame to egress through the local access port (block 1406).
If, however, the RB determines at block 1402 that the data frame is a TRILL MC data frame, the RB applies an ACL at block 1404 by determining whether or not the RB identified in ingress RB nickname field 1034 of the TRILL header is a peer RB belonging to the same t-LAG cluster as the current RB. If not, the RB allows the data frame to egress through the local access port (block 1406). If, however, the RB determines that the RB identified in the ingress RB nickname field 1034 of the TRILL header is a peer RB belonging to the same t-LAG cluster as the current RB, the RB enforces the ACL by discarding the data frame (block 1408), thus preventing frame looping. Following block 1406 or block 1408, the process depicted in FIG. 14 terminates at block 1410.
As an alternative to the process depicted in FIG. 14, source pruning for TRILL multidestination frames can be performed by employing a different distribution tree for frames entering at different switch RBs supporting a t-LAG cluster. For example, in TRILL campus 200, vRB9 can implement source pruning for multidestination traffic by employ differing distribution trees for each combination of switch RB (i.e., RB4 or RB6) and t-LAG.
In a conventional TRILL campus, MAC learning is performed at egress RBs to bind the SMAC of a data frame exiting the TRILL campus to the ingress RB. A prior art MAC learning process in a conventional TRILL campus is shown in FIG. 31. The depicted process begins at block 3100 and then proceeds to block 3102, which illustrates an egress RB of a conventional TRILL campus receiving a TRILL data frame at one of its local network ports. In response to receipt of the TRILL data frame, the egress RB performs an RB lookup in its RB data structure based on the egress RB nickname specified in the TRILL header of the TRILL data frame (block 3104).
If the egress RB determines at block 3106 the destination port returned by the RB lookup is not a local access port, then the conventional MAC learning process depicted in FIG. 31 terminates at block 3120. If, on the other hand, the egress RB determines at block 3106 that the destination port returned by the RB lookup is a local access port (i.e., the local RB is the egress RB for the TRILL data frame), then the egress RB performs hardware SMAC learning and binds the SMAC to the ingress RB indicated by the TRILL header of the TRILL data frame (block 3110). Thereafter, the process depicted in FIG. 31 ends at bock 3120.
In a preferred embodiment, the conventional MAC learning process depicted in FIG. 31 is replaced in TRILL campus 200 with a more comprehensive MAC learning methodology supporting the use of t-LAGs and t-LAG clusters as described herein. This comprehensive MAC learning methodology includes MAC learning at t-LAG ports of ingress RBs (e.g., as depicted in FIG. 15), MAC learning at RBs in the same t-LAG cluster as an edge RB (e.g., as illustrated in FIG. 16), and MAC learning at egress RBs that binds SMACs to ingress vports (e.g., as depicted in FIG. 18).
With reference now to FIG. 15, there is illustrated a high level logical flowchart of an exemplary process by which an ingress RB of a TRILL campus 200 performs MAC learning at a t-LAG port in accordance with one embodiment. As shown, the process begins at 1500 and then proceeds to block 1502, which depicts an ingress RB of a TRILL campus 200 receiving a native L2 data frame at a local access port connected to an external link 212. In response to receipt of the native L2 data frame, the ingress RB performs a lookup of the data frame in FDB data structure 640 using the SMAC address specified by the data frame (block 1504).
The ingress RB then determines at block 1506 whether or not the FDB entry obtained by the lookup performed at block 1504 is newly learned at a local access port that is configured in a t-LAG. If not, the process depicted in FIG. 15 terminates at block 1520. If, however, a determination is made at block 1506 that the entry obtained by the FDB lookup is newly learned at a local access port configured in a t-LAG, the contents of the FDB entry are passed to software for MAC learning (block 1510). Software accordingly binds the SMAC of the data frame to the ingress vRB if the ingress local access port is a t-LAG port (block 1512). Binding the SMAC of the data frame to the ingress vRB (rather than ingress RB) in this manner supports the automatic load balancing and fault tolerant communication described herein. The ingress RB then passes the contents of the FDB entry to all other RBs of TRILL campus 200 via ESADI (block 1514). Thereafter, the process illustrated in FIG. 15 ends at block 1520.
Referring now to FIG. 16, there is depicted a high level logical flowchart of an exemplary process by which an egress RB of a TRILL campus 200 performs MAC learning in accordance with one embodiment. The process begins at block 1600 and then proceeds to block 1602, which illustrates an egress RB of TRILL campus 200 receiving an ESADI frame from another RB in TRILL campus 200. The ESADI frame can be originated, for example, at block 1514 of the ingress RB MAC learning process depicted in FIG. 15. In response to receipt of the ESADI frame, the egress RB determines at block 1604 whether or not it is configured within a common t-LAG cluster with the remote RB from which the ESADI frame originated. If not, the egress RB configures its switch controller 630 to bind the SMAC to a vport for the ingress vRB of the traffic flow (block 1610). If, however, the egress RB determines at block 1604 that is configured in the same t-LAG cluster as the remote RB, the egress RB configures its switch controller 630 to bind the SMAC to a local t-LAG port of the t-LAG cluster (block 1606). Following either block 1606 or block 1610, the egress RB MAC learning process depicted in FIG. 16 terminates at block 1612.
With reference now to FIG. 17, there is illustrated a high level logical flowchart of an exemplary method of configuring a RB of a TRILL campus to support a t-LAG in accordance with one embodiment. The process begins at block 1700 and then proceeds to block 1702, which illustrates a RB of TRILL campus 200 receiving a t-LAG configuration specifying which ports 602 of the RB belong to a link aggregation group (LAG). In response to receipt of the t-LAG configuration, the RB configures switch controller 630 to map both vRB(s) and switch-based (i.e., physical) RBs in the same t-LAG to the same vport (block 1704). Mapping both vRBs and RBs in the same t-LAG to the same vport in this manner supports the egress RB MAC learning process described below with reference to FIG. 18. Following block 1704, the process of FIG. 17 ends at block 1706.
Referring now to FIG. 18, there is depicted a high level logical flowchart of an exemplary process by which an egress RB of a TRILL campus implements MAC learning in response to a TRILL data frame in accordance with one embodiment. As shown, the process begins at block 1800 and then proceeds to block 1802, which illustrates an egress RB of TRILL campus 200 receiving a TRILL data frame 1000 as illustrated in FIG. 10 via an internal link 202 of TRILL campus 200 at one of its local network ports. In response to receipt of the TRILL data frame, the egress RB performs a lookup in RB data structure 642 based on the egress RB nickname specified in the egress RB nickname field 1032 of the TRILL header (block 1804).
If the egress RB determines the destination port returned by the RB lookup is not a local access port, but is instead a vport for a remote RB (block 1806), then the process depicted in FIG. 18 terminates at block 1820. If, on the other hand, the egress RB determines at block 1806 that the destination port returned by the RB lookup is a local access port, then the egress RB again performs a lookup in RB data structure 642 based on the ingress RB nickname specified in the ingress RB nickname field 1034 of the TRILL header in order to determine the ingress vport (block 1810). As noted above with reference to FIG. 17, both the ingress RB and any related vRB are preferably configured to map to the same vport. The egress RB then performs hardware SMAC learning and binds the SMAC to the ingress vport returned by the second RB lookup (block 1812). Thereafter, the process depicted in FIG. 18 ends at bock 1820.
With reference now to FIG. 19, there is illustrated a high level logical flowchart of an exemplary process by which a RB of TRILL campus 200 provides fault-tolerant communication via a t-LAG cluster in accordance with one embodiment. The process begins at block 1900 and then proceeds to block 1902, which depicts an RB supporting a t-LAG (hereafter assumed for the sake of example to be RB4) determining whether or not a t-LAG link-down event has been detected for one of its external links 212. If not, the process iterates at block 1902 until a t-LAG link-down event is detected for one of its external links 212.
In response to RB4 detecting a link-down event for one of its external links 212, RB4 determines at block 1904 whether the number of its currently downed links exceeds a predetermined threshold (in at least some embodiments, RBs (or vRBs at different RBs) can have different numbers of external links and different thresholds). If so, the process proceeds to block 1920, which is described below. If, however, RB4 determines at block 1904 that the number of its current downed links does not exceed the predetermined threshold, the process proceeds to block 1910.
At block 1910, RB4 utilizes an ISL of the t-LAG cluster to redirect egress traffic of TRILL campus 200 that was directed to the downed link. For example, FIG. 20 depicts an exemplary flow of UC traffic via TRILL campus 200 to an external node (i.e., switch 202) via a t-LAG 230 a prior to a link down event. If an external link, such as external link 212 c, fails as shown in FIG. 21, RB4″, the extra-campus component of RB4, redirects the UC traffic via t-LAG ISL 300 to the peer RB (i.e., RB6″) in the same t-LAG cluster for egress through a healthy t-LAG link, such as link 212 d.
As depicted at block 1912, during and after the redirection RB4 continues to monitor to determine if its downed external link 212 has been restored. In response to detection that the downed external link (e.g., link 212 c) is restored, RB4 reverts communication of egress traffic from t-LAG ISL 300 to the restored external link (block 1914). Thereafter, the process returns to block 1902, which has been described.
Referring now to block 1920, in response to a determination of RB4 that the number of its downed external links 212 exceeds the threshold, RB4 reports a link-down condition (e.g., via TRILL IS-IS) to TRILL campus 200 indicating that the connectivity between its intra-campus RB (i.e., RB4′) and the vRB including its extra-campus RB (i.e., RB4″) is down (even though the actual link failure events impact external links 212). In response, TRILL campus 200 automatically reroutes traffic that was previously routed to RB4 to a peer RB of the t-LAG cluster for egress. One example of this rerouting behavior is shown in FIG. 22, which depicts TRILL campus 200 automatically rerouting egress UC traffic intended for switch 202 from RB4 to RB6, which transmits the egress traffic to switch 202 via external link 212 d of the t-LAG cluster.
During the rerouting illustrated at block 1920, RB4 further determines at block 1922 whether or not one or more of its downed external links 212 have been restored. If so, RB4 additionally determines at block 1924 whether or not the number of its external links 212 that are down still exceeds the threshold. If so the process returns to block 1922. If, however, RB4 determines at block 1924 that the restoration of one or more external links 212 has caused the number of its external links 212 that are down to not exceed the threshold, RB4 communicates to TRILL campus 200 a link-up event for the link between its intra-campus RB (i.e., RB4′) and the vRB (i.e., vRB9) including its extra-campus RB (i.e., RB4″). In response, TRILL campus 200 re-establishes routing for the egress traffic through RB4, as shown in FIG. 20.
FIGS. 23-24 illustrate that the same technique depicted in FIG. 19 can be utilized to provide fault-tolerant communication for multidestination traffic ingressing at a t-LAG cluster. For example, assuming a multidestination flow (e.g., broadcast flow) ingresses TRILL campus 200 on external link 212 h of t-LAG 230 c, vRB9 can distribute the multidestination flow to TRILL campus 200 and its external nodes utilizing a distribution tree rooted at vRB9, as shown in FIG. 23. In response to a link-down event for an external link (e.g., external link 212 c) of a t-LAG cluster supporting the multidestination flow, RB4 can utilize t-LAG ISL 300 to redirect the multidestination traffic to the peer RB (RB6″) in the same t-LAG cluster in order to send out the egress frames to external switch 202, as shown in FIG. 24. Further, in response to failure of a number of external links 212 of the t-LAG cluster that exceeds a threshold, RB4 can report a link down between its intra-campus RB4′ and vRB9 to TRILL campus 200 in order to enforce use of a different primary link for the egress multidestination traffic directed to the external node coupled to TRILL campus 200 by the downed link until the number of its downed external links is less than or equal to the threshold.
In support of the fault tolerant communication process depicted in FIG. 19, dynamic reconfiguration of RBs is preferably implemented as now described with reference to FIGS. 25-30.
With reference now to FIG. 25, there is illustrated a high level logical flowchart of an exemplary process by which a t-LAG-enabled RB of a TRILL campus is configured by default at startup in accordance with one embodiment. The process begins at block 2500 following startup of a t-LAG-enabled RB of TRILL campus 200 (e.g., RB4 or RB6). The t-LAG-enabled RB then applies a default configuration for traffic flow in the t-LAG, as depicted at blocks 2502-2508. Specifically, the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from any local access port or local network port to the port for t-LAG ISL 300 (blocks 2502 and 2504). In addition, the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from the port for t-LAG ISL 300 to any local access port or local network port (blocks 2506 and 2508). Thereafter, the default t-LAG configuration process illustrated in FIG. 25 ends at block 2510.
Referring now to FIG. 26, there is depicted a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB of a TRILL campus in response to a local link-up event in accordance with one embodiment. As shown, the process begins at block 2600 and then proceeds to block 2602, which illustrates a t-LAG-enabled RB of TRILL campus 200 (e.g., RB4 or RB6) detecting a link-up event on a local t-LAG of the RB (block 2602). In response to detecting the link-up event, the t-LAG-enabled RB notifies its peer RB in the t-LAG cluster of the link-up event (block 2604).
In addition, the t-LAG-enabled RB determines at block 2606 whether or not the t-LAG supported by the peer RB of the t-LAG cluster is currently up. If not, the t-LAG-enabled RB configures its switch controller 630 to allow traffic to flow from the port connected to the t-LAG ISL 300 to the local port having the link-up event detected at block 2602 (block 2620). The process then proceeds to block 2622, which depicts the t-LAG-enabled RB initiating a t-LAG reconfiguration, as described in detail below with reference to FIG. 27. Thereafter, the process depicted in FIG. 26 ends at block 2630.
Returning to block 2606, in response to a determination that the t-LAG supported by the peer RB of the t-LAG cluster is currently up, the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from the port connected to the t-LAG ISL 300 to the local port having the link-up event detected at block 2602 (block 2610). In addition, for all MAC entries learned at the t-LAG, the t-LAG-enabled RB updates the MAC entries to bind to the local port that just experienced the link-up event (block 2612). From block 2612, the process proceeds to block 2622 and 2630, which have been described.
With reference now to FIG. 27, there is illustrated a high level logical flowchart of an exemplary t-LAG reconfiguration process in accordance with one embodiment. The process is performed, for example, at block 2622 of FIG. 26, as well as block 2810 of FIG. 28, block 2920 of FIG. 29 and block 3010 of FIG. 30, as described further below.
The process illustrated in FIG. 27 begins at block 2700 and thereafter proceeds to block 2702, which depicts a t-LAG-enabled RB of TRILL campus 200 determining whether or not any local t-LAG link of the RB is down while the t-LAG of a remote RB in the same t-LAG cluster is up, for example, as shown in FIG. 21. If not, the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from any local network port or from any local t-LAG port to the port connected to the t-LAG ISL 300 (blocks 2704 and 2706). If, on the other hand, the t-LAG-enabled RB makes an affirmative determination at block 2702, the t-LAG-enabled RB configures its switch controller 630 to allow traffic to flow from any local network port or from any local t-LAG port to the port connected to the t-LAG ISL 300 (blocks 2710 and 2712). Following either of blocks 2706 or 2712, the t-LAG reconfiguration process illustrated in FIG. 27 ends at block 2714.
Referring now to FIG. 28, there is depicted a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a remote link-up event in accordance with one embodiment. As shown, the process begins at block 2800 and then proceeds to block 2802, which illustrates a t-LAG-enabled RB of TRILL campus 200 (e.g., RB4 or RB6) detecting a link-up event for a remote t-LAG in the same t-LAG cluster. For example, the t-LAG-enabled RB may detect the event based on a notification communicated by a peer RB in the t-LAG cluster as described at block 2604 of FIG. 26.
In response to detecting the link-up event for the remote t-LAG of the t-LAG cluster, the t-LAG-enabled RB determines at block 2804 whether or not the local t-LAG it supports is currently up. If not, the t-LAG-enabled RB initiates a t-LAG reconfiguration, as described with reference to FIG. 27 (block 2810). Thereafter, the process depicted in FIG. 28 ends at block 2812.
Returning to block 2804, in response to a determination by the t-LAG-enabled RB that its t-LAG is currently up, the t-LAG-enabled RB configures its switch controller 630 to not allow traffic to flow from the port connected to the t-LAG ISL 300 to its local t-LAG (block 2806). Thereafter, the process proceeds to block 2810 and 2812, which have been described.
With reference now to FIG. 29, there is illustrated a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB of a TRILL campus in response to a local t-LAG link-down event in accordance with one embodiment. The depicted process begins at block 2900 and then proceeds to block 2902, which illustrates a t-LAG-enabled RB of TRILL campus 200 (e.g., RB4 or RB6) detecting a link-down event on a local t-LAG of the RB (block 2902). In response to detecting the link-down event, the t-LAG-enabled RB notifies its peer RB in the t-LAG cluster of the link-down event (block 2904). The t-LAG-enabled RB also configures its switch controller 630 to not allow traffic to flow from the port connected to t-LAG ISL 300 to the port connected to the downed t-LAG link (block 2906).
The t-LAG-enabled RB additionally determines at block 2910 whether or not the t-LAG supported by the peer RB of the t-LAG cluster is currently up. If not, the t-LAG-enabled RB clears all the MAC entries learned for the t-LAG cluster. The process then proceeds to block 2920, which depicts the t-LAG-enabled RB initiating a t-LAG reconfiguration, as described with reference to FIG. 27. Thereafter, the process depicted in FIG. 29 ends at block 2922.
Returning to block 2910, in response to a determination that the t-LAG supported by the peer RB of the t-LAG cluster is currently up, the t-LAG-enabled RB updates the MAC entries to bind entries for the local t-LAG to the port connected to t-LAG ISL 300 (block 2914). From block 2914, the process proceeds to block 2920 and 2922, which have been described.
Referring now to FIG. 30, there is depicted a high level logical flowchart of an exemplary configuration process at a t-LAG-enabled RB in response to a remote link-down event in accordance with one embodiment. The depicted process begins at block 3000 and then proceeds to block 3002, which illustrates a t-LAG-enabled RB of TRILL campus 200 (e.g., RB4 or RB6) detecting a link-down event for a remote t-LAG in the same t-LAG cluster. For example, the t-LAG-enabled RB may detect the event based on a notification communicated by a peer RB in the t-LAG cluster as described at block 2904 of FIG. 29.
In response to detecting the link-down event for the remote t-LAG of the t-LAG cluster, the t-LAG-enabled RB determines at block 3004 whether or not the local t-LAG it supports is currently up. If not, the t-LAG-enabled RB initiates a t-LAG reconfiguration, as described with reference to FIG. 27 (block 3010). Thereafter, the process depicted in FIG. 30 ends at block 3012.
Returning to block 3004, in response to a determination by the t-LAG-enabled RB that its t-LAG is currently up, the t-LAG-enabled RB configures its switch controller 630 to allow traffic to flow from the port connected to the t-LAG ISL 300 to its local t-LAG (block 3006). Thereafter, the process proceeds to block 3010 and 3012, which have been described.
As has been described, the t-LAG support disclosed herein is designed to provide DMLT capability for external network nodes (e.g., switches or servers) connected to a TRILL campus, with all the links in a t-LAG used in an active-active mode for the same VLAN. The use of a virtual-RB for each t-LAG leads to efficient load distribution of UC traffic in the t-LAG. The use of this virtual-RB as the ingress RB in the TRILL encapsulation enables the MAC learning performed at egress RBs to be performed by hardware automatically. In cases in which the switch chips have difficulty in employing the virtual-RB as the source RB, the switch RB can alternatively be used as the ingress RB in TRILL encapsulation.
To support t-LAG, it is preferable if switch chips are capable of handling traffic for multiple RBs, but in some cases switch chips may lack such support in terms of capability or capacity. If such support is lacking, a t-LAG cluster including multiple RBs can be employed to adapt available hardware to provide t-LAG support. All the t-LAGs in a t-LAG cluster need to use just one virtual-RB in this case.
To eliminate frame duplication in a t-LAG for multidestination traffic, a link in a t-LAG is preferably selected as the primary link for multidestination transmission for each specific frame flow. The selection of the primary link for a t-LAG can be system-based or based on a combination of distribution tree, VLAN, and/or DMAC. Actions, such as enforcement of ACLs, are applied at egress RBs to make sure a multidestination frame will not be returned to its originating t-LAG.
Traffic handling in a t-LAG cluster is preferably separated into two domains: one for traffic routing within the TRILL campus and the other for the traffic switching in the regular L2 domain. It is recommended to totally separate the traffic handling in these two domains in a t-LAG cluster. A t-LAG ISL is utilized in a t-LAG cluster between peer RBs to handle the traffic redirection in the event of a local link failure on a t-LAG. The traffic redirection via the t-LAG ISL is employed until a new route or distribution tree for affected traffic can be determined and applied.
While the present invention has been particularly shown as described with reference to one or more preferred embodiments, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention. For example, although aspects have been described with respect to a data processing system (e.g., server computer, network switch, etc.) executing program code that directs the functions of the present invention, it should be understood that present invention may alternatively be implemented as a program product including a data storage medium/device storing program code that can be processed by a data processing system to implement the functionality. The data storage medium/device can be, for example, an optical or magnetic disk, a volatile or non-volatile memory device, etc.

Claims (28)

What is claimed is:
1. A method of network bridging, the method comprising:
implementing, on each of first and second bridges of a data network having respective links to an external node, a network bridge component that forwards traffic inside the data network and a virtual bridge component that forwards traffic outside of the data network;
forming a virtual bridge including the virtual bridge components of the first and second bridges and an interswitch link (ISL) between the virtual bridge components of the first and second bridges; and
communicating data frames with each of multiple external network nodes outside the data network via a respective one of multiple link aggregation groups all commonly supported by the virtual bridge.
2. The method of claim 1, wherein:
the multiple link aggregation groups include a first link aggregation group having a first link coupling a first external node to the first bridge and a second link coupling the first external node the to the second bridge; and
the communicating includes applying to a data frame ingressing into the data network via the first link an ingress bridge identifier identifying the first bridge as an ingress bridge rather than the virtual bridge.
3. The method of claim 2, wherein:
the data network is a Transparent Interconnection of Lots of Links (TRILL) network;
the applying comprises identifying the first bridge as the ingress bridge in a TRILL header of the data frame.
4. The method of claim 1, and further comprising:
performing L2 switching for traffic external to the data network utilizing the virtual bridge.
5. The method of claim 1, and further comprising:
applying an access control list (ACL) at the first and second bridges to implement source pruning for a multidestination flows ingressing on one of the multiple link aggregation groups.
6. The method of claim 1, and further comprising:
employing a first distribution tree for a first multidestination flow ingressing at the first bridge via one of the multiple link aggregation groups and employing a different second distribution tree for a second multidestination flow ingressing at the second bridge via the one of the multiple link aggregation groups.
7. The method of claim 1, and further comprising:
in the first bridge, implementing multiple different techniques for selecting a primary link for multidestination traffic among the multiple links in a link aggregation group among the multiple link aggregation groups, wherein at least one of the multiple different techniques selects the primary link based on one or more of a set including a VLAN identifier and a destination media access control address.
8. A data network, comprising:
a plurality of interconnected bridges implementing a common encapsulating layer two protocol, the plurality of bridges including first and second bridges each having respective links to an external node, a network bridge component that forwards traffic inside the data network and a virtual bridge component that forwards traffic outside of the data network;
wherein the first and second bridges are configured to support a virtual bridge including the virtual bridge components of the first and second bridges and an interswitch link (ISL) between the virtual bridge components of the first and second bridges; and
wherein the data network communicates data frames with each of multiple external network nodes outside the data network via a respective one of multiple link aggregation groups all commonly supported by the virtual bridge.
9. The data network of claim 8, wherein:
the multiple link aggregation groups include a first link aggregation group having a first link coupling a first external node to the first bridge and a second link coupling the first external node the to the second bridge; and
the first bridge applies to a data frame ingressing into the data network via the first link an ingress bridge identifier identifying the first bridge as an ingress bridge rather than the virtual bridge.
10. The data network of claim 9, wherein:
the data network is a Transparent Interconnection of Lots of Links (TRILL) network;
the first bridge identifies itself as the ingress bridge in a TRILL header of the data frame.
11. The data network of claim 8, wherein the data network performs L2 switching for traffic external to the data network utilizing the virtual bridge.
12. The data network of claim 8, wherein:
each of the first and second bridges applies an access control list (ACL) to implement source pruning for a multidestination flows ingressing on one of the multiple link aggregation groups.
13. The data network of claim 8, wherein:
the first bridge is configured to employ a first distribution tree for a first multidestination flow ingressing at the first bridge via one of the multiple link aggregation groups; and
the second bridge is configured to employ a different second distribution tree for a second multidestination flow ingressing at the second bridge via the one of the multiple link aggregation groups.
14. The data network of claim 8, wherein the first bridge implements multiple different techniques for selecting a primary link for multidestination traffic among the multiple links in a link aggregation group among the multiple link aggregation groups, wherein at least one of the multiple different techniques selects the primary link based on one or more of a set including a VLAN identifier and a destination media access control address.
15. A switch for a data network a plurality of interconnected bridges implementing a common encapsulating layer two protocol, the switch comprising:
multiple ports including at least one port connectable to an external link outside the data network;
a switch fabric that switches data frames between the multiple ports; and
a switch controller configured to implement:
a network bridge component that forwards traffic inside the data network and a first virtual bridge component that forwards traffic outside of the data network; and
a virtual bridge including the first virtual bridge component of the switch, a second virtual bridge component of another switch in the data network, and an interswitch link (ISL) between the first and second virtual bridge components;
wherein the switch communicates data frames with each of multiple external network nodes outside the data network via a respective one of multiple link aggregation groups all commonly supported by the virtual bridge.
16. The switch of claim 15, wherein:
the multiple link aggregation groups include a first link aggregation group having a first link coupling a first external node to the switch and a second link coupling the first external node the to the another switch; and
the switch controller is configured to apply to a data frame ingressing into the data network via the first link an ingress bridge identifier identifying the switch as an ingress bridge rather than the virtual bridge.
17. The switch of claim 16, wherein:
the data network is a Transparent Interconnection of Lots of Links (TRILL) network;
the switch controller is configured to identify the switch as the ingress bridge in a TRILL header of the data frame.
18. The switch of claim 15, wherein the switch controller is configured to perform L2 switching for traffic external to the data network utilizing the virtual bridge.
19. The switch of claim 15, wherein:
the switch controller is configured to apply an access control list (ACL) to implement source pruning for a multidestination flows ingressing on one of the multiple link aggregation groups.
20. The switch of claim 15, wherein:
the switch controller is configured such that the virtual bridge employs different distribution trees for multidestination traffic depending on which of the multiple link aggregation groups the multidestination traffic ingresses on.
21. The switch of claim 15, wherein the switch controller is configured to implement multiple different techniques for selecting a primary link for multidestination traffic among the multiple links in a link aggregation group among the multiple link aggregation groups, wherein at least one of the multiple different techniques selects the primary link based on one or more of a set including a VLAN identifier and a destination media access control address.
22. A program product, comprising:
a data storage device;
program code stored within the data storage device and executable to cause a switch of a data network to perform:
forming a virtual bridge including a first virtual bridge component of the switch, a second virtual bridge component of another switch in the data network, and an interswitch link (ISL) between the first and second virtual bridge components;
communicating data frames with each of multiple external network nodes outside the data network via a respective one of multiple link aggregation groups all commonly supported by the virtual bridge.
23. The program product of claim 22, wherein:
the multiple link aggregation groups include a first link aggregation group having a first link coupling a first external node to the switch and a second link coupling the first external node the to the another switch; and
the program code causes the switch to apply to a data frame ingressing into the data network via the first link an ingress bridge identifier identifying the switch as an ingress bridge rather than the virtual bridge.
24. The program product of claim 23, wherein:
the data network is a Transparent Interconnection of Lots of Links (TRILL) network;
program code causes the switch to identify itself as the ingress bridge in a TRILL header of the data frame.
25. The program product of claim 22, wherein the program code causes the switch to perform L2 switching for traffic external to the data network utilizing the virtual bridge.
26. The program product of claim 22, wherein the program code causes the switch to apply an access control list (ACL) to implement source pruning for a multidestination flows ingressing on one of the multiple link aggregation groups.
27. The program product of claim 22, wherein the program code causes the virtual bridge to employ different distribution trees for multidestination traffic depending on which of the multiple link aggregation groups the multidestination traffic ingresses on.
28. The program product of claim 22, wherein the program code is executable to cause the switch to implement multiple different techniques for selecting a primary link for multidestination traffic among the multiple links in a link aggregation group among the multiple link aggregation groups, wherein at least one of the multiple different techniques selects the primary link based on one or more of a set including a VLAN identifier and a destination media access control address.
US13/314,455 2011-06-17 2011-12-08 Distributed link aggregation group (LAG) for a layer 2 fabric Active 2034-01-29 US9497073B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/314,455 US9497073B2 (en) 2011-06-17 2011-12-08 Distributed link aggregation group (LAG) for a layer 2 fabric

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161498316P 2011-06-17 2011-06-17
US13/314,455 US9497073B2 (en) 2011-06-17 2011-12-08 Distributed link aggregation group (LAG) for a layer 2 fabric

Publications (2)

Publication Number Publication Date
US20120320926A1 US20120320926A1 (en) 2012-12-20
US9497073B2 true US9497073B2 (en) 2016-11-15

Family

ID=47353586

Family Applications (5)

Application Number Title Priority Date Filing Date
US13/314,455 Active 2034-01-29 US9497073B2 (en) 2011-06-17 2011-12-08 Distributed link aggregation group (LAG) for a layer 2 fabric
US13/315,463 Active 2032-11-25 US8750307B2 (en) 2011-06-17 2011-12-09 Mac learning in a trill network
US13/315,443 Active 2032-09-23 US8948003B2 (en) 2011-06-17 2011-12-09 Fault tolerant communication in a TRILL network
US13/655,975 Expired - Fee Related US8767738B2 (en) 2011-06-17 2012-10-19 MAC learning in a TRILL network
US13/780,530 Active US8948004B2 (en) 2011-06-17 2013-02-28 Fault tolerant communication in a trill network

Family Applications After (4)

Application Number Title Priority Date Filing Date
US13/315,463 Active 2032-11-25 US8750307B2 (en) 2011-06-17 2011-12-09 Mac learning in a trill network
US13/315,443 Active 2032-09-23 US8948003B2 (en) 2011-06-17 2011-12-09 Fault tolerant communication in a TRILL network
US13/655,975 Expired - Fee Related US8767738B2 (en) 2011-06-17 2012-10-19 MAC learning in a TRILL network
US13/780,530 Active US8948004B2 (en) 2011-06-17 2013-02-28 Fault tolerant communication in a trill network

Country Status (1)

Country Link
US (5) US9497073B2 (en)

Families Citing this family (113)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8665886B2 (en) 2009-03-26 2014-03-04 Brocade Communications Systems, Inc. Redundant host connection in a routed network
US8369335B2 (en) 2010-03-24 2013-02-05 Brocade Communications Systems, Inc. Method and system for extending routing domain to non-routing end stations
US9769016B2 (en) 2010-06-07 2017-09-19 Brocade Communications Systems, Inc. Advanced link tracking for virtual cluster switching
US9461840B2 (en) 2010-06-02 2016-10-04 Brocade Communications Systems, Inc. Port profile management for virtual cluster switching
US8989186B2 (en) 2010-06-08 2015-03-24 Brocade Communication Systems, Inc. Virtual port grouping for virtual cluster switching
US9270486B2 (en) 2010-06-07 2016-02-23 Brocade Communications Systems, Inc. Name services for virtual cluster switching
US9231890B2 (en) 2010-06-08 2016-01-05 Brocade Communications Systems, Inc. Traffic management for virtual cluster switching
US8867552B2 (en) 2010-05-03 2014-10-21 Brocade Communications Systems, Inc. Virtual cluster switching
US9716672B2 (en) 2010-05-28 2017-07-25 Brocade Communications Systems, Inc. Distributed configuration management for virtual cluster switching
US9001824B2 (en) 2010-05-18 2015-04-07 Brocade Communication Systems, Inc. Fabric formation for virtual cluster switching
US8885488B2 (en) 2010-06-02 2014-11-11 Brocade Communication Systems, Inc. Reachability detection in trill networks
US9806906B2 (en) 2010-06-08 2017-10-31 Brocade Communications Systems, Inc. Flooding packets on a per-virtual-network basis
US8446914B2 (en) 2010-06-08 2013-05-21 Brocade Communications Systems, Inc. Method and system for link aggregation across multiple switches
US9246703B2 (en) 2010-06-08 2016-01-26 Brocade Communications Systems, Inc. Remote port mirroring
US9608833B2 (en) 2010-06-08 2017-03-28 Brocade Communications Systems, Inc. Supporting multiple multicast trees in trill networks
US9628293B2 (en) 2010-06-08 2017-04-18 Brocade Communications Systems, Inc. Network layer multicasting in trill networks
US9807031B2 (en) 2010-07-16 2017-10-31 Brocade Communications Systems, Inc. System and method for network configuration
US9270572B2 (en) 2011-05-02 2016-02-23 Brocade Communications Systems Inc. Layer-3 support in TRILL networks
US9497073B2 (en) 2011-06-17 2016-11-15 International Business Machines Corporation Distributed link aggregation group (LAG) for a layer 2 fabric
US8879549B2 (en) 2011-06-28 2014-11-04 Brocade Communications Systems, Inc. Clearing forwarding entries dynamically and ensuring consistency of tables across ethernet fabric switch
US8948056B2 (en) 2011-06-28 2015-02-03 Brocade Communication Systems, Inc. Spanning-tree based loop detection for an ethernet fabric switch
US9401861B2 (en) 2011-06-28 2016-07-26 Brocade Communications Systems, Inc. Scalable MAC address distribution in an Ethernet fabric switch
US9407533B2 (en) 2011-06-28 2016-08-02 Brocade Communications Systems, Inc. Multicast in a trill network
US9007958B2 (en) 2011-06-29 2015-04-14 Brocade Communication Systems, Inc. External loop detection for an ethernet fabric switch
US8885641B2 (en) 2011-06-30 2014-11-11 Brocade Communication Systems, Inc. Efficient trill forwarding
US9736085B2 (en) 2011-08-29 2017-08-15 Brocade Communications Systems, Inc. End-to end lossless Ethernet in Ethernet fabric
US9960992B2 (en) * 2011-09-01 2018-05-01 Itxc Ip Holdings S.A.R.L. Systems and methods for routing data in a network
US20130083660A1 (en) * 2011-10-03 2013-04-04 Cisco Technology, Inc. Per-Group ECMP for Multidestination Traffic in DCE/TRILL Networks
US8750129B2 (en) 2011-10-06 2014-06-10 International Business Machines Corporation Credit-based network congestion management
US9699117B2 (en) * 2011-11-08 2017-07-04 Brocade Communications Systems, Inc. Integrated fibre channel support in an ethernet fabric switch
US9450870B2 (en) 2011-11-10 2016-09-20 Brocade Communications Systems, Inc. System and method for flow management in software-defined networks
CN102625986A (en) * 2011-12-09 2012-08-01 华为技术有限公司 Method, device and network equipment for processing loops in two layer network
US8995272B2 (en) 2012-01-26 2015-03-31 Brocade Communication Systems, Inc. Link aggregation in software-defined networks
WO2013117166A1 (en) * 2012-02-08 2013-08-15 Hangzhou H3C Technologies Co., Ltd. Implement equal cost multiple path of trill network
US9742693B2 (en) 2012-02-27 2017-08-22 Brocade Communications Systems, Inc. Dynamic service insertion in a fabric switch
US10103980B1 (en) * 2012-03-14 2018-10-16 Juniper Networks, Inc. Methods and apparatus for maintaining an integrated routing and bridging interface
US9154416B2 (en) 2012-03-22 2015-10-06 Brocade Communications Systems, Inc. Overlay tunnel in a fabric switch
US9025432B2 (en) * 2012-05-07 2015-05-05 Cisco Technology, Inc. Optimization for trill LAN hellos
CN102710510B (en) * 2012-05-18 2018-03-13 中兴通讯股份有限公司 Information processing method, apparatus and system
US9374301B2 (en) 2012-05-18 2016-06-21 Brocade Communications Systems, Inc. Network feedback in software-defined networks
US10277464B2 (en) 2012-05-22 2019-04-30 Arris Enterprises Llc Client auto-configuration in a multi-switch link aggregation
EP2853066B1 (en) 2012-05-23 2017-02-22 Brocade Communications Systems, Inc. Layer-3 overlay gateways
CN102724120B (en) * 2012-06-08 2015-12-16 华为技术有限公司 The method of transmitting announcement routing information and route-bridge
US9602430B2 (en) 2012-08-21 2017-03-21 Brocade Communications Systems, Inc. Global VLANs for fabric switches
US8717944B2 (en) * 2012-08-23 2014-05-06 Cisco Technology, Inc. TRILL optimal forwarding and traffic engineered multipathing in cloud switching
US9083645B2 (en) * 2012-09-07 2015-07-14 Dell Products L.P. Systems and methods providing reverse path forwarding compliance for a multihoming virtual routing bridge
US9401872B2 (en) 2012-11-16 2016-07-26 Brocade Communications Systems, Inc. Virtual link aggregations across multiple fabric switches
US9979595B2 (en) 2012-12-18 2018-05-22 Juniper Networks, Inc. Subscriber management and network service integration for software-defined networks having centralized control
US9100285B1 (en) 2012-12-18 2015-08-04 Juniper Networks, Inc. Dynamic control channel establishment for software-defined networks having centralized control
US8711855B1 (en) * 2012-12-18 2014-04-29 Juniper Networks, Inc. Topology discovery, control channel establishment, and datapath provisioning within an aggregation network with centralized control
CN103916319B (en) * 2013-01-06 2017-03-15 杭州华三通信技术有限公司 Link selecting method and stack equipment in LACP stacking networkings
US9350680B2 (en) 2013-01-11 2016-05-24 Brocade Communications Systems, Inc. Protection switching over a virtual link aggregation
US9413691B2 (en) 2013-01-11 2016-08-09 Brocade Communications Systems, Inc. MAC address synchronization in a fabric switch
US9548926B2 (en) 2013-01-11 2017-01-17 Brocade Communications Systems, Inc. Multicast traffic load balancing over virtual link aggregation
US8891516B2 (en) 2013-01-15 2014-11-18 International Business Machines Corporation Extended link aggregation (LAG) for use in multiple switches
US9565113B2 (en) 2013-01-15 2017-02-07 Brocade Communications Systems, Inc. Adaptive link aggregation and virtual link aggregation
US9356884B2 (en) 2013-01-17 2016-05-31 Cisco Technology, Inc. MSDC scaling through on-demand path update
US10616049B2 (en) 2013-01-25 2020-04-07 Dell Products, L.P. System and method for determining the configuration of switches in virtual link trunking environments
CN103973471B (en) * 2013-01-31 2018-11-02 中兴通讯股份有限公司 A kind of notifying method and device of TRILL distribution trees failure
CN103986650B (en) * 2013-02-07 2017-08-11 新华三技术有限公司 The treating method and apparatus that nickname conflicts in a kind of TRILL network
US9565099B2 (en) 2013-03-01 2017-02-07 Brocade Communications Systems, Inc. Spanning tree in fabric switches
US9143444B2 (en) 2013-03-12 2015-09-22 International Business Machines Corporation Virtual link aggregation extension (VLAG+) enabled in a TRILL-based fabric network
WO2014145750A1 (en) 2013-03-15 2014-09-18 Brocade Communications Systems, Inc. Scalable gateways for a fabric switch
US9699001B2 (en) 2013-06-10 2017-07-04 Brocade Communications Systems, Inc. Scalable and segregated network virtualization
US9565028B2 (en) 2013-06-10 2017-02-07 Brocade Communications Systems, Inc. Ingress switch multicast distribution in a fabric switch
CN104348726B (en) 2013-08-02 2018-12-11 新华三技术有限公司 Message forwarding method and device
US9806949B2 (en) 2013-09-06 2017-10-31 Brocade Communications Systems, Inc. Transparent interconnection of Ethernet fabric switches
US9912612B2 (en) 2013-10-28 2018-03-06 Brocade Communications Systems LLC Extended ethernet fabric switches
US9515918B2 (en) 2013-11-18 2016-12-06 International Business Machines Corporation Computing forwarding tables for link failures
CN104717140B (en) * 2013-12-11 2018-03-09 华为技术有限公司 The fault handling method and device of edge route bridge device in TRILL network
US9300528B2 (en) 2013-12-13 2016-03-29 International Business Machines Corporation Trill network with multipath redundancy
CN104717089A (en) * 2013-12-16 2015-06-17 华为技术有限公司 Equipment switching method and routing bridge equipment and system
CN104753790B (en) * 2013-12-26 2018-05-04 华为技术有限公司 A kind of message transmitting method and equipment based on TRILL network
CN104753782B (en) * 2013-12-26 2018-07-03 华为技术有限公司 A kind of method and apparatus that message is sent in transparent interconnection of lots of links interconnect TRILL network
US9548873B2 (en) 2014-02-10 2017-01-17 Brocade Communications Systems, Inc. Virtual extensible LAN tunnel keepalives
US10581758B2 (en) 2014-03-19 2020-03-03 Avago Technologies International Sales Pte. Limited Distributed hot standby links for vLAG
US10476698B2 (en) * 2014-03-20 2019-11-12 Avago Technologies International Sales Pte. Limited Redundent virtual link aggregation group
US10063473B2 (en) 2014-04-30 2018-08-28 Brocade Communications Systems LLC Method and system for facilitating switch virtualization in a network of interconnected switches
US9800471B2 (en) 2014-05-13 2017-10-24 Brocade Communications Systems, Inc. Network extension groups of global VLANs in a fabric switch
US10616108B2 (en) 2014-07-29 2020-04-07 Avago Technologies International Sales Pte. Limited Scalable MAC address virtualization
US9544219B2 (en) 2014-07-31 2017-01-10 Brocade Communications Systems, Inc. Global VLAN services
US9807007B2 (en) 2014-08-11 2017-10-31 Brocade Communications Systems, Inc. Progressive MAC address learning
WO2016048390A1 (en) * 2014-09-26 2016-03-31 Hewlett Packard Enterprise Development Lp Link aggregation configuration for a node in a software-defined network
US9634928B2 (en) 2014-09-29 2017-04-25 Juniper Networks, Inc. Mesh network of simple nodes with centralized control
US9524173B2 (en) 2014-10-09 2016-12-20 Brocade Communications Systems, Inc. Fast reboot for a switch
US9699029B2 (en) 2014-10-10 2017-07-04 Brocade Communications Systems, Inc. Distributed configuration management in a switch group
CN105610708B (en) * 2014-10-31 2019-11-12 新华三技术有限公司 The implementation method and RB equipment of multicast FRR in a kind of TRILL network
JP6375206B2 (en) * 2014-10-31 2018-08-15 APRESIA Systems株式会社 Relay system and switch device
CN105743780B (en) * 2014-12-09 2019-05-28 华为技术有限公司 Message transmitting method and device
US9998247B1 (en) 2014-12-30 2018-06-12 Juniper Networks, Inc. Controller-based network device timing synchronization
US9626255B2 (en) 2014-12-31 2017-04-18 Brocade Communications Systems, Inc. Online restoration of a switch snapshot
US9628407B2 (en) 2014-12-31 2017-04-18 Brocade Communications Systems, Inc. Multiple software versions in a switch group
US10003552B2 (en) 2015-01-05 2018-06-19 Brocade Communications Systems, Llc. Distributed bidirectional forwarding detection protocol (D-BFD) for cluster of interconnected switches
US9942097B2 (en) 2015-01-05 2018-04-10 Brocade Communications Systems LLC Power management in a network of interconnected switches
US10038592B2 (en) 2015-03-17 2018-07-31 Brocade Communications Systems LLC Identifier assignment to a new switch in a switch group
US9807005B2 (en) 2015-03-17 2017-10-31 Brocade Communications Systems, Inc. Multi-fabric manager
US10579406B2 (en) 2015-04-08 2020-03-03 Avago Technologies International Sales Pte. Limited Dynamic orchestration of overlay tunnels
US10439929B2 (en) 2015-07-31 2019-10-08 Avago Technologies International Sales Pte. Limited Graceful recovery of a multicast-enabled switch
US10707952B2 (en) * 2015-07-31 2020-07-07 Viasat, Inc. Flexible capacity satellite constellation
US20170048139A1 (en) * 2015-08-14 2017-02-16 Futurewei Technologies, Inc. Interconnecting an Overlay Management Control Network (OMCN) to a Non-OMCN
US10171303B2 (en) 2015-09-16 2019-01-01 Avago Technologies International Sales Pte. Limited IP-based interconnection of switches with a logical chassis
US9912614B2 (en) 2015-12-07 2018-03-06 Brocade Communications Systems LLC Interconnection of switches based on hierarchical overlay tunneling
US10237090B2 (en) 2016-10-28 2019-03-19 Avago Technologies International Sales Pte. Limited Rule-based network identifier mapping
CN106789644B (en) * 2016-11-29 2020-01-07 深圳市楠菲微电子有限公司 Method and device for forwarding TRILL multicast message
CN107968825B (en) * 2017-11-28 2021-06-29 新华三技术有限公司 Message forwarding control method and device
US11245644B2 (en) * 2018-01-19 2022-02-08 Super Micro Computer, Inc. Automatic multi-chassis link aggregation configuration
CN110149730B (en) 2018-02-13 2021-01-29 华为技术有限公司 Communication method and device
US10721163B1 (en) * 2019-03-15 2020-07-21 Dell Products L.P. Spanning tree protocol bridge-based link selection system
US10999187B2 (en) 2019-06-13 2021-05-04 Juniper Networks, Inc. Wireless control and fabric links for high-availability cluster nodes
CN110971519A (en) * 2019-12-12 2020-04-07 迈普通信技术股份有限公司 Port interconnection management method and device
US11444667B2 (en) * 2020-04-22 2022-09-13 Qualcomm Incorporated Methods and apparatus for orthogonal sequence transmission with frequency hopping
CN112235175B (en) * 2020-09-01 2022-03-18 深圳市共进电子股份有限公司 Access method and access device of network bridge equipment and network bridge equipment
CN111935337B (en) * 2020-09-17 2021-01-08 南京中兴软件有限责任公司 MAC address keep-alive method, equipment and storage medium of aggregation link

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070263640A1 (en) * 2006-05-10 2007-11-15 Finn Norman W Technique for efficiently managing bandwidth for multipoint-to-multipoint services in a provider network
US20090129385A1 (en) * 2004-09-17 2009-05-21 Hewlett-Packard Development Company, L. P. Virtual network interface
US20090185571A1 (en) * 2008-01-23 2009-07-23 Francois Edouard Tallet Translating mst instances between ports of a bridge in a computer network
US20100158024A1 (en) * 2008-12-23 2010-06-24 Ali Sajassi Optimized forwarding for provider backbone bridges with both i&b components (ib-pbb)
US20100246388A1 (en) * 2009-03-26 2010-09-30 Brocade Communications Systems, Inc. Redundant host connection in a routed network
US20110019678A1 (en) * 2009-07-24 2011-01-27 Juniper Networks, Inc. Routing frames in a shortest path computer network for a multi-homed legacy bridge node
US7974223B2 (en) * 2004-11-19 2011-07-05 Corrigent Systems Ltd. Virtual private LAN service over ring networks
US20110228780A1 (en) * 2010-03-16 2011-09-22 Futurewei Technologies, Inc. Service Prioritization in Link State Controlled Layer Two Networks
US20110235523A1 (en) * 2010-03-24 2011-09-29 Brocade Communications Systems, Inc. Method and system for extending routing domain to non-routing end stations
US20110280572A1 (en) * 2010-05-11 2011-11-17 Brocade Communications Systems, Inc. Converged network extension
US20110299409A1 (en) * 2010-06-02 2011-12-08 Brocade Communications Systems, Inc. Reachability detection in trill networks
US20110299532A1 (en) * 2010-06-08 2011-12-08 Brocade Communications Systems, Inc. Remote port mirroring
US20110299406A1 (en) * 2010-06-02 2011-12-08 Brocade Communications Systems, Inc. Path detection in trill networks
US20110299536A1 (en) * 2010-06-08 2011-12-08 Brocade Communications Systems, Inc. Method and system for link aggregation across multiple switches
US20120014387A1 (en) * 2010-05-28 2012-01-19 Futurewei Technologies, Inc. Virtual Layer 2 and Mechanism to Make it Scalable
US20120014261A1 (en) * 2010-07-14 2012-01-19 Cisco Technology, Inc. Monitoring A Flow Set To Detect Faults
US20120177045A1 (en) * 2011-01-07 2012-07-12 Berman Stuart B Methods, systems and apparatus for the interconnection of fibre channel over ethernet devices using a trill network
US8271680B2 (en) * 1998-12-24 2012-09-18 Ericsson Ab Domain isolation through virtual network machines
US20120243544A1 (en) * 2011-03-21 2012-09-27 Avaya Inc. Usage of masked bmac addresses in a provider backbone bridged (pbb) network
US20120243539A1 (en) * 2011-03-21 2012-09-27 Avaya Inc. Usage of masked ethernet addresses between transparent interconnect of lots of links (trill) routing bridges
US8325598B2 (en) * 2009-05-20 2012-12-04 Verizon Patent And Licensing Inc. Automatic protection switching of virtual connections

Family Cites Families (111)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5394402A (en) 1993-06-17 1995-02-28 Ascom Timeplex Trading Ag Hub for segmented virtual local area network with shared media access
US5617421A (en) 1994-06-17 1997-04-01 Cisco Systems, Inc. Extended domain computer network using standard links
US5515359A (en) 1994-08-26 1996-05-07 Mitsubishi Electric Research Laboratories, Inc. Credit enhanced proportional rate control system
US5633859A (en) 1994-09-16 1997-05-27 The Ohio State University Method and apparatus for congestion management in computer networks using explicit rate indication
ZA959722B (en) 1994-12-19 1996-05-31 Alcatel Nv Traffic management and congestion control for packet-based networks
US6035105A (en) 1996-01-02 2000-03-07 Cisco Technology, Inc. Multiple VLAN architecture system
US5742604A (en) 1996-03-28 1998-04-21 Cisco Systems, Inc. Interswitch link mechanism for connecting high-performance network switches
US5832484A (en) 1996-07-02 1998-11-03 Sybase, Inc. Database system with methods for parallel lock management
EP0853405A3 (en) 1997-01-06 1998-09-16 Digital Equipment Corporation Ethernet network with credit based flow control
US5893320A (en) 1997-05-20 1999-04-13 Demaree; Michael S. Device for cooking fowl
US6192406B1 (en) 1997-06-13 2001-02-20 At&T Corp. Startup management system and method for networks
US6147970A (en) 1997-09-30 2000-11-14 Gte Internetworking Incorporated Quality of service management for aggregated flows in a network system
US6567403B1 (en) 1998-04-30 2003-05-20 Hewlett-Packard Development Company, L.P. Virtual-chassis switch network topology
US6347337B1 (en) 1999-01-08 2002-02-12 Intel Corporation Credit based flow control scheme over virtual interface architecture for system area networks
US6646985B1 (en) 1999-06-03 2003-11-11 Fujitsu Network Communications, Inc. Congestion control mechanism in a network access device
US6922408B2 (en) 2000-01-10 2005-07-26 Mellanox Technologies Ltd. Packet communication buffering with dynamic flow control
US6977930B1 (en) 2000-02-14 2005-12-20 Cisco Technology, Inc. Pipelined packet switching and queuing architecture
US6901452B1 (en) 2000-03-02 2005-05-31 Alcatel Selectable prioritization for data communication switch
US6992984B1 (en) 2000-03-07 2006-01-31 Lucent Technologies Inc. Credit-based adaptive flow control for multi-stage multi-dimensional switching architecture
US6880086B2 (en) 2000-05-20 2005-04-12 Ciena Corporation Signatures for facilitating hot upgrades of modular software components
US6947419B2 (en) 2001-06-12 2005-09-20 Acute Technology Corp. Apparatus for multicast forwarding in a virtual local area network environment
US7042842B2 (en) 2001-06-13 2006-05-09 Computer Network Technology Corporation Fiber channel switch
US7263060B1 (en) 2001-06-28 2007-08-28 Network Appliance, Inc. Multiple switch protected architecture
US7173934B2 (en) * 2001-09-10 2007-02-06 Nortel Networks Limited System, device, and method for improving communication network reliability using trunk splitting
US7035220B1 (en) 2001-10-22 2006-04-25 Intel Corporation Technique for providing end-to-end congestion control with no feedback from a lossless network
US7561517B2 (en) 2001-11-02 2009-07-14 Internap Network Services Corporation Passive route control of data networks
US7668966B2 (en) 2001-11-02 2010-02-23 Internap Network Services Corporation Data network controller
US20030185206A1 (en) 2002-03-29 2003-10-02 Bhaskar Jayakrishnan Destination device bit map for delivering an information packet through a switch fabric
KR100472416B1 (en) 2002-11-01 2005-03-11 삼성전자주식회사 Device and Method for Controlling Packet Flow
US7702729B2 (en) 2003-04-08 2010-04-20 Johanson Bradley E Event heap: a coordination infrastructure for dynamic heterogeneous application interactions in ubiquitous computing environments
JP2004355125A (en) 2003-05-27 2004-12-16 Pioneer Electronic Corp Software update processing device, system, its method and program, and recording medium with the program recorded thereon
US20050047405A1 (en) 2003-08-25 2005-03-03 International Business Machines Corporation Switching device for controlling data packet flow
US7508763B2 (en) 2003-09-04 2009-03-24 Hewlett-Packard Development Company, L.P. Method to regulate traffic congestion in a network
US7386848B2 (en) 2003-10-02 2008-06-10 International Business Machines Corporation Method and system to alleviate denial-of-service conditions on a server
WO2005038599A2 (en) 2003-10-14 2005-04-28 Raptor Networks Technology, Inc. Switching system with distributed switching fabric
CN101087238B (en) 2003-10-21 2010-08-04 华为技术有限公司 Dynamic bandwidth allocation device and method of passive optical network
US7483370B1 (en) 2003-12-22 2009-01-27 Extreme Networks, Inc. Methods and systems for hitless switch management module failover and upgrade
JP2005277804A (en) 2004-03-25 2005-10-06 Hitachi Ltd Information relaying apparatus
US7593320B1 (en) 2004-04-30 2009-09-22 Marvell International, Ltd. Failover scheme for stackable network switches
US7475397B1 (en) 2004-07-28 2009-01-06 Sun Microsystems, Inc. Methods and apparatus for providing a remote serialization guarantee
US8238347B2 (en) 2004-10-22 2012-08-07 Cisco Technology, Inc. Fibre channel over ethernet
US7564869B2 (en) 2004-10-22 2009-07-21 Cisco Technology, Inc. Fibre channel over ethernet
US7830793B2 (en) 2004-10-22 2010-11-09 Cisco Technology, Inc. Network device architecture for consolidating input/output and reducing latency
US7715382B2 (en) 2004-11-01 2010-05-11 Alcatel-Lucent Usa Inc. Softrouter
US7385925B2 (en) 2004-11-04 2008-06-10 International Business Machines Corporation Data flow control method for simultaneous packet reception
JP2006189937A (en) 2004-12-28 2006-07-20 Toshiba Corp Reception device, transmission/reception device, reception method, and transmission/reception method
US20070036178A1 (en) 2005-02-02 2007-02-15 Susan Hares Layer 2 virtual switching environment
ATE525829T1 (en) 2005-02-28 2011-10-15 Ibm BLADE SERVER SYSTEM HAVING AT LEAST ONE STACKED SWITCH WITH MULTIPLE SWITCHES CONNECTED TO EACH OTHER AND CONFIGURED FOR MANAGEMENT AND OPERATION AS A SINGLE VIRTUAL SWITCH
US8085657B2 (en) 2005-04-01 2011-12-27 Sony Corporation Flow control in a cellular communication system
CN100486216C (en) 2005-07-15 2009-05-06 华为技术有限公司 Method for improving transmission reliability in virtual exchange system
US7962923B2 (en) 2005-12-30 2011-06-14 Level 3 Communications, Llc System and method for generating a lock-free dual queue
CN100571249C (en) 2006-02-27 2009-12-16 中兴通讯股份有限公司 A kind of ethernet communication method of determining transmission in real time
BRPI0806396B1 (en) 2007-02-02 2020-04-28 Interdigital Tech Corp method and apparatus for improving rlc for flexible rlc pdu size
US9661112B2 (en) 2007-02-22 2017-05-23 International Business Machines Corporation System and methods for providing server virtualization assistance
US8140696B2 (en) 2007-03-12 2012-03-20 International Business Machines Corporation Layering serial attached small computer system interface (SAS) over ethernet
US8320245B2 (en) 2007-03-13 2012-11-27 Alcatel Lucent Policy enforcement points
JP4888186B2 (en) 2007-03-28 2012-02-29 富士通株式会社 Communication system, repeater, and relay method
US8649370B2 (en) 2007-05-17 2014-02-11 Ciena Corporation Systems and methods for programming connections through a multi-stage switch fabric with blocking recovery, background rebalancing, and rollback
WO2008154556A1 (en) 2007-06-11 2008-12-18 Blade Network Technologies, Inc. Sequential frame forwarding
US9667442B2 (en) 2007-06-11 2017-05-30 International Business Machines Corporation Tag-based interface between a switching device and servers for use in frame processing and forwarding
US7912003B2 (en) 2007-06-27 2011-03-22 Microsoft Corporation Multipath forwarding algorithms using network coding
US8584138B2 (en) 2007-07-30 2013-11-12 Hewlett-Packard Development Company, L.P. Direct switching of software threads by selectively bypassing run queue based on selection criteria
US8139358B2 (en) 2007-09-25 2012-03-20 International Business Machines Corporation Apparatus for externally changing the direction of air flowing through electronic equipment
US7839777B2 (en) 2007-09-27 2010-11-23 International Business Machines Corporation Method, system, and apparatus for accelerating resolution of network congestion
US20090125882A1 (en) 2007-10-08 2009-05-14 Matteo Frigo Method of implementing hyperobjects in a parallel processing software programming environment
US8867341B2 (en) 2007-11-09 2014-10-21 International Business Machines Corporation Traffic management of client traffic at ingress location of a data center
US8553537B2 (en) 2007-11-09 2013-10-08 International Business Machines Corporation Session-less load balancing of client traffic across servers in a server group
US8082418B2 (en) * 2007-12-17 2011-12-20 Intel Corporation Method and apparatus for coherent device initialization and access
US8194674B1 (en) 2007-12-20 2012-06-05 Quest Software, Inc. System and method for aggregating communications and for translating between overlapping internal network addresses and unique external network addresses
US8625592B2 (en) 2008-02-26 2014-01-07 Cisco Technology, Inc. Blade switch with scalable interfaces
US20110035494A1 (en) 2008-04-15 2011-02-10 Blade Network Technologies Network virtualization for a virtualized server data center environment
US8131983B2 (en) 2008-04-28 2012-03-06 International Business Machines Corporation Method, apparatus and article of manufacture for timeout waits on locks
US8307422B2 (en) 2008-08-14 2012-11-06 Juniper Networks, Inc. Routing device having integrated MPLS-aware firewall
US8385202B2 (en) 2008-08-27 2013-02-26 Cisco Technology, Inc. Virtual switch quality of service for virtual machines
US9426095B2 (en) 2008-08-28 2016-08-23 International Business Machines Corporation Apparatus and method of switching packets between virtual ports
US9237034B2 (en) * 2008-10-21 2016-01-12 Iii Holdings 1, Llc Methods and systems for providing network access redundancy
CN102334112B (en) 2009-02-27 2014-06-11 美国博通公司 Method and system for virtual machine networking
US8238340B2 (en) 2009-03-06 2012-08-07 Futurewei Technologies, Inc. Transport multiplexer—mechanisms to force ethernet traffic from one domain to be switched in a different (external) domain
US8265075B2 (en) 2009-03-16 2012-09-11 International Business Machines Corporation Method and apparatus for managing, configuring, and controlling an I/O virtualization device through a network switch
US9213586B2 (en) 2009-03-18 2015-12-15 Sas Institute Inc. Computer-implemented systems for resource level locking without resource level locks
AU2010232526B2 (en) 2009-04-01 2014-06-26 Nicira, Inc. Method and apparatus for implementing and managing virtual switches
US8174984B2 (en) 2009-05-29 2012-05-08 Oracle America, Inc. Managing traffic on virtualized lanes between a network switch and a virtual machine
US8638799B2 (en) 2009-07-10 2014-01-28 Hewlett-Packard Development Company, L.P. Establishing network quality of service for a virtual machine
US8204061B1 (en) 2009-07-23 2012-06-19 Cisco Technology, Inc. Virtual port channel switches with distributed control planes
US9031081B2 (en) 2009-08-06 2015-05-12 Broadcom Corporation Method and system for switching in a virtualized platform
US8625427B1 (en) 2009-09-03 2014-01-07 Brocade Communications Systems, Inc. Multi-path switching with edge-to-edge flow control
KR101337559B1 (en) 2009-10-27 2013-12-06 한국전자통신연구원 Virtualization support programmable platform and Method for transferring packet
US8537860B2 (en) 2009-11-03 2013-09-17 International Business Machines Corporation Apparatus for switching traffic between virtual machines
US8665747B2 (en) 2009-12-03 2014-03-04 Cisco Technology, Inc. Preventing loops on network topologies built with virtual switches and VMS
US8509069B1 (en) 2009-12-22 2013-08-13 Juniper Networks, Inc. Cell sharing to improve throughput within a network device
US8400915B1 (en) 2010-02-23 2013-03-19 Integrated Device Technology, Inc. Pipeline scheduler for a packet switch
US20110299533A1 (en) 2010-06-08 2011-12-08 Brocade Communications Systems, Inc. Internal virtual network identifier and internal policy identifier
US8406128B1 (en) 2010-06-29 2013-03-26 Amazon Technologies, Inc. Efficient highly connected data centers
US8417800B2 (en) 2010-07-16 2013-04-09 Broadcom Corporation Method and system for network configuration and/or provisioning based on open virtualization format (OVF) metadata
US8873551B2 (en) * 2010-07-30 2014-10-28 Cisco Technology, Inc. Multi-destination forwarding in network clouds which include emulated switches
US9059940B2 (en) 2010-08-04 2015-06-16 Alcatel Lucent System and method for transport control protocol in a multi-chassis domain
US8345697B2 (en) 2010-08-17 2013-01-01 Dell Products, Lp System and method for carrying path information
US8498299B2 (en) 2010-08-19 2013-07-30 Juniper Networks, Inc. Flooding-based routing protocol having average-rate and burst-rate control
US9749241B2 (en) 2010-11-09 2017-08-29 International Business Machines Corporation Dynamic traffic management in a data center
US8730963B1 (en) * 2010-11-19 2014-05-20 Extreme Networks, Inc. Methods, systems, and computer readable media for improved multi-switch link aggregation group (MLAG) convergence
US20120131662A1 (en) 2010-11-23 2012-05-24 Cisco Technology, Inc. Virtual local area networks in a virtual machine environment
US20120163164A1 (en) * 2010-12-27 2012-06-28 Brocade Communications Systems, Inc. Method and system for remote load balancing in high-availability networks
US8478961B2 (en) 2011-03-02 2013-07-02 International Business Machines Corporation Dynamic migration of virtual machines based on workload cache demand profiling
KR101780423B1 (en) 2011-03-18 2017-09-22 삼성전자주식회사 Semiconductor device and method of forming the same
US8588224B2 (en) 2011-05-14 2013-11-19 International Business Machines Corporation Priority based flow control in a distributed fabric protocol (DFP) switching network architecture
US20120287785A1 (en) 2011-05-14 2012-11-15 International Business Machines Corporation Data traffic handling in a distributed fabric protocol (dfp) switching network architecture
US8837499B2 (en) 2011-05-14 2014-09-16 International Business Machines Corporation Distributed fabric protocol (DFP) switching network architecture
US9497073B2 (en) 2011-06-17 2016-11-15 International Business Machines Corporation Distributed link aggregation group (LAG) for a layer 2 fabric
US9736085B2 (en) 2011-08-29 2017-08-15 Brocade Communications Systems, Inc. End-to end lossless Ethernet in Ethernet fabric
US8767529B2 (en) 2011-09-12 2014-07-01 International Business Machines Corporation High availability distributed fabric protocol (DFP) switching network architecture
US8830466B2 (en) 2011-11-10 2014-09-09 Cisco Technology, Inc. Arrangement for placement and alignment of opto-electronic components

Patent Citations (22)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8271680B2 (en) * 1998-12-24 2012-09-18 Ericsson Ab Domain isolation through virtual network machines
US20090129385A1 (en) * 2004-09-17 2009-05-21 Hewlett-Packard Development Company, L. P. Virtual network interface
US8213429B2 (en) * 2004-09-17 2012-07-03 Hewlett-Packard Development Company, L.P. Virtual network interface
US7974223B2 (en) * 2004-11-19 2011-07-05 Corrigent Systems Ltd. Virtual private LAN service over ring networks
US20070263640A1 (en) * 2006-05-10 2007-11-15 Finn Norman W Technique for efficiently managing bandwidth for multipoint-to-multipoint services in a provider network
US20090185571A1 (en) * 2008-01-23 2009-07-23 Francois Edouard Tallet Translating mst instances between ports of a bridge in a computer network
US20100158024A1 (en) * 2008-12-23 2010-06-24 Ali Sajassi Optimized forwarding for provider backbone bridges with both i&b components (ib-pbb)
US20100246388A1 (en) * 2009-03-26 2010-09-30 Brocade Communications Systems, Inc. Redundant host connection in a routed network
US8325598B2 (en) * 2009-05-20 2012-12-04 Verizon Patent And Licensing Inc. Automatic protection switching of virtual connections
US20110019678A1 (en) * 2009-07-24 2011-01-27 Juniper Networks, Inc. Routing frames in a shortest path computer network for a multi-homed legacy bridge node
US20110228780A1 (en) * 2010-03-16 2011-09-22 Futurewei Technologies, Inc. Service Prioritization in Link State Controlled Layer Two Networks
US20110235523A1 (en) * 2010-03-24 2011-09-29 Brocade Communications Systems, Inc. Method and system for extending routing domain to non-routing end stations
US20110280572A1 (en) * 2010-05-11 2011-11-17 Brocade Communications Systems, Inc. Converged network extension
US20120014387A1 (en) * 2010-05-28 2012-01-19 Futurewei Technologies, Inc. Virtual Layer 2 and Mechanism to Make it Scalable
US20110299409A1 (en) * 2010-06-02 2011-12-08 Brocade Communications Systems, Inc. Reachability detection in trill networks
US20110299406A1 (en) * 2010-06-02 2011-12-08 Brocade Communications Systems, Inc. Path detection in trill networks
US20110299536A1 (en) * 2010-06-08 2011-12-08 Brocade Communications Systems, Inc. Method and system for link aggregation across multiple switches
US20110299532A1 (en) * 2010-06-08 2011-12-08 Brocade Communications Systems, Inc. Remote port mirroring
US20120014261A1 (en) * 2010-07-14 2012-01-19 Cisco Technology, Inc. Monitoring A Flow Set To Detect Faults
US20120177045A1 (en) * 2011-01-07 2012-07-12 Berman Stuart B Methods, systems and apparatus for the interconnection of fibre channel over ethernet devices using a trill network
US20120243544A1 (en) * 2011-03-21 2012-09-27 Avaya Inc. Usage of masked bmac addresses in a provider backbone bridged (pbb) network
US20120243539A1 (en) * 2011-03-21 2012-09-27 Avaya Inc. Usage of masked ethernet addresses between transparent interconnect of lots of links (trill) routing bridges

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
Brocade, "BCEFE in a Nutshell First Edition", Global Education Services Revision 0111, pp. 1-70, Copyright 2011 Brocade Communications Systems, Inc.
Cisco Systems Inc., "Cisco FabricPath Overview", pp. 1-20, Copyright 2009.
D.E. Eastlake, "RBridges and the IETF TRILL Protocol", pp. 1-39, TRILL Protocol, Dec. 2009.
Dar-Ren Leu, "dLAG-DMLT over TRILL", Blade Network Technologies, pp. 1-20, Copyright 2009.
Manral, et al., "Rbridges: Bidirectional Forwarding Detection (BFD) support for TRILL draft-manral-trill-bfd-encaps-01", pp. 1-10, TRILL Working Group Internet-Draft, Mar. 13, 2011.
Perlman, et al., "RBridges: Base Protocol Specification", pp. 1-117, TRILL Working Group Internet-Draft, Mar. 3, 2010.
Posted by Mike Fratto, "Cisco's FabricPath and IETF TRILL: Cisco Can't Have Standards Both Ways", Dec. 17, 2010; https://www.networkcomputing.com/data-networking-management/229500205.

Also Published As

Publication number Publication date
US20120320800A1 (en) 2012-12-20
US20130170339A1 (en) 2013-07-04
US8948004B2 (en) 2015-02-03
US8767738B2 (en) 2014-07-01
US8750307B2 (en) 2014-06-10
US20120320739A1 (en) 2012-12-20
US8948003B2 (en) 2015-02-03
US20120320926A1 (en) 2012-12-20
US20130148662A1 (en) 2013-06-13

Similar Documents

Publication Publication Date Title
US9497073B2 (en) Distributed link aggregation group (LAG) for a layer 2 fabric
US11438219B2 (en) Advanced link tracking for virtual cluster switching
CN110098992B (en) Private virtual local area network for transporting peer-to-peer traffic between switches
US10284469B2 (en) Progressive MAC address learning
US8811398B2 (en) Method for routing data packets using VLANs
EP2282453B1 (en) Routing frames in a shortest path computer network for a multi-homed legacy bridge node
US9059940B2 (en) System and method for transport control protocol in a multi-chassis domain
US9258185B2 (en) Fibre channel over Ethernet support in a trill network environment
KR101507675B1 (en) Priority based flow control in a distributed fabric protocol (dfp) switching network architecture
KR101485728B1 (en) Distributed fabric protocol (dfp) switching network architecture
US9036637B2 (en) Message transmission in virtual private networks
US8059638B2 (en) Inter-node link aggregation system and method
US9401872B2 (en) Virtual link aggregations across multiple fabric switches
US12081458B2 (en) Efficient convergence in network events
EP4344137A2 (en) Fast forwarding re-convergence of switch fabric multi-destination packets triggered by link failures
US8228823B2 (en) Avoiding high-speed network partitions in favor of low-speed links
Leu et al. A distributed LAG mechanism for TRILL enabled fabrics

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:KAMATH, DAYAVANTI G.;KAMBLE, KESHAV;LEU, DAR-REN;AND OTHERS;REEL/FRAME:032002/0106

Effective date: 20111202

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8