<?XML333 version="1.0" encoding="utf-8"?>
<?xml-model href="rfc7991bis.rnc"?>  <!-- Required for schema
      validation and schema-aware editing -->

<!DOCTYPE rfc [
  <!ENTITY filename "draft-eastlake-bess-enhance-evpn-all-active-11">
  <!ENTITY nbsp     "&#160;">
  <!ENTITY zwsp     "&#8203;">
  <!ENTITY nbhy     "&#8209;">
  <!ENTITY wj       "&#8288;">
]>
<!-- <?xml-stylesheet type="text/xsl" href="rfc2629.xslt" ?> -->
<!-- This third-party XSLT can be enabled for direct transformations
in XML processors, including most browsers -->
<!-- If further character entities are required then they should be
added to the DOCTYPE above. Use of an external entity file is not
recommended. -->
<?rfc strict="yes" ?>
<?rfc toc="yes"?>

<rfc
  xmlns:xi="http://www.w3.org/2001/XInclude"
  category="std"
  docName="&filename;"
  ipr="trust200902"
  obsoletes=""
  submissionType="IETF"
  xml:lang="en"
  version="3">
<!--
    * docName should be the name of your draft * category should be
    one of std, bcp, info, exp, historic * ipr should be one of
    trust200902, noModificationTrust200902, noDerivativesTrust200902,
    pre5378Trust200902 * updates can be an RFC number as NNNN *
    obsoletes can be an RFC number as NNNN
-->


<!-- ____________________FRONT_MATTER____________________ -->

<front>

<title abbrev="Enhance All Active EVPN">EVPN All Active Usage
Enhancement</title>
<!-- The abbreviated title is required if the full
title is longer than 39 characters -->

<seriesInfo name="Internet-Draft"
		value="&filename;"/>
    
<author fullname="Donald Eastlake" initials="D." surname="Eastlake">
  <organization>Futurewei Technologies</organization>
  <address>
    <postal>
      <street>2386 Panoramic Circle</street>
      <city>Apopka</city>
      <region>FL</region>
      <code>32703</code>
      <country>USA</country>
    </postal>
    <phone>+1-508-333-2270</phone>
    <email>d3e3e3@gmail.com</email>
    <email>donald.eastlake@futurewei.com</email>
  </address>
</author>

<author fullname="Zhenbin Li" initials="Z." surname="Li">
  <organization>Huawei Technologies</organization>
  <address>
    <postal>
      <street>Huawei Blduilding, No.156 Beiqing Rdoad</street>
      <city>Beijing</city>
      <code>100095</code>
      <country>China</country>
    </postal>
    <email>lizhenbin@huawei.com</email>
  </address>
</author>

<author fullname="Shunwan Zhuang" initials="S." surname="Zhuang">
  <organization>Huawei Technologies</organization>
  <address>
    <postal>
      <street>Huawei Blduilding, No.156 Beiqing Rdoad</street>
      <city>Beijing</city>
      <code>100095</code>
      <country>China</country>
    </postal>
    <email>zhuangshunwan@huawei.com</email>
  </address>
</author>

<author fullname="Russ White" initials="R." surname="White">
  <organization>Juniper Networks</organization>
  <address>
    <email>russ@riw.us</email>
  </address>
</author>

<date year="2023" month="May" day="30"/>
  <!-- Meta-data Declarations -->

<area>Routing</area>

<workgroup>BESS Working Group</workgroup>
    <!-- WG name at the upperleft corner of the doc, IETF is fine for
       individual submissions.  If this element is not present, the
       default is "Network Working Group", which is used by the RFC
       Editor as a nod to the history of the IETF. -->

<keyword></keyword>
  <!-- Keywords will be incorporated into HTML output files in a meta
       tag but they have no effect on text or nroff output. If you
       submit your draft to the RFC Editor, the keywords will be used
       for the search engine. -->

<abstract>
  <t>A principal feature of EVPN is the ability to support multihoming
  from a customer equipment (CE) to multiple provider edge equipment
  (PE) active with all-active links. This draft specifies an
  improvement to load balancing such links.</t>
</abstract>
  
</front>


<!-- ***** MIDDLE MATTER ***** -->

<middle>


<section anchor="Introduction">  <!-- 1. -->
  <name>Introduction</name>

<t>A principal feature of EVPN (Ethernet VPN [rfc7432bis]) is the
ability to support multihoming from a customer equipment (CE) to
multiple provider edge equipments (PEs) with links used in an
all-active redundancy mode. That mode is where a device is multihomed
to a group of two or more PEs and where all PEs in such redundancy
group can forward traffic to/from the multihomed device or network for
a given VLAN <xref target="RFC7209"/>. This draft specifies an
improvement in load balancing such PE to CE all-active multi-homing
links.</t>

<t>In the case where a CE is multihomed to multiple PE nodes, using a
Link Aggregation Group (LAG) with All-Active redundancy, it is
possible that only a single PE learns a set of the MAC addresses
associated with traffic transmitted by the CE.  This leads to a
situation where remote PE nodes receive MAC/IP Advertisement routes
for these addresses from a single PE, even though multiple PEs are
connected to the multihomed segment.</t>

<t>To address this issue, EVPN introduces the concept of "aliasing",
which is the ability of a PE to signal that it has reachability to an
EVPN instance (EVI) on a given Ethernet segment (ES) even when it has
learned no MAC addresses from that EVI/ES.  The Ethernet A-D per EVI
route is used for this purpose.  A remote PE that receives a MAC/IP
Advertisement route with a non-reserved ESI SHOULD consider the
advertised MAC address to be reachable via all PEs that have
advertised reachability to that MAC address's EVI/ES via the
combination of an Ethernet A-D per EVI route for that EVI/ES (and
Ethernet tag, if applicable) AND Ethernet A-D per ES routes for that
ES with the "Single-Active" bit in the flags of the ESI Label extended
community set to 0.</t>

<section>  <!-- 1.1 -->
  <name>Terminology and Acronyms</name>

<t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
"SHOULD", "SHOULD NOT", "RECOMMENDED", "NOT RECOMMENDED", "MAY", and
"OPTIONAL" in this document are to be interpreted as described in BCP
14 <xref target="RFC2119"/> <xref target="RFC8174"/> when, and only
when, they appear in all capitals, as shown here.</t>

<t>This document uses the following acronyms and terms:</t>

<dl>
<dt>A-D</dt><dd>- Auto Discovery.</dd>

<dt>All-Active Redundancy Mode</dt><dd>- When a device is multihomed to a group of
two or more PEs and when all PEs in such redundancy group can forward
traffic to/from the multihomed device or network for a given VLAN.</dd>

<dt>CE</dt><dd>- Customer Edge equipment.</dd>

<dt>ES</dt><dd>- Ethernet Segment.</dd>

<dt>ESI</dt><dd>- Ethernet Segment Identifier.</dd>

<dt>EVI</dt><dd>- EVPN Instance.</dd>

<dt>EVPN</dt><dd>- Ethernet VPN <xref target="RFC7432"/>.</dd>

<dt>FRR</dt><dd>- Fast ReRoute.</dd>

<dt>MAC</dt><dd>- Media Access Control.</dd>

<dt>PE</dt><dd>- Provider Edge equipment.</dd>

<dt>Single-Active Redundancy Mode</dt><dd>- When a device or a network is
multihomed to a group of two or more PEs and when only a single PE in
such a redundancy group can forward traffic to/from the multihomed
device or network for a given VLAN.</dd>

<dt>VLAN</dt><dd>- Virtual Local Area Network.</dd>

<dt>VPN</dt><dd>- Virtual Private Network.</dd>
</dl>

</section>

</section>

<section>  <!-- 2. -->
  <name>Improved Load Balancing</name>

<t>Consider the example in Figure 1. CE1 is multihomed to PE1 and
PE2. CE1 typically uses a hash algorithm to determine whether to send
a particular traffic to PE1 or to PE2. Thus, if such traffic from CE1
is only sent to PE1, then PE1 will learn CE1's MAC address(es) and
that PE2 will not.</t>

<t>PE3 and PE4 can do aliasing [rfc7432bis] because PE1 and PE2 will be
advertising the same ESI. Thus PE3 and PE4 will expect that a MAC
address reachable from PE1 will also be reachable from PE2. This
aliasing will cause PE3 and PE4 to load balance to CE1's MAC(s),
sending some traffic to PE1 and some to PE2.</t>

<figure anchor="Current">
   <name>Current Situation</name>
   <artwork align="center"><![CDATA[
                         .........
      +----------+      .       .      +----------+
      | PE1 MAC  +------+       +------+ PE3      |
      | Learning |      .       .      |          |
      +----------+      .       .      +----------+
     /     ^            .       .            |     \
+---+      |            . EVPN  .            |     +---+
|CE1|      |            . MPLS  .            |     |CE2|
+---+      |            .       .            |     +---+
     \     |            .       .            |    /
      +----------+      .       .      +----------+
      | PE2      |      .       .      | PE4      |
      |          +------+       +------+          |
      +----------+      .       .      +----------+
                        .........
]]></artwork>
</figure>

<t>There are two problems associated with this situation that are
described in the subsections below.  Section 3 describes the mechanism
to address these problems.</t>

<section>  <!-- 2.1 -->
  <name>Problem 1: Traffic Bypassing</name>

<t>Since PE2 has not learning CE1's MAC(s), the MAC lookup at PE2 will
find that MAC address associated with PE1. PE2 will then tunnel the
traffic to PE1.</t>

<t>As an enhancement that solves this problem, PE1 can send MAC
address(es) with VLAN and ESI information. PE2 will then
receive the MAC address(es) and VLAN that PE1 associates with the ESI
and PE2 can use this to update its forwarding tables (see Figure
2). As a result, when traffic addressed to a CE1 MAC arrives at PE2,
it can send it on the appropriate local interface and VLAN. This
avoids the unnecessary extra hop through PE1 for such traffic arriving
at PE2.</t>

<figure anchor="Enhancement">
   <name>With Enhancement</name>
   <artwork align="center"><![CDATA[
                         .........
      +----------+      .       .      +----------+
      | PE1 MAC  +------+       +------+ PE3      |
      | Learning |      .       .      |          |
      +----------+      .       .      +----------+
     /     ^            .       .            |     \
+---+      |            . EVPN  .            |     +---+
|CE1|    Sy|nc          . MPLS  .            |     |CE2|
+---+      |            .       .            |     +---+
     \     v            .       .            |    /
      +----------+      .       .      +----------+
      | PE2      |      .       .      | PE4      |
      |          +------+       +------+          |
      +----------+      .       .      +----------+
                        .........
]]></artwork>
</figure>

</section>

<section>  <!-- 2.2 -->
  <name>Problem 2: VID Encapsulation Confusion</name>

<t>If CE1 is connected through a VLAN and has only one VLAN under the
EVPN instance of PE2, the unicast traffic can be directly sent to the
appropriate interface and encapsulated with the appropriate VID and
forwarded to CE1.</t>

<t>However, there may be multiple ways for CE1 to connect to PE1 and PE2,
including Ethernet Tag, Ethernet Tag termination, and Q-in-Q.  PE2
cannot always obtain the appropriate VLANs and in such cases PE2 is
missing the information needed to forward the unicast traffic to CE1
directly.</t>

</section>
</section>

<section>  <!-- 3. -->
  <name>VLAN-Redirect-Extended Community Attribute</name>

<t>This document defines a new BGP extended community attribute called
the VLAN-Redirect-Extended Community attribute as shown in Figure 3.</t>

<figure anchor="VextComm">
   <name>VLAN-Redirect-Extended Community Attribute</name>
   <artwork align="center"><![CDATA[
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|       0x06    | Sub-Type=TBA  |    Flags      |   Reserved    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|            S-VLAN             |            C-VLAN             |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
]]></artwork>
</figure>

<t>Where:</t>

<dl>
<dt>0x06:</dt><dd>EVPN Extended Community Type field.</dd>

<dt>Sub-Type:</dt><dd>Sub-Type field indicating that the extended community
attribute is a VLAN-Redirect-Extended Community attribute, and the
value is TBA as assigned by the IANA.</dd>

<dt>Flags:</dt><dd>8 bits of identification information. Bit 0 set to 0 indicates
that the action is redirected to the VLANs in this community.</dd>

<dt>Reserved:</dt><dd>Not used. MUST be sent as zero and ignored on receipt.</dd>

<dt>S-VLAN:</dt><dd>Outer VLAN information. MUST NOT be 0 or 0xFFFF. If it is one
of those values, which are not valid VLAN IDs, the attribute is
ignored.</dd>

<dt>C-VLAN:</dt><dd>Inner VLAN information. When 0, it means there is no
C-VLAN. MUST NOT be 0xFFFF, which is not a valid VLAN ID. If it is
that illegal value, the attribute is ignored.</dd>
</dl>

</section>

<section>  <!-- 4. -->
  <name>Operation</name>

<t>Operation with the solution specified in Section 3 and the topology
shown in Figure 2 is described below.</t>

<section>  <!-- 4.1 -->
  <name>Establishment</name>

<ol>
<li>PE1 learns MAC addresses from CE1, advertises them to PE2, carries
the ESI value as ES1 and the next hop as PE1, and carries the
VLAN-Redirect-Extended Community attributes.</li>

<li>PE2 receives the MAC route advertised by PE1 and finds the
interface that connects to CE1 locally according to the ESI value.  At
the same time, PE2 fills in the VLAN information according to the
VLAN-Redirect-Extended Community attributes.</li>

<li>At the same time, PE2 generates a fast reroute (FRR) entry
according to the next hop information (PE1) of the MAC route, that is,
a MAC address entry on PE2, where the primary path points to the CE1
link and the standby path points to PE1.</li>

<li>PE2 also sends the MAC as a local MAC route to PE1.</li>

<li>PE1 receives the MAC route advertised by PE2 and generates the
FRR entry with the MAC route learned by CE1, that is, the MAC address
entry on PE1, with the primary path pointing to the CE1 link and the
secondary path pointing to PE2.</li>
</ol>

</section>

<section>  <!-- 4.2 -->
  <name>Handling Link Failure</name>

<ol>
  <li>When the link between PE1 and CE1 fails, PE1 withdraws the MAC
address that PE1 advertised to PE2.</li>

<li>PE2 receives the MAC withdrawal from PE1, does not delete the MAC
immediately, but starts an aging timer, and does not withdraw the
MAC address that PE1 advertised to PE2.</li>

<li>When the aging timer expires, if PE2 cannot receive the traffic
from CE1, then PE2 withdraws the MAC address that was advertised to
PE2 by PE1 and deletes the MAC entry. If PE2 can communicate directly
with CE1, it just eliminates the FRR standby path to PE1.</li>
</ol>

</section>
</section>

<section>  <!-- 5. -->
  <name>IANA Considerations</name>

<t>IANA is requested to assign a new EVPN Extended Community SubType as
follows:</t>

  <table>
    <thead>
<tr><th>Sub-Type Value</th><th
align="center">Name</th><th>Reference</th></tr> 
    </thead>
    <tbody>
<tr><td align="center">TBA</td><td>VLAN-Redirect Extended
Community</td><td>[this doc]</td></tr>
    </tbody>
  </table>

</section>

<section>  <!-- 6. -->
  <name>Security Considerations</name>

<t>TBD</t>

<t>For general EVPN Security Considerations, see [rfc7432bis].</t>

</section>


</middle>


<!-- ____________________BACK_MATTER____________________ -->
<back>

<references>
  <name>Normative References</name>

<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.2119.xml"/>
<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.7432.xml"/>
<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.8174.xml"/>

</references>

<references>
  <name>Informative References</name>

<xi:include
    href="https://www.rfc-editor.org/refs/bibxml/reference.RFC.7209.xml"/>

</references>

<section anchor="Acknowledgements" numbered="false">
  <name>Acknowledgements</name>
  
<t>The authors would like to thank the following for their comments
and review of this document:</t>

<t>TBD</t>

</section>


</back>
  
</rfc>
