IPv4 Address Exhaustion Causing Harmful Effects on the Earth

Today, I received a very disturbing email on NANOG which was forwarded from a recipient on the Global Environment Watch (GEW) mailing list.  If this is true, we all need to take steps to make an orderly and smooth transition to IPv6 as quickly as possible, lest we suffer from the harmful effects described in this email.


From: Stephen H. Inden
To: Global Environment Watch (GEW) mailing list
Date: Fri, 1 Apr 2011 00:19:08 +0200
Subject: IPv4 Address Exhaustion Effects on the Earth

At a ceremony held on February 3, 2011 the Internet Assigned Numbers Authority (IANA) allocated the remaining last five /8s of IPv4 address space to the Regional Internet Registries (RIRs). With this action, the free pool of available IPv4 addresses was completely depleted. Continue reading “IPv4 Address Exhaustion Causing Harmful Effects on the Earth”

Bluehost IPv6 Epic Fail

Recently, I had a conversation with my hosting provider to determine if they had IPv6 support.  I’m interested in getting my web site set up and reachable via IPv6.  Below is a copy of the conversation I had with their customer support, and clearly indicates we’ve got a long way before IPv6 is ready for the masses:

stupid
Bluehost Support: Hi, how can I help you?
Stefan: Hi, I am a hosting customer and I was wondering if you currently have support for IPv6?
Bluehost Support: Let me look into that for you.
Bluehost Support: Yes, we do support IPv6.
Stefan: Great! Is there a cost associated with that and how do I go about setting that up?
Bluehost Support: I am sorry, we cannot give you IPv6 until our IP5 runs out.
Stefan: Wait, you just told me you have support for IPv6.  What the heck is IP5?
Bluehost Support: IP5 is the version before IP6.  We can’t give you an IP6 until our IP5 runs out. I am sorry for the misunderstanding.

Interview with Chris Grundemann, Author of ‘Day One: Exploring IPv6’

Spend a little time in Juniper, ARIN or a wide variety of other networking forums, and you’ll likely see the name Chris Grundemann.  Recently, we had the opportunity to catch up with him, and discuss the nature of his involement in deploying IPv6 at tw telecom, as well as his recently published Juniper booklet entitled “Day One: Exploring IPv6“.


Thanks Chris for joining us today.  Tell us a little bit about yourself and your career experience, and specifically tell us about your day-to-day experience working with IPv6.

Certainly. Career-wise, I am currently engaged as a Network Architect with tw telecom inc. where I am responsible for setting forward looking architectures and leading various technology development efforts. I am also the Founding Chair of the Colorado Chapter of the Internet Society, Founding Editor of Burning with the Bush and an active participant (and current AC nominee) in the ARIN policy development process. Obviously I am also the author of the Juniper “Day One: Exploring IPv6” booklet.

My day-to-day experience with IPv6 is actually pretty minimal at this point. Last year while I was still on the IP backbone team here at tw telecom, I rolled out IPv6 across all of our PE routers – in one night. Since then, there has been very little technical work needed from a networking perspective. We still have plenty of work to fully operationalize IPv6 but it is mostly systems and process issues now, much less exciting.

For any readers who are interested, you can find a lot more about me on my personal site. This includes links to my Facebook and LinkedIn profiles, so feel free to send me an invite to connect!

You rolled out IPv6 across all of your PE routers in a single night! That’s a pretty big accomplishment. Would you say that Juniper’s implementation of IPv6 made it easy to deploy and support IPv6 across a large number of devices?

Thanks! There was of course plenty of preparation leading up to that night, but we “flipped the switch” all at once and it went extremely smooth.

All of Juniper’s carrier routers forward IPv6 in hardware, which is huge. Also, IPv6 was integrated into Junos very well, most of the commands are similar if not the same between IPv4 and IPv6. This makes it really easy operationally speaking. So, yes, I would definitely agree that Juniper’s implementation of IPv6 makes it easy to deploy and scale.

Ok, so let’s specifically talk about the current state of affairs with IPv6.  Hurricane Electric, one of the leading providers of IPv6 connectivity, states that as of the time of this writing we have less than a year remaining until complete IPv4 exhaustion.  This is based on the fact that there are only sixteen /8 network blocks available for allocation (approximately 6%).    We’ve heard figures such as this for many years now, but techniques like NAT have allowed people to extend the length of the existing IPv4 address pool.  Based on your experience working with IPv6 and also your involvement with ARIN, can you help us to understand what is fact and what is fiction – how long do you really think we have before total address exhaustion becomes a reality and customers will have no choice but to start looking at IPv6 for future deployments?

Let me re-phrase your query into two distinct questions if I may: How long do we have with IPv4 and when will network operators be forced to consider IPv6 deployment? The answers are very different so I think they should be addressed individually.

First, How long do we have with IPv4? As you state, Hurricane Electric’s widget gives us less than a year. But let’s start with a quick level-set. There are actually three distinct points leading up to what I would call “complete IPv4 exhaustion.” The first is IANA unallocated pool exhaustion. This is the point when the global pool of IPv4 /8s designated for unicast routing reaches 5 remaining and subsequently each of the 5 RIRs receives one (thus depleting the unallocated pool completely). The second point is RIR exhaustion, when the Regional Internet Registries can no longer allocate nor assign IPv4 addresses that they received from IANA (because they don’t have any). Finally, true exhaustion happens when the ISPs/LIRs exhaust their remaining IPv4 addresses and end users simply cannot get a routable IPv4 address.

As I understand it, Hurricane Electric is getting their data from the IPv4 Address Report built by Geoff Huston and are predicting the date of the first point; exhaustion of the IANA IPv4 unallocated address pool. As of today that date is projected to be 1 July, 2011 – less than a year away. However, this projection is based on the current and historical run-rate, on how fast we have consumed IPv4 addresses up to this point. Because so many folks have not paid attention to IPv6 and are still wholly dependent on IPv4, it is quite likely that the run-rate will increase, perhaps drastically, as we get closer to IANA unallocated pool exhaustion. If this happens, we actually have much less than one year before reaching that first point.

Predicting the second point gets a little murkier, because different folks define this point differently. Should we declare that RIR exhaustion is upon us when the first RIR runs out of unallocated IPv4 address space? When the last one does? Perhaps when the RIR for your region has no unallocated IPv4 to give you? Mr. Huston projects the date “where the first RIR has exhausted its available pool of addresses” and since he has already done all the work, it is a convenient place to set the bar. As of today that date is predicted to be 20 January, 2012. Remember again that this does not take into account any possible run on IPv4 addresses that may happen between now and then and that other RIRs will have IANA allocated IPv4 space for some time after that date.

The final point is the hardest one to pin down. This is mostly because it would be very hard, if not impossible to quantify how much currently allocated/assigned address space is unused or underused.

Many ISPs may be able to feed off of current reserves for months or even years, while many more will run out of IPv4 addresses within weeks of receiving their last traditional allocation from their RIR.

You also have to take into account things like IPv4 address transfers which are now allowed in many regions, other possible policy changes and transition technologies such as carrier-grade-NAT (CGN). All of these things pull IPv4 use in different directions. So no one can intelligently predict this final date.

Although I cannot tell you that IPv4 will be dead in X years, there are some very important facts that we should not overlook. The first is that Geoff Huston’s projections have remained quite consistent over the past two years, and the time remaining has steadily decreased over those two years. The second is that we are running out of usable IPv4 addresses. NAT was a stop gap to allow folks time to adopt IPv6. That time has largely been wasted unfortunately. The bottom line is that IPv4 will continue to become more expensive to use on interconnected networks while IPv6 continues to become less expensive.

This is where the second question comes into play: When will network operators be forced to look at IPv6 deployment? The truth is that they should be looking into it now. If you are not adding only IPv6 capable hardware and software to your network now – you are going to be forced to spend extra money upgrading sooner than you would like. As IPv4 becomes ever more expensive (both directly as ISPs charge more for it or you are forced to pay for addresses through a transfer and indirectly as CGNs and other transition mechanisms drive up operational costs), many will turn to IPv6 – more and more as the next two years play out. Businesses that have IPv6 capable networks now will have a competitive advantage over those who are forced to upgrade their network to get IPv6 connectivity.

An often overlooked aspect of this question is security. If your network is not IPv6 enabled today, you likely have IPv6 traffic being tunneled right through your firewalls. Another is mobile access – very soon mobile phone operators will be migrating to 4G technologies that take advantage of IPv6 addressing for all new phones on their networks. These IPv6 mobile devices will be reaching your website(s) via IPv6, if you want them to have the best possible experience your site needs to be running IPv6 natively.  As soon as a website is IPv6 only, ISPs will be required to provide IPv6 connectivity or lose customers to those who do.

So, in short, the answer really is now. Everyone should be thinking of IPv6 when planning all future network deployments, starting now (if not yesterday).

Many industry experts are already speculating that an IPv4 black market will exist because of the depletion of IPv4 address space and the lack of a large IPv6 installed base. Do you suspect there will be a black market for IPv4 addresses and what impacts might this have?

The answer varies a bit depending on how you choose to define black market. Under many definitions, that market already exists. I think that it already impacts us and that it will get worse as we near and ultimately cross the free pool depletion threshold. Think spammers and phishers operating out of address blocks that they beg, borrow, steal and often “rent” or “buy.” There are also instances where much more legitimate businesses make back-room deals for the use of IP addresses.

Overall some of the most negative impacts surround the WHOIS database, and the integrity of the data it contains. When folks get addresses through grey or black markets, instead of from the RIR they are probably not going to report proper reassignment registration information to the RIR. This leads to stale WHOIS data which makes troubleshooting and abuse reporting much harder for operators and investigation much harder for law enforcement. I helped author a recently adopted ARIN policy change to start addressing some of this and am actually spearheading an effort to continue that work with another policy proposal in the ARIN region as we speak.

Another concern is prefix hijacking, this is not really a black market issue but is another facet of the problems we will face more and more as IPv4 gets more expensive, unless and until IPv6 adoption picks up across the board.

There is a lot of work going on right now within ARIN, the IETF and other RIRs to try and limit the impacts of any IPv4 black market, other abuses and also ease the overall IPv4-IPv6 transition. Anyone interested in this work should join the ARIN-PPML (Public Policy Mailing List) [http://lists.arin.net/mailman/listinfo/arin-ppml] (or their local equivalent) and/or show up at a meeting; and join the conversation. ARIN and the other RIRs are open, transparent, ground-up organizations and your voice can make a huge impact.

It has been observed that the number of Autonomous Systems supporting IPv6 as well as IPv6 DNS Queries, in the form of AAAA records, have significantly increased in the last several years.  Have we reached that critical mass where widespread adoption is imminent and if so what can we expect to see in the next few years?

I think that widespread adoption is imminent now but I don’t believe that it is an issue of critical mass, more an issue of network operators starting to see the dates we discussed above nearing. I make the distinction because there is still very little IPv6 traffic actually flowing. What I think is happening is that the folks who have been watching this are getting a sense of urgency that is finally breaking through to the folks who write the checks. The real critical mass moment will be when we see IPv6 traffic levels really climbing and IPv4 traffic growth starting to slow. I think you can expect to see that in the next few years. Within five certainly and probably within three.

Let’s talk for a minute about the Juniper “Day One” guides.  Can you tell us what they are all about, and more specifically, tell us a little bit about the “Day One: Exploring IPv6” guide that you’ve written.  What is it all about and what can potential readers hope to gain from reading it?

The Day One guides are exactly what the name implies, they are booklets that give you everything you need to know to get through your first day working on the covered topic. They are hands-on, example-driven, cut-to-the-chase primers on all sorts of topics surrounding Juniper Networks gear and the Junos OS. In “Exploring IPv6” I tried to really provide that common sense starting point for implementing IPv6. The booklet covers enabling IPv6, adding IPv6 addresses to interfaces, configuring static routes in IPv6, implementing IPv6 IGPs (RIPng, IS-IS and OSPF v3) as well as all the basic verification and troubleshooting that surrounds those topics. If you follow along through the book examples and work through all of the “try it yourself” exercises, you should gain a solid understanding of IPv6 LANs and how IPv6 is implemented in JUNOS, as well as a great general / vendor-agnostic view of IPv6 itself and how it differs from IPv4.

Tell us a little bit about the “Advanced IPv6” Day One Guide that you are currently working on?  What should the network practitioner hope to gain from reading it?

Advanced IPv6 is kind of a “Day Two” guide on IPv6 in JUNOS. It continues right where Exploring IPv6 left off and moves onto more advanced topics such as BGP, VRRP, CoS, Multicast and system management. It takes you another big step towards being able to fully implement IPv6 in a production environment.

After you are done with “Advanced IPv6” do you have any other writing aspirations?

I do. Writing is definitely work but I am finding that it’s work I really enjoy. Hopefully others like my writing in these Day One booklets and that gives me the opportunity to continue writing!

Thanks for joining us today Chris.  This has been extremely informative and we are all really excited about reading your next Day One guide and are anxiously awaiting its arrival!

Book Review :: JUNOS High Availability: Best Practices for High Network Uptime

JUNOS High Availability: Best Practices for High Network Uptime
JUNOS_High_Availabilityby James Sonderegger, Orin Blomberg, Kieran Milne, Senad Palislamovic
Paperback: 688 pages
Publisher: O’Reilly Media
ISBN-13: 978-0596523046

5starsHigh Praises for JUNOS High Availability

Building a network capable of providing connectivity for simple business applications is a fairly straightforward and well-understood process. However, building networks capable of surviving varying degrees of failure and providing connectivity for mission-critical applications is a completely different story. After all, what separates a good network from a great network is how well it can withstand failures and how rapidly it can respond to them.

While there are a great deal of books and resources available to assist the network designer in establishing simple network connectivity, there aren’t many books which discuss the protocols, technologies, and the myriad ways in which high availability can be achieved, much less tie it all together into one consistent thread. “JUNOS High Availability” does just that, in essence providing a single, concise resource covering all of the bits and pieces which are required in highly available networks, allowing the network designer to build networks capable of sustaining five, six, or even seven nines of uptime.

In general, there are a lot of misconceptions and misunderstandings amongst Network Engineers with regards to implementing high availability in Junos. One only needs to look at the fact that Graceful Restart (GR) protocol extensions and Graceful Routing Engine Switchover (GRES) are often mistaken for the same thing, thanks in no small part to the fact that these two technologies share similar letters in their acronyms. This book does a good job of clarifying the difference between the two and steers clear of the pitfalls typically prevalent in coverage of the subject matter. The chapter on ‘Control Plane High Availability’ covers the technical underpinnings of the underlying architecture on most Juniper platforms; coverage of topics like the separation between the control and forwarding planes, and kernel replication between the Master and Backup Routing Engine give the reader a solid foundation to understand concepts like Non-Stop Routing, Non-Stop Bridging, and In-Service Software Upgrades (ISSU). In particular I found this book to be very useful on several consulting engagements in which seamless high availability was required during software upgrades as the chapter on ‘Painless Software Upgrades’ discusses the methodology for achieving ISSU and provides a checklist of things to be performed before, during, and after the upgrade process. Similarly, I found the chapter on ‘Fast High Availability Protocols’ to be very informative as well, providing excellent coverage of BFD, as well as the differences between Fast Reroute vs. Link and Node Protection.

Overall I feel this book is a valuable addition to any networking library and I reference it often when I need to implement certain high availability mechanisms, or simply to evaluate the applicability of a given mechanism versus another for a certain deployment. The inclusion of factoring costs into a high availability design is a welcome addition and one that all too many authors fail to cover. Naturally, it only makes sense that costs should be factored into the equation, even when high availability is the desired end-state, in order to ensure that ultimately the business is profitable. If I had to make one suggestion for this book it is that there should be additional coverage of implementing High Availability on the SRX Series Services Gateways using JSRP, as this is a fundamental high availability component within Juniper’s line of security products. To the authors credit however, this book was written just as the SRX line was being released, so I don’t fault the authors for providing limited coverage. Perhaps more substantial coverage could be provided in the future if a Second Edition is published.

The bottom line is this – if you are a Network Engineer or Architect responsible for the continuous operation or design of mission-critical networks, “JUNOS High Availability” will undoubtedly serve as an invaluable resource. In my opinion, the chapters on ‘Control Plane High Availability’, ‘Painless Software Upgrades’, and ‘Fast High Availability Protocols’ are alone worth the entire purchase price of the book. The fact that you get a wealth of information beyond that in addition to the configuration examples provided makes this book a compelling addition to any networking library.

What’s the BFD with BFD?

Many networks today are striving for “five nines” high availability and beyond. What this means is that network operators must configure the network to detect and respond to network failures as quickly as possible, preferably on the order of milliseconds. This is in contrast to the failure detection inherent in most routing protocols, which is typically on the order of several seconds or more. For example, the default hold-time for BGP in JUNOS is 90 seconds, which means that in certain scenarios BGP will have to wait for upwards of 90 seconds before a failure is detected, during which time a large percentage of traffic may be blackholed. It is only after the failure is detected that BGP can reconverge on a new best path.

Another example is OSPF which has a default dead interval of 40 seconds, or IS-IS which has a default hold-time of 9 seconds (for DIS routers), and 27 seconds (for non-DIS routers). For many environments which support mission-critical data, or those supporting Voice/Video or any real-time applications, any type of failure which isn’t detected in the sub-millisecond range is too long.

While it is possible to lower timers in OSPF or IS-IS to such an extent that a failure between two neighbors can be detected rather quickly (~1 second), it comes at a cost of increased protocol state and considerable burden on the Routing Engine’s CPU.  As an example, let us consider the situation in which a router has several hundred neighbors. Maintaining subsecond Hello messages for all of these neighbors will dramatically increase the amount of work that the Routing Engine must perform. Therefore, it is a widely accepted view that a reduction in IGP timers is not the overall best solution to solve the problem of fast failure detection.

Another reason that adjusting protocol timers is not the best solution is that there are many protocols which don’t support a reduction of timers to such an extent that fast failure detection can be realized. For example, the minimum BGP holdtime is 20 seconds, which means that the best an operator can hope for is a bare minimum of 20 seconds for failure detection.

Notwithstanding, this does nothing about situations in which there is no protocol at all, for example, Ethernet environments in which two nodes are connected via a switch as can be seen in the figure below.  In this type of environment, R1 has no idea that R2 is not reachable, since R1’s local Ethernet segment connected to the switch remains up.  Therefore, R1 can’t rely on an ‘Interface Down’ event to trigger reconvergence on a new path and instead must wait for higher layer protocol timers to age out before determining that the neighbor is not reachable.  (Note to the astute reader: Yes, Ethernet OAM is certainly one way to deal with this situation, but that is a discussion which is beyond the scope of this article).

L2-Connectivity

Essentially, at the root of the problem is either a lack of suitable protocols for fast failure detection of lower layers, or worse, no protocol at all.  The solution to this was the development of Bidirectional Forwarding Detection, or BFD, developed jointly by Cisco and Juniper.  It has been widely deployed and is continuing to gain widespread acceptance, with more and more protocols being adapted to use BFD for fast failure detection. 

So what is the Big Freaking Deal with Bidirectional Forwarding Detection anyway and why are so many operators implementing it in their networks?  BFD is a simplistic hello protocol with the express purpose of rapidly detecting failures at lower layers.  The developers wanted to create a low overhead mechanism for exchanging hellos between two neighbors without all the nonessential bits which are typical in an IGP hello or BGP Keepalives.  Furthermore, the method developed had to be able to quickly detect faults in the Bidirectional path between two neighbors in the forwarding plane.  Originally, BFD was developed to provide a simple mechanism to be used on Ethernet links, as in the example above, prior to the development of Ethernet OAM capabilities.  Hence, BFD was developed with this express purpose in mind with the intent of providing fault identification in an end-to-end path between two neighbors.

Once BFD was developed, the protocol designers quickly found that it could be used for numerous applications beyond simply Ethernet.  In fact, one of the main benefits of BFD is that it provides a common method to provide for failure detection for a large number of protocols, allowing a singular, centralized method which can be reused.  In other words, let routing protocols do what they do best – exchange routing information and recalculate routing tables as necessary, but not perform identification of faults at lower layers.  An offshoot of this is that it allows network operators to actually configure higher protocol timer values for their IGPs, further reducing the burden placed on the Routing Engine.

BFD timers can be tuned such that failure detection can be realized in just a few milliseconds, allowing for failure and reconvergence to take place in similar timeframes to that of SONET Automatic Protection Switching.  A word of caution – while BFD can dramatically decrease the time it takes to detect a failure, operators should be careful when setting the intervals too low.  Very aggressive BFD timers could cause a link to be declared down even when there is only a slight variance in the link quality, which could cause flapping and other disastrous behavior to ensue.  The best current practice with regards to BFD timers is to set a transmit and receive interval of 300ms and a multiplier of 3, which equates to 900ms for failure detection.  This is generally considered fine for most environments, and only the most stringent of environments should need to set their timers more aggressive than this.

One question that is commonly asked is how is it that BFD can send hello packets in the millisecond range without becoming a burden on the router.  The answer to this question lies in the fact that BFD was intended to be lightweight and run in the forwarding plane, as opposed to the control plane (as is the case with routing protocols).  It is true that while early implementations of BFD ran on the control plane, most of the newer implementations run in the forwarding plane, taking advantage of the dedicated processors built into the forwarding plane and alleviating the burden which would otherwise be place on the RE.  In JUNOS releases prior to JUNOS 9.4, BFD Hello packets were generated via RPD running on the RE.  In order to enable BFD to operate in the PFE in JUNOS versions prior to JUNOS 9.4, the Periodic Packet Management Daemon (PPMD) had to be enabled, using the command ‘set routing-options ppm delegate processing’.  In JUNOS 9.4 and higher this is the default behavior and BFD Hello packets are automatically handled by PPMD operating within the PFE.

Facilitating Firewall Filter Configuration in JUNOS using ‘apply-path’

Undoubtedly, one of the coolest features in JUNOS is the apply-path statement. Using apply-path, an operator can configure a prefix-list which comprises IP prefixes linked to a defined path within JUNOS. This facilitates tasks like configuring firewall filters to allow traffic from configured BGP neighbors, making them highly dynamic.

Continue reading “Facilitating Firewall Filter Configuration in JUNOS using ‘apply-path’”

Implementing Provider-Provisioned VPNs using Route Reflectors

MPLS/BGP Provider-Provisioned VPNs, such as those proposed in RFC 4364 (formerly RFC 2547) or draft-kompella variants, suffer from some scalability issues due to the fact that all PE routers are required to have a full iBGP mesh in order to exchange VPN-IPv4 NLRI and associated VPN label information.  In a modern network consisting of a large number of PE devices, it becomes readily apparent that this requirement can quickly become unmanageable.

The formula to compute the number of sessions for an iBGP full mesh is n * (n-1)/2.  10 PE devices would only require a total of 45 iBGP sessions (10 * (9)/2 = 45).  However, by simply adding 5 additional PEs into this environment your total number of sessions increases exponentially to 105.  Scalability issues arise because maintaining this number of iBGP sessions on each PE is an operational nightmare; similarly control plane resources are quickly exhausted.

An alternative to this that has gained widespread adoption is to utilize Route Reflectors to reflect the VPN-IPv4 NLRI and associated VPN label between PE devices.  However, several issues arise when using Route Reflectors in such an environment.  In a normal environment without the use of Route Reflectors, MPLS tunnels exist between each PE router such that when the VPN-IPv4 NLRI and associated VPN label are received, a PE router can recurse through its routing table to find the underlying MPLS tunnel used to reach the remote BGP next-hop within the VPN-IPv4 NLRI.  In the Route Reflection model, the Route Reflector typically doesn’t have an MPLS tunnel to each PE for which it is receiving VPN-IPv4 NLRI.  Therefore, these routes never become active and are therefore not candidates for reflection back to other client and non-client peers.

A few methods have been developed which circumvent this issue.  One method is to simply define MPLS tunnels from the Route Reflector to each PE.  This solves the problem by allowing the Route Reflector to find a recursive match (i.e. MPLS tunnel) in order to reach the remote PE.  However, this approach suffers from the drawback in that it requires a whole bunch of MPLS tunnels to be configured which only serve to allow the received VPN-IPv4 NLRI to be considered active.  Remember, these tunnels are completely useless in that they will never be used for the actual forwarding of data, they are only used within the control plane to instantiate routes.

An alternative and much more graceful solution to this problem is to configure the Route Reflector with a static discard route within the routing table which is used to reference BGP next-hops in MPLS environments (inet.3 for example in JUNOS).  This static discard route only serves to function as a recursive match when incoming VPN-IPv4 NLRI are received for the express purpose of making these routes active and therefore candidates for redistribution.  In JUNOS, one can accomplish this using the following configuration:

routing-options {
    rib inet.3 {
        static {
            route 0.0.0.0/0 discard;
        }
    }
}

With the above, any VPN-IPv4 NLRI received from a PE router is immediately made active due to the fact that a static route has been created in inet.3 which is the routing table used in JUNOS to recurse for BGP next-hops in MPLS environments.

An excellent whitepaper entitled “BGP Route Reflection in Layer 3 VPN Networks” expands upon this and describes the benefits of using Route Reflection in such environments. It also builds the case for using a distributed Route Reflection design to further enhance scalability and redundancy.

One thing to keep in mind is that with the Route Reflector approach, we have merely moved the problem set from that of the PE device to that of the Route Reflector.  Although it minimizes the number of iBGP sessions required on PE devices, the Route Reflector must be capable of supporting a large number of iBGP sessions and in addition, must be able to store all of the VPN-IPv4 NLRI for all of the VPNs for which it is servicing.  It is highly recommended that adequate amounts of memory are in place on the Route Reflector in order to store this large amount of routing information.

Finally, while using Route Reflectors is an acceptable solution in the interim to addressing scaling concerns with Provider-Provisioned VPNs, it is not clear if this approach is sufficient for the long term.  There are several other options being examined, with some of them outlined in a presentation entitled 2547 L3VPN Control Plane Scaling” given at APRICOT in Kyoto, Japan in 2005.