9.1 BGP Overview | Juniper Networks Reference Guide: JUNOS Routing, Configuration, and Architecture: JUNOS Routing, Configuration, and Architecture

EGPs, in a nutshell , provide the information necessary for interdomain routing. More specifically , they provide network layer reachability information (NLRI), which is used to determine the best paths through multiple ASs. As discussed in Section 8.1, IGP provides the best route information within an AS. This chapter will focus on a larger scale.

BGP4 is the implemented standard across the Internet and is in the EGP family. Figure 9-1 shows an overview of the relationship of BGP and a given IGP to the Internet. BGP provides NLRI between any two or more ASs. Furthermore, any AS can use a different IGP to accomplish intradomain routing. Although not listed in the figure, there are also cases in which BGP is used internally to provide reachability information. Doing so can better optimize the intradomain routing to provide detailed routing information for multiple points of egress.

Figure 9-1. BGP/IGP AS Overview

graphics/09fig01.gif

9.1.1 External BGP and Internal BGP Conceptual Model

BGP was developed to handle interdomain routing and eliminate issues surrounding such topics as routing loops , backbone-to-backbone peering, and reachability information for IGPs for external networks. BGP is used for both internal (IBGP) and external (EBGP) peering.

Figure 9-2 shows a very basic BGP logical view. Washington D.C., in AS100, has an EBGP session with New York, in AS200. This session is considered external because it includes two neighbors, each in its own AS. They are therefore external neighbors. There is also an IBGP session between New York and Boston in AS200, so they are internal neighbors. Notice that there is no physical connection between the two. IBGP does not require that the peers be directly connected. To take it one step further, we have included an internal router in the physical path . There can be several non-BGP speaking routers within an AS. As long as the BGP speakers can create a connection logically to each other, the peering session can be established.

Figure 9-2. IBGP and EBGP Conceptual Model

graphics/09fig02.gif

The main distinguishers between EBGP and IBGP include how the BGP process handles routing updates, the use of attributes, and the propagation of routing information.

EBGP sessions work differently than IBGP sessions. Typical EBGP peering routers are directly connected. However, EBGP can take advantage of logical connectivity, like IBGP sessions, through the use of the multihop statement. This will be covered later in Section 9.4.2.21. Again, EBGP peers are not in the same AS; therefore, when setting up peering, EBGP peers typically will not listen to other devices advertising BGP routes to them unless they are explicitly configured to do so.

IBGP sessions, on the other hand, operate differently. Peer sessions can be established either through directly connected interfaces or logical connectivity. There is no true dependence on physical topology. In BGP, the next-hop is almost always validated by a recursive lookup in the IGP routing table. If there is no next -hop route to the border router ( NEXT_HOP attribute), then BGP will not allow the route to be installed into the routing table. IBGP also mandates that each IBGP-speaking router have a peering session with every other IBGP-speaking router. This concept is known as IBGP full-mesh, or the IBGP mesh. In IBGP when a route is learned from another IBGP peer, the receiving system will not advertise the route to other IBGP neighbors, owing to the full-mesh concept. Since the advertisement will already have been sent to all peers from the original sender, there is no need to duplicate this information.

In Figure 9-3, we see a sample of IBGP peering sessions. Routers Washington D.C., New York, Atlanta, and Dallas are configured to create a logical full-mesh. They do not have a physical full-mesh topology, however. Washington D.C. does not have a physical connection to Dallas; New York does not have a physical connection to Atlanta. However, in terms of IBGP connectivity, each router must have a peering session with every other IBGP-speaking router. Because there is no longer a requirement for the IBGP session to be established over a physically meshed topology, Washington D.C. and Dallas can establish a peering session through either Atlanta or New York. In this particular scenario, if Atlanta were to fail, the IBGP session would remain up through New York.

Figure 9-3. IBGP Logical Connectivity

graphics/09fig03.gif

Because IBGP can be established through logical connectivity, it is recommended that all IBGP sessions be established between loopback interfaces because, if the loopback addresses are advertised into the IGP properly and a single physical interface to the local system is up, IBGP peering can be established.

9.1.2 Autonomous System Numbers

The introductory discussion of IBGP and EBGP mentioned autonomous systems, or ASs, which are basically single administrative entities when it comes to the Internet. Each AS is unique, with no duplication occurring, and is typically managed by a single network administration. Autonomous systems numbers (ASNs) can range from 1 to 65,535; however, 32,768 to 64,511 are reserved for future use, and 64,512 to 65,535 are reserved for use as private ASs. For larger enterprise networks and some carrier networks, the use of private ASNs works well for scalability and for certain designs. These numbers should not be advertised beyond the boundary of the public AS using them. If this sounds familiar, it is because private ASNs are not unlike private IP address blocks referenced by IANA in RFC 1918.

For instance, a large carrier network may have an ASN of 100. This ASN is advertised throughout the Internet. Through the use of BGP, prefixes are associated with AS100. BGP advertises or announces a given prefix to other ASs. They, in turn , will announce the prefixes associated with AS100 to their neighbors. As this occurs, each AS attaches its own ASN to the advertisement, creating a path list. The AS_PATH attribute, explained in Section 9.2.5.2, provides a listing of all ASs that have announced the prefix that originated in AS100. A remote system in AS333 might receive an advertisement of a prefix from multiple external neighbors. Through the route-selection process, explained in Section 9.1.4, the router will make a decision about which one of these routes it will install into the local routing table.

Three regional Internet registries located throughout the world assign ASNs.

ARIN (www.arin.net) assigns numbers for North America, South America, the Caribbean, and sub-Saharan Africa.
RIPE NCC (www.ripe.net) assigns numbers for Europe, the Middle East, and part of Africa.
APNIC (www.apnic.net) assigns numbers for Asia Pacific.

These registries provide IP and ASN assignment and allocation. The private address space and private ASNs can be used without any approval. However, take caution in ensuring that no private IP or ASNs are advertised into the public space. In JUNOS, this can be accomplished through the use of policy. ASNs can also be filtered with the remove-private statement. Since it is vitally important that private ASNs do not get advertised into the Internet routing table, providers have routing policy in place to ensure this type of information does not traverse the external border routers.

In order for BGP to operate properly, the ASNs must be set. This is accomplished with the following command:

 set routing-options autonomous-system  ASN  routing-options {     autonomous-system 100; }

9.1.3 Topologies: Transit and Homing

BGP provides reachability information for the Internet. This section will discuss a few of the common scenarios regarding transit and nontransit AS and homing.

9.1.3.1 Transit and Nontransit AS

The nature of internetworking has resulted in several different types of network topologies. Depending on topology and how networks are connected to other networks, unintended results, such as suboptimal routing or loss of routing information altogether, can occur. Figure 9-4 illustrates two topologies. In diagram 1, AS100 is connected to AS200 and AS300. In diagram 2, AS100 is connected to AS200 in two places. These scenarios can create a transit AS, which allows traffic not destined for networks residing in the AS to pass through to the intended destination. Nontransit means that only traffic destined for the given AS should enter. Usually this is not a problem, but when multihoming to different ASs and having large geographical topologies, the potential for unintended transit issues to arise is greater. Maintaining a specific peering policy with upstream providers and downstream customers can eliminate some transit problems. By performing in-depth analysis of your peering arrangements and understanding all external connection points, you can administer your network and cut down on the number of potential problems.

Figure 9-4. Transit/Homing Overview Topologies

graphics/09fig04.gif

9.1.3.2 Homing

Homing generally refers to how an AS is interconnected . This section describes common homing topologies and provides more detailed explanation of transit and nontransit ASs. There are three different characteristics that an AS may take on:

Single- homed (stub AS), which has one ingress/egress point for all traffic
Multihomed nontransit, which has multiple ingress/egress points for traffic flow (nontransit indicates that the AS should not be used to carry traffic or advertise routes for traffic not destined for the AS)
Multihomed transit, which also has multiple ingress/egress points for traffic flow (transit means policy conditions are applied to allow negotiated traffic not destined for the AS to flow through the AS)

A quick way to determine if a topology is considered multihomed or not is to determine if there are multiple border routers or multiple links exiting a single router or AS.

Figure 9-5 illustrates a stub/single-homed router. Washington D.C. is connected to New York via some circuit. New York has multiple connections to the rest of the Internet. Since Washington D.C. only has one egress point it is single-homed and a stub. Figure 9-6 has a similar topology as Figure 9-5, but now there are two circuits between Washington D.C. and New York. Washington D.C. is still a stub, but it is multihomed. It would not make any sense to attempt to use Washington D.C. in this case for any transit traffic. It would only add an additional hop to the path and would be a poor implementation of policy.

Figure 9-5. Single-Homed Router (Stub)

graphics/09fig05.gif

Figure 9-6. Stub/Multihomed Router

graphics/09fig06.gif

Figure 9-7 shows how the same physical topology is applied in the context of ASs and BGP. Washington D.C. is in AS100 and New York is in AS200. As long as Washington D.C. is the only border router in AS100 running EBGP, then the AS100 is single-homed. There is only one circuit connected to its upstream provider. AS100 would also be considered a stub AS. In reality, if Washington D.C. in AS100 were a customer of the provider AS200, then AS200 would most likely assign a private ASN to the customer.

Figure 9-7. Stub/Single-Homed AS

graphics/09fig07.gif

In Figure 9-8 there are two circuits between Washington D.C. and New York. The dual circuits and the fact that multipath can be run tells us that the topology is multihomed. Again any attempt to take advantage of AS100 for transit would be a waste of time, and in reality AS100 probably would not advertise itself as being able to reach any prefixes other than those assigned to the AS itself. Nonetheless, if poor policy is used, suboptimal routing conditions could occur. It is vital to understand your connection points and policy.

Figure 9-8. Stub/Multihomed AS

graphics/09fig08.gif

In Figure 9-9, Washington D.C. in AS100 is connected to New York and Boston in AS200. In this situation, AS100 is still only using one border router running EBGP; thus, it is still considered a stub AS. However, having two circuits, it is also considered multihomed due to the multiple routers to which it has connectivity. In this scenario, Washington D.C. would probably receive prefixes for the same networks from both New York and Boston. Washington D.C. could manipulate the LOCAL_PREF attribute to make more efficient use of both links. AS200 could also manipulate the MULTI_EXIT_DISC ( MED ) value of prefixes sent to AS100 in an attempt to influence how AS100 will send traffic. More likely, AS200 would be a provider and AS100 would be a customer. AS200 would probably work with AS100 to establish policy that lets both organizations utilize the links in the most efficient manner.

Figure 9-9. Stub/Multihomed Dual Routers

graphics/09fig09.gif

Figure 9-10 shows a slightly different environment. Both AS100 and AS200 are connected to each other in Los Angeles and New York. AS100 is still a stub AS and multihomed with AS200. A common problem would entail either AS100 or AS200 taking advantage of the other. The New York router in AS100 could send traffic destined to Los Angeles by using the New York router in AS200. AS100 would essentially be using AS200 resources to traverse the country instead of using its own. Depending on each AS's policy, the New York router in AS200 could send the traffic right back to AS100, thereby creating a route loop. The advantage to this scenario, however, is that AS100, as the customer, can advertise certain prefixes to AS200 in an attempt to influence how traffic is sent into the customer network. Thus, traffic destined to Los Angeles would come from the Los Angeles router in AS200.

Figure 9-10. Stub/Multihomed AS with Load-Sharing and Multipath Configurations

graphics/09fig10.gif

In Figure 9-11, Los Angeles and Atlanta in AS100 are connected to Denver in AS300 and Boston in AS400, respectively. This is a multihomed scenario. There is a potential for AS400 to use AS100 as a transit AS to AS300. If AS100 does not have arrangements with either AS300 or AS400 to use it as a transit AS, then strict policy should be implemented to prevent the exploitation of AS100 connection points. It would also be in AS100's best interest to advertise prefixes in the western region to prefer router Los Angeles and prefixes in the eastern region to prefer router Atlanta.

Figure 9-11. Multihomed AS and Transit AS Scenario 1

graphics/09fig11.gif

Figure 9-12 shows a slightly different type of multihomed environment. In this figure, Denver in AS100 is connected to Los Angeles in AS200 and New York in AS300. Though there is only a single router in AS100, the same conditions apply as those in Figure 9-11. Proper design would attempt to utilize both links for resiliency and prefix influence.

Figure 9-12. Multihomed AS and Transit AS Scenario 2

graphics/09fig12.gif

Figure 9-13 adds a new twist to the previous two scenarios. Though this scenario is similar to that presented in Figure 9-10, three ASs are being used instead of two. In this situation, AS100 and AS300 both have routers in Los Angeles, and AS100 and AS400 each have a router in New York. There is a link between AS300 and AS400, but it is expensive to use it. AS300 or AS400 could take advantage of the cross-country link between Los Angeles and New York in AS100 by sending their own transit traffic to AS100 to get across the country.

Figure 9-13. Multihomed AS and Transit AS Scenario 3

graphics/09fig13.gif

Consider the flip side. AS100 could take advantage of the link between AS300 and AS400 to pass nonpriority traffic cross-country. Ultimately, each AS here has a responsibility to apply policy and announce prefixes responsibly. It is easy in situations in which this topology can be applied to create suboptimal routing domains and be taken advantage of by poor prefix advertisement and poor policy implementation.

Ultimately, both the customer and provider are responsible for maintaining the most optimized routing architecture they can. These scenarios have pointed out just a few of the potential cases that can occur, depending on transit and homing conditions.

9.1.4 Routing

Routing, in general, is the process for selecting a path over which to route traffic. BGP evaluates several attributes prior to selecting a route to a prefix and installing it into the routing table. You will recall that an IGP is typically based on either distance vector or link state routing principles. These are used to determine a loop-free routing topology internal to the AS. BGP, on the other hand, is a path vector protocol and uses a list of ASNs within the routing information to determine a loop-free path to destination networks.

The next few sections will look at how route-preference and default-route selection criteria play key roles in how JUNOS decides to select routes. The first subject is the RIB and how it applies to BGP.

9.1.4.1 RIB

RIB stores routing information for incoming route advertisements, locally stored routing information, and outbound route advertisements. There are three types of RIBs:

Adj-RIB-In contains all inbound routes received from other BGP peers. There is an Adj-RIB-In for each BGP peer. The default behavior of JUNOS is to store learned BGP routes here that do not have AS_PATH routing loops. The routes here have not been manipulated by any import policies. These are the routes on which your BGP import policy would be performed, as would recursive lookups in the routing table for valid next-hop addresses. See the keep statement in Section 9.4.2 for additional information.
Loc-RIB contains routes that have come from the Adj-RIB-In tables and have gone through validation and policy. These would be considered your inet.0 routing table, which is discussed in Section 9.1.4.2.
Adj-RIB-Out contains routes that will be advertised by the local router to its BGP peers. This table is where routes would be placed after being evaluated by your BGP export policy. Once here, they will be advertised to the corresponding BGP peer.

Chapter 11 provides in-depth explanation of JUNOS policy.

9.1.4.2 Routing Tables

JUNOS places all unicast routes in table inet.0 . So, the routes that are learned via BGP, which pass through the import policy and have a valid next-hop, will be placed here. Table 9-1 lists the various predefined routing tables in JUNOS and their use.

Table 9-1. JUNOS Routing Tables

Table	Function
`inet.0`	Default unicast routing table
`Instance- name .inet.0`	Unicast routing table for a given routing instance
`inet.1`	Multicast forwarding cache
`inet.2`	Unicast routes used for multicast reverse path forwarding (RPF) lookup
`inet.3`	MPLS routing table for path information
`mpls.0`	MPLS routing table for LSP next-hops

BGP route decisions work a bit differently from IGP route decisions. JUNOS will prefer routes in table inet.3 for next-hop resolution over those listed in table inet.0 when MPLS and traffic engineering are enabled and in use. Details on inet.3 can be found in Section 12.3.3. In addition to these tables, you have the ability to create your own tables for customized purposes. You can apply import and export policies to these tables just as you would to any of the default tables.

9.1.4.3 Route Preference

JUNOS uses protocol preference to choose which routes will become active in the routing table. This can be easily manipulated in JUNOS and can be useful in scenarios where it is necessary to prefer routes learned from one protocol over those learned from another. There is also the ability to use preference in tiebreaking. The preference statement is used to specify a preference of the protocol in terms of its association with the other protocols. Thus, you can cause BGP to be preferred over OSPF, for example. Table 9-2 lists the protocols and their default preference. The lower the value associated with the protocol, the more preferred it is to others. Care should be taken when manipulating these preferences as unintended results may occur, causing suboptimal routing within your particular routing domain.

Table 9-2. JUNOS Routing Protocol Ranking

Routing Protocol	JUNOS Default Preference Value
Direct
Static	5
MPLS (RSVP)	7
MPLS (LDP)	9
OSPF Internal Route	10
IS-IS Level 1 Internal Route	15
IS-IS Level 2 Internal Route	18
Redirects	30
RIP	100
Point-to-Point Interfaces	110
Generated or Aggregate	130
OSPF AS External Routes	150
IS-IS Level 1 Internal Route	160
IS-IS Level 2 Internal Route	165
BGP	170

9.1.4.4 BGP Route Selection

JUNOS typically chooses routes to become active by their preference. When it comes to BGP, there is an even more defined set of criteria for route selection.

For each route available to a given destination in the routing table, JUNOS must select one of those routes as the active route and install it into the master forwarding table. JUNOS uses a process to select the active route. It chooses first based upon protocol preference, as listed in Table 9-2. For instance, RIP routes (default preference value of 100) will take preference over OSPF AS external routes (default preference value of 150), while OSPF internal routes (default preference value of 10) will take preference over RIP routes. Routes that are rejected and marked (as unusable) are given a preference of “1. These routes will never be installed as active routes, regardless of their protocol ranking. If the decision-making process decides to use BGP routes, the following process is followed.
As long as BGP can do a recursive lookup and resolve the next-hop for a prefix in the existing routing table, then the route will be installed in the LOC-In RIB . If there is no valid next-hop in the routing table, then JUNOS will hide the route. Once routes make it to the LOC-In , JUNOS will continue through the route-selection process.
JUNOS will then select the BGP route with the highest LOCAL_PREF . The higher the value, the more preferred the route is. LOCAL_PREF is set internally to the AS according to policy. If this is equal, then JUNOS will move on to the next step.
If the routes being compared both contain AS_PATH information, JUNOS will select the route with the shortest AS_PATH list. Confederation sequences are given a path length of 0, where as Confederation sets and AS_SET are treated has having a path length of 1. This can be seen in Section 10.4.2.
The next step in the route-selection process is to evaluate the ORIGIN attribute. Values for ORIGIN rank as follows :
1. IGP
2. EGP
3. Incomplete
MED value is the next criteria to be compared. The path with the lowest MED value will be chosen . Think of MED as a metric, which are typically preferred based on the lowest value. This only occurs if prefixes are learned from the same AS. There is a MED case study to refer to in Section 10.2.4.
JUNOS will next choose routes learned via EBGP over IBGP. EBGP is generated externally and is considered cheaper than internally generated routes. This is commonly referred to as hot potato routing.
Next, JUNOS will prefer routes with whose next-hop has the lowest IGP metric to the border router.
Routes in table inet.3 will be chosen over routes in inet.0 (MPLS-centric).
Routes with a greater number of next-hops will be preferred. This means if prefix 172.16/16 has two next-hops, and next-hop A has four routes in the IGP, and next-hop B has two routes in the IGP, BGP will install the route with next-hop A.
If route reflectors are used, BGP will choose the route with the shorter CLUSTER_LIST .
The route with the lowest RID will be chosen.
Routes from the peer with the lowest peer ID will be selected.

This process is valid only for BGP route selection. Route selection in JUNOS, in general, encompasses these criteria, plus more relating to individual IGPs. Policy can also be used to manipulate routes, so you can force them to pass certain criteria in this selection process. Use caution when doing this, as suboptimal routing may occur.

9.1.4.5 Default Routes

A default route will be used if no other more specific route for a given destination exists in the current routing table. In BGP, this is important for several reasons. If your local AS is not receiving full Internet routes, or does not need to receive full Internet routes, then you may attempt to send data to a destination that is not listed in the routing table. You do not want to statically assign this default route unless it is absolutely necessary. The preferred method is to have the upstream provider announce a default ( 0.0.0.0/0 ) route to your BGP border router. In turn, you want your border router to announce the default into the IGP. In this way, your entire routing domain should be able to see the default route. If the link for the default should go away, the default route will go away, and the packet will be dropped, as it should be.

The benefit of doing this over static definition is obvious. If static definition were used, the default route could point to a destination that may not actually be reachable or that could go away due to a change. The dynamic method of introducing the default route is preferred because it will go away only when it is not physically accessible.