Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Static blackhole route does not honor distance attribute #2230

Open
solevi opened this issue May 14, 2018 · 13 comments
Open

Static blackhole route does not honor distance attribute #2230

solevi opened this issue May 14, 2018 · 13 comments
Assignees
Labels
bug policy matter Here be dragons - involves layer 9!

Comments

@solevi
Copy link

solevi commented May 14, 2018

Hello,
We've faced following problem with latest frr of 4.0-1~ubuntu16.04+1
When we add static null route with distance 200, frr does not accept route received from BGP.
Below is the relevant configuration.

router bgp 65002
 coalesce-time 1000
 bgp graceful-restart
 bgp graceful-shutdown
 neighbor 192.168.20.1 remote-as 65020
 address-family ipv4 unicast
!
  aggregate-address 192.168.3.0/24
  redistribute connected
  redistribute static
  neighbor 192.168.20.1 soft-reconfiguration inbound
 exit-address-family
!
ip route 0.0.0.0/0 blackhole 200
!

As you can see below, default route received from neighbor is not being accepted.

# show ip bgp neighbors 192.168.20.1 routes 
BGP table version is 225, local router ID is 192.168.3.70
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
              i internal, r RIB-failure, S Stale, R Removed
Origin codes: i - IGP, e - EGP, ? - incomplete

   Network          Next Hop            Metric LocPrf Weight Path
*  0.0.0.0          192.168.20.1             0      0      0 65020 ?
*> 192.168.3.5/32   192.168.20.1             0      0      0 65020 ?

# show ip ro
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       > - selected route, * - FIB route

S>* 0.0.0.0/0 [200/0] unreachable (blackhole), 00:00:23
C>* 192.168.1.0/24 is directly connected, eth0, 00:37:10
B   192.168.3.0/24 [200/0] via 0.0.0.0 inactive, 00:00:00

However, if we are adding blackhole route after bgp neighbor is established, distance is being honored.

# show ip ro
Codes: K - kernel route, C - connected, S - static, R - RIP,
       O - OSPF, I - IS-IS, B - BGP, P - PIM, E - EIGRP, N - NHRP,
       T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
       > - selected route, * - FIB route

S   0.0.0.0/0 [200/0] unreachable (blackhole), 00:00:03
B>* 0.0.0.0/0 [20/0] via 192.168.20.1, eth1.4092, 00:02:06
C>* 192.168.1.0/24 is directly connected, eth0, 00:36:11
B   192.168.3.0/24 [200/0] via 0.0.0.0 inactive, 00:08:34 

Though after peer restart default route is not accepted from peer anymore and blackhole route is left as preferred one.

@donaldsharp donaldsharp self-assigned this May 15, 2018
@qlyoung qlyoung added the bug label May 15, 2018
@kssoman
Copy link
Contributor

kssoman commented Jul 19, 2018

       A ------------------------------------- B
                   EBGP

A

ip route 50.1.1.0/24 blackhole 200
!
interface ens192
ip address 10.1.1.1/24
!
router bgp 1
bgp router-id 1.1.1.1
neighbor 10.1.1.2 remote-as 2
!
address-family ipv4 unicast
redistribute static
exit-address-family

B

ip route 50.1.1.0/24 ens192
!
interface ens192
ip address 10.1.1.2/24
!
router bgp 2
bgp router-id 2.2.2.2
neighbor 10.1.1.1 remote-as 1
!
address-family ipv4 unicast
network 50.1.1.0/24
exit-address-family

Case 1)

Static route is already present in rib and redistributed to bgp with weight 32768 (default)

Then route from peer is received

Best route selected in bgp is static (since weight of static route is higher)

Therefore peer route is not installed in RIB as expected as shown in logs below

LOGS

A

show ip bgp
BGP table version is 3, local router ID is 1.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path

  • 50.1.1.0/24 10.1.1.2 0 0 2 i <========== Peer
    *> 0.0.0.0 0 32768 ? <============ Redist

show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR,
> - selected route, * - FIB route

K>* 0.0.0.0/0 [0/0] via 10.112.157.253, ens160, 1d22h08m
C>* 10.1.1.0/24 is directly connected, ens192, 1d21h31m
C>* 10.112.156.0/23 is directly connected, ens160, 1d22h08m
S>* 50.1.1.0/24 [200/0] unreachable (blackhole), 00:02:52 <================

Case 2)

Static route NOT in rib when bgp session is established
Peer route received with distance 20

dev(config-router-af)# do show ip bgp
BGP table version is 10, local router ID is 1.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
*> 50.1.1.0/24 10.1.1.2 0 0 2 i

Displayed 1 routes and 1 total paths
dev(config-router-af)# do show ip route
Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR,
> - selected route, * - FIB route

K>* 0.0.0.0/0 [0/0] via 10.112.157.253, ens160, 1d22h38m
C>* 10.1.1.0/24 is directly connected, ens192, 1d22h01m
C>* 10.112.156.0/23 is directly connected, ens160, 1d22h38m
B>* 50.1.1.0/24 [20/0] via 10.1.1.2, ens192, 00:16:13

Add static route

dev(config)# ip route 50.1.1.0/24 blackhole 200

show ip route

Codes: K - kernel route, C - connected, S - static, R - RIP,
O - OSPF, I - IS-IS, B - BGP, E - EIGRP, N - NHRP,
T - Table, v - VNC, V - VNC-Direct, A - Babel, D - SHARP,
F - PBR,
> - selected route, * - FIB route

K>* 0.0.0.0/0 [0/0] via 10.112.157.253, ens160, 1d22h42m
C>* 10.1.1.0/24 is directly connected, ens192, 1d22h05m
C>* 10.112.156.0/23 is directly connected, ens160, 1d22h42m
S 50.1.1.0/24 [200/0] unreachable (blackhole), 00:00:26
B>* 50.1.1.0/24 [20/0] via 10.1.1.2, ens192, 00:19:45

RIB selected BGP route since it has lower admin distance. The static route is not selected therefore RIB will not send update to bgp

static void zebra_redistribute(struct zserv *client, int type,
unsigned short instance, vrf_id_t vrf_id,
int afi)
{
if (!CHECK_FLAG(newre->flags, ZEBRA_FLAG_SELECTED))
continue;
}
show ip bgp

BGP table version is 10, local router ID is 1.1.1.1, vrf id 0
Status codes: s suppressed, d damped, h history, * valid, > best, = multipath,
i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes: i - IGP, e - EGP, ? - incomplete

Network Next Hop Metric LocPrf Weight Path
*> 50.1.1.0/24 10.1.1.2 0 0 2 i

Displayed 1 routes and 1 total paths

Therefore this works as expected.

If the peer is restarted this will result in same processing as case (1) and will cause static route to be selected in RIB]

@donaldsharp
Copy link
Member

No, bgp needs to respect admin distance. BGP should be storing the redistributed routes admin distance and in section 3 of bgp_info_cmp when we look at the static route type we should do a quick comparison of the route's admin distance -vs- bgp's admin distance and select the better.

@donaldsharp
Copy link
Member

To whit we want a reproducible deterministic network this bug is not.

@dslicenc
Copy link
Member

Historically, this is the way it also worked at the last vendor I worked at as well. BGP best path calculation prefers locally injected routes over those learned from bgp peers and never tries to install if the locally installed route is injected first. Admin distance isn't in play in this case. If the BGP prefix is learned first and installed in the rib, then zebra will use admin distance to chose. Definitely non-deterministic.

Here is a snippet of the first three steps of the bestpath algorithm.

Prefer the path with the highest WEIGHT.

Prefer the path with the highest LOCAL_PREF.

Prefer the path that was locally originated via a network or aggregate BGP subcommand or through redistribution from an IGP.

Local paths that are sourced by the network or redistribute commands are preferred over local aggregates that are sourced by the aggregate-address command.

@dslicenc
Copy link
Member

I believe if you don't have a need to redistribute the static into bgp, the behavior would become deterministic because zebra would be doing the deciding.

@kssoman
Copy link
Contributor

kssoman commented Jul 19, 2018

As mentioned by Don Slice, admin distance is not a parameter in bgp route selection process (as per the standards), therefore the existing behavior is correct and the issue raised is not a defect.

@srimohans
Copy link
Contributor

srimohans commented Aug 24, 2018

Analysis

1st Scenario: Configure static route followed by configuration for BGP.

  1. A Static/Blackhole route is configured by use which will get added to RIB. (0.0.0.0/0 with admin distance 200). This will be selected as best route in RIB and also will get downloaded to FIB.
  2. When BGP is configured and when redistribute command is issued (“redistribute static”), RIB sends a notification to BGP about the static route, which happens to be best route.
  3. At this time, BGP also has the route 0.0.0.0/0 with admin distance 20 (which it received from neighbor). However BGP selects the static route as the best route, as static route gets a default weight of 32768 when it is redistributed to BGP. This is the reason why Static Route is selected over BGP route. Note that admin distance doesn’t play any role in BGP best path algorithm.
  4. So BGP doesn’t download 0.0.0.0/0 with admin distance 20 to RIB and RIB end up having only one route which is 0.0.0.0/0 with admin distance 200.

2nd Scenario: Configure BGP followed by configuration for static route

  1. BGP session is configured and “network 0.0.0.0/0” command is also configured.
  2. BGP downloads the route 0.0.0.0/0 with admin distance 20 into the RIB. This will be selected as the best route in RIB and also get downloaded to FIB.
  3. A static/blackhole route is configured with admin distance 200. As per RIB’s best route selection, RIB doesn’t select this new route as best route and keep the old best route (from BGP with admin distance as 20). Now RIB doesn’t send a notification back to RIB as there is no change in best route.
  4. BGP is not even aware of this new route, and so it will not download any new route and RIB end up having route 0.0.0.0/0 with admin distance 20.

Conclusion
• RIB ends up in having 0.0.0.0/0 with admin distance 200 in first scenario and 0.0.0.0/0 with admin distance 20 in 2nd scenario.
• In the bug 0.0.0.0/0 was added but this scenario would happen for any static route with the condition that admin distance of static > BGP.
• The end effect to the user of FRR stack is that, FRR stack doesn’t honor admin distance while downloading routes to RIB/FIB in certain scenarios

Fix proposed:
Consider admin distance of each routing protocol while selecting the best route in BGP (i.e. when the scenario involves redistributed routes) like how RIB does. With this fix, in the 1st scenario, BGP selects the route with admin distance 20 as the best route over the route with admin distance 200 and it downloads this route to RIB. So RIB will end up having the route with admin distance 20 as the best route and downloads the same to FIB. i.e. FRR stack always honors the admin distance while selecting the best route.

Other routing stacks behavior: Cisco also behaves like FRR stack. i.e. in the above scenario, Cisco equipment also doesn’t honor admin distance.

@nikos-github
Copy link

This isn't a bug. Coming to the part of getting FRR to behave the way it is described above, I believe the desired behavior can be achieved if a route-map is applied to modify the weight of BGP paths to be greater than 32,768.

@srimohans
Copy link
Contributor

srimohans commented Aug 31, 2018

I tested by configuring a route map to increase the weight of the BGP prefix to more than 32768.
Please find the outputs and configuration for both cases. i.e. without any route map and with route map.
With route map configuration, the route downloaded to RIB is consistent in both scenarios.

image
image
image
image
image

@mdash-vmware
Copy link

Hi All,
I am little disagree with this solution, what if we have many Neighbour, it would be very tedious for user to configure Route MAP for all the neighbours.

Why cant we just Ignore the default Weight (32768) while calculating the Best route in BGP table, If we ignore the default weight then the next Best path algo would kick-in and there IGP/EGP won the battle VS Unknown (Redistributed Static has UNKOWN TYPE) and BGP route would be selected as Best and triggered to Update Zebra.

I would suggest to always Ignore default parameters (Such as Weight and Local Preference) while calculating Best Paths in BGP. These are default set by System and not configured by User.. we should honour user configured parameter over default set.

-Manas

@eqvinox
Copy link
Contributor

eqvinox commented Sep 4, 2018

=> meeting this week
=> rough consensus on call is that deterministic behaviour is desired

@srimohans
Copy link
Contributor

image

@solevi
Copy link
Author

solevi commented Sep 11, 2018

Hello everyone,
I would like to thank you all for diving so deep, really did not expected to make so much trouble.

@qlyoung qlyoung added the policy matter Here be dragons - involves layer 9! label Apr 24, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug policy matter Here be dragons - involves layer 9!
Projects
None yet
Development

No branches or pull requests

9 participants