Difference between revisions of "Building Scalable DNS Cluster using LVS"
(→Configuration Example: now here is actually something written ;)) |
|||
Line 7: | Line 7: | ||
== Configuration Example == | == Configuration Example == | ||
+ | keepalived.conf: | ||
+ | ! Balancer-Set for udp/53 | ||
+ | virtual_server 194.97.173.124 53 { | ||
+ | delay_loop 10 | ||
+ | lb_algo wrr | ||
+ | lb_kind DR | ||
+ | protocol UDP | ||
+ | ! persistence_timeout 1 | ||
+ | ! persistence_granularity 255.255.255.255 | ||
+ | ! eth1.105 -> kai eth1.105 | ||
+ | real_server 10.1.53.2 53 { | ||
+ | weight 1 | ||
+ | MISC_CHECK { | ||
+ | misc_path "/usr/bin/dig -b 10.1.53.1 a resolve.test.roka.net @10.1.53.2 +time=1 +tries=5 +fail > /dev/null" | ||
+ | misc_timeout 6 | ||
+ | } | ||
+ | } | ||
+ | ! eth1.109 -> kai eth1.109 | ||
+ | real_server 10.3.53.2 53 { | ||
+ | weight 1 | ||
+ | MISC_CHECK { | ||
+ | misc_path "/usr/bin/dig -b 10.3.53.1 a resolve.test.roka.net @10.3.53.2 +time=1 +tries=5 +fail > /dev/null" | ||
+ | misc_timeout 6 | ||
+ | } | ||
+ | } | ||
+ | } | ||
+ | |||
+ | As you can dig (;-) we are using an A record with a low TTL to test the service for this setup is a recursive DNS cluster. So far dig works fine with 44 real_servers configured on an idle Dual PIII 800. | ||
+ | |||
+ | |||
+ | on real_server kai we use the following netfilter setup to be able to direct the traffic to different BIND processes on the same machine/mac: | ||
+ | #DNAT 194.97.173.124->10.1.53.2 eth1.105 | ||
+ | $ipt -t nat -A PREROUTING -i eth1.105 -s $net -d 194.97.173.124 -p tcp --dport 53 -j DNAT --to-destination 10.1.53.2:53 | ||
+ | $ipt -t nat -A PREROUTING -i eth1.105 -s $net -d 194.97.173.124 -p udp --dport 53 -j DNAT --to-destination 10.1.53.2:53 | ||
+ | #DNAT 194.97.173.124->10.3.53.2 eth1.109 | ||
+ | $ipt -t nat -A PREROUTING -i eth1.109 -s $net -d 194.97.173.124 -p tcp --dport 53 -j DNAT --to-destination 10.3.53.2:53 | ||
+ | $ipt -t nat -A PREROUTING -i eth1.109 -s $net -d 194.97.173.124 -p udp --dport 53 -j DNAT --to-destination 10.3.53.2:53 | ||
+ | |||
+ | |||
+ | If you have more than one Loadbalancer at different locations and you can convince your local Networker to let you speak BGP4+ to his routers you can use quagga with something like the following configuration to failover the service IP to the second LB if the first one goes down: | ||
+ | ! | ||
+ | router bgp 5430 | ||
+ | no synchronization | ||
+ | bgp router-id a.b.c.d | ||
+ | redistribute connected route-map benice | ||
+ | neighbor c.d.e.f remote-as 5430 | ||
+ | neighbor c.d.e.f description ffm4-j2 | ||
+ | neighbor c.d.e.f send-community both | ||
+ | neighbor c.d.e.f soft-reconfiguration inbound | ||
+ | neighbor c.d.e.f route-map nixda in | ||
+ | neighbor c.d.e.f route-map benice out | ||
+ | neighbor d.c.f.e remote-as 5430 | ||
+ | neighbor d.c.f.e description ffm4-j | ||
+ | neighbor d.c.f.e send-community both | ||
+ | neighbor d.c.f.e soft-reconfiguration inbound | ||
+ | neighbor d.c.f.e route-map nixda in | ||
+ | neighbor d.c.f.e route-map benice out | ||
+ | no auto-summary | ||
+ | ! | ||
+ | access-list line permit 127.0.0.1/32 exact-match | ||
+ | access-list line deny any | ||
+ | ! | ||
+ | ip prefix-list cns-dus2 description dus2 high-metric eq low-perference | ||
+ | ip prefix-list cns-dus2 seq 5 permit 194.97.173.125/32 | ||
+ | ip prefix-list cns-dus2 seq 10 deny any | ||
+ | ip prefix-list cns-ffm4 description ffm4 low-metric eq high-preference | ||
+ | ip prefix-list cns-ffm4 seq 5 permit 194.97.173.124/32 | ||
+ | ip prefix-list cns-ffm4 seq 10 deny any | ||
+ | ! | ||
+ | route-map benice permit 10 | ||
+ | match ip address prefix-list cns-ffm4 | ||
+ | set local-preference 100 | ||
+ | set metric 0 | ||
+ | ! | ||
+ | route-map benice permit 20 | ||
+ | match ip address prefix-list cns-dus2 | ||
+ | set local-preference 100 | ||
+ | set metric 1 | ||
+ | ! | ||
+ | route-map nixda deny 10 | ||
+ | ! | ||
+ | This is the LB at FFM4. Note that the metric at the DUS2 LB is just the other way around. | ||
+ | Here we fancy talking to two core-routers from each LB for extra redundancy. | ||
+ | You can also have an internal anycast ServiceIP if you use the same metric at both LBs and make sure they are attached to the same level of router network-topology-wise. This way traffic gets shared between the two loadbalancers according to your network-topology most interesting of course for large dialin ISPs. | ||
== Conclusion == | == Conclusion == |
Revision as of 09:34, 20 January 2006
Introduction
Architecture
Configuration Example
keepalived.conf:
! Balancer-Set for udp/53 virtual_server 194.97.173.124 53 { delay_loop 10 lb_algo wrr lb_kind DR protocol UDP ! persistence_timeout 1 ! persistence_granularity 255.255.255.255 ! eth1.105 -> kai eth1.105 real_server 10.1.53.2 53 { weight 1 MISC_CHECK { misc_path "/usr/bin/dig -b 10.1.53.1 a resolve.test.roka.net @10.1.53.2 +time=1 +tries=5 +fail > /dev/null" misc_timeout 6 } } ! eth1.109 -> kai eth1.109 real_server 10.3.53.2 53 { weight 1 MISC_CHECK { misc_path "/usr/bin/dig -b 10.3.53.1 a resolve.test.roka.net @10.3.53.2 +time=1 +tries=5 +fail > /dev/null" misc_timeout 6 } } }
As you can dig (;-) we are using an A record with a low TTL to test the service for this setup is a recursive DNS cluster. So far dig works fine with 44 real_servers configured on an idle Dual PIII 800.
on real_server kai we use the following netfilter setup to be able to direct the traffic to different BIND processes on the same machine/mac:
#DNAT 194.97.173.124->10.1.53.2 eth1.105 $ipt -t nat -A PREROUTING -i eth1.105 -s $net -d 194.97.173.124 -p tcp --dport 53 -j DNAT --to-destination 10.1.53.2:53 $ipt -t nat -A PREROUTING -i eth1.105 -s $net -d 194.97.173.124 -p udp --dport 53 -j DNAT --to-destination 10.1.53.2:53 #DNAT 194.97.173.124->10.3.53.2 eth1.109 $ipt -t nat -A PREROUTING -i eth1.109 -s $net -d 194.97.173.124 -p tcp --dport 53 -j DNAT --to-destination 10.3.53.2:53 $ipt -t nat -A PREROUTING -i eth1.109 -s $net -d 194.97.173.124 -p udp --dport 53 -j DNAT --to-destination 10.3.53.2:53
If you have more than one Loadbalancer at different locations and you can convince your local Networker to let you speak BGP4+ to his routers you can use quagga with something like the following configuration to failover the service IP to the second LB if the first one goes down:
! router bgp 5430 no synchronization bgp router-id a.b.c.d redistribute connected route-map benice neighbor c.d.e.f remote-as 5430 neighbor c.d.e.f description ffm4-j2 neighbor c.d.e.f send-community both neighbor c.d.e.f soft-reconfiguration inbound neighbor c.d.e.f route-map nixda in neighbor c.d.e.f route-map benice out neighbor d.c.f.e remote-as 5430 neighbor d.c.f.e description ffm4-j neighbor d.c.f.e send-community both neighbor d.c.f.e soft-reconfiguration inbound neighbor d.c.f.e route-map nixda in neighbor d.c.f.e route-map benice out no auto-summary ! access-list line permit 127.0.0.1/32 exact-match access-list line deny any ! ip prefix-list cns-dus2 description dus2 high-metric eq low-perference ip prefix-list cns-dus2 seq 5 permit 194.97.173.125/32 ip prefix-list cns-dus2 seq 10 deny any ip prefix-list cns-ffm4 description ffm4 low-metric eq high-preference ip prefix-list cns-ffm4 seq 5 permit 194.97.173.124/32 ip prefix-list cns-ffm4 seq 10 deny any ! route-map benice permit 10 match ip address prefix-list cns-ffm4 set local-preference 100 set metric 0 ! route-map benice permit 20 match ip address prefix-list cns-dus2 set local-preference 100 set metric 1 ! route-map nixda deny 10 !
This is the LB at FFM4. Note that the metric at the DUS2 LB is just the other way around. Here we fancy talking to two core-routers from each LB for extra redundancy. You can also have an internal anycast ServiceIP if you use the same metric at both LBs and make sure they are attached to the same level of router network-topology-wise. This way traffic gets shared between the two loadbalancers according to your network-topology most interesting of course for large dialin ISPs.