RADSClassFall06/labs/lab4
From RAD Lab
Contents |
Part 1: failover with HAProxy
In this part you'll use the HAProxy (high availability proxy) to enable failover and make the multi-server site more dependable.
In the next part, if there's time, you'll leverage HAProxy to do "poor man's migration" to dynamically adapt the system configuration to different workloads.
- To simplify, let's assume the database's workload is read only (Selects but no Inserts or Updates). Deploy 2 independent databases (on different VMs), and enough dispatchers and webserveres (based on your experience from lab 3) to avoid having the database become the bottleneck.
- Deploy HAProxy between the Web server and dispatchers, and between the dispatchers and databases.
- What does HAProxy do to baseline performance?
- Verify that if you kill a dispatcher, HAProxy fails over gracefully to some remaining dispatcher; and if you kill a database copy, HAProxy fails over gracefully to the second database.
- What is the effect on baseline performance if you kill a dispatcher? If you kill a database instance? Do you observe roughly linear slowdown?
Part 2: Poor man's migration
- A switch will be provided in the workload generator to enable/disable requests that include full-text database searches. Start the WLG, flip the switch, and ensure the deployment is configured so that there are enough dispatchers to keep the database busy under this workload. What you should find is that fewer dispatchers are needed because each dispatcher request generates more work for the database.
- Flip the switch back, and verify that the dispatchers once again become the bottleneck.
The next goal is to provide some machinery to allow the VM hosting the 2nd copy of the database to instead host more dispatchers when the workload changes. Initially, you can do this manually.
- Start the workload generator with search enabled and with 2 databases; then switch over to search disabled. Manually kill the database (HAProxy should fail over to the remaining database), start some more dispatchers, and modify HAProxy's settings so that the existing webservers will transparently start using the extra dispatchers.
- During the transition interval, try to measure how many requests were lost or were delivered with a latency that is more than about 2 standard deviations higher than the steady-state mean.
- Now go the other way: switch the workload generator back to the search-enabled workload, and replace the extra dispatchers with a 2nd copy of the database again. Measure how many requests were affected by this operation.
What you will end up reporting is essentially the cost of migration, in particular, the dynamic adaptation of system resources to cope with a bimodal workload. A good prelude to any projects that will involve VM migration.
HAProxy config files
for part 1
We run HAProxy on the web server VM: port 10000 handles requests from web server to dispatchers, port 10001 handles requests from dispatchers to the database. vm83/vm84 run the dispatchers, vm86 runs MySQL, vm85 runs 5 dispatchers and MySQL used for failover. This means that vm85 is not used in the default configuration, but HAProxy will fail over to vm85 if *both* vm83 and vm84 are down or if vm86 is down. check port 22 inter 3000 rise 1 fall 2 means that HAProxy will make TCP health-checks every 3 seconds (just connect to port 22); if it can't connect twice in a row (fall 2), the server is down and HAProxy won't send requests to it. When it connects again, HAProxy will add the server to the list of active servers (rise 1). Notice that we're not actually checking the health of the dispatchers, but only of the whole VM; if a dispatcher dies, HAProxy won't notice it and will keep sending requests to that port. The reason why we don't check the actual dispatcher ports is that an empty TCP connection to a dispatcher port will kill the dispatcher. Ooops.
You can run the HAProxy with the -d switch to get debug information such as connections and health-checks. HAProxy version 1.2.14 and newer have an HTML status page on vm80.vm:10000//haproxy?stats, unfortunately these machines are running version 1.2.11.
If something is not clear, read the documentation: http://haproxy.1wt.eu/download/1.2/doc/haproxy-en.txt
Here's the config file we used and it seems to work. You have to change the VM numbers.
# this config needs haproxy-1.1.28 or haproxy-1.2.1
global
log 127.0.0.1 local0
log 127.0.0.1 local1 notice
#log loghost local0 info
maxconn 4096
# chroot /usr/share/haproxy
chroot /tmp/haproxy
uid 99
gid 99
daemon
#debug
#quiet
defaults
log global
mode http
option httplog
option dontlognull
retries 1
redispatch
maxconn 2000
contimeout 500 # You may need a longer value, 4000 is suggested in the HAProxy manual
clitimeout 5000 # If you are getting blank pages (especially collection/list) increase this
srvtimeout 5000 # If you are getting blank pages (especially collection/list) increase this
#stats enable
listen rails_dispatchers 0.0.0.0:10000
mode tcp
balance roundrobin
#stats enable
server disp83_0 vm83.vm:9000 check port 22 inter 3000 rise 1 fall 2
server disp84_0 vm84.vm:9000 check port 22 inter 3000 rise 1 fall 2
server disp83_1 vm83.vm:9001 check port 22 inter 3000 rise 1 fall 2
server disp84_1 vm84.vm:9001 check port 22 inter 3000 rise 1 fall 2
server disp83_2 vm83.vm:9002 check port 22 inter 3000 rise 1 fall 2
server disp84_2 vm84.vm:9002 check port 22 inter 3000 rise 1 fall 2
server disp83_3 vm83.vm:9003 check port 22 inter 3000 rise 1 fall 2
server disp84_3 vm84.vm:9003 check port 22 inter 3000 rise 1 fall 2
server disp83_4 vm83.vm:9004 check port 22 inter 3000 rise 1 fall 2
server disp84_4 vm84.vm:9004 check port 22 inter 3000 rise 1 fall 2
server disp85_0 vm85.vm:9000 backup
server disp85_1 vm85.vm:9001 backup
server disp85_2 vm85.vm:9002 backup
server disp85_3 vm85.vm:9003 backup
server disp85_4 vm85.vm:9004 backup
option allbackups
listen MySQL_servers 0.0.0.0:10001
#stats enable
mode tcp
balance roundrobin
server mysql86 vm86.vm:3306 check port 22 inter 3000 rise 1 fall 2
server mysql85 vm85.vm:3306 backup
listen memcached 0.0.0.0:10002
#stats enable
mode tcp
balance roundrobin
server memcached86 vm86.vm:11211 check port 22 inter 3000 rise 1 fall 2
server memcached85 vm85.vm:11211 backup
changing the config files for lighty and Rails
You also need to change the configuration of lighty and Rails to use HAProxy (instead of the static allocation of VMs).
The original lighty config file specifies all the dispatcher VMs (and ports). With HAProxy, you need to specify just the HAProxy VM and port. Make the following changes to /etc/lighttpd/lighttpd.cfg. This assumes that HAProxy is running on the web server VM.
fastcgi.server = ( ".fcgi" => ( "dispatcher-proxy" => ( "host" => "127.0.0.1", "port" => 10000 ) ) ) #fastcgi.server = ( ".fcgi" => # ( # "ri1-0" => ( "host" => "192.168.7.183", "port" => 9000 ), # "ri2-1" => ( "host" => "192.168.7.184", "port" => 9001 ), # "ri3-2" => ( "host" => "192.168.7.185", "port" => 9002 ), # "ri1-3" => ( "host" => "192.168.7.183", "port" => 9003 ), # "ri2-4" => ( "host" => "192.168.7.184", "port" => 9004 ), # "ri3-0" => ( "host" => "192.168.7.185", "port" => 9000 ), # "ri1-1" => ( "host" => "192.168.7.183", "port" => 9001 ), # "ri2-2" => ( "host" => "192.168.7.184", "port" => 9002 ), # "ri3-3" => ( "host" => "192.168.7.185", "port" => 9003 ), # "ri1-4" => ( "host" => "192.168.7.183", "port" => 9004 ), # "ri2-0" => ( "host" => "192.168.7.184", "port" => 9000 ), # "ri3-1" => ( "host" => "192.168.7.185", "port" => 9001 ), # "ri1-2" => ( "host" => "192.168.7.183", "port" => 9002 ), # "ri2-3" => ( "host" => "192.168.7.184", "port" => 9003 ), # "ri3-4" => ( "host" => "192.168.7.185", "port" => 9004 ) # ) #)
For dispatchers, you need to change the database server and port. Make the following changes in the config/database.yml (in the production section):
# host: vm86.vm host: vm80.vm port: 10001
Also, if you're using memcached and you want to shut down the VM that's running it (and fail over to another VM), you need to access memcached through HAProxy as well. Make the following changes in config/environment.rb for all dispatcher VMs.
#memcache_servers = [ 'vm86.vm:11211' ] memcache_servers = [ 'vm80.vm:10002' ]
