RADSClassFall06/team3/lab1

From RAD Lab

Jump to: navigation, search

Contents

Lab1

Team 3

Zhangxi Tan

Lilia Gutnik

Methodology and tools

In this lab, we configured vm30.vm as the web server (running on X7) and vm39.vm as the workload generator running on X8. The workload are collected from Rails production.log and researchindex_load's log. The production log is analyzed with a ruby script and railsanalyzer (pl_analyzer).

We fixed the number for fcgi dispatchers of rails server and varied the number of concurrent threads of workload generator. Although we didn't increase the webserver dispatchers, we still reach some very interesting conclusion regarding VM. In our analysis, we present the potential bottleneck of the current system and argue the performance measurment done under VMs are not faithful and realistic.

Production log parse script

We wrote this ruby parse script to use in addition to pl_analyze. This script does several things:

  • Separate the production.log based on PID of dispatcher
  • Sort the total response time/database time/rendering time based on the name of rails controller
 # system ('ps -ef | grep dispatch.fcgi> pidfile')
 
 
 pf = File.new("pidfile","r")
 
 actionlist = Hash.new
 loglist = Hash.new
 action = String.new  
 
 pf.each {
   |str|  
 
   if str =~ /\/www\/dispatch.fcgi/ 
     result = str.match(/\d+/)
     loglist["#{result}"] = ""
   end  
 }
 
 lf = File.new("production.log","r")
 
 lf.each { |line|
   if line =~ /host\srails\[\d+\]/ 
       str = line.match(/\[\d+\]/).to_s
       key = str.match(/\d+/)
       if loglist[key.to_s]
         loglist[key.to_s] = loglist[key.to_s] + line
       end
   end
 }
 
 
 loglist.each_pair do  |key, value|
   filename = "log" + key.to_s
   wf = File.new(filename,"w")
   wf.write value
   
 
   value.each {
    |line|

    str = line.match(/Processing[a-zA-Z\d\s#_\.\/]+/)
 
    if str != nil
      action = str.to_s.match(/\s[a-zA-Z\d\s#_\.\/]+/).to_s
 
      if actionlist.has_key?(action) == false        
         actionlist[action] = {"Action"=>action,"Total"=>0.0, "DB"=>0.0,"Rendering"=>0.0, "count"=>0, "URL"=>" "}
      end
    end
  
    if line =~ /Completed\sin/
         str = line.match(/http:\/\/[a-zA-Z\d\/\._\?@\+#]+/)     
         actionlist[action]["URL"] = str.to_s
       
         str = line.match(/Completed\sin\s(\d+)\.(\d+)/).to_s
         actionlist[action]["Total"] = actionlist[action]["Total"] + str.match(/(\d+)\.(\d+)/).to_s.to_f
 
         str = line.match(/Rendering:\s(\d+)\.(\d+)/).to_s
         actionlist[action]["Rendering"] = actionlist[action]["Rendering"] + str.match(/(\d+)\.(\d+)/).to_s.to_f
 
         str = line.match(/DB:\s(\d+)\.(\d+)/).to_s
         actionlist[action]["DB"] = actionlist[action]["DB"] + str.match(/(\d+)\.(\d+)/).to_s.to_f
 
         actionlist[action]["count"] = actionlist[action]["count"] + 1         
    end   
   }
 end
 
 totaltime = 0.0
 totalcount = 0
 
 wf = File.new("detailedlog","w")
  actionlist.each do |key, value|
   if  value["count"].to_f > 0
    avgTotal = value["Total"] / value["count"].to_f
    avgRendering = value["Rendering"] / value["count"].to_f
    avgDB = value["DB"] / value["count"].to_f
  else
    avgTotal = 0
    avgRendering = 0
    avgDB = 0
 end
   totaltime = totaltime + value["Total"]
   totalcount = totalcount + value["count"]
   wf.write key.to_s + "URL:" + value["URL"].to_s + ' ' + "Total:" + avgTotal.to_s + ' ' + "Rendering:" + avgRendering.to_s + ' ' +     "DB:" + avgDB.to_s + "\n"
 end 
 
 wf.write "Total response time:" + ' '+ totaltime.to_s + " # of requests:" + ' ' + totalcount.to_s

Logs collected

All of our result can be found on either vm30.vm or vm39.vm.

The following files represent the result of pl_analyze being run against the resulting production.log file from a run of researchindex_load. The format is as follows: #X th #Y iter _ PL.txt. #X represents the number of threads being run concurrently. #Y represents the number of iterations for each test.

The production logs are renamed in the same format to represent the number of threads in the test and the number of iterations per test. The production log format is production #x_#Y.log, with #x and #Y the thread and iteration count respectively.

as found in

 [root@vm30 home/railsdev/websites/researchindex/production/log]# 
 
 1th2iter_PL.txt
 1th5iter_PL.txt   
 2th5iter_PL.txt     
 3th5iter_PL.txt   
 4th5iter_PL.txt      
 5th5iter_PL.txt   
 6th5iter_PL.txt
 7th5iter_PL.txt 
 8th5iter_PL.txt
 10th5iter_PL.txt  
 12th5iter_PL.txt  
 16th5iter_PL.txt  
 20th5iter_PL.txt     
 30th5iter_PL.txt  
 40th5iter_PL.txt  
 80th5iter_PL.txt  

as found in

 [railsdev@vm39 ]#
 
 production1_2.log   
 production2_5.log   
 production3_5.log
 production4_5.log
 production5_5.log
 production6_5.log
 production7_5.log
 production8_5.log
 production10_5.log 
 production12_5.log
 production16_5.log    
 production20_5.log  
 production30_5.log
 production40_5.log
 production80_5.log

All the data are with 5 dispatchers.

Data Visualization and Bottleneck Analysis

The following figure shows average response time of different rails controller, when increase more users to web server. e.g. 10 thread means 10 concurrent workload generator. The x-axis presents are the names of rails controller invoked.

Image:Respond.JPG

This figure shows the average DB time in second.

Image:Dbt.JPG

This figure presents the average rendering time.

Image:Render.JPG

This graph is the percentage time spent in database.

Image:Dbr.JPG

From the workload generator side (user side) we observed:

10-thread:

      avg user time:   0.073717
       avg sys time:   0.158183
     total cpu time:   139.1400
       elapsed time:   152.2190
           req rate:     3.9417     

20-thread:

      avg user time:   0.102567
       avg sys time:   0.206667
     total cpu time:    92.7700
       elapsed time:    95.1076
           req rate:     3.1543  

30-thread:

      avg user time:   0.102567
       avg sys time:   0.206667
     total cpu time:    92.7700
       elapsed time:    95.1076
           req rate:     3.1543

80-thread:

      avg user time:   0.057275
       avg sys time:   0.216567
     total cpu time:   328.6100
       elapsed time:   332.4182
           req rate:     3.6099

The request rate is stable, which is 3-4/s.

We draw the following conclusions

  • With a fixed number of dispatchers (5 in all experiement), increasing number of concurrent thread won't help to saturate applications server. Increasing the number of dispatchers will help increase the concurrency of application server, because more request can be processed and throw to database concurrently.
  • SearchController#ajax_externalsearch takes longer time than other request but it requires relative less rendering and DB time. Probably, some external traffic dominates.
  • For most of rails controllers that access DB, database access time dominates (over 70%). If increasing number of dispatcher, this percentage is likely higher. Tables that are used by CollectionsController#show takes longer time to complete than others.

What is the real problem

Given the number saw above, it seems that increasing the number of dispatcher will exert more pressure on the DB backend. Before doing futher adjustment, we should have a look at who kills the CPU time? Was the server fully loaded?

The answer is YES! Let's look at this figure, which is the CPU utilization on VM during our experiement over the whole day. When we were doing the experiement, the CPU utilization already goes to 100%! Most of them are system CPU time (usuall over 90%).

Enlarge

More interesting results come from the top program. It indicates over 70% CPU time is consumed by 5 dispatchers. Therefore,

  • Adding more dispatchers will further increase the CPU share of dispatcher and make DB worse.
  • DB should be moved to another machine!

Why is the system CPU time so high?

System CPU time comes from system call which usually invoked by IO operations. To help understand this problem, let's look at the cpu utilization on x7 (physic node for vm30). This graph is the utilization of x7 for the whole day. x7 is a dual CPU machine. Assume this version of vmware server doesn't support multiprocessor, the utilization should be 50% if the VM CPU utilization is 100%. While we put more load on application server, the CPU utilization on x7 increases but the maximum is around 40%. If we increase the number of dispatchers of lighthttpd and use more concurrent copies of workload generator, the CPU utilization on x7 will goes down!

Enlarge

This is because I/O is more expensive under VM, if more requests are handled more I/O will be required to be performed (or rather translated by VM hypervisor. The I/O operations of rails application server come from two parts. One is from mysql database's disk access . Another is from dispatcher's network access. The more requests the more overhead using VM. Finally, CPU is no longer a bottle neck for our application.

We argue using VM as a way doing performance measurement will not offer faithful results. In addition, the VM timer interrupt is no longer accurate enough for finegrained operations. All of the problems mentioned can be worse if we use binary translation style VM on x86 machine.

Personal tools