Chef

From RAD Lab

Jump to: navigation, search

Contents

Introduction

Chef is a systems integration framework written in Ruby. The basic idea is you create a cookbook (git-repo) of recipes (descriptions of different services like hadoop, hypertable, rails that you might want to run). These recipes are composed of resources (file, directories, processes), that you specify declaratively. When you run chef it goes through the cookbook, looks at the specific configuration for the node its running on, and then takes any actions needed to get it into the right state. For example, on the hadoop namenode the file system needs to be formatted before you start it. This is expressed in chef by the following:

execute "/opt/hadoop/bin/hadoop namenode -format" do 
  cwd "/opt/hadoop"
  creates "/mnt/hadoop/tmp/dfs/name"
end

It's also easy to tell chef that a server should be running, and should be restarted if failed:

supervised_service "hadoop_namenode" do
  command "/opt/hadoop/bin/hadoop namenode"
end

Note, this also takes care of things like making sure that only one copy is running at a time, and that the logs get shipped somewhere useful!

Getting the Cookbook

The cookbook is available in a git repository at ssh://scm.millennium.berkeley.edu/project/cs/radlab/src/chef-repo.git.

Starting a Node with Chef

To start a node with chef installed go to the directory where you have checked out the cookbook and run:

ec2run ami-99cc2bf0 -k [key]

Creating a JSON Configuration File

You can set node specific configuration by creating a file filed with json config variables. '/root/json'

  • Recipes - Here you list the recipes that should run on this node. Options include:
    • hadoop::namenode
    • hadoop::datanode
    • hypertable::master
    • hypertable::rangeserver
    • ec2::disk_prep
  • Hypertable Specific Config
    • If you want to have the hypertable src and build in /usr/hypertable/, include the config "hypertable":{"install_type":"source"} after the recipe list
  • Hadoop Specific Config
    • You need to tell all the slaves where the master is so set hadoop[:namenode] to the ec2 internal address
  • Logging Config - There are two logging options that are configured by setting "log_type":
    • file (default) - log to rotated files in /mnt/services/<service name>/log
    • nc - send each log line over udp on port 9001 to "log_host" (very useful for debugging
  • ec2 Specific Config
    • the disk_prep recipe does a first write to your ec2 instance

Example Hadoop/Hypertable master:

{"recipes":["hadoop::namenode","hadoop::datanode","hypertable::master", "hypertable::rangeserver"], 
"hadoop":{"namenode":"domU-12-31-39-00-88-27.compute-1.internal"}, "log_type":"nc", "log_host":"128.32.132.72"}

Example Hadoop/Hypertable slave:

{"recipes":["hadoop::datanode","hypertable::rangeserver"], 
"hadoop":{"namenode":"domU-12-31-39-00-88-27.compute-1.internal"}, "log_type":"nc", "log_host":"128.32.132.72"}

Making It All Happen

After creating the config file, just run: chef-solo -r http://cs.berkeley.edu/~marmbrus/cookbooks.tgz -j /root/json -l debug

Chukwa Integration

If you use the golden image, you can use Chukwa to collect and archive your logs and system metrics.

In order to deploy Chukwa agents on an instance, you will need a Chukwa configuration file.

   {
       "recipes": ["chukwa::default"],
       "chukwa": {
           "adaptors": [
               "add org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Top 15000 /usr/bin/top -b -n 1 -c 0",
               "add org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Df 60000 /bin/df -x nfs -x none 0",
               "add org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Sar 1000 /usr/bin/sar -q -r -n ALL 55 0",
               "add org.apache.hadoop.chukwa.datacollection.adaptor.ExecAdaptor Iostat 1000 /usr/bin/iostat -x -k 55 2 0"
           ]
       }
   }

If you want to collect XTrace reports, add the following two lines under 'adaptors':

           "add edu.berkeley.chukwa_xtrace.XtrAdaptor XTrace TcpReportSource 0",
           "add edu.berkeley.chukwa_xtrace.XtrAdaptor XTrace UdpReportSource 0"

The adaptors can be changed (see The Chukwa programming guide and the rest of the Chukwa documentation.). This configuration file can be found in the Chef Repository (Git).

To configure and deploy the Chukwa agent, run this command: chef-solo -j <path/to/chukwa/config.js>

Custom Resources

Personal tools