• Call: +1 (858) 429-9131

Archive for January, 2012

DevOPS on AWS Cloud using Opscode Chef

Rule the Cloud‘ with Chef
Chef is Infrastructure as Code,an API for your entire infrastructure. Assuming that you are well versed with cloud if not still you should have atleast heard of cloud computing and it is still an evolving paradigm and Cloud computing companies are the newest buzz in the IT sector. Chef is used in conjunction with cloud  from cloud providers say Amazon’s AWS. If a software thats being developed is a mix of technology which is interdependent and works in perfect harmony then why not the people behind it, this thought has led to the emergence of a new cultral trend called DevOPS. Now if you setup a number of instances on the cloud then whats next – new instances on cloud are just like bare metal server and the configuration has to be done from scratch and it would be feasible to do so manually for couple of them what if the count just got bigger say 100 live instances with different unix distros, although a script could be written but still it will not suffice,  in the long run considering management too. Here the CHEF comes into play

“chef is sysadmin robot performing configuration tasks automatically and much more quickly than a single admin could ever hope to” – Jesse Robbins, Opscode CEO.

CHEF is an open source configuration management tool using pure-Ruby,the chef domain specific language for writting system configuration related stuff (recipes and cookbook)

CHEF brings a new feel with its interesting naming conventions relating to cookery like Cookbooks (they contain codes for a software package installation and configuration in the form of Recipes), Knife (API tool), Databags (act like global variables) etc

Although there are many configuration management tools prevailing in the industry CHEF was able to secure its position in the race.

“CHEF take a step farther passes puppet and cfengine — like doing “LIVE SEARCH” within  configuration management like loadbalancer can call out to get a list of the app servers you need to balance  or an applicaton server can call out, get a reference to the master database server  etc …..the centralised chef server is indexing all the information about your infrasturctre  so that you could search in the command line using knife you know in real time so that application could lever that data..” by Seth Chisamore from the OPSCODE.

A techonology peak that isnt fluffy – Cloud
For those folks new to cloud- Its a whole bunch of activites which began as an innovation, recently given out as products and now they have become so widespread and so feature complete that they became suitable for utility services.

So if you dont want cloud in your business its like saying you dont want to use the electricity instead you built your own generator and use it according to your need. Now what do we loose if we continue with that is the competitive edge ie you get the pressure to keep your stuff upgraded inorder to find your place relative to the others in the ecosystem.

Cloud is API oriented, everything you see in cloud is ulitmately programmable.

Virtualization is the foundation of Cloud but virtualization is not Cloud by itself. It certainly enables many of the things we talk about when we talk Cloud but it is not necessary sufficient to be a cloud. Google app engine is a cloud that does not incorporate virtualization. One of the reasons that virtualization is great is because you can automate the procurement of new boxes.

A Culture thats on path to revolutionize IT – DevOPS
Devops is something that orginated in webshops predominantly and it require a kind of tools thats really not available except for home grown tools which the big webshops built over and over again. So the organisation who wanted to use devops started using the tools that enable this transition as most organisations depends on web as a source of revenue in a variety of different ways, even the enterprise desire to be as agile as the webshops. This has begun a revolution from the website permeate into the enterprise base more frequently.

Considering a real life example for Devops say facebook, the most popular social networking site here the developers/QA/operations – there is alot of communications, cross talk happening between them like the developers has to write codes, QA who has to make sure the good code goes out, the operations team has to make sure its up and running. Finally all of these has to be in records which altogether seems to be inefficient, this led to the evolving of the entire system. According to the conventional practices where the developers writes the code and throws it off to the testing. Once the testing is done then it moves to the operations etc. Contrary to that the developers , operations team are all involved in the entire lifecycle of the project as a team. This creates a symbiotic relationship. Now the operations people could understand what the engineers needs the most and the developers are able to see the value that operation people brings as they make architecture decisions.

Cloud with your DevOps offers some fantastic properties. The ability to leverage all the advancements made in software development around repeatability and testability with your infrastructure. The ability to scale up as need be real time (autoscaling) and among other things being able to harness the power of self healing systems. DevOps better with Cloud.

Configuration management say CHEF is one of the most fundamental elements allowing DevOps in the cloud. It allows you to have different VMs that have just enough OS that they can be provisioned, automatically through virtualization, and then through configuration management can be assigned to a distinct purpose within the cloud. The CM system handles turning the lightly provisioned VM into the type of server that it is intended to be.

DevOps & Chef
DevOps is nonthing but a cultural movement where everybody say the developers, QA, Operations, Testing etc get along. A project group formation with a mixed skillset that blurs the line between say a developer and sysadmin. This helps the project to meet its deadlines
and avoid unexpected situations. Cloud computing act like a catalyst to this movement. Thereby the CHEF also hops in.

Chef forms a critical layer in the Devops stack.Thanks to the concept of infrastructure as code and virtualization, we can define and build our infrastructure based on text files. Those files can be version-controlled and tested like regular code. The artifact (ami, image), can then be deployed on an infrastructure. The following image gives you an overview on the similarities.

Inadvertently the issues like “what if the application” or “what if the infrasturcture” are resolved, the fact is that application is the infrastructure and infrastructure is the application and we are here to enable business, also it helped bring peoples in the team into better alignment across the board.

Chef configuration is written in pure ruby.

Devops == Ruby

For those who think Bash is enough as a scripting language – Bash becomes a liability not an asset once your script exceeds 100 lines and a total nightmare if you need to parse or output HTML, CSV, XML, JSON, etc. A significant point to be noted is that Chef uses Ruby in its recipes unlike puppet where it uses its own configuration language that is based on Ruby although chef is heavily inspired from puppet. If you chose chef then you are effectively scripting your infrastructure with ruby.

Though Chef was only released on January 15th , 2009 it has gotten rapid adoption and gained a large number of contributors. According to the Opscode wiki there are 545 approved contributors to Opscode projects and 106 companies. Beyond that the #chef IRC channel is typically attended by over 100 users and Opscode staff, signs of a healthy, growing open source community.

Springsource division of VMware have signed on to contribute to the project. They are even being very public about it as seen in this endorsement:

“We are excited about the open source contributions the Springsource Division of VMware has made to Opscode Chef.” said Javier Soltero, CTO of Springsource Management Products at VMware. “Chef is an important tool for automating infrastructure management and we look forward to its continued growth and success.”

Moreover on my experience of using chef I really enjoyed the quick response I could get from the Opscode Support Team for all my queries and they had always being able to direct me towards a solution.

Automation Using Chef to create an Instance on Amazon Cloud Service Provider with Apache webserver configured in it.

Memo
chef-workstation – is the place where we customize our cookbooks and maintains the chef-repo
chef node – is the management node that we create using chef, it configures itself based on its runlist and downloaded cookbooks

The really cool thing with Chef is that you can rerun cookbooks against a node and it will not do anything it has already done i.e it will not change the end result on the target node as defined by the recipes being run against it. So you will always get the same outcome no matter what state the node and actions will not be taken if already done (and conversely run if detected it has not been run).  When reading about Chef you will see this described as being idempotent (There I’ve saved you looking it up).

Prerequisites – an AWS account, EC2 API configured, OS – Ubuntu.

1. Sign up an account at http://www.opscode.com/hosted-chef/# , Here we use the OHC (opscode hosted chef) where we get to create upto 5 nodes for free!!

2.Verify your opscode account.

3.Download the files

Create an organization in the Console page at www.manage.opscode.com, and then download the following files:

  • Your Organization validation key. This is used to automatically register new Chef Clients (like servers you manage).
  • The Knife configuration file.
  • Your User key. This is used to authenticate your user with Hosted Chef.
  • Edit knife.rb  to add aws access key and secret access key
  • knife[:aws_access_key_id]     = “Your AWS Access Key”
  • knife[:aws_secret_access_key] = “Your AWS Secret Access Key”

At this stage I have a chef ready user environment, an OpsCode organisation set up and now I want to start by spinning up an ec2 instance. I will not be going into any depth regarding  the ec2 specifics as that would make this post far too long.

4.Setting Up chef-Workstation

Install Ruby and Development Tools

#sudo apt-get update
#sudo apt-get install ruby ruby-dev libopenssl-ruby rdoc ri irb build-essential wget ssl-cert git-core
#sudo gem update –system

Install RubyGems

#cd /tmp
#wget http://production.cf.rubygems.org/rubygems/rubygems-1.8.10.tgz
#tar zxf rubygems-1.8.10.tgz
#cd rubygems-1.8.10
#sudo ruby setup.rb –no-format-executable

Install Chef

#sudo gem install chef

5.To verify chef installation

#chef-client -v

6.Build the chef repository

#cd ~
#git clone https://github.com/opscode/chef-repo.git

Knife reads configuration files in .chef. so we need to create those as well

#mkdir -p ~/chef-repo/.chef

Copy the keys and knife configuration you downloaded earlier into this directory:

#cp USERNAME.pem ~/chef-repo/.chef
#cp ORGANIZATION-validator.pem ~/chef-repo/.chef
#cp knife.rb ~/chef-repo/.chef

Run the following command to confirm knife is working with the Hosted Chef API.

#cd ~/chef-repo
#knife client list

output : “ORGANIZATION-validator”

7.Now i need to download the apache2 cookbook on to my workstation, customize if required and then upload it to my account on the opscode platform

#knife cookbook site install apache2

this will notify git and also pulls down the desired cookbook

8.Upload the cookbook using the following command

#knife cookbook upload apache2

9.Enter the following command, sit back and  enjoy the show!!!

#knife ec2 server create -G default -I ami-1212ef7b -f m1.small -S <aws ssh key id> -i <ssh identity file> -x root -r ‘recipe[apache2]’


Before proceeding it would probably be a good idea to take time out and read the Opscode  Chef Recipe wiki which has a nice clear explanation on cookbook name spaces. Also remind yourself of the components that make up a cookbook it’s worth noting that recipes manage resources and those resources will be executed in the order they occur.

Splunk on AWS EC2 CloudSplunk

Whats is Splunk ?

Splunk is a log, monitoring and reporting tool for IT system administrators with search capabilities. It crawls logs, metrics, and other data from applications, servers and network devices and indexes it in a searchable repository from which it can generate graphs, SQL reports and alerts. Splunk can be easily set on the AWS machine archival storage as EBS volumes and periodically syncing the archive from EBS to S3 Bucket or taking EBS snapshots for backup of the logs for the future use.

Generally its hard to track the logs from the server. We do have different monitoring tools such as Nagios, Zabix, here is a new tool named Splunk, which is a kind of bigger solution for providing monitor the visibility inside all the dynamic and complex environment. For example you have an application seems to be very slow, its not because the app have some issue , its because of the lack of free memory on the server. Such kind of details can be obtained from inside the splunk server.

Why do we go for Splunk ?

In auto-scaled where the instances are running under load-balancer scenarios, the servers gets scale up and down, and also there are some situations like some instance gets terminated without any alert. During this situation it will be good to get the login sessions during the server-down state, also the server access logs, so that we can track the reason for the server down. Managing logs on server is really hard, and also the logs will be available on different location. Inorder to address this problem, here we have setup Splunk to listen on a TCP port for any network traffic passes all others servers log to this host, then you will have a centralized, indexed log repository for all of your services.

Here i will guide you on deploying the splunk on the AWS EC2 and configuring splunk forwarder on the remote machine. Splunk is very flexible and is easy to install on any servers. You can select the appropriate hardware capacity planning for your Splunk deployment from here.

Once you have installed the Splunk server , follow the steps given below to start the app:

Now start the Splunk using the command given below:
[NOTE: The here Splunk is installed in /opt location]

/opt/splunk/bin/splunk start

Now you can access the Splunk web UI using the URL given below:

http://domain.com:8000

The Splunk need to be configure in such a way that it should be able to receive the data from the remote machine. For this you will need is to follow the following steps:

1. Login to Splunk WebUI eg. http://10.10.10.35:8000
2. Go to Manager –> Forwarding and receiving –> Receive data
3. Click on New Button and add default port i.e. 9997
4. Click on save button to save the settings.
NOTE: Make sure that the port is opened for the server to accept the data from the remote machine.

Next you will need to install Splunk forwarder on the remote machine. Once you have installed the forwarder start the app as shown below:

/opt/splunk/bin/splunk start

Then enable the forwarder using the command and restart the Splunk app.

./splunk enable app SplunkLightForwarder -auth
Splunk username: admin
Password: changeme
./splunk add forward-server 10.10.10.35:9997 -auth admin
./splunk restart

Now after few minutes you can see the Splunk dashboard indexes all it logs on the realtime dashboard.

Generally in Splunk deployment , we have a deployment server which pushes the configuration on to the deployment client, grouped into server class. The Splunk deployment server is a centralized manager which manages several splunk instances known as deployment client. The deployment client is the Splunk instance installed on the remote machine and parse the log on to the Splunk deployment server.

 

 

The Splunk generally collects the data from the remote machine which contain  the machine-to-machine and also from human-to-machine interaction. With these collected data it indexes to the engine and generates the reports and also drives alert. The email alert can be configured for the specific conditions like. For example we can configure the alert mail when it finds any log containing the error messages. The Splunk will access all these large volume of data and also provides the visibility and intelligence to IT and data ware house. And also will be able to perform the real-time and historic analysis of all the bulk data from the remote machine.

Its easy to use, also to install and also easier deploy method make this application different from others. The Splunk will be very useful for the developer team for finding and fixing the bugs and also helps to provide real time insights.