• Call: +1 (858) 429-9131

Splunk on AWS EC2 CloudSplunk

Whats is Splunk ?

Splunk is a log, monitoring and reporting tool for IT system administrators with search capabilities. It crawls logs, metrics, and other data from applications, servers and network devices and indexes it in a searchable repository from which it can generate graphs, SQL reports and alerts. Splunk can be easily set on the AWS machine archival storage as EBS volumes and periodically syncing the archive from EBS to S3 Bucket or taking EBS snapshots for backup of the logs for the future use.

Generally its hard to track the logs from the server. We do have different monitoring tools such as Nagios, Zabix, here is a new tool named Splunk, which is a kind of bigger solution for providing monitor the visibility inside all the dynamic and complex environment. For example you have an application seems to be very slow, its not because the app have some issue , its because of the lack of free memory on the server. Such kind of details can be obtained from inside the splunk server.

Why do we go for Splunk ?

In auto-scaled where the instances are running under load-balancer scenarios, the servers gets scale up and down, and also there are some situations like some instance gets terminated without any alert. During this situation it will be good to get the login sessions during the server-down state, also the server access logs, so that we can track the reason for the server down. Managing logs on server is really hard, and also the logs will be available on different location. Inorder to address this problem, here we have setup Splunk to listen on a TCP port for any network traffic passes all others servers log to this host, then you will have a centralized, indexed log repository for all of your services.

Here i will guide you on deploying the splunk on the AWS EC2 and configuring splunk forwarder on the remote machine. Splunk is very flexible and is easy to install on any servers. You can select the appropriate hardware capacity planning for your Splunk deployment from here.

Once you have installed the Splunk server , follow the steps given below to start the app:

Now start the Splunk using the command given below:
[NOTE: The here Splunk is installed in /opt location]

/opt/splunk/bin/splunk start

Now you can access the Splunk web UI using the URL given below:

http://domain.com:8000

The Splunk need to be configure in such a way that it should be able to receive the data from the remote machine. For this you will need is to follow the following steps:

1. Login to Splunk WebUI eg. http://10.10.10.35:8000
2. Go to Manager –> Forwarding and receiving –> Receive data
3. Click on New Button and add default port i.e. 9997
4. Click on save button to save the settings.
NOTE: Make sure that the port is opened for the server to accept the data from the remote machine.

Next you will need to install Splunk forwarder on the remote machine. Once you have installed the forwarder start the app as shown below:

/opt/splunk/bin/splunk start

Then enable the forwarder using the command and restart the Splunk app.

./splunk enable app SplunkLightForwarder -auth
Splunk username: admin
Password: changeme
./splunk add forward-server 10.10.10.35:9997 -auth admin
./splunk restart

Now after few minutes you can see the Splunk dashboard indexes all it logs on the realtime dashboard.

Generally in Splunk deployment , we have a deployment server which pushes the configuration on to the deployment client, grouped into server class. The Splunk deployment server is a centralized manager which manages several splunk instances known as deployment client. The deployment client is the Splunk instance installed on the remote machine and parse the log on to the Splunk deployment server.

 

 

The Splunk generally collects the data from the remote machine which contain  the machine-to-machine and also from human-to-machine interaction. With these collected data it indexes to the engine and generates the reports and also drives alert. The email alert can be configured for the specific conditions like. For example we can configure the alert mail when it finds any log containing the error messages. The Splunk will access all these large volume of data and also provides the visibility and intelligence to IT and data ware house. And also will be able to perform the real-time and historic analysis of all the bulk data from the remote machine.

Its easy to use, also to install and also easier deploy method make this application different from others. The Splunk will be very useful for the developer team for finding and fixing the bugs and also helps to provide real time insights.

Mapreduce using Hadoop + pig/hive on AWS EC2 hadoop cluster

This article discuss about running mapreduce jobs using the apache tools called pig and hive.Before we can process the data we need to upload the files to be processed to HDFS/S3.  We recommend uploading to hdfs and keeping the important files in s3 for backup is a better practice. s3 is easily accessible from commandline using tools like s3cmd. HDFS is a failover cluster filesystem which provides enough protection to your data over instance failures.

Mapreduce

MapReduce is a programming model and an associated implementation for processing and generating large data sets. We can specify a map function that processes a key/value pair to generate a set of intermediate key/value pairs, and a reduce function that merges all intermediate values associated with the same intermediate key.

The main steps hadoop takes to run a job are

  1. The client, which submits the MapReduce job.
  2. The jobtracker, which coordinates the job run. The jobtracker is a Java application whose main class is JobTracker.
  3. The tasktrackers, which run the tasks that the job has been split into. Tasktrackers are Java applications whose main class is TaskTracker.
  4. The distributed filesystem (normally HDFS), which is used for sharing job files between the other entities.

Hadoop Map/Reduce is very powerful, but

o   Requires a Java Programmer.

o   Harder to write and also time consuming.

o   Difficult to update frequently.

A solution is to Run jobs using pig(Piglatin)/hive(HiveQL).

Pig

• An engine for executing programs on top of Hadoop

• It provides a language, Pig Latin, to specify these programs

Pig has Two main parts:

– A high level language to express data analysis

– Compiler to generate mapreduce programs (which can run on top of Hadoop)

Pig Latin is the name of the language with which Pig scripts are written. Pig also provides an interactive shell for executing simple commands, called Grunt. Pig Latin is a high level language. Pig runs on top of Hadoop. It collect the data for processing from Hadoop HDFS filesystem and Submit the jobs to the Hadoop mapreduce system.

A sample mapreduce job (like a Hello World program) using pig is given below

It is assumed that you are on one of the machines which is a part of a hadoop cluster having NameNode/DataNode as well as JobTracker/TaskTracker setup.

We will be executing piglatin commands using grunt shell. Switch to hadoop user first .

Consider we have a file ‘users’ on our local filesystem which contain data to be processed.First we have to upload it to hdfs. Then

# pig -x mapreduce

this command will take you to grunt shell. Pig Latin statements are generally

organized in the following manner:

A LOAD statement reads data from the file system.Then we process the data.And writes output to the file system using STORE statement. A DUMP statement displays output to the screen.

grunt> Users = load ‘users’ as (name, age);

grunt> Fltrd = filter Users by age >= 18 and age <= 25;

grunt> Pages = load ‘pages’ as (user, url);

grunt> Jnd = join Fltrd by name, Pages by user;

grunt> Grpd = group Jnd by url;

grunt> Smmd = foreach Grpd generate group, COUNT(Jnd) as clicks;

grunt> Srtd = order Smmd by clicks desc;

grunt> Top5 = limit Srtd 5;

grunt> store Top5 into ‘top5sites’;

We can also view the progress of the job through the web interface http://<ipaddress of jobtracker machine>:50030.

Tools like PigPen (an eclipse plugin) are available  that helps us create pig-scripts, test them using the example generator and then submit them to a hadoop cluster.

There is another tool called oozie – Oozie is a server based Workflow Engine specialized in running workflow jobs with actions that run Hadoop Map/Reduce and Pig jobs.

Pig tasks can be modeled as a workflow in oozie. These are deployed to the Oozie server using a command line utility. Once deployed, the workflows can be started and manipulated as necessary using the same utility. Once the workflow is started Oozie will run through each flow.. The web console for Oozie server can be used to monitor the progress of various workflow jobs being managed by the server.

Hive

 

Pig, was causing some slowdowns at Facebook company as it needed training to bring business intelligence users up to speed. So the development team decided to write Hive which has an SQL like syntax.

Apache Hive is a data warehouse infrastructure built on top of Apache Hadoop. It provides tools for querying and analysis of large data sets stored in Hadoop files. Hive defines a simple SQL-like query language, called HiveQL, that enables users familiar with SQL to query the data. Also it allows custom mappers and reducers to perform more sophisticated analysis that may not be supported by the built-in capabilities of the language.

Some of the queries in HiveQL are given below, which is very similar to the SQL.

# show tables;

# describe <tablename>;

# SELECT * FROM <tablename> LIMIT 10;

#  CREATE TABLE table_name

#  ALTER TABLE table_name RENAME TO new_table_name

#  DROP TABLE table_name

NoSQL databases like Cassandra provide support for hadoop. Cassandra supports running Hadoop MapReduce jobs against the Cassandra cluster. With proper cluster configuration, MapReduce jobs can retrieve data from Cassandra and then output results either back into Cassandra, or into a file system.

HADOOP Cluster on AWS EC2 with hadoop-0.20 and ubuntu-10.04

Let’s start with a small introduction- what is hadoop ?. Hadoop is an open-source project administered by the Apache Software Foundation. Apache Hadoop is a Java software framework that supports data-intensive distributed applications under a free license. It enables applications to work with thousands of nodes and petabytes of data. Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers.

Technically, Hadoop consists of two key services: reliable data storage using the Hadoop Distributed File System (HDFS) and high-performance parallel data processing using a technique called MapReduce.

Dealing with big data requires two things:

  • Inexpensive, reliable storage; and
  • New tools for analyzing unstructured and structured data.

Hadoop creates clusters of machines and coordinates work among them. Clusters can be built with inexpensive computers.If one fails, Hadoop continues to operate the cluster without losing data or interrupting work, by shifting work to the remaining machines in the cluster.

HDFS manages storage on the cluster by breaking incoming files into pieces, called “blocks,” and storing each of the blocks redundantly across the pool of servers.

The main services running in a hadoop cluster will be

1)namenode

2)jobtracker

3)secondarynamenode

These three will be running only on a single node(machine) ; that machine is the central machine which controls the cluster.

4)datanode

5)tasktracker

These two services will be running on all other nodes in the cluster.

HDFS has a master/slave architecture. An HDFS cluster consists of a single NameNode, a master server that manages the file system namespace and regulates access to files by clients. In addition, there are a number of DataNodes, usually one per node in the cluster, which manage storage attached to the nodes that they run on.

Above the file systems comes the MapReduce  engine, which consists of one Job Tracker, to which client applications submit MapReduce jobs. The Job Tracker pushes work out to available Task Tracker nodes in the cluster, striving to keep the work as close to the data as possible.

The only purpose of the secondary name-node is to perform periodic checkpoints. The secondary name-node periodically downloads current name-node image and edits log files, joins them into new image and uploads the new image back to the (primary and the only) name-node.

Now Let us have a look at how to build a hadoop cluster using Cloudera hadoop-0.20 on ubuntu-10.04

You should install sun –jdk  first. Then add the following repositories to the apt sources list.

vim /etc/apt/sources.list.d/cloudera.list

[bash]

deb http://archive.cloudera.com/debian lucid-cdh3u0 contrib

deb-src http://archive.cloudera.com/debian lucid-cdh3u0 contrib

[/bash]

Import key

[bash]curl -s http://archive.cloudera.com/debian/archive.key | apt-key add -[/bash]

Then run

[bash]apt-get update[/bash]

For Namenode/Jobtracker ( These two services should run only on a single central machine in the cluster)

[bash]

apt-get install hadoop –yes

apt-get install hadoop-0.20-namenode

apt-get install hadoop-0.20-jobtracker

apt-get install hadoop-0.20-secondarynamenode

[/bash]

Configuration

vim /etc/hadoop/conf/hadoop-env.sh

Append these

[bash]

export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.24/   ( your java home comes here )

export HADOOP_CONF_DIR=/etc/hadoop/conf

export HADOOP_HOME=/usr/lib/hadoop-0.20

export HADOOP_NAMENODE_USER=hdfs

export HADOOP_SECONDARYNAMENODE_USER=hdfs

export HADOOP_DATANODE_USER=hdfs

export HADOOP_JOBTRACKER_USER=mapred

export HADOOP_TASKTRACKER_USER=mapred

export HADOOP_IDENT_STRING=hadoop

[/bash]

vim /etc/hadoop/conf/core-site.xml

[bash]

<?xml version=”1.0″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<!– Put site-specific property overrides in this file. –>

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://< ip address of this machine >:8020</value>

</property>

</configuration>

[/bash]

vim /etc/hadoop/conf/hdfs-site.xml

 

[bash]

<?xml version=”1.0″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<!– Put site-specific property overrides in this file. –>

<configuration>

<property>

<name>dfs.name.dir</name>

<value>/var/lib/hadoop-0.20/name</value>

</property>

<property>

<name>dfs.data.dir</name>

<value>/var/lib/hadoop-0.20/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

</configuration>

[/bash]

vim /etc/hadoop/conf/mapred-site.xml

[bash]

<?xml version=”1.0″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

<!– Put site-specific property overrides in this file. –>

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>< ip address of this machine >:8021</value>

</property>

<property>

<name>mapred.system.dir</name>

<value>/var/lib/hadoop-0.20/system</value>

</property>

<property>

<name>mapred.local.dir</name>

<value>/var/lib/hadoop-0.20/mapred</value>

</property>

</configuration>

[/bash]

——————————————————————————————————————————————

[bash]

mkdir  / var/lib/hadoop-0.20/name

mkdir  / var/lib/hadoop-0.20/data

mkdir  / var/lib/hadoop-0.20/system

mkdir  / var/lib/hadoop-0.20/mapred

chown -R hdfs /var/lib/hadoop-0.20/name

chown -R hdfs /var/lib/hadoop-0.20/data

chown -R mapred /var/lib/hadoop-0.20/mapred

[/bash]

Now format NameNode

[bash]yes Y | /usr/bin/hadoop namenode –format[/bash]

Start namenode

[bash]/etc/init.d/hadoop-0.20-namenode start[/bash]

Check the log Files for error:

less /usr/lib/hadoop-0.20/logs/hadoop-hadoop-namenode-<ip>.log

Also you can check whether the Namenode process is up or not using the command

[bash]# jps[/bash]

Start the SecondaryNamenode

[bash]/etc/init.d/hadoop-0.20-secondarynamenode start[/bash]

Log: less /usr/lib/hadoop-0.20/logs/hadoop-hadoop-secondarynamenode-<ip>.log

[bash]

sudo -u hdfs hadoop fs -mkdir /var/lib/hadoop-0.20/system

sudo -u hdfs hadoop fs -chown mapred /var/lib/hadoop-0.20/system

[/bash]

Now Start the JobTracker

[bash]/etc/init.d/hadoop-0.20-jobtracker start[/bash]

Log : less /usr/lib/hadoop-0.20/logs/hadoop-hadoop-jobtracker-ip-10-108-39-34.log

Now  jps  command will show the three processes up

# jps

19233 JobTracker

18994 SecondaryNameNode

18871 NameNode

For Datanode/Tasktracker ( These two services should be running on all the other machines in the cluster )

[bash]

apt-get install hadoop-0.20-datanode

apt-get install hadoop-0.20-tasktracker

[/bash]

Configuration

vim /etc/hadoop/conf/core-site.xml

 

[bash]

<?xml version=”1.0″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

&nbsp;

<!– Put site-specific property overrides in this file. –>

&nbsp;

<configuration>

<property>

<name>fs.default.name</name>

<value>hdfs://< ip address of the namenode >:8020</value>

</property>

</configuration>

[/bash]

vim /etc/hadoop/conf/hdfs-site.xml

[bash]

<?xml version=”1.0″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

&nbsp;

<!– Put site-specific property overrides in this file. –>

&nbsp;

<configuration>

<property>

<name>dfs.name.dir</name>

<value>/var/lib/hadoop-0.20/name</value>

</property>

<property>

<name>dfs.data.dir</name>

<value>/var/lib/hadoop-0.20/data</value>

</property>

<property>

<name>dfs.replication</name>

<value>2</value>

</property>

</configuration>

[/bash]

vim /etc/hadoop/conf/mapred-site.xml

[bash]

<?xml version=”1.0″?>

<?xml-stylesheet type=”text/xsl” href=”configuration.xsl”?>

&nbsp;

<!– Put site-specific property overrides in this file. –>

&nbsp;

<configuration>

<property>

<name>mapred.job.tracker</name>

<value>< ip address of jobtracker  >:8021</value>

</property>

<property>

<name>mapred.system.dir</name>

<value>/var/lib/hadoop-0.20/system</value>

</property>

<property>

<name>mapred.local.dir</name>

<value>/var/lib/hadoop-0.20/mapred</value>

</property>

</configuration>

[/bash]

———————————————————————————————————————————————

[bash]

mkdir  /var/lib/hadoop-0.20/data/

chown -R hdfs /var/lib/hadoop-0.20/data

mkdir /var/lib/hadoop-0.20/mapred

chown -R mapred /var/lib/hadoop-0.20/mapred

[/bash]

Start the DataNode

[bash]/etc/init.d/hadoop-0.20-datanode start[/bash]

Log : less /usr/lib/hadoop-0.20/logs/hadoop-hadoop-datanode-<ip>.log

Start the TaskTracker

[bash]/etc/init.d/hadoop-0.20-tasktracker start[/bash]

Log: less /usr/lib/hadoop-0.20/logs/hadoop-hadoop-tasktracker-<ip>.log

You can now check the interface

http://< namenode-ip >:50070   – for HDFS overview

and

http://< jobtracker –ip>:50030  – for Mapreduce overview

Cassandra Cluster on AWS EC2 with Cassandra 7.x and ubuntu 10.04

Cassandra is a highly scalable, eventually consistent, distributed, structured key-value store. Cassandra brings together  Dynamo’s fully distributed design  and Bigtable’s ColumnFamily-based data model.

In a cluster, Cassandra nodes exchange information about one another using a mechanism called Gossip. The nodes in a cluster needs to know one another.  Nodes named “seed”s are the centre of this communication mechanism. It’s customary to pick a small number of relatively stable nodes to serve as your seeds. Do make sure that each seed also knows of at least one other. Having two nodes is what is preferred.

Lets have a look at how we can bring a Cassandra cluster up with Cassandra 7.x on ubuntu 10.04

First of all you have to install the java/jdk .  As that is out of scope for our discussion please do it on your own and let’s start with cassandra.

Add the following repositories to your apt sources list

vim /etc/apt/sources.list.d/cassandra.list

[bash]deb http://www.apache.org/dist/cassandra/debian 07x main
deb-src http://www.apache.org/dist/cassandra/debian 07x main[/bash]

Import the following keys and add it to apt-key

[bash]

gpg –keyserver keyserver.ubuntu.com –recv-keys 4BD736A82B5C1B00

gpg –export –armor 4BD736A82B5C1B00 | sudo apt-key add –

gpg –keyserver keyserver.ubuntu.com –recv-keys F758CE318D77295D

gpg –export –armor F758CE318D77295D | sudo apt-key add –

[/bash]

Execute

[bash]apt-get update[/bash]

and make sure that no error is there with accessing the packages.

Installing cassandra on all nodes(machines) with  which we intend to build the cluster.

[bash]apt-get install cassandra  –yes[/bash]

Now edit the configuration file for Cassandra

vim /etc/cassandra/cassandra.yaml

Here  I will discuss the important directives that has to be edited for the cluster to take effect

initial_token:

eg:  initial_token:  136112946768375385385349842972707284582

This parameter determines the position of each node in the Cassandra ring. Initial token for the first seed node should be ‘0’.Here is a simple Python script that helps to calculate the token values.

[bash]

#! /usr/bin/python

import sys

if (len(sys.argv) > 1):

num=int(sys.argv[1])

else:

num=int(raw_input(“How many nodes are in your cluster? “))

for i in range(0, num):

print ‘node %d: %d’ % (i, (i*(2**127)/num))

[/bash]

executing this script will prompt you for the no. of nodes in your cluster. Then it will output the initial tokens for each node.

For eg: Consider a 2 node cluster, the tokens will be

node 0: 0

node 1: 85070591730234615865843651857942052864

auto_bootstrap: false

You can set this to false as we are just going to start the cluster for the first time.

seeds:

-< ip address >

As I told you earlier, the seeds mentioned here will control the communication between the nodes.

You can give the ips of the two nodes here  for which you assigned the first two initial tokens generated by the script above.

Example:

Seeds:

-192.168.1.10

-192.168.1.13

This seed entries should be the same on all nodes of the cluster.

listen_address:

&

rpc_address:

You can leave both empty.

Starting  the Cassandra

For starting Cassandra you can either use an init script/ or the command “cassandra”. Here I will use the second option.

As Cassandra service was started during the installation some values will be stored in /var/lib/cassandra/data directory. So Before starting Cassandra follow these steps.

[bash]

1)      /etc/init.d/cassandra stop

2)      rm –rf  /var/lib/cassandra/data

3)      mkdir /var/lib/cassandra/data

[/bash]

After doing these steps on all the nodes please run the following  command to start Cassandra on each node starting from the seed node 1

[bash]# cassandra &[/bash]

After starting Cassandra on all the nodes you can check the cluster status using the following command

[bash]nodetool -h <ip of the node >  -p 8080 ring[/bash]

or

[bash]nodetool -h localhost -p 8080 ring[/bash]

Achieving HIPAA on AWS / EC2 with Windows Server 2008

When you are creating a HIPAA compliant system on cloud service like AWS / EC2 / S3, you have to carefully examine the different levels of data security provided by the Cloud Service provider

At a minimum level, the following should be ascertained:

i) Where is the Cloud provider’s data center physically located. In some countries, HIPAA restricts Protected Health Information ( PHI ) to be stored on servers located outside of the country.

ii) Whether the cloud provider contractually obligated to protect the customer’s data at the same level as the customer’s own internal policies?

iii) Cloud provider’s Backup and Recovery policies

iv) What are the provider’s policies on data handling/management and access control? Do adequate controls exist to prevent impermissible copying or removal of customer data by the provider, or by unauthorized employees of the company?

v) What happens to data when it is deleted? This is very important as customers will be storing data on virtual Machines. Also What happens to cloud hardware when the hardware is replaced?

In this blog we are only looking at the different security levels to be taken by the application developer to make sure that a web application built on AWS / EC2 using Windows Server 2008 / .NET / MSSQL / IIS 7 / is HIPAA compliant. The basic requirement is to encrypt all the data at rest and transit

1. Encrypting Data in transit between the user ( clients ) and the server ( Webserver )

SSL over HTTP ( HTTPS )

Steps used to Implement SSL on IIS are the following:

[bash]
1.Open IIS Manager.
2.Click on the server name.
3.Double-click the “Server Certificates” button in the “Security” section
4.Click on self-signed certificate
5.Enter certificate name and click ok
6. Select the name of the server to which the certificate was installed.

7. From the “Actions” menu (on the right), click on “Bindings.” This will open the “Site Bindings” window

8. In the “Site Bindings” window, click “Add” This will open the “Add Site Binding” window

9. Under “Type” choose https. The IP address should be the IP address of the site , and the port over which traffic will be secured by SSL is usually 443. The “SSL Certificate” field should specify the certificate that was installed in step 5.

10.Click “OK.” . SSL is now installed .
[/bash]

2 ) Encrypting Data at Rest ( Document Root )

EFS with IIS

You can use EFS ( Encrypted File System ) in Windows 2008 Server to automatically encrypt your data when it is stored on the hard disk.

Encrypt a Folder:

[bash]
1. Open Windows Explorer.
2. Right-click the folder that you want to encrypt , and then click Properties.
3. On the General tab, click Advanced.
4. Under Compress or Encrypt attributes, select the Encrypt contents to secure data check box and then click OK.
5. Click OK.
6. In the Confirm Attribute Changes dialog box that appears, use one of the following steps:
i) If you want to encrypt only the folder, click Apply changes to this folder only, and then click OK.
ii) If you want to encrypt the existing folder contents along with the folder, click Apply changes to this folder, subfolders and files, and then click OK.
[/bash]

The folder becomes an encrypted folder. New files that you create in this folder are automatically encrypted


3 ) Encrypting MSSQL Database ( Data at Rest )

TDE ( Transparent Data Encryption )

TDE is a new feature inbuilt in MSSQL Server 2008 Enterprise Edition . Data is encrypted before it is written to disk; data is decrypted when it is read from disk. The “transparent” aspect of TDE is that the encryption is performed by the database engine and SQL Server clients are completely unaware of it. There is absolutely no code that needs to be written to perform the encryption and decryption .So there is no need for changing any code ( Database Queries ) in the Application .

STEPS

i) Create a Master Key

A master key is a symmetric key that is used to create certificates and asymmetric keys. Execute the following script to create a master key:

[bash]
USE master;
CREATE MASTER KEY
ENCRYPTION BY PASSWORD = ‘Pass@word1’;
GO
[/bash]

ii)Create Certificate

Certificates can be used to create symmetric keys for data encryption or to encrypt the data directly. Execute the following script to create a certificate:

[bash]
CREATE CERTIFICATE TDECert
WITH SUBJECT = ‘TDE Certificate’
GO
[/bash]

iii) Create a Database Encryption Key and Protect it by the Certificate

[bash]
1.Go to object explorer in the left pane of the MSSQL SERVER Management Studio
2.Right Click on the database on which TDE Requires
3.Click Tasks and Navigate to Manage Database Encryption
4. Select the encrytion algorithm (AES 128/192/256) and select the certificate you have created
5.Then Mark the check Box for Set Database Encryption On
[/bash]

You can query the is_encrypted column in sys.databases to determine whether TDE is enabled for a particular database.

[bash]
SELECT [name], is_encrypted FROM sys.databases
GO
[/bash]


4 ) Encrypting Data in transit between the Webserver and the MSSQL Database

MSSQL secure connection using SSL

i) Creating a self-singned cert using makecert
[bash]
makecert -r -pe -n “CN=YOUR_SERVER_FQDN” -b 01/01/2000 -e 01/01/2036 -eku 1.3.6.1.5.5.7.3.1 -ss my -sr localMachine -sky exchange -sp “Microsoft RSA SChannel Cryptographic Provider” -sy 12 c:\test.cer
[/bash]

ii) Install this cert

[bash]
Copy c:\test.cer into your client machine, run c:\test.cer from command window, select “Install Certificate”. -&gt; click “Next” -&gt; select “Place all certificates in the following store” –&gt; click “Browser” -&gt; select “Trusted Root Certification Authorities” -&gt; select OK and Finish
[/bash]

iii) Open SQL Server Configuration Manager

[bash]
Expand SQL Server Network Configuration, right-click “Protocols for MSSQLSERVER” then click “properties”. On the “Certificate” tab select the certificate just installed . On the “Flags” tab, set “ForceEncryption” YES.
[/bash]

Now SSL is ready to be used on the server. The only modification needed in the .NET code is connection string. It will be

[bash]
connectionString=”Data Source=localhost;Initial Catalog=mydb;User ID=user1;Password=pas@123;Encrypt=true;TrustServerCertificate=true”
[/bash]

SSL for Tomcat on AWS EC2

To launch an AWS/EC2 instance, at first setting up a security group to specify what network traffic is allowed to reach the instance. Then select an AMI and launch an instance from it. And create a volume in the same zone of the instance and attach with it. Format the device and mount it to a directory. After that follow the steps to create SSL for Tomcat:

1. For the tomcat we need java, so create a directory to save the Java Binary file.

[shell] mkdir /usr/java
cd /usr/java [/shell]

2. Download jdk binary file (jdk-x-linux-ix.bin) here
Use URL http://www.oracle.com/technetwork/java/archive-139210.html

3. Execute the Binary file

[shell] /usr/java/jdk-x-linux-ix.bin [/shell]

Now we have the Java in our device. Then Download the Tomcat and install it followed by the instructions:-

1. Create a directory to save the tomcat

[shell] mkdir /usr/tomcat
cd /usr/tomcat [/shell]

2. Download tomcat source file (apache-tomcat-x.tar.gz) here
Use URL http://apache.hoxt.com/tomcat/tomcat-6/v6.0.32/bin/

3. Extract that file

[shell] tar -zxvf apache-tomcat-x.tar.gz [/shell]

4. Edit the catalina.sh file

[shell] vim /usr/tomcat/apache-tomcat-x/bin/catalina.sh [/shell]

[shell]

#** Add at the top **
JAVA_HOME=/usr/java/jdk1.x.x_x

[/shell]

save and exit
5. Start the tomcat

[shell] /usr/tomcat/apache-tomcat-x/bin/startup.sh [/shell]

6. We can see the logs by using the given command

[shell]tail -f /usr/tomcat/apache-tomcat-x/logs/catalina.out [/shell]

7. Take the browser and enter the URL http://localhost
Now we can see the tomcat index page

8. To stop the tomcat

[shell]/usr/tomcat/apache-tomcat-x/bin/shutdown.sh [/shell]

Now configure the SSL Certificate for tomcat. When you choose to activate SSL on your web server you will be prompted to complete a number of questions about the identity of your website and your company. Your web server then creates two cryptographic keys – a Private Key and a Public Key. The Public Key does not need to be secret and is placed into a Certificate Signing Request (CSR) – a data file also containing your details.

Create a self signed certificate authority (CA) and keystore.

1. Make a directory to hold the certs and keystore. This might be something like:

[shell] mkdir /usr/tomcat/ssl
cd /usr/tomcat/ssl [/shell]

2. Generate a private key for the server and remember it for the next steps

[shell]openssl genrsa -des3 -out server.key 1024[/shell]

Generating RSA private key, 1024 bit long modulus
…………………..++++++
…++++++
e is 65537 (0x10001)
Enter pass phrase for server.key:
Verifying – Enter pass phrase for server.key:

3. Generate a CSR (Certificate Signing Request). Give the data after executing this command

[shell]openssl req -new -key server.key -out server.csr[/shell]

Enter pass phrase for server.key:
You are about to be asked to enter information that will be incorporated
into your certificate request.
What you are about to enter is what is called a Distinguished Name or a DN.
There are quite a few fields but you can leave some blank
For some fields there will be a default value,
If you enter ‘.’, the field will be left blank.
—–
Country Name (2 letter code) [GB]:
State or Province Name (full name) [Berkshire]:
Locality Name (eg, city) [Newbury]:
Organization Name (eg, company) [My Company Ltd]:
Organizational Unit Name (eg, section) []:
Common Name (eg, your name or your server’s hostname) []:
Email Address []:

Please enter the following ‘extra’ attributes
to be sent with your certificate request
A challenge password []:
An optional company name []:

4. Remove the passphrasse from the key

[shell]cp server.key server.key.org
openssl rsa -in server.key.org -out server.key[/shell]

Enter pass phrase for server.key.org:
writing RSA key

5. Generate the self signed certificate

[shell]openssl x509 -req -days 365 -in server.csr -signkey server.key -out server.crt[/shell]

Signature ok
subject=/C=GB/ST=Berkshire/L=Newbury/O=My Company Ltd
Getting Private key

You should then submit the CSR. During the SSL Certificate application process, the Certification Authority will validate your details and issue an SSL Certificate containing your details and allowing you to use SSL. Typically an SSL Certificate will contain your domain name, your company name, your address, your city, your state and your country. It will also contain the expiration date of the Certificate and details of the Certification Authority responsible for the issuance of the Certificate.

Create a certificate for tomcat and add both to the keystore

1. Change the path to ssl

[shell]cd /usr/tomcat/ssl[/shell]

2. Create a keypair for ‘tomcat’

[shell]keytool -genkey -alias tom -keyalg RSA -keystore tom.ks[/shell]

Enter keystore password:
Re-enter new password:
What is your first and last name?
[Unknown]:
What is the name of your organizational unit?
[Unknown]:
What is the name of your organization?
[Unknown]:
What is the name of your City or Locality?
[Unknown]:
What is the name of your State or Province?
[Unknown]:
What is the two-letter country code for this unit?
[Unknown]:

Is CN=Unknown, OU=Unknown, O=Unknown, L=Unknown, ST=Unknown, C=Unknown correct?
[no]: yes

Enter key password for <tom>
(RETURN if same as keystore password):
Re-enter new password:

3. Generate a CSR (Certificate Signing Request) for tomcat

[shell]keytool -keystore tom.ks -alias tom -certreq -file tom.csr[/shell]

Enter keystore password:

4. create unique serial number

[shell]echo 02 > serial.txt[/shell]

5. Sign the tomcat CSR

[shell]openssl x509 -CA server.crt -CAkey server.key -CAserial serial.txt -req -in tom.csr -out tom.cer -days 365[/shell]

Signature ok
subject=/C=Unknown/ST=Unknown/L=Unknown/O=Unknown/OU=Unknown/CN=Unknown
Getting CA Private Key

6. Import the server CA certificate into the keystore

[shell]keytool -import -alias serverCA -file server.crt -keystore tom.ks[/shell]

Enter keystore password:
Owner: O=My Company Ltd, L=Newbury, ST=Berkshire, C=GB
Issuer: O=My Company Ltd, L=Newbury, ST=Berkshire, C=GB
Serial number: ee13c90cb351968b
Valid from: Thu May 19 02:12:51 EDT 2011 until: Fri May 18 02:12:51 EDT 2012
Certificate fingerprints:
MD5: EE:F0:69:01:4D:D2:DA:A2:4E:88:EF:DC:A8:3F:A9:00
SHA1: 47:97:72:EF:30:02:F7:82:BE:CD:CA:F5:CE:4E:ED:89:73:23:4E:24
Signature algorithm name: SHA1withRSA
Version: 1
Trust this certificate? [no]: yes
Certificate was added to keystore

7. Add the tomcat certificate to the keystore

[shell]keytool -import -alias tom -file tom.cer -keystore tom.ks[/shell]

Enter keystore password:
Certificate reply was installed in keystore

To configure a secure (SSL) HTTP connector for Tomcat, verify that it is activated in the $TOMCAT_HOME/conf/server.xml file. Edit this file and add the following lines.

Tomcat configuration

1. Edit the given portion of tomcat configuretion file and change the port as 80

[shell]vim /usr/tomcat/apache-tomcat-6.0.13/conf/server.xml[/shell]

[shell]“””””” <Connector port=”8080″ protocol=”HTTP/1.1″
connectionTimeout=”20000″
redirectPort=”8443″ /> “”””””

<Connector port=”80″ protocol=”HTTP/1.1″
connectionTimeout=”20000″
redirectPort=”8443″ />

[/shell]

2. Add the given portion to server.xml and give your password in the password portion

[shell]

<Connector port=”443″ protocol=”HTTP/1.1″ SSLEnabled=”true”
maxThreads=”150″ scheme=”https” secure=”true”
keystoreFile=”tom.ks”
keystorePass=”password”
clientAuth=”false” sslProtocol=”TLS” />

[/shell]

When you start the Tomcat Your web server will match your issued SSL Certificate to your Private Key. Your web server will then be able to establish an encrypted link between the website and your customer’s web browser.

Start the tomcat with SSL Certificate

1. Restart tomcat

[shell]/usr/tomcat/apache-tomcat-6.0.13/bin/shutdown.sh
/usr/tomcat/apache-tomcat-6.0.13/bin/startup.sh[/shell]

2. Go to https://Public DNS name:443/

Then your browser shows a security issue. Click the Approve button. Then you can enter to the tomcat with your certificate. When a browser connects to a secure site it will retrieve the site’s SSL Certificate and check that it has not expired, it has been issued by a Certification Authority the browser trusts, and that it is being used by the website for which it has been issued. If it fails on any one of these checks the browser will display a warning to the end user letting them know that the site is not secured by SSL.

You are Done !!!

Simulating multiple IP-Camera with h.264 stream in Amazon EC2 using Wowza

When you are setting up a Wowza based streaming application which need to stream and record more than a thousand cameras, and in the testing stage you need to see how the system works by providing multiple H.264 camera streams. But, when you have only one camera for testing purposes, you cannot overload the camera by taking a thousand streams from it to test the application. And if the camera gives an MPEG-4 stream, Wowza is not going to play since H.264 is the only supported format by it. We did a workaround to overcome this situation in Amazon EC2. We launched a large wowza instance from paid AMI and installed VLC in it. Using VLC we transcoded the MPEG-4 video stream to H.264. Illustration given below
Simulating multiple=
vlc -vvv rtsp://camera.hostname:port/stream-name --sout "#transcode{venc=x264{keyint=60,profile=baseline,level=3.0,nocabac, qpmax=36,qpmin=10,me=hex,merange=24,subme=9,qcomp=0.6},vcodec=x264,vb=128,scale=1, width=640,height=480,acodec=mp4a,channels=1,fps=15,samplerate=4750} :rtp{dst=local.amazon.ip.ofwowzainstance,port-video=10000,port-audio=10002 ,sdp=file:///wowza-installation-dir/content/vlc.sdp}" -R -d

Next we added a username and password to file /usr/local/WowzaMediaServer/conf/admin.password so that we can access the stream manager. Then we had to start wowza server, access the stream manager using the url http://public-dns-name-of.instance:8086/streammanager/

After Login using the username and password mentioned in /usr/local/WowzaMediaServer/conf/admin.password. Click on “start receiving stream” under rtplive.

In the configuration window mentioned Application as rtplive/_definst_ , MediaCaster Type as rtp, and Stream Name as vlc.sdp and clicked “OK” to submit and stream to start. The RSTP url to access the stream was be rtsp://public-dns-name-of.instance:8086/rtplive/vlc.sdp and this give an H.264 stream which is equivalent to a stream from an H.264 camera. The advantage of this setup is you need not overload a single IP camera by taking 1000 streams as this single rtsp output can be used multiple times to simulate a multiple IP-Camera system and feed it as input to the wowza streaming infrastructure we are developing in Amazon EC2.