• Call: +1 (858) 429-9131

cloud vendors and interoperability

It’s Phab! That makes your life easier

We have been using plenty of different tools for tracking bugs/product management/project management/to do lists/code review; such as ClearCase, ClearQuest, Bugzilla, Github, Asana, Pivotal Tracker, Google Drive etc. We found Phabricator as a “Too Good To Be True” software engineering web application platform originally developed at Facebook. It has code review, wiki, repository browsing,tickets and a lot more to make Phab more fabulous.

Phabricator is an open source collaboration of web applications which help software companies to build better software. It is a suite of applications. Following are the most important tools in phabricator :
Maniphest – Bug tracker/task management tracker
Diffusion- source code browser
Differential – code review tool that allows developers to easily submit reviews to one another via command line tool when they check in code using Git or Subversion
Phriction – wiki tool

How to setup and configure the code review and project management tool – Phabricator


Server – 4GB Digital ocean droplet
OS – Ubuntu 14.04

1. Install dependencies

apt-get install mysql-server apache2 dpkg-dev php5 php5-mysql php5-gd php5-dev php5-curl php-apc php5-cli php5-json

2. Get code

#cd /var/www/codereview

git clone https://github.com/phacility/libphutil.git

git clone https://github.com/phacility/arcanist.git

git clone https://github.com/phacility/arcanist.git

3. Configure virtual host entry

#add below lines


DocumentRoot /var/www/codereview/webroot
RewriteEngine on
RewriteRule ^/rsrc/(.*) – [L,QSA]
RewriteRule ^/favicon.ico – [L,QSA]
RewriteRule ^(.*)$ /index.php?__path__=$1 [B,L,QSA]
Order allow,deny
allow from all
4. Enable the virtual host entry for phabricator.

# a2ensite phabricator.conf
# service apache2 reload

5. Configure the MySQL database configuration for phabricator

– create database
# /var/www/codereview/phabricator/bin/config set mysql.user mysql_username
# /var/www/codereview/phabricator/bin/config get mysql.pass mysql_password
# /var/www/codereview/phabricator/bin/config get mysql.host mysql_host
# /var/www/codereview/phabricator/bin/config storage upgrade
-tweak mysql

Open /etc/mysql/my.cnf and add the following line under [mysqld] section:


#service mysql restart

Set the Base URI of Phabricator install

# /var/www/codereview/phabricator/bin/config set phabricator.base-uri

(eg: phabricator.your-domain.com)

Configure Outbound Email – External SMTP (Google Apps)

Set the following configuration keys using /var/www/codereview/phabricator/bin/config set value

– metamta.mail-adapter -> PhabricatorMailImplementationPHPMailerAdapter
– phpmailer.mailer -> smtp
– phpmailer.smtp-host -> smtp.gmail.com
– phpmailer.smtp-port -> 465
– phpmailer.smtp-user -> Your Google apps mail id
– phpmailer.smtp-password -> set to your password used for authentication
– phpmailer.smtp-protocol -> ssl

Start the phabricator daemons

You can start all the phabricator deamons using the script
# /var/www/codereview/phabricator/bin/phd start
To start daemons at the boot time, add this entry to the file /etc/rc.local

/var/www/codereview/phabricator/bin/phd start

Diffusion repository hosting with git

1. Install git

#apt-get install git

2. Create a local repository directory:

#mkdir -p /data/repo

3. Edit the repository.default-local-path key to the new local repository directory.

Go to the Config -> Repositories -> repository.default-local-path

4. Configure System user accounts

Phabricator uses as many as three user accounts. These are system user accounts on the machine Phabricator runs on, not Phabricator user accounts.

* daemon-user – The user the daemons run as

We will configure the root user to run the daemons

* www-user – The user the web server run as

We will use www-data to be the web user

* vcs-user – The user that users will connect over SSH as

We will configure git user to the vcs-user

To enable SSH access to repositories, edit /etc/sudoers file using visudo to contain:

#includedir /etc/sudoers.d
git ALL=(root) SETENV: NOPASSWD: /usr/bin/git-upload-pack, /usr/bin/git-receive-pack, /usr/bin/git

Since we are going to enable SSH access to the repository, ensure the following holds good.

– Open /etc/shadow and find the line for vcs-user, git.

The second field (which is the password field) must not be set to !!. This value will prevent login. If it is set to !!, edit it and set it to NP (“no password”) instead.

– Open /etc/passwd and find the line for the vcs-user, git.
The last field (which is the login shell) must be set to a real shell. If it is set to something like /bin/false, then sshd will not be able to execute commands. Instead, you should set it to a real shell, like /bin/sh.

– Use phd.user as our daemon user;
# /var/www/phab/phabricator/bin/config phd.user root
# /var/www/phab/phabricator/bin/config set diffusion.ssh-user git

5. Configuring SSH

We will move the normal sshd daemon to another port, say 222. We will use this port to get a normal login shell. We will run highly restrictive sshd on port 22 managed by Phabricator.

Move Normal SSHD

– make a backup of sshd_config before making any changes.

#cp /etc/ssh/sshd_config /etc/ssh/sshd_config.backup

– Update /etc/ssh/sshd_config, change the port to some othert port like 222.

Port 222

– Restart sshd and verify that you are able to connect to the new port

ssh -p 222 user@host

Configure and start Phabricator SSHD

We now configure and start a second SSHD instance which will run on port 22. This instance will use special locked down configuration that uses Phabricator to handle the authentication and command execution.

– Create a phabricator-ssh-hook.sh file

– Create a sshd_phabricator config file

– Start a copy of sshd using the new configuration

Create phabricator-ssh-hook.sh: Copy the template in phabricator/resources/sshd/ phabricator-ssh-hook.sh to somewhere like /usr/lib/phabricator-ssh-hook.sh and edit it to have the correct settings



# NOTE: Replace this with the username that you expect users to connect with.

# NOTE: Replace this with the path to your Phabricator directory.

if [ “$1” != “$VCSUSER” ];
exit 1

exec “$ROOT/bin/ssh-auth” $@

Make it owned by root and restrict editing;

#sudo chown root /usr/lib/phabricator-ssh-hook.sh
#chmod 755 /usr/lib/phabricator-ssh-hook.sh

Create sshd_config for Phabricator: Copy the template in /phabricator/sshd/sshd_config.phabricator.example to somewhere like /etc/ssh/sshd_config.phabricator

Start Phabricator SSHD

#sudo /usr/sbin/sshd -f /etc/ssh/sshd_config.phabricator

Add this entry to the /etc/rc.local to start the daemon on startup.

If you did everything correctly, you should be able to run this;

#echo {} | ssh git@phabricator.your-company.com conduit conduit.ping

and get a response like this;


You should now be able to access your instance over ssh on port 222 for normal login and administrative purposes. Phabricator SSHD runs on port 22 to handle authentication and command execution.

6. To create a git repository

Go to Diffusion -> New Repository -> Create a New Hosted Repository

Upgrade Phabricator

Since phabricator is under development, you should update frequently. To update phabricator:

– Stop the web server
– Run git pull in libphutil/, arcanist/, and phabricator.
– Run phabricator/bin/storage upgrade.
– Restart the web server.
Also you can use a script similar to this one to automate the process:

“IAM user, who can write to the S3 bucket”

Here we are to educate ourselves as to what “IAM user, who can write to the S3 bucket” is, by using cloudfront distribution and S3 objects, which are of world readable.


1.Create a bucket in s3 my-bucket

1. Log in to the AWS Management Console

2. Click on s3 tab

3. Create a new bucket

4. Create a custom/aws bucket policy to make it world readable

Read more…

Web 2.0 application architecture Template

Application created for a Startup based in Chicago

The term ‘Load Balancer’ is quite self-explanatory, it balances the load on application servers behind it. There can be ‘n’ number of application servers behind the Load Balancer  (LB) which would not be directly facing the end users.

Read more…

Iozone – file system benchmark quick installation

Iozone – open source file system benchmarking tool is a handy tool which  lets you to inspect  filesystem performance like read, write, rewrite, random read etc etc. Major UNI*X flavours and even Microsfot Windows is supported.

Iozone graph - file system benchmark

This is a quick how to on installation and usage on Redhat flavor Linux & MacOS.



Installation steps

  1. Download the rpm source fro the Iozone from the following link or enter the follwing command at the command prompt
wget http://www.iozone.org/src/current/iozone-3-338.i386.rpm
  1. install Iozone using the following command
rpm -ivh /path/to/iozone-3-338.i386.rpm

Now the filesystem benchmarking tool is installed. To verify the installation you can find the files of Iozone tool in the directory /opt/iozone/

Plotting graphs from Iozone analysis results

For the plotting of a graph you should ensure that gnuplot is already installed in your machine. Gnuplot is a command-driven opensource function plotting tool. If it isn’t installed you may install it using the following command

yum install gnuplot

After ensuring that you have got gnuplot installed. Lets try plotting the analysis of the Iozone tool on a graph. Here we will analyze the filesystem with in auto mode. Run th iozone command and redirect its output to a file.

/opt/iozone/bin/iozone -a -g 4m > /tmp/test_analysis

In the above command we have set the maximum size of a file for auto mode  to 4mb with -g option. For brief study about Iozone commands use

/opt/iozone/bin/iozone -h

To generate the graph there is another command which is installed with the iozone tool ie Generate_Graphs. To plot the graph use the following commands

/opt/iozone/bin/Generate_Graphs /tmp/test_analysis

A new window will pop up with the plotted graph of the write performance. At the command line if you hit enter then new graph is plotted fro the rewrite performance. Thus there will be graph plotted for many IO performances as you hit enter each time. Evaluate the required IO performances from the set of graphs depending on the application that you are planning to deploy.


If you have brew installed, its as simple as

brew install iozone

Apache on the Cloud – The things you should know

    LAMP forms the base of most web applications.  As the load on an server increases, the bottlenecks in the underlying infrastructure become more apparent in the form of slow response to user requests.

     To overcome this slow response  the primary choice of most people is to add more hardware resources ( incase of AWS increasing the instance type). This will definitely  increases performance but will cost you more money.  The webserver and database eat most of the resources. Most commonly used web server is apache and database is MySQL. So if we can optimize these two we can improve the performance.

   Apache optimization techniques can often provide significant acceleration boosts  even when other acceleration techniques are in use, such as a CDN.  mod_pagespeed is a module from Google for Apache HTTP Servers that can improve the page load times of your website. you can read more on this from here.  If you want to deploy a PHP app on AWS Cloud, Its better to using some kind of caching mechanism.  Its already discussed in our blog .

      Once we came into a situation where we have to use a micro instance for a web server with less than 500 hits a day

      When the site started running live, and we feel like disappointed. when accessing website, it would sometimes pause for several seconds before serving the requested page. It took  hours to figure out what was going on. finally we run the command top and quickly discovered that when the site was accessing by certain amount of users the CPU would spike, but the spike was not the typical user or system CPU. For testing what’s happening in  server we used the apache benchmark tool ‘ab’ and run the following command on  localhost.

                                             #ab -n 100 -c 10 http://mywebserver.com/

      This will show  how fast our web server can handle 100 requests, with a maximum of 10 requests running concurrently. In the meantime we were monitoring the output of top command on web server.

     For further investigation we started with  sar – Linux command to  Collect, report, or save system activity information

  #sar 1

      According to amazon documentation “Micro instances (t1.micro) provide a small amount of consistent CPU resources and allow you to increase CPU capacity in short bursts when additional cycles are available”.

       If you use 100% CPU for more than a few minutes, Amazon will “steal” CPU time from the instance, meaning that they throttle your instance.  This last  as long as five minutes, and then you get a few seconds of 100% again, then the restrictions are back.  This will effect your website, making it slow, and even timing-out requests. basically means the physical hardware is busy and the hypervisor can’t give the VM the amount of CPU cycles it wants.

   Real tuning required on prefork. This is where we can tell apache to only generate so many processes. The defaults values  are high, and which cant be handled by micro instance. Suppose you get 10 concurrent requests for a php page and require around 64MB of RAM when requested (you have to make sure that  php memory_limit is above that value). That’s around 640MB of RAM on micro instance of 613MB RAM.  This is the case  with 10 connections – apache is configured to allow 256 clients by default,  We need to  scale these down , normally with 10-12 MaxClients. As per out case, this is still a huge number because 10-12 concurrent connections would use all our memory. If you want to be really cautious, make sure that your max memory usage is less than 613MB. Something like 64M php memory limit and 8 max clients keeps you under your limit with space to spare – this helps ensure that our MySQL process when your server is under load.

           Maxclients an important tuning parameter regarding the performance of the Apache web server. We can calculate the value of this for a t1.micro instance


MaxClients =(Total Memory – Operating System Memory – MySQL memory) / Size Per Apache process.

t1.micro have a server with 613MB of Total memory. Suppose We are using RDS instead of mysql server.

Stop apache and run

#ps aux | awk ‘{sum1 +=$4}; END {print sum1}’.

 we will get the amount of memory thats used by processes other than apache.

Suppose we get a value around 30.

from top command we can check the average memory that each apache resources use.

suppose its 60mb.

Max clients = (613 – 30 ) 60 = 9.71 ~ 10 approx …

       Micro instances are awesome, especially when cost becomes a major concern, however that they are not right for all applications. A simple website with only a few hundreds  hits a day will do just fine since it will only need CPU in short bursts.

      For Servers that serves dynamic content, better approach is to employ a reverse-proxy. This would be done this apache’s mod_proxy or Squid. The main advantages of this configurations are content caching, load balancing etc. Easy method is to use mod_proxy and the ProxyPass directive to pass content to another server. mod_proxy supports a degree of caching that can offer a significant performance boost. But another advantage is that since the proxy server and the web server are likely to have a very fast interconnect, the web server can quickly serve up large content, freeing up a apache process, why the proxy slowly feeds out the content to clients

If you are using ubuntu, you can enable module by

                                        #a2enmod proxy

                                        #a2enmod proxy_http    

and in apache2.conf

                                         ProxyPass  /

                                         ProxyPassReverse  /

         The ProxyPassreverse directive captures the responses from the web server and masks the URL as it would be directly responded by the Apache  hiding the identity/location of the web server. This is a good security practice, since the attacker won’t be able to know the ip of our web server.

      Caching with Apache2 is another important consideration.  We can configure apache  to set the Expires HTTP header, max-age directive of the Cache-Control HTTP header of static files ,such as images, CSS and JS files, to a date in the future so that these files will be cached by your visitors browsers. This saves bandwidth and makes web site appear faster if a user visits your site for a second time, static files will be fetched from the browser cache

                                      #a2enmod expires

  edit  /etc/apache2/sites-available/default

  <IfModule mod_expires.c>
               ExpiresActive On
               ExpiresByType image/gif “access plus 4 weeks”
               ExpiresByType image/jpg “access plus 4 weeks”


This would tell browsers to cache .jpg, .gif  files for four week.

       If your server requires a large amount of read / write operations, you might consider provisioned IOPS ebs volumes on your server. This is really effective if you use database server on ec2 instances.  we can use iostat on the command line to take a look at your read/sec and write/sec. You can also use CloudWatch metrics to determine read and write operations.

       Once we move to the security side of apache, our major concern is DDos attacks. If a server is under a DDoS attack, it is quite difficult to detect the attack before the damage is done.  Attack packets usually have spoofed source IP addresses. Hence, it is more difficult to trace them back to their real source. The limit on the number of simultaneous requests that will be served by Apache is decided by the MaxClients directive, and is set to safe limit, by default. Any connection attempts over this limit will normally be queued up.

     If you want to protect your apache against DOS,  DDOS attacks use mod_evasive module.  This module is designed specifically as a remedy for Apache DoS attacks. This module will allow you to specify a maximum number of requests executed by the same IP address. If the limit is reached, the IP address is blacklisted for the time period you specify.

PHP Caching : The way to speed up PHP sites.

     There are many sites which  is built in PHP. PHP provides the power to simply ‘pull’ content from an external source.   it could just as easily be an MySQL database or an XML file etc.

    The downside to this is processing time, each request for one page can trigger multiple database queries, processing of the output, and formatting it for display. This can be quite slow on complex sites (or slower servers).  Dynamic sites probably have very little changing content, this page will almost never be updated after the day it is written. Each time someone requests it the scripts goes and fetches the content, applies various functions and filters to it, then outputs it to you

       This is where caching can help us out, instead of regenerating the page every time, the scripts running this site generate it the first time they’re asked to, then store a copy of what they send back to your browser. The next time a visitor requests the same page, the script will know it’d already generated one recently, and simply send that to the browser without all the hassle of re-running database queries or searches.

Different Caching mechanism are discussed below.


      APC stands for Alternative PHP Cache, and is a free and open opcode cache for PHP. It provides a robust framework for caching and optimizing PHP performance. APC also provides a user cache for storing application data. APC for caches that do not change often and will not grow too big to avoid fragmentation. The default setting of APC will allow you to store 32 MiB for the opcode cache and the user cache combined

Installing apc on ubuntu

#apt-get install php-apc

edit  apc.ini   ; default location on new php5 is –> /etc/php5/conf.d/20-apc.ini

extension = apc.so;  uncomment this line   
apc.shm_segments=1;   ( by default its enabled .. give 0 to disable)

you can customize your values here. for getting the default values install php5-cli and from command line run

#php -i | grep apc

For monitoring apc cache hits and miss, apc providing a php script. which is located at /usr/share/doc/php-apc/apc.php. Copy this file to your document root and you will be able to monitor your apc status.


for performance benchmarking we created two php files


         $start = microtime(true);
         for ($i = 0; $i < 500000; $i++)
         $end = microtime(true);          echo "Start: " . $start . "<br />";
         echo "End: " . $end . "<br />";
         echo "Diff: ". ($end-$start) . "<br />";


          $t =    "migrate2cloud";

Without apc…

Start: 1360937874.8965 
End: 1360937883.1506 
Diff: 8.2541239261627

With apc..

Start: 1360937935.5746 
End: 1360937936.7291 
Diff: 1.1545231342316

 without apc it took 8 seconds to complete the request .. with apc.. 1.15 seconds..


        Memcached system uses a client–server architecture. The servers maintain a key–value associative array; the clients populate this array and query it. Keys are up to 250 bytes long and values can be at most 1 megabyte in size. Clients use client-side libraries to contact the servers which, by default, expose their service at port 11211. Amazon provides a Service called Amazon elasticache for memcache through which we can configure memcache clusters for caching purposes.

installation and configuration

apt-get install memcached 
apt-get install php5-memcached

enable memcache module in /etc/php5/apache2/conf.d/20-memcached.ini  or in php.ini

edit php.ini 
session.save_handler = memcached 

Restart apache and memcache..

php script used for memcache testing..

//memcached simple test  
$memcache = new Memcache;
$memcache->connect('localhost', 11211) or die ("Could not connect");
$key = md5('42data');  //something unique  
for ($k=0; $k<5; $k++) {
$data = $memcache->get($key);
    if ($data == NULL) {
       $data = array();
       //generate an array of random shit  
       echo "expensive query";
       for ($i=0; $i<100; $i++) {
           for ($j=0; $j<10; $j++) {
               $data[$i][$j] = 42;  //who cares  
    } else 
       echo "cached";
    }  } 

You can monitor memcache using phpmemcacheadmin


Varnish – Cache

Varnish has a concept of “backend” or “origin” servers. A backend server is the server providing the content Varnish will accelerate. Our first task is to tell Varnish where it can find its content. open the varnish default configuration file. Iif you installed from a package it is probably /etc/varnish/default.vcl.

Somewhere in the top there will be a section that looks a bit like this.:

backend default { .host = ""; .port = "80"; }

Change the port number to your apache ( or whatever the webserver you are using) port number.

this piece of configuration defines a backend in Varnish called default. When Varnish needs to get content from this backend it will connect to port 80 on localhost (

# varnishd -f /etc/varnish/default.vcl -s malloc,1G -T -a

The -f options specifies what configuration varnishd should use.

The -s options chooses the storage type Varnish should use for storing its content

-T — Varnish has a built-in text-based administration interface

-a — specify that I want Varnish to listen on port 80

For logging varnish — In terminal window you started varnish type varnishlog

When someone accessing your page you will get log like

11 SessionOpen c 58912 
11 ReqStart c 58912 595005213 
11 RxRequest c GET 
11 RxURL c / 
11 RxProtocol c HTTP/1.1 
11 RxHeader c Host: localhost:80
11 RxHeader c Connection: keep-alive

Where not to use Caching

          Caching should not be used for some things like search results, forums etc… where the content has to be upto the times and changes depending on user’s input. It’s also advisable to avoid using this method for things like a Flash news page, in general dont use it on any page that you wouldn’t want the end users browser or proxy to cache.