• Call: +1 (858) 429-9131

Posts Tagged ‘cache’

DevOps stories 1: working with a high traffic e-commerce portal

Looks like this is a good idea to write down first person stories of various DevOps – Cloud migration scenarios that we come across.

In this particular case we have a beast of a server with 32 processors with 8 cores each & 256 of RAM running LAMP stack, CakePHP &  X-cart shopping cart. And yes, everything is dead slow.

Cleaning up the X-cart cache

By default (?), the cache is at /var/www/html/cache or [DOCTUMENT_ROOT]/cache. If there are too many files, you will not be able to delete the files. The following commands can help.


touch /root/agileblaze/cache-file-list.txt #empty file
find . -name '.js' | grep -vFf /root/agileblaze/cache-file-list.txt | xargs /bin/rm -f
find . -name 'sql.
' | grep -vFf /root/agileblaze/cache-file-list.txt | xargs /bin/rm -f
find . -name 'rf*.php' | grep -vFf /root/agileblaze/cache-file-list.txt | xargs /bin/rm -f

The permanant fix for this X-cart behaviour is to change the following row in the config.php file from:

define('USE_SQL_DATA_CACHE', true);
to
define('USE_SQL_DATA_CACHE', false);

MySQL

There are tons of issues like a db that is not upgraded, joins without indexes etc. We decided to make use of the RAM & have MySQL MYISAM temporary files in there for faster access. Don’t forget to create the required directory and add the necessary entries /etc/fstab to persist the changes over reboots.

/etc/my.cnf is changed as follows

tmpdir = /var/mysqltmp # changed from /var/lib/mysql/tmp

Now that we have some room to look into other matters, things should be easier.

We also had the non-so-friendly max connections error. We increased in the max connections from the default.

# MAX CONNECTIONS
max_connections = 300 #Sat Apr 30 03:35:25 CDT 2016

Slow Queries

If the slow query log is enabled, mysqldumpslow can be a very handy command

[root@714219-db1 mysql]# mysqldumpslow -a -s r -t 10 /var/log/mysql/slow.log

Reading mysql slow query log from /var/log/mysql/slow.log Count: 376687 Time=1.63s (613441s) Lock=0.00s (36s) Rows=203657.1 (76714970948), 2users@localhost SELECT productid, COUNT(remote_ip) AS total, AVG(vote_value) AS rating FROM xcart_product_votes GROUP BY productid

Controlling the RAM usage

 

The RAM usage on GNU/Linux based systems can be sometimes quite weird. The immediate path taken is to play around with sysctl and tweak swappiness & may be run drop_cache.

ie,

change swappiness to say, 10 & do a cache + buffer cleanup. But these may not be very handy but the /proc/sys/vm/vfs_cache_pressure changes seems to help further. (we have it around 512)

Further minimum free memory size is a parameter which can help preventing OOM errors. A sample value is shown below.

sysctl -w vm.min_free_kbytes=2621440

Further:

sysctl -w vm.vfs_cache_pressure=1024
sysctl -w vm.swappiness=10

 

Keep an eye on Caches and Buffers

This is often something people miss.   The difference between free command and the total process usage can give us the Cache + buffer usage.  slabtop is a very handy command to get exact details.

slabtop --delay=10 -s c

Can give a neat summary.

Screenshot from 2016-05-11 20-28-07

 

Another very useful tool is dstat

dstat -lrvn 10 output is shown below. This can give colourful details of cache usage.

the memory, CPU, network, IO columns above gives useful information.

 

How to read dstat : On a fully warmed-up system, memory should be around 95% in-use, with most of it in the cache column. CPUs should be in use with no more than 1-2% of iowait and 2-15% system time.

 

How to setup automatic updates:

Sometimes it is quite good to have automatic updates in place. For Ubuntu, automatic updates can be done following these instructions.

 

 

PHP Caching : The way to speed up PHP sites.

     There are many sites which  is built in PHP. PHP provides the power to simply ‘pull’ content from an external source.   it could just as easily be an MySQL database or an XML file etc.

    The downside to this is processing time, each request for one page can trigger multiple database queries, processing of the output, and formatting it for display. This can be quite slow on complex sites (or slower servers).  Dynamic sites probably have very little changing content, this page will almost never be updated after the day it is written. Each time someone requests it the scripts goes and fetches the content, applies various functions and filters to it, then outputs it to you

       This is where caching can help us out, instead of regenerating the page every time, the scripts running this site generate it the first time they’re asked to, then store a copy of what they send back to your browser. The next time a visitor requests the same page, the script will know it’d already generated one recently, and simply send that to the browser without all the hassle of re-running database queries or searches.

Different Caching mechanism are discussed below.

APC

      APC stands for Alternative PHP Cache, and is a free and open opcode cache for PHP. It provides a robust framework for caching and optimizing PHP performance. APC also provides a user cache for storing application data. APC for caches that do not change often and will not grow too big to avoid fragmentation. The default setting of APC will allow you to store 32 MiB for the opcode cache and the user cache combined

Installing apc on ubuntu

#apt-get install php-apc

edit  apc.ini   ; default location on new php5 is –> /etc/php5/conf.d/20-apc.ini

extension = apc.so;  uncomment this line   
apc.shm_segments=1;   ( by default its enabled .. give 0 to disable)

you can customize your values here. for getting the default values install php5-cli and from command line run

#php -i | grep apc

For monitoring apc cache hits and miss, apc providing a php script. which is located at /usr/share/doc/php-apc/apc.php. Copy this file to your document root and you will be able to monitor your apc status.

http://localhost/apc.php

for performance benchmarking we created two php files

test1.php

<?php 
         $start = microtime(true);
         for ($i = 0; $i < 500000; $i++)
             {
                include('test_include.php');
             }
         $end = microtime(true);          echo "Start: " . $start . "<br />";
         echo "End: " . $end . "<br />";
         echo "Diff: ". ($end-$start) . "<br />";
?>

test2.php

<?php
          $t =    "migrate2cloud";
 ?>

Without apc…

Start: 1360937874.8965 
End: 1360937883.1506 
Diff: 8.2541239261627

With apc..

Start: 1360937935.5746 
End: 1360937936.7291 
Diff: 1.1545231342316

 without apc it took 8 seconds to complete the request .. with apc.. 1.15 seconds..

Memcached

        Memcached system uses a client–server architecture. The servers maintain a key–value associative array; the clients populate this array and query it. Keys are up to 250 bytes long and values can be at most 1 megabyte in size. Clients use client-side libraries to contact the servers which, by default, expose their service at port 11211. Amazon provides a Service called Amazon elasticache for memcache through which we can configure memcache clusters for caching purposes.

installation and configuration

apt-get install memcached 
apt-get install php5-memcached

enable memcache module in /etc/php5/apache2/conf.d/20-memcached.ini  or in php.ini

edit php.ini 
session.save_handler = memcached 
extension=memcache.so
extension=memcached.so

Restart apache and memcache..

php script used for memcache testing..

 
<?php
//memcached simple test  
$memcache = new Memcache;
$memcache->connect('localhost', 11211) or die ("Could not connect");
$key = md5('42data');  //something unique  
for ($k=0; $k<5; $k++) {
$data = $memcache->get($key);
    if ($data == NULL) {
       $data = array();
       //generate an array of random shit  
       echo "expensive query";
       for ($i=0; $i<100; $i++) {
           for ($j=0; $j<10; $j++) {
               $data[$i][$j] = 42;  //who cares  
           }
       }
       $memcache->set($key,$data,0,3600);
    } else 
 {
       echo "cached";
    }  } 

You can monitor memcache using phpmemcacheadmin

http://code.google.com/p/phpmemcacheadmin/

Varnish – Cache

Varnish has a concept of “backend” or “origin” servers. A backend server is the server providing the content Varnish will accelerate. Our first task is to tell Varnish where it can find its content. open the varnish default configuration file. Iif you installed from a package it is probably /etc/varnish/default.vcl.

Somewhere in the top there will be a section that looks a bit like this.:

backend default { .host = "127.0.0.1"; .port = "80"; }

Change the port number to your apache ( or whatever the webserver you are using) port number.

this piece of configuration defines a backend in Varnish called default. When Varnish needs to get content from this backend it will connect to port 80 on localhost (127.0.0.1).

# varnishd -f /etc/varnish/default.vcl -s malloc,1G -T 127.0.0.1:2000 -a 0.0.0.0:80

The -f options specifies what configuration varnishd should use.

The -s options chooses the storage type Varnish should use for storing its content

-T 127.0.0.1:2000 — Varnish has a built-in text-based administration interface

-a 0.0.0.0:80 — specify that I want Varnish to listen on port 80

For logging varnish — In terminal window you started varnish type varnishlog

When someone accessing your page you will get log like

#varnishlog
11 SessionOpen c 127.0.0.1 58912 0.0.0.0:80 
11 ReqStart c 127.0.0.1 58912 595005213 
11 RxRequest c GET 
11 RxURL c / 
11 RxProtocol c HTTP/1.1 
11 RxHeader c Host: localhost:80
11 RxHeader c Connection: keep-alive

Where not to use Caching

          Caching should not be used for some things like search results, forums etc… where the content has to be upto the times and changes depending on user’s input. It’s also advisable to avoid using this method for things like a Flash news page, in general dont use it on any page that you wouldn’t want the end users browser or proxy to cache.

MySQL Optimization

Database optimization is the process of configuring database to use system resource efficiently and perform tasks quickly. To optimize mysql you should know the work flow of entire system, your hardware, operating system, disk I/O performance etc.
Why to Optimize
You can do more with less. The default mysql setup is optimized for a minimal system because it should work well on a minimal hardware. But when you use a dedicated mysql server with high traffic and complex queries you have to optimize mysql.
MySQL Server tuning Considerations
Here you will find some common optimization parameters.

  • MySQL variables
  • Hardware
  • Disk
  • Application

MySQL Optimization
MySQL global variables don’t have any predefined optimum values. It is a trial and monitor process. It depends on all the above parameters. Here you will see some of the common parameters.
Key-buffer-size
It is size of the buffer used to index blocks for MyISAM tables. On a dedicated mysql server with MyISAM storage engine 25-30% of systems total memory you can allocate for key_buffer_size. To fine tune key_buffer_size you can compare the variables key_reads and the key_read_requests.
This ratio should be at least 1:100.

SHOW STATUS LIKE ‘%key_read%’;
+——————-+————-+
| Variable_name | Value |
+——————-+————-+
| Key_read_requests | 10726813161 |
| Key_reads | 92790146 |
+——————-+————-+
Here the ratio is 1:115 which is acceptable.
But suppose you get a ratio 1: 10 then you need to add more key buffer and upgrade hardware accordingly.
Query Cache
“My website is too slow while loading dynamic pages”. If it is a mysql database related issue, following MySQL variables will be your solution.
query_cache_type
Set the query cache type. There are 3 values 0 ,1 or 2

0 Do not cache any query result
1 Cache query results.
2 Cache results ondemand. Cacheable queries that begin with SELECT SQL_CACHE.

query_cache_size
The amount of memory used to cache query result. Default is 0 which disable query cache.
The optimum value is depend on your application.
query_cache_limit
Do not cache results that are larger than this number of bytes. The default value is 1MB.
Status checking
SHOW STATUS LIKE ‘%qcache%’;
+————————-+———-+
| Variable_name | Value |
+————————-+———-+
| Qcache_free_blocks | 1 |
| Qcache_free_memory | 8371272 |
| Qcache_hits | 23547551 |
| Qcache_inserts | 46909131 |
| Qcache_lowmem_prunes | 5110536 |
| Qcache_not_cached | 2760196 |
| Qcache_queries_in_cache | 0 |
| Qcache_total_blocks | 1 |
+————————-+———-+
There were 46909131 queries and out which 23547551 queries cached and remaining not cached. Here the issue will either the result is greater than query_cache_limit or greater than query_cache_size itself. You have to trial and monitor 🙂
Qcache_lowmem_prunes.
When a query is removed from the query cache, this value will be incremented. If it increases quickly, and you still have memory to spare, you can set query_cache_size high, If it never increases, you can reduce the cache size.

sort_buffer
The sort_buffer is a useful for speed up myisamchk operations. It can also be useful when performing large numbers of sorts.

tmp_table_size

This variable determines the maximum size for a temporary table in memory. The maximum in memory size is minimum of tmp_table_size and max_heap_table_size. You can compare
Created_tmp_disk_tables and Created_tmp_tables to optimize tmp_table_size.

innodb_buffer_pool_size

This variable is target for innodb table and it is similar to key_buffer_size in MyISAM table.
On a dedicated mysql server using innodb you can set this upto 80% of RAM.
Hardware for mysql
If you have large tables(>3GB), you should consider 64 bit hardware as mysql uses a lots of 64bit integers internally.

You need more memory(RAM) if you want mysql to handle large number of connections simultaneously. More RAM will speed up key updates by keeping most of the pages in RAM

Another consideration is Ethernet device, You can use a 1G Ethernet for a dedicated mysql server for fast remote connections.

Disk performance is also an important parameter.
Disk Optimization
Striping disk (RAID 0) will increase both read and write throughput.

Don’t use RAID 1 or mirroring on disk for temporary files.

On Linux, mount the disks with async (default) and noatime.
Optimizing your application
Cache process in your application

Specify the column name in queries(eg dont use SELECT * FROM……)

Use persistent connections

USE EXPLAIN to explain!!.You will see detail below.

Queries and Indexes

Let us start with a simple query SELECT firstname FROM student WHERE id=’145870′;
MySQL start searching from the beginning row to find the student with id 145870. It does not even know it exist or not. An index is a sorted file which have an entry for each row.MySQL can find the corresponding record very quickly by referring this index.
EXPLAIN is a nice tool to understand your queries

EXPLAIN SELECT firstname,lastname FROM student WHERE id=’145870′;

+———-+——+—————+——+———+——+——+————+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+———-+——+—————+——+———+——+——+————+
| student | ALL | NULL | NULL | NULL | NULL |10000 | where used |
+———-+——+—————+——+———+——+——+————+
The possible_keys is null. In this case mysql will check all the 10000 rows. We can say this query(or table) is not optimized.

Now suppose we have use index for above table and run EXPLAIN again then we will get
+———-+——-+—————+———+———+——-+——+——-+
| table | type | possible_keys | key | key_len | ref | rows | Extra |
+———-+——-+—————+———+———+——-+——+——-+
| employee | const | PRIMARY | PRIMARY | 10 | const | 1 | |
+———-+——-+—————+———+———+——-+——+——-+
The type is “const”, which means that the table has only one matching row. The primary key is being used to find this particular record.

There are many more optimization variables and indxing methods. It is difficult include everything in a single article. But you can start mysql fine tuning while you database is underperfoming.

How to configure Memcached on AWS EC2: A Starter’s Guide

Memached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load and session management. Lets focus on session management first and build up a caching daemon to store PHP sessions in a load balanced environment.  In this post I will explain how you can easily install it and make it available in LAMP. Read more…