Cache Cow

If you’ve been around PHP for a while, you’ve probably heard about APC, the Alternative PHP Cache. APC is an opcode cache that can significantly speed up your PHP applications, by caching both PHP code and user variables. Adding APC to an application usually results in improved application response times, reduced server load and happier users.

In this article, I’ll introduce you to APC, guiding you through the process of installing and configuring it and showing you a few examples of how it works. I’ll also walk you through the APC administrator interface, which lets you view APC performance in real time, and show you how you can use it with the Zend Framework. So come on in, and let’s get started!

Getting Started

First up, a quick description of what APC is and how it works.

As you probably already know, PHP is an interpreted language (unlike, say, Java or C++). Whenever a client requests a PHP page, the server will read in the source code of the page, compile it into bytecode and then execute it. In the normal scenario, this process is performed on every request…although PHP is so speedy that you probably won’t even notice it!

If you’re running an application or Web site that has many hundreds or thousands of requests coming in per minute, though, you’re going to want to speed things up as much as possible. And that’s where APC comes in. In the words of its Web site, APC is “a free, open, and robust framework for caching and optimizing PHP intermediate code.” Very simply, APC caches the compiled output of each PHP script run and reuses it for subsequent requests. This reduces the time and processing cycles needed to fully satisfy each request, leading to better performance and lower response times.

Does it work? You bet (there are some benchmarks at the end of the article). And it’s easy to set up as well. To install it, use the pecl command, as shown below:

shell> pecl install apc-3.1.4

The PECL installer will now download the source code, compile it and install it to the appropriate location on your system.

Alternatively, manually download the source code archive (v3.1.4 at this time) and compile it into a loadable PHP module with phpize:

shell> cd apc-3.1.4
shell# phpize
shell# ./configure
shell# make
shell# make install

This procedure should create a loadable PHP module named apc.so in your PHP extension directory. You should now enable the extension in the php.ini configuration file, restart your Web server, and check that the extension is enabled with a quick call to phpinfo():

Windows users have a much easier time of it; pre-compiled Windows versions of php_apc.dll can be downloaded from here. Once you’ve got the file, place it in your PHP extension directory, activate it via your php.ini configuration file, and restart your Web server. You should now be able to see the extension active with phpinfo(), as described above.

Digging Deeper

The APC source code archive includes a script named apc.php. This script serves as the administrator interface for the cache, allowing you to look inside the cache at any time to view usage or inspect cached variables. It’s a good idea to get familiar with how this works before starting to write any code.

To begin, extract the script from the source code archive and copy it to your Web server document root. Then, open it in a text editor and set the administrator password (you’ll find it near the top of the script). Once you’ve got that done, try accessing the script through your Web browser. You should see something like this:

As you can see, the script provides a birds-eye view of the current cache status, displaying general cache information, usage and statistics on hits and misses. The information is presented for both the system cache (which handles opcodes) and the user cache (which handles user variables). You’ll also see some interesting data derived from these statistics, such as the number of cache requests per second, the hit rate and miss rate.

This information is useful to understand how well your cache is working, and identify areas that are under-optimized. In particular, note the cache full count value, which indicates how often the cache has filled up; if this is a high number, it’s an indication of high cache churn and suggests that you should perhaps assign more memory to the cache.

The “System Cache Entries” tab lists the PHP scripts that are currently being cached, together with their filename, size and number of hits. Note that APC will automatically cache script opcode.

The “User Cache Entries” tab lists user variables that have been stored in the cache, together with their identifier, size, and creation and modification times. You can select any of these entries to look inside the cache entry and see what it contains. Note that user cache entries must be manually created by a PHP script – you’ll see how on the next page.

Remember that you can clear the opcode cache or the user cache at any time with the “Clear cache” buttons at the top right corner of the page.

A Galaxy Far, Far Away

Now that you have a better idea of how APC works, let’s write some code. Caching user variables in APC is mainly done through the apc_add() and apc_fetch() methods, which allow variable to added to, and retrieved from, the cache respectively. Here’s a simple example that illustrates:

<?php
if ($quote = apc_fetch('starwars')) {
  echo $quote;
  echo " [cached]";
} else { 
  $quote = "Do, or do not. There is no try. -- Yoda, Star Wars";  
  echo $quote; 
  apc_add('starwars', $quote, 120);
}
?>

Now, the first time, you run this script, the page will be generated from scratch and you’ll see something like this:

Next, try refreshing the page. This time, the output will be retrieved from the cache:

The business logic to use a cache is fairly simple. The first step is to check if the required data already exists in the cache. If it doesn’t, it should be generated from the original data source, and a copy saved to the cache for future use. If it does, you can use it right away – write it to a file, pipe it to an external program or output it to the client.

In the previous example, checking whether the data already exists in the cache is accomplished with the apc_fetch() method, while writing a fresh snapshot to the cache is done with the apc_add() method. Notice that both methods require an ID; this ID uniquely identifies the object in the cache and serves as a key for saving and retrieving cache data. The apc_add() method additionally lets you specify a duration (in seconds) for which the cache is valid.

Take a look in the administrator interface, and you should see your cached data, together with statistics on cache hits and misses:

A-rray of Sunshine

In addition to caching strings, APC also allows you to cache arrays, objects, functions and references. Consider the following example, which caches an array of values:

<?php
// if available, use cached array
// if not, create and cache array
if ($data = apc_fetch('heroes')) {
  echo 'Cached data: ';
} else { 
  $data = array('Spiderman', 'Superman', 'Green Lantern', 'Human Torch');
  apc_add('heroes', $data, 120);
}
echo $data[1];  // Superman
?>

You can also cache nested or multi-dimensional arrays, as shown below:

<?php
// if available, use cached data
// if not, create and cache nested array
if ($data = apc_fetch('config')) {
  echo 'Cached data: ';
} else {
  $data = array(
    'site1' => array(
      'smtp' => array(
        'host' => '192.168.0.1',
        'user' => 'user1',
        'pass' => 'guessme'
      )
    ),
    'site2' => array(
      'smtp' => array(
        'host' => '192.168.10.10',
        'user' => 'user2',
        'pass' => 's3cr3t'
      )
    ),    
  );
  apc_add('config', $data, 120);
}

// display data
echo $data['site2']['smtp']['pass']; // s3cr3t
?>

An Object Lesson

In addition to caching arrays, APC also allows you to store objects in the cache. To illustrate, consider the next example, which initializes a simple User object, stores it in the cache, and then retrieves it back from the cache:

<?php
// define class
class User {

  private $name;
  private $location;

  function setName($value) {
    $this->name = $value;
  }

  function setLocation($value) {
    $this->location = $value;
  }

  function getName() {
    return $this->name;
  }

  function getLocation() {
    return $this->location;
  }
}

// if cached object available, use cached object
// if not, create new object instance and cache it
if (apc_exists('user')) {
  $obj = apc_fetch('user');
  echo "Cached data: ";  
} else {
  $obj = new User;
  $obj->setName('John');
  $obj->setLocation('London');
  apc_add('user', $obj, 120);
}

// print object properties
echo 'My name is ' . $obj->getName() . ' and I live in ' . $obj->getLocation();
?>

Here’s what the output looks like:

Another approach to arrive at the same result is to serialize the object into a string, and then store the string in the cache instead of the object. Here’s what that would look like:

<?php
// define class
class User {

  private $name;
  private $location;

  function setName($value) {
    $this->name = $value;
  }

  function setLocation($value) {
    $this->location = $value;
  }

  function getName() {
    return $this->name;
  }

  function getLocation() {
    return $this->location;
  }
}

// serialize object and cache
// retrieve from cache as needed and deserialize
if ($str = apc_fetch('user')) {
  $obj = unserialize($str);
  echo "Cached data: ";  
} else { 
  $obj = new User;
  $obj->setName('John');
  $obj->setLocation('London');
  $str = serialize($obj);
  apc_add('user', $str, 120);
}
echo 'My name is ' . $obj->getName() . ' and I live in ' . $obj->getLocation();
?>

Getting Closure

You can also use APC to cache references and (with a little tweaking) anonymous functions. Let’s take a look:

<?php
class Calendar {
  public $year = 2001;
  public function &getYear() {
    return $this->year;
  }
}

$obj = new Calendar;
$a = &$obj->getYear();  // 2001
$obj->year = 2010;      // 2010
apc_add('ref', $a, 60);
$ref = apc_fetch('ref');
$ref++;                 
echo $ref;              // 2011
?>

Anonymous functions or closures, new in PHP 5.3, offer an easy way to define functions “on the fly”. By default, however, closures cannot be cached with APC, as the Closure class does not implement serialization. Here’s a simple example that illustrates the problem:

<?php
// check if closure exists in cache
// if yes, retrieve and use
// if not, define and add to cache
if (!apc_exists('area')) {
  // simple closure
  // calculates area from length and width
  $area = function($length, $width) {
      return $length * $width;
  };
  apc_store('area', $area);
  echo 'Added closure to cache.';  
} else {
  $func = apc_fetch('area');
  echo 'Retrieved closure from cache. ';
  echo 'The area of a 6x5 polygon is: ' . $func(6,5);  
}
?>

When you try accessing this script, you’ll see an error about serialization, as shown below:

What’s the solution? Well, Jeremy Lindblom has extended the Closure class and created a custom SuperClosure class that supports both serialization and reflection. If you implement your closure using this class, you will be able to cache it. Here’s a revision of the previous script that demonstrates:

<?php
// include SuperClosure class
include 'SuperClosure.class.php';

// check if closure exists in cache
// if yes, retrieve and use
// if not, define and add to cache
if (!apc_exists('area')) {
  // simple closure
  // calculates area given length and width
  $area = new SuperClosure(
    function($length, $width) {
      return $length * $width;
    }
  );
  apc_store('area', $area);
  echo 'Added closure to cache.';  
} else {
  $func = apc_fetch('area');
  echo 'Retrieved closure from cache. ';
  echo 'The area of a 6x5 polygon is: ' . $func(6,5);  
}
?>

Here’s what the output looks like:

Note that these examples use apc_store() instead of apc_add(). In most cases, you can use these two functions interchangeably. The primary difference lies in how they behave when you attempt to store a value using an identifier that already exists in the cache: apc_add() will return false, while apc_store() will overwrite the previous value with the new value.

You can download the SuperClosure class definition from Jeremy Lindblom’s Github account.

Utility Belt

The APC extension also comes with a few other methods of note. For example, there’s the apc_inc() and apc_dec() methods, which can be used to increment or decrement cached values respectively:

<?php
// store a value
apc_store('a', 20);

// increment and decrement stored value
apc_inc('a');         // 21
apc_inc('a');         // 22
apc_inc('a');         // 23
apc_dec('a');         // 22

// retrieve final value
echo apc_fetch('a');  // 22
?>

The apc_bin_dump() method dumps the current contents of the cache in binary form, while the apc_bin_load() method loads a binary dump into the cache. Consider the following example, which illustrates:

<?php
// clear cache
apc_clear_cache();
apc_clear_cache('user');

// store some values
apc_store('a', 20001);
apc_store('b', 7494);

// dump cache in binary form
$dump = apc_bin_dump();

// clear cache
apc_clear_cache();
apc_clear_cache('user');

// try accessing a stored value
if (apc_fetch('a')) {
  echo apc_fetch('a');
} else {
  echo 'Nothing in cache'; 
}
// reload cache from binary dump
apc_bin_load($dump);

// try accessing a stored value
if (apc_fetch('a')) {
  echo apc_fetch('a');     // 20001
} else {
  echo 'Nothing in cache'; 
}
?>

The apc_clear_cache() method can be used to clear the opcode cache or the user cache:

<?php
// clear opcode cache
apc_clear_cache();

// clear user cache
apc_clear_cache('user');
?>

The apc_cache_info() method presents information on current cache status and memory allocation:

<?php
print_r(apc_cache_info());
?>

Here’s what the output looks like:

Tweet Tweet

With all this background information at hand, let’s try APC with a real-world example. This next script uses APC to cache the result of a Twitter search:

<html>
  <head>
    <style type="text/css">
      div.outer {
      	border-bottom: dashed orange 1px;
      	padding: 4px;
      	clear: both;
      	height: 50px;
      }        
      div.img {
        float:left;
        padding-right: 2px;
      }
      span.attrib {
        font-style: italic;
      }
    </style>  
  </head>  
  <body>
    <h2>Twitter Search</h2>
    <form action="<?php echo htmlentities($_SERVER['PHP_SELF']); ?>" method="post">
    Search term: <input type="text" name="q" />
    <input type="submit" name="submit" />
    </form>   
<?php
    // if form submitted
    if (isset($_POST['submit'])):
      // sanitize query terms
      $q = strip_tags($_POST['q']);

      // generate cache id from query term
      $id = md5($q);

      // check if this search already exists in cache
      // use if yes, generate fresh results and add to cache if no
      if (apc_exists($id)) {
        $records = apc_fetch($id);        
      } else {
        // search Twitter for query term
        $result = simplexml_load_file("http://search.twitter.com/search.atom?q=$q&lang=en");

        // process Atom feed of search results
        $records = array();
        foreach ($result->entry as $entry) {
          $item['image'] = (string)$entry->link[1]['href']; 
          $item['owner'] = (string)$entry->author->name;
          $item['uri'] = (string)$entry->author->uri;
          $item['tweet'] = (string)$entry->content;
          $item['time'] = date('d M Y, h:i', strtotime($entry->published)); 
          $records[] = $item;
        }

        // cache for 5 minutes
        apc_store($id, $records, 300);
      }    

      // display search results
?>
    <h2>Twitter Search Results for '<?php echo $q; ?>'</h2>
      <?php foreach ($records as $r): ?>
      <div class="outer">
        <div class="img"><img width=48" height="48" src="<?php echo $r['image']; ?>" /></div>
        <div><?php echo $r['tweet']; ?><br/> 
        <span class="attrib">By <a href="<?php echo $r['uri']; ?>"><?php echo $r['owner']; ?></a> 
        on <?php echo $r['time']; ?></span></div>
      </div> 
      <?php endforeach; ?>
    <?php endif; ?>
  </body>
</html>

Despite its length, this is actually a very simple script. It begins by creating a search form for the user to enter search terms into. Once this form is submitted, it connects to the Twitter Search API, retrieves an Atom-formatted list of search results matching the search term, process the Atom feed and render the final output as an HTML table. The results of the search are cached for five minutes, so that they can be used for subsequent searches containing the same search terms. Notice that each search query is assigned a unique identifier in the APC cache, by using its MD5 signature as key.

You will realize that there are two levels of caching in this script. First, APC’s opcode cache is automatically caching the compiled bytecode of the script, and using this cached bytecode for subsequent requests instead of recompiling it anew. Second, APC’s user cache is caching the results of each Twitter search, and reusing these results (instead of connecting to Twitter afresh) for subsequent searches containing the same query terms. As a result, subsequent searches for the same term will be served from the cache, leading to a noticeable reduction in load time (try it for yourself and see).

Here’s an example of what the output looks like:

In The Frame

If you’re a fan of the Zend Framework, you’ll be happy to hear that Zend_Cache comes with built-in support for APC, allowing you to begin using it out of the box. To illustrate, consider the following Zend Framework controller, which revises the previous example into a Zend Framework controller:

<?php
class IndexController extends Zend_Controller_Action
{
    public function indexAction()
    {
        // action body
    }

    public function searchAction()
    {
      // initialize cache
      $cache = Zend_Cache::factory( 'Core', 
                                    'APC', 
                                    array('lifeTime' => 300, 'automatic_serialization' => true));

      // create form and attach to view                                          
      $form = new SearchForm();
      $this->view->form = $form;       

      // validate input      
      if ($this->getRequest()->isPost()) {
        if ($form->isValid($this->getRequest()->getPost())) {
          // get sanitized input
          $values = $form->getValues();        

          // calculate MD5 hash
          $id = md5($values['q']);

          // look for records in cache
          if (!$records = $cache->load($id) ){                       
            // if not present in cache
            // search Twitter for query term
            $result = simplexml_load_file("http://search.twitter.com/search.atom?q=" . $values['q'] . "&lang=en");

            // process Atom feed of search results
            $records = array();
            foreach ($result->entry as $entry) {
              $item['image'] = (string)$entry->link[1]['href']; 
              $item['owner'] = (string)$entry->author->name;
              $item['uri'] = (string)$entry->author->uri;
              $item['tweet'] = (string)$entry->content;
              $item['time'] = date('d M Y, h:i', strtotime($entry->published)); 
              $records[] = $item;
            }

            // save to cache
            $cache->save($records, $id);
          }

          // assign results to view
          $this->view->records = $records;
          $this->view->q = $values['q'];
        }           
      }       
    }
}

// search form
class SearchForm extends Zend_Form
{
  public function init()
  {
    // initialize form
    $this->setMethod('post');

    // create text input for search term
    $q = new Zend_Form_Element_Text('q');
    $q->setLabel('Search term:')
         ->setOptions(array('size' => '35'))
         ->setRequired(true)
         ->addValidator('NotEmpty', true)
         ->addFilter('HTMLEntities')            
         ->addFilter('StringTrim');            

    // create submit button
    $submit = new Zend_Form_Element_Submit('submit');
    $submit->setLabel('Search');

    // attach elements to form
    $this->addElement($q)
         ->addElement($submit);
  }
}

Here, the searchAction() method first sets up the Zend_Cache instance, with the Core frontend and the APC backend. The form object, which extends Zend_Form, is then added to the view, together with all necessary validators and filters, and the view is rendered.

When the user submits the form, control transfers back to the action controller, which checks the input and retrieves the filtered values. It then checks the cache to see if a search result already exists for this search term, and uses it if available; if not, it connects to the Twitter Search API, retrieves a result set, and saves it to the cache for future use. The results are then rendered through the view script. On subsequent searches for the same term, the cached result set will be used, producing a much faster response.

Here’s the code for the view script:

<style type="text/css">
  div.outer {
  	border-bottom: dashed orange 1px;
  	padding: 4px;
  	clear: both;
  	height: 50px;
  }        
  div.img {
    float:left;
    padding-right: 2px;
  }
  span.attrib {
    font-style: italic;
  }
</style>
<h2>Twitter Search</h2
<?php echo $this->form; ?>

<?php if ($this->records): ?>
  <h2>Twitter Search Results for '<?php echo $this->q; ?>'</h2>
  <?php foreach ($this->records as $r): ?>
  <div class="outer">
    <div class="img"><img width=48" height="48" src="<?php echo $r['image']; ?>" /></div>
    <div><?php echo $r['tweet']; ?><br/> 
    <span class="attrib">By <a href="<?php echo $r['uri']; ?>"><?php echo $r['owner']; ?></a> 
    on <?php echo $r['time']; ?></span></div>
  </div> 
  <?php endforeach; ?>
<?php endif; ?>

And here’s a sample of the output:

The Need For Speed

At this point, there’s only one question left to answer: does APC’s opcode caching really deliver the goods and produce a verifiable increase in performance?

A good way to test this is by benchmarking a PHP script with and without APC, and evaluating the performance differential if any. ApacheBench (ab) is my tool of choice for this test, and my testbed will be the default welcome page of a new Zend Framework project. You can create this by installing the Zend Framework and then using the zf command-line tool to initialize a new, empty project, like this:

shell> zf create project example

Now, turn off APC, by disabling the extension in your php.ini configuration file and restarting the Web server. Then, use ab to benchmark the application welcome page by sending it 1000 requests with a concurrency level of 5, as follows:

shell> ab -n 1000 -c 5 http://example.localhost/default/index/index

On my development system, this produces output like the following:

The main numbers to look at here are the requests per second and the average time per request. The lower the average time per request, the better the performance. Similarly, the greater the number of requests served, the better the performance.

Next, re-enable APC, restart the Web server and try the test again. On my development system, this produces output like the following:

As you can see, enabling APC has resulted in an almost 185% increase in performance, with the server now being able to handle 71 requests per second (up from 25 earlier) and the average time per request coming down to 69 ms (from 194 ms earlier).

The above test was run with APC’s default settings. However, APC comes with a number of configuration settings that you can tweak further to squeeze even better performance from it. Here are some of the important ones:

  • ‘apc.shm_size’ controls the size of the APC memory cache;
  • ‘apc.stat’ controls whether APC checks each script to see if it has been modified and needs to be recompiled and recached;
  • ‘apc.optimization’ determines the degree of optimization to apply;
  • ‘apc.filters’ specifies which files should be cached;
  • ‘apc.write_lock’ places an exclusive lock for caching compiled script bytecode;
  • ‘apc.lazy_functions’ and ‘apc.lazy_classes’ enables lazy loading for functions and classes.

You can read more about these and other configuration directives here.

That’s about all I have for the moment. I hope this tutorial has given you some insight into how APC works, and how you can use it to improve the performance of your PHP applications. Try it out the next time you have a performance optimization problem, and see what you think!

Copyright Melonfire, 2010. All rights reserved.

转自:DevZone

- EOF -