caching story

Caching is not performance optimization

  • Sharebar

Caching is not a performance optimization… and it never will

In a perfect world, caching would be the four letter word defined as a way to masquerade poor and inefficient code around a system. Using it would be the greatest sin that could be ever committed. But this isn’t a perfect world, and caching? Well, it’s a requirement.

I remember the old days when I was coding in assembly and C. Back then, I had to do my own memory management while writing code that could run in 4k of RAM all the while still being awesome. Nowadays, we have a bunch of people piecing together code-snippets and calling it development. I was once one of these hackers but decided to educate myself in the world of programming. Seriously, by now I’ve studied such a great deal of concepts and design patterns that when I look back at my spaghetti code it makes my stomach turn.

So, before you start your witch hunt, know that I have valid reasons to say that caching is not the same as performance optimization. In fact, it’s quite the opposite – caching is just a way to avoid repeatable tasks (which isn’t the same as automation), hide screw-ups and a decent way to keep the server from overloading (which, I guess is a good thing).

So, let’s answer the obvious question: why not? It’s very simple, “A sub-optimal application even cached remains sub-optimal, the execution time will be greatly reduced with caching but the efficiency remains the same”.

Performance optimization is an art. It’s accomplished by analyzing for bottlenecks and removing them. It’s hunting down poorly written code and refactoring it. I could go on, but you get the point. Performance Optimization is awesome, it’s basically making your code better.

I’m not saying that caching sucks because it’s not the same a performance optimization. My point is that we need to clearly understand what caching is and know where to use it, especially in the hack and slash world of Magento.

Now let’s look at how Wikipedia describes Caching:

In computer science, a cache is a component that transparently stores data so that future requests for that data can be served faster. The data that is stored within a cache might be values that have been computed earlier or duplicates of original values that are stored elsewhere. If requested data is contained in the cache (cache hit), this request can be served by simply reading the cache, which is comparatively faster. Otherwise (cache miss), the data has to be recomputed or fetched from its original storage location, which is comparatively slower. Hence, the greater the number of requests that can be served from the cache, the faster the overall system performance becomes.

To be cost efficient and to enable an efficient use of data, caches are relatively small. Nevertheless, caches have proven themselves in many areas of computing because access patterns in typical computer applications have locality of reference. References exhibit temporal locality if data is requested again that has been recently requested already. References exhibit spatial locality if data is requested that is physically stored close to data that has been requested already.

So future requests can be faster? So why not make them faster to begin with?
This concept is application and language agnostic. There’s not a single application or language where caching will provide an improvement.

But that doesn’t make any sense! Well, let’s look at a few quick examples. Remember when you were told to never do this?

$sql = "SELECT * FROM `database`.`table_name`";

Do you think caching that select statement in MySQL, Memcache or some other caching store somehow makes it more efficient? Of course not. No matter what, it’s still a variable stored in a system. Caching stuff like this won’t make any difference at all when optimizing.

Let’s take Magento for example. After all, we are magento developers, right? I’m using Magento EE 1.12 configured to use nginx and memcache with full page caching disabled.

Consider the following sql, used when generating a category page:

SELECT  `e` . * ,  `cat_index`.`position` AS  `cat_index_position` ,  `price_index`.`price` ,  `price_index`.`tax_class_id` ,  `price_index`.`final_price` , IF( price_index.tier_price IS NOT NULL , LEAST( price_index.min_price, price_index.tier_price ) , price_index.min_price ) AS  `minimal_price` , `price_index`.`min_price` ,  `price_index`.`max_price` ,  `price_index`.`tier_price` 
FROM  `catalog_product_entity` AS  `e` 
INNER JOIN  `catalog_category_product_index` AS  `cat_index` ON cat_index.product_id = e.entity_id
AND cat_index.store_id =1
AND cat_index.visibility
IN ( 2, 4 ) 
AND cat_index.category_id =  '10'
AND cat_index.is_parent =1
INNER JOIN  `catalog_product_index_price` AS  `price_index` ON price_index.entity_id = e.entity_id
AND price_index.website_id =  '1'
AND price_index.customer_group_id =0
ORDER BY  `cat_index`.`position` ASC 
LIMIT 15

Showing rows 0 – 6 (7 total, Query took 0.0017 sec)

SELECT `e`.sku, `cat_index`.`position` AS `cat_index_position`, `price_index`.`price`, `price_index`.`tax_class_id`, `price_index`.`final_price`, IF(price_index.tier_price IS NOT NULL, LEAST(price_index.min_price, price_index.tier_price), price_index.min_price) AS `minimal_price`, `price_index`.`min_price`, `price_index`.`max_price`, `price_index`.`tier_price` FROM `catalog_product_entity` AS `e` INNER JOIN `catalog_category_product_index` AS `cat_index` ON cat_index.product_id=e.entity_id AND cat_index.store_id=1 AND cat_index.visibility IN(2, 4) AND cat_index.category_id='10' AND cat_index.is_parent=1 INNER JOIN `catalog_product_index_price` AS `price_index` ON price_index.entity_id = e.entity_id AND price_index.website_id = '1' AND price_index.customer_group_id = 0 ORDER BY `cat_index`.`position` ASC LIMIT 15

Showing rows 0 – 6 (7 total, Query took 0.0015 sec)

Did you notice the difference? The script takes roughly the same time to generate but can be improved greatly by using indexes and maybe even customizing MySQL.

You see? Performance improvement is as simple as it seems. Let’s look at another example…

If you open: Mage_Core_Model_Resource_Db_Abstract

Let’s look at the load method and implement some caching to improve performance:

    /**
	* Fetches an object from the cache storage
	* @param Mage_Core_Model_Abstract $object
	* @return boolean | object
     */
    protected function isThisObjectCached(Mage_Core_Model_Abstract $object) {
        $key = md5(json_encode($object));
        $helper = Mage::helper('performance');

        if ($cachedObject = $helper->fetchFromCache($key)) {
            return $cachedObject;
        }

        return false;
    }

    /**
	* Saves an object in cache
	* @param Mage_Core_Model_Abstract $object
	* @return \Mage_Core_Model_Resource_Db_Abstract
     */
    protected function saveThisObjectInCache(Mage_Core_Model_Abstract $object) {
        $key = md5(json_encode($object));
        $helper = Mage::helper('performance');

        $helper->saveObjectInCache($key, $object);

        return $this;
    }
 
 /**
	* Load an object
     *
	* @param Mage_Core_Model_Abstract $object
	* @param mixed $value
	* @param string $field field to load by (defaults to model id)
	* @return Mage_Core_Model_Resource_Db_Abstract
     */
    public function load(Mage_Core_Model_Abstract $object, $value, $field = null) {
        if ($cachedObject = $this->isThisObjectCached($object)) {
            $data = $cachedObject->getData();
            if ($data) {
                $object->setData($data);
            }
        } else {
            if (is_null($field)) {
                $field = $this->getIdFieldName();
            }

            $read = $this->_getReadAdapter();
            if ($read && !is_null($value)) {
                $select = $this->_getLoadSelect($field, $value, $object);
                $data = $read->fetchRow($select);

                if ($data) {
                    $object->setData($data);
                }
            }
        }
        $this->unserializeFields($object);
        $this->_afterLoad($object);
        $this->saveThisObjectInCache($object);
        return $this;
    }

You can find my siege results here.

As you can see, there’s not much to gain. One could even say that with caching, there’s even performance degradation.

What does that even mean?

It means that by adding an extra layer (caching) you’re actually increasing the overall processing time because now you have the additional overhead of checking if the object is cached and fetching or saving caching data.

To me, at least, this also indicates that performance isn’t affected by fetching data from one source over the other, in this case memcache vs. MySQL.

Why are you ranting about caching

Because of this . That’s why. I’m a super-advocate of reverse proxies and caching, not because I’m lazy but because I’m super nuts about performance optimization. When I think about optimization I think enhancements and refactoring, not hiding errors and poorly thought out code.

Architecture has a lot in common with Engineering. When you’re constructing a building, you choose the right materials, spot and people to construct it. So, if you’re constructing a sh&^%$ building, are you going to paint the walls to hide the shoddy craftsmanship and poor materials? Yes, you probably would and act like nothing happen until you get caught, that’s the rule of the game right?

What can we do

It’s totally up to you but here are a few good articles:

The best thing you can do however is to apply the boy scout rule to everything that you code:

It’s not enough to write the code well. The code has to be kept clean over time. We’ve all seen code rot and degrade as time passes. So we must take an active role in preventing this degradation.
The Boy Scouts of America have a simple rule that we can apply to our profession.
Leave the campground cleaner than you found it.

This is an updated version of the article, the previous version can be found here as text file.

VN:F [1.9.22_1171]
Rating: 9.8/10 (6 votes cast)
VN:F [1.9.22_1171]
Rating: +2 (from 4 votes)
Caching is not performance optimization, 9.8 out of 10 based on 6 ratings
 Caching is not performance optimization

About Luis Tineo

Husband, Father, performance improvement junkie, biker and video gamer, Linux user and in my day job I'm a Systems Architect at Blue Acorn.

Tags: , ,

  • Robert Henderson

    Your articles are very informative Luis.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • http://www.kingletas.com letas

      Thank you Robert!!!

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
    • http://www.kingletas.com Luis Tineo

      Thanks Robert… :-D

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
    • letas

      Thanks Robert… I really appreciate it :-D

      VA:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VA:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
  • fbrnc

    Great post!

    Btw, something’s wrong with your http://www.kingletas.com/go/* links. They always redirect back to the homepage.

    Bye, and have a nice day,

    Fabrizio

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • letas

      Thanks Fabrizio, greatly appreciate it. Thanks for the tip on the links, I recently migrated the blog and enabled some 404 redirection but didn’t check these links.

      VA:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VA:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
  • http://www.facebook.com/Euklid Nils Boe Eriksen

    Great article on magento performance vs. caching.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • http://www.kingletas.com Luis Tineo

      Thanks Nils… I really appreciate it :-D

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
    • letas

      Thanks Nils… I really appreciate it :-D

      VA:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VA:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
  • letas

    Hey Christopher,

    Thanks for stopping by and I am sorry you have that perception of me. While I don’t consider myself to be one I cannot control the perception that you get of me. Could you elaborate as of why you think I get to be in that list?

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
  • jul

    It all depends on your application. I’m currently working on an app accessing files on tape library. Even in the case were a tape drive is available, the access time to the file is counted in minutes. Caching dramatically increases performance.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • letas

      Hey Jul – I think that is my point, you can dramatically speed up processing with caching. Caching is part of our current performance improvement methodologies but is not performance improvement on its own.

      VA:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VA:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
  • http://twitter.com/puppetMaster3 c vic

    Let me guess: U don’t have a CS degree? it’s ok.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • letas

      Telecommunications Engineer and Electrical Technician – You? Let me guess? …..

      VA:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VA:F [1.9.22_1171]
      Rating: 0 (from 0 votes)
  • Pingback: use sqlite3 with magento

  • Andrew Jackson

    I’m a bit late to the party but this was a good article. I just found your site today and am really enjoying all the different topics and different point of views.

    Thanks for the effort in writing these, I look forward to the next.

    VA:F [1.9.22_1171]
    Rating: 0.0/5 (0 votes cast)
    VA:F [1.9.22_1171]
    Rating: 0 (from 0 votes)
    • http://www.kingletas.com Luis Tineo

      Andrew welcome to the party please enjoy it as much as you can, this is just the beginning :-D,

      Thanks for stopping by

      VN:F [1.9.22_1171]
      Rating: 0.0/5 (0 votes cast)
      VN:F [1.9.22_1171]
      Rating: 0 (from 0 votes)