You need php session-clustering and you need it done yesterday. The project is at risk. The suits are breathing down your neck and monitoring your every tweet. Memcache to the rescue
Installing memcached
Not installed? Get it here. Alternatively, install via a *nix package manager eg. apt, yum, rpm etc. Start your memcache servers:
/usr/local/bin/memcached -u root -d -m 1024 -l 0.0.0.0 -p 11211
Add one more to the pool: (this will make sense later. promise)
/usr/local/bin/memcached -u root -d -m 1024 -l 0.0.0.0 -p 11212
Note: the port number differs for the second memcached daemon.
Installing memcache extension with “session feature”
PHP “talks” to a memcache server via the memcache extension but you already knew that. Something less known is that when installing this extension you can enable/disable the extension’s session.save_handler feature.
qbook:Locate quinton$ sudo pecl install memcache
Password:
downloading memcache-2.2.5.tgz ...
Starting to download memcache-2.2.5.tgz (35,981 bytes)
....done: 35,981 bytes
11 source files, building
running: phpize
Configuring for:
PHP Api Version: 20041225
Zend Module Api No: 20060613
Zend Extension Api No: 220060519
1. Enable memcache session handler support? : yes
Choose “yes”. You can now configure php to use memcache as its session store.
Configuring the memcache session store
ext/memcache introduces new directives to the php.ini. In this post I’ll use ini_set() within a php script so you can just copy-and-paste to try it out yourself. In your production environment you’ll prolly want these directives inside your php.ini and/or apache vhost. You decide.
// firstly, override the default session save_handler like this
ini_set('session.save_handler', "memcache"); // PHP_INI_ALL Supported since memcache 2.1.2
// now tell php where to store the sessions. two memcache servers specified here. more bout that later
ini_set('session.save_path', "tcp://localhost:11211, tcp://localhost:11212"); // PHP_INI_ALL Supported since memcache 2.1.2
// getting interesting. if primary server is down talk to the others in the server pool
ini_set('memcache.allow_failover', "1"); // PHP_INI_ALL Available since memcache 2.0.2.
// more tweaking stuff follows here
// may want to adjust this
ini_set('memcache.max_failover_attempts', "20"); // PHP_INI_ALL Available since memcache 2.1.0.
ini_set('memcache.default_port', "11211"); // PHP_INI_ALL Available since memcache 2.0.2.
ini_set('memcache.chunk_size', "8192"); // PHP_INI_ALL Available since memcache 2.0.2.
ini_set('memcache.hash_strategy', "standard"); // PHP_INI_ALL Available since memcache 2.2.0.
ini_set('memcache.hash_function', "crc32"); // PHP_INI_ALL Available since memcache 2.2.0.
?>
A crude test
This code can be appended to the php snippet from above
<?php
session_start();
echo "Session save_handler is: ".ini_get("session.save_handler")."
";
echo "Session save_path is: ".ini_get("session.save_path")."
";
if(isset($_SESSION['bleh']))
{
echo "Bleh is already set: ".$_SESSION['bleh']."
";
}
else
{
$_SESSION['bleh'] = 'bwahahha';
echo 'Bleh has been set with: '.$_SESSION['bleh']."
";
}
?>
Where’s the clustering you ask
Well, truth be told there isn’t any. What we have so far is a memcache pool consisting of 2 servers. PHP is configured to write to the pool. PHP reads/writes to the server pool in order specified by “session.save_path” ini directive. For reads PHP will request a cache object by key from the pool. Because “failover” is enabled PHP will query the pool of memcache servers one-by-one until the request is fulfilled. Every silver lining has a cloud.
Fragmentation
In the event of a memcache server crash, fragmentation will occur. Therefore the pool will have an identical cache and its exceptionally hard to tell which keys each memcache server is storing. When a memcache daemon crashes it loses all data. Memory, by nature, is volatile. You have been warned.
Replication
This method is most commonly refered to as client-side replication BUT is better described as redundancy. There is a method of achieving server-side replication (think mysql replication). AFAIK repcached is the only available patch of its kind. I would like to try it but haven’t had the opportunity. Its a long story.
Summary
Memcache-based session save_handler is a quick-win but its NOT scalable. As your server farm grows the cost of maintaining and managing this solution is complexity. Also, consider the implications of managing a php.ini per web server and the repercussions of the failover feature. Nasty. However, there is good news. A libmemcached-based extension is on the way.
Getting real
If you have more time to spare take the safer option ie. storing sessions in a database. You could write a custom session save_handler or just use Zend framework’s Zend_Session_SaveHandler_DbTable. What could be easier?








[...] Parker’s latest blog post looks at a handy feature of the memcache tool – session clustering – and how to set it up in your [...]
@tom3k,
Answers to your questions…
a) i just installed memcached via yum (running centos 5 latest, memcached came off rpmforge…)… any tweaking i should do to the memcached conf?
That depends on which extension you’ve installed. There are two memcache extensions for php viz. memcache and memcached. The latter is “better” since its based on libmemcached c library
http://www.php.net/manual/en/memcached.configuration.php
http://www.php.net/manual/en/memcache.ini.php
b) performance wise… since the sessions will now be stored in memory… im assuming this should be a tad faster? or does the in / out into memcached overhead negate any performance improvements? everythings running off 1 box btw, so it’ll be accessing localhost.
Since you’re running everything-on-1-box network latency would never be an issue. Reading/writing to memory is always faster than disk I/O. Depending on your RAID configuration and filesystem performance gain could be negligible or very very significant. Don’t expect miracles. Is all I’m saying
yea, the reason i switched from file to memcached sessions wasnt performance related, i just needed to get around the sessions-locked-until-script complete issue… and wasnt in the mood to do a db driven system…
thanks again bud, write more articles… the crowd cant get enuf
We’re already using the PECL extension “memcache” which works great for us.
What is the difference to the “memcached” extension mentioned in the summary?
The new memcached extension is built entirely on libmemcached which is more feature-rich than ext/memcache
See http://blog.digg.com/?p=531 and http://tangent.org/552/libmemcached.html
Its still in beta but its something to keep an eye on
Great post! Having to deploy a web cluster we really wondered a lot what to do with sessions and uploaded files. I’m not a fan of database sessions. Though it’s an easy solution i wouldn’t like to keep the transaction database handling session data for performance reasons. Memcached seems like a very nice solution even with no actual cluster
What do you suggest for uploaded files? We are thinking of rsyncing them.
Thanks Žilvinas,
Honestly, I would feel a lot safer with the database session store. A well-optimized mysql database will do the trick. The memcache session clustering is but a quick and dirty (note: dirty) win. I haven’t yet tried it in production under heavy loads.
With regards to files (or any static content) i recommend storing them on the filesystem in a manner which allows for replication (such as rsync) and does not interfere with the web server’s inherent ability to perform http-level caching (Etags, last-modified headers and stuff). *nix ext filesystems employ aggressive file-caching which will count in your favour.
Always be ready to serve up static content from a cluster of small web servers (like lighttpd/nginx/blah) AND/OR be able to move to a third-party CDN (akamai, cloudfront etc). Stay away from sharing filesystems over NFS! Its slo-w-w–w-w
Rsync to the cluster – but beware of etags.
Make sure to set apache (for example) to NOT use inode data when calculating etags – otherwise your identical content on each node will still show to the end user as a different etag.
If you don’t want your OLTP database handling session data – then don’t – you can always throw up a separate database for dealing with session data alone.
There’s also…. http://memcachedb.org/
[...] planet-php.org i found this great post about how to solve PHP session clustering in an easy way. Though it’s no silver bullet but it’s definately worth knowing about. [...]
As a compromise, you can use a database-backed memcache solution where you write session information to memcache and then to the database every 5th time. This way, all sessions are not dropped when a instance goes down.
Erm. I think you misread the post. The memcache server pool handles failover automatically (well actually the client does). There won’t be any loss of session information unless all your memcache servers go down at the same time for some odd reason. And you don’t need to write a line of php to do that
Sorry bud. The solution you propose does not solve the data consistency problem. It creates a whole other problem. Think about it
If its consistency you want and/or need then read/write sessions to the database. See Zend_Session_SaveHandler_DbTable. And thats a whole 3-4 lines of code. Oh well. Can’t win them all
Instead of wring session data to db every 5th time, which could still cause data lost, you can md5 session data after read, and the md5 new session data before write. if md5 value is the same, then nothing changed so no need to write, otherwise write, this can save lots of writes to db, which could be the potential bottleneck long term
phpslacker: If a memcache daemon goes down, then any sessions stored it in are lost. So the failover works great for adding machines to the memcache pool, but fails when a machine is removed.
Yes. I discuss that in the post under “fragmentation”
IF your database-backed session store goes down – you lose all session data too.
Tihs can be corrected on the code end – code needs to check for valid session data, and if it doesn’t exist – go get it, or otherwise deal with it gracefully.
First, a couple questions. Why do you feel that using memcache as a session store is a “quick and dirty” solution? How is using a DB session store is more scalable than using a memcache backend?
In regards to my first question, in my opinion, properly setting up memcache to store sessions takes about the same amount of time as using a DB backend. That is assuming you are using the new memcached extension and consistent hashing (which you should be!). If you’re using a MySQL backend with a disk-based storage engine your session write master will eventually become bottlenecked when your site needs to scale. I can’t imagine a DB backend would be able to handle 25k-50k active users the way memcache can with aplomb.
It is for these reasons that I feel that memcache is the only way to store sessions if scalability is truly your concern.
Hi Rich,
Sorry bout the late reply.
Memcache certainly performs better than MySQL (or any other RDBMS for that matter). However an RDBMS offers data consistency and reliable replication out of the box whereas memcache is volatile and at present doesn’t offer ACID not replication (the trusted kind). With confidence you could setup and grow a db cluster whereas these features are currently rough and a bit dodgy with memcache. So the more you need to scale out the more you’ll have data consistency issues with memcache. By definition scaling out is also the ability to do so gracefully
Scalibility != Performance
It really comes down to this: Are we using memcache for its intended purpose? Memcache was designed to take the load off slower disk-based storage solutions but was not designed to replace them
The same hash-based distributed session storing you can do with memcache works equally well with any store – including mysql.
Rather than replicate, we do the same thing – partition session data to different mysql servers by hashing. IF we lose a server, we end up losing some session data – but everything will fail over to the other one, and the site code will re-establish the session, re-fetch what it needs to, and things move on.
I am using repcached (memcache servers with support for replication) servers and “libmemcached-based” PHP client.
The problem is that if the main repcached server goes for a toss, it does not fallback to the other servers in the pool.
The fallback is happening with “zlib-based” PHP memcached client.
Is failover mechanism not there in the libmemcached based client ?? If there is any, how can I achieve this ?
Hi Anshul,
I’m not entirely sure. I’ve heard that libmemcached (http://pecl.php.net/package/memcached) offers such a feature as the failover offered by http://pecl.php.net/package/memcache BUT i can’t find any substantiating doc
Best ask Brian Aker (http://tangent.org/552/libmemcached.html)
Hi PHPSlacker,
Thanks for the reply. I will surely do as suggested by you.
Hi PHPSlacker,
On the same line with my earlier post, is there any way to check
if my repcached server is down, so that I can remove it dynamically from the server pool.
Thanks
You could try Monit
http://mmonit.com/monit/
With it you can monitor any daemon process and trigger an action to handle the event (in this case a fallen repcached server)
no .. i mean can I check using PHP client,say, while adding servers to the pool.
I understand. You can still use something like monit. When monit detects that a memcache server has fallen over it could trigger the writing of a record to filesystem or database or whatever that indicates that the server is down
Your php script could then check for that written record and update its server list on the fly. You can change your ini settings for memcache session handler using ini_set() (before calling session_start() of course)
Hi,
First off, thanks for the outline, it solves roughly half my problem instantly! the other half, well, I have been googling most of the day, and was wondering what your thoughts are on this ….
I want to use MEMCACHE for session storage as it is quick quick quick, but I need to keep the records for long periods of time, and the sessions are only activated infrequently. So I was looking for a solution where the session is stored in MySQL but when activated it gets put into memchache(d). All updated during the ‘use spike’ are written to the memcache, but then once the spike is over, the record is written back into the DB (write-back cache), after a period of inactivity. This way, if the memcache server goes offline, I only lose a small amount of data ..
Anyone doing anything like this ? Or am I missing something and it is a really stupid idea ?
Hi Richard,
Not a dumb idea at all. That very issue is what caused me to write up the blog post
Indeed, the original idea behind memcache was to be a “fast frontend” to a slower backend. A DBMS is slower in comparison to memcached. But speed/performance is relative. By making memcached the “primary” store the roles are reversed hence the issue at hand.
I was somewhat forced to implement a memcache-based solution at the time of writing this post. So I settled on reading/writing to a pool of memcache servers. That gives you somewhat reliable yet kludgy data redundancy
Today things have changed. I now have the freedom to choose whatever solution I think suitable. For the needs of the project I’m currently working on I switched to using Zend_Session_SaveHandler_DbTable (with mysql) becoz my greatest concern is data consistency/redundancy.
MySQL certainly won’t be as quick as memcached but it is quick enough. MySQL, when configured correctly, is quick. It is the exception to the rule that “RDBMSes are slow-w-w-w”
I’m love memcached its a great piece of tech innovation. But everything in its right place…
Zend_Session_SaveHandler_DbTable is also very very easy to setup. Little code reqd
Thanks for the reply, I will have a look at the zend thing (I am a huge supporter of ZendStudio, though I hate the Ecplise version), but I think I’m gonna use the write-back cacheing method with mysql+memcache. I nearly killed our NAS with session traffic last month (+2 million flat files at any one moment) and I have the luxury of time and resources to try and create a ‘near perfect solution’ (well I can dream can’t I!)
You could have a look at memcachedb instead…
what a great post!
i was looking for a solution to some of the (small..) issues i was having with sessions remaining locked while being called by another script (ajax…)…
two questions:
a) i just installed memcached via yum (running centos 5 latest, memcached came off rpmforge…)… any tweaking i should do to the memcached conf?
b) performance wise… since the sessions will now be stored in memory… im assuming this should be a tad faster? or does the in / out into memcached overhead negate any performance improvements? everythings running off 1 box btw, so it’ll be accessing localhost.
once again, thank you kindly for for a nice quick solution to an issue iv been pondering for a few days now
ciao!
for those that like colorful graphs… (you know you all do…)
a very apc-looking control panel for memcached:
http://livebookmark.net/journal/2008/05/21/memcachephp-stats-like-apcphp/
enjoy!
Hi,
Can i install php memcahced in a cloud server running centos 5.2 if so what are the steps.
Thanks
Madhav
Yes. I’ve actually done that on CentOs. CentOs packages are somewhat out-of-date. Best download memcached and compile from source
[...] Using memcache. We could install memcache extension with “session feature”, and configure the session store. Here is a instruction: http://phpslacker.com/2009/03/02/php-session-clustering-with-memcache/. [...]
Just wanted to thank you for the article. It saved my day.
Hi all,
I understand that when I set two memcached servers in “save_path”, sessions will be stored only in first instance, until it crashes.
There isn’t any loadbalancing between instances. Am I right ?
Could someone tell me if following parameters are useful ?
=> persistent=1&weight=1&timeout=1&retry_interval=15
Many thanks
actually sessions are stored in both instances until it crashes. how keys are stored in the server pool is dependent on which hash strategy is used. a “consistent” strategy will allow servers to be added and/or removed from the pool without remapping of keys.
memcache servers are independent and totally unaware of “peers”. the so-called load-balancing is done client-side ie. the hashing algorithm determines this
Ok…understood.
What are the best practises about parameters : persistent,weight, timeout, retry_interval for session management ?
set the parameters according to your unique requirements. there is no “best practice” that is the “best” for every situation
One of the best information pages about memcached-session-storage, thank you!
Hi,
I have some question regarding the proper implementation of sessions.
On my site, I have a shopping that uses the session_set_save_handler(), and on the other page, I also have a form that has a captcha validation which also used the ini_set(‘session.save_handler’, ‘files’).
It turns out that there’s a conflict between the two.
When I visit the form ( has captcha ), my cart value becomes empty.
I guess, session.save_handler overwrites the value of session_set_save_handler().
Any idea guys on how to fix this?
well firstly, there’s nothing wrong with the behaviour of session.save_handler. Secondly, there’s no reason why ur shopping cart data cannot share the same storage as ur captcha data. thats what “namespaces” are for
$_SESSION['cart_data'] = ‘blah’;
$_SESSION['captcha_data'] = ‘blah’;
Does this solution ensure that session ids will be unique across the cluster? What is preventing duplicate session ids from being assigned?
php-memcache extension does not overwrite php’s session-handling mechanism it merely implements a session save handler http://www.php.net/manual/en/function.session-set-save-handler.php so garbage collections, id generation is the same as if you used the default “file” save handler for php sessions. that answers both of ur comments
interesting idea to use memcachedb as persistent storage. u should run more than one instance of memcachedb for redundancy sake and bear in mind that memcachedb DOES NOT implement the full memcache protocol so it may not be compatible with php’s memcached extension
We are experimenting with php-memcache extension and memcachedb back-end.
memcachedb provides persistent storage. Does anyone know if the garbage collection is handled by the php-memcache front-end? Or do we have to manually delete the expired session ids ourselves from the back-end db?
Thank you.
Your solution is very useful. I’ve tested and it works fine.
before it, I’ve changed php.ini parameters but it failed.
Hi,
Thanks for a great article. I have a question about using this session clustering technique across two round-robin load-balanced LAMP servers.
From my understanding of this article, it should be possible to share sessions across two webservers by defining the save_path on each server to have a local memcache store AND a remote memcache store:
session.save_path=”tcp://localhost:11211, Tcp://192.168.0.[$othermachine]:11211″
That way, the users session should be accessible from whatever server they get landed on and 50% of the time it will be on the localhost and so avoid network traffic? Am I correct or did I misunderstand something?
I haven’t got it to work in testing but I’m thinking that on CentOS 5.5, the PHP memcache extension in the repos might not support transparent replication. It seems the session is only in the first instance and when the user gets sent to the other server, it just creates a new session in the new servers local instance (therefore the user is logged out). i.e. it doesn’t seem to peak inside each instance in turn to work out whether it can find the session – it just peaks into the first instance and then creates a new session if it finds nothing.
Look. The solution i posted for session clustering is a fiery hack. FYI Memcached does not do replication between instances
For the default php.ini memcache configuration both servers should have an identical php.ini line like this:
session.save_path=”tcp://primary_memcached_instance:11211, Tcp://secondary_memcached_instance:11211″
Because the default hashing method is “standard” not “consistent”. The type of hashing algorithm (chosen for memcached) determines how the keys are distributed across the pool of memcached servers. Consistent hashing would offer full failover. Then you can use the settings you describe in your comment to achieve 50% hit/miss on local reads
See http://www.php.net/manual/en/memcache.ini.php#ini.memcache.hash-strategy
Thanks for replying, again really useful info.
I used the technique to successfully get sessions to transparently failover. i.e. the user was not logged out if I stop a pooled memcached instance on one of the machines.
I didn’t manage to use the technique to share sessions across two servers: http://serverfault.com/questions/164350/can-a-pool-of-memcache-daemons-be-used-to-share-sessions-more-efficiently
Hello,
thanks for the article and your following of discussion since then.
Related to the DB approach which many follow (the guy from the Lift framework among others), what about using MongoDB instead of MySQL ?
I’m currently wondering what would be the best approach between MongoDB or simply use Zend cluster Manager. What do you think of ?
Does someone know the price of Zend cluster Manager by the way ?
This is not a general help forum but here’s the best possible advice anyone can give you. You have a lot of questions. Nobody can make those decisions for you. You have specific needs/requirements for your unique environment. You need to gather all the facts and make a decision yourself for yourself. Let go of broad-spectrum best practices. Every problem has context. Seek it. Understand it
OR
hire a consultant at $$$$ / hour