Thursday, March 13, 2008

Hibernate and Memcached

As I mentioned at the end of my last post, Cache Everything, I am working on building a library to use memcached from Hibernate as a second level cache. That project is hibernate-memcached and its in it's infancy over at googlecode.

The Hibernate second level cache is used to cache items between session invocations. Hibernate always uses it's session as a first level cache, but if you want to avoid database hits between sessions you need a second level cache. If you want these caches to stay in synch between several instances of your application, you need a distributed second level cache.

Memcached is a fantastically fast distributed cache, so why not use that? Once I build a jar of the hibernate-memcached project you'll be able to include it in your application and tell Hibernate to use it. Telling Hibernate to use it means adding the cache_provider property to your hibernate.cfg.xml, or your Spring LocalSessionFactoryBean if you're using that. The hiberante.cfg.xml looks like this...

<hibernate-configuration>
<session-factory>
<property name="hibernate.cache.provider_class">com.googlecode.hibernate.memcached.MemcachedCacheProvider</property>
<mapping class="com.yadda.yadda.Something"/>
</session-factory>
</hibernate-configuration>


Then, in your mappings you tell Hibernate that you want instances of your persistent objects to be cached, or collections from relation ships to be cached. Once Hibernate knows what you want cached you can watch it pump stuff into memcached, which is good clean geeky fun. Fire up memcahed (on windows or linux) using the -vv command line option to make it run "very verbose". You can see it storing and responding for each operation taken.

You can play with the code straight out of subversion at google code. It will actually work and cache stuff, but it is currently hardcoded to look only at localhost:11211 for a memcached instance. I'll get around to making it configurable. To build the code you'll need Maven2. I highly recommend using Don Brown's no-suck port of maven2 actually.

More to come...

Cache Everything

So I'm working on this Web Service project and its time to consider scalability. Which means its time to setup memcached. If you haven't looked at using memcached, you should.

Memcached is a totally stupid-simple distributed memory object cache. It's this super tiny application that slices off a chunk of RAM and listens on a port. Yeah, that's about it :)

It gets cool to talk about when you start talking about making it "distributed". When you start talking about "Distributed Caches" people start thinking about Coherence and other hugely complicated and hugely expensive enterprise applications. It really dosen't have to be so hard.

Here's what you do. Get a couple cheap boxes with one or two gigs of ram in them. Install the Linux flavor of your choise. Install memcached and run it with "memcached -d -m 1024". This will start memcached on it's default port (11211) and allocate 1gb of ram to the cache. If you do this on three boxes, you now have a 3gb distributed cache.

There are client libraries available for most popular languages. If you're using Java I'd highly recommend using spymemcached from Dustin Sallings.

So you give the clients a list of servers to connect to and they will basically treat that big distributed cache you created as a huge hashtable. Each get/set operation has a string key that is hashed and used to decide what "partition" of the cache to read/write to. If an instance goes down the client just begins distributing reads/writes to the other instances until the missing member comes back.

Using any of the client libraries, in just about any language, your usage looks like this:

  • Check the cache for your object.

  • If your object is not null, return the object.

  • Otherwise, read the object from the database or what have you.

  • Put the object in the cache.

  • Return the object.



Memcached uses a very light and fast binary protocol which makes it usable by just about any language. There are pre-built packages in apt if you're using Ubuntu, you can easily build it from source for just about any other *nux flavor. There's even a windows port. I've only used that for dev at this point though.

I'm working on writing a second level cache for hibernate that uses memcached. It is super simple, but I just started work on it last night (seriously). Check it out if you want. I plan to work on it over the next few nights and get it ready for prime time :)

Wednesday, March 05, 2008

Open Source Licensing: WTFPL

I think I'm going to start using this license for any open source stuff I do...
WTFPL