A couple of thoughts about Hibernate, caches and OQL

by perfstories

I was needed to clarify several things on Hibernate according to my work (what is a query cache? does it have regions? what is update timestamps cache? does it have regions or not?)

In general, there is a lot of articles, papers, notes, manuals, etc. One of the excellent examples is http://tech.puredanger.com/2009/07/10/hibernate-query-cache/.

So we got some understanding on query cache and update timestamps cache after reading: there is a query cache with query and bound variables as keys and update timestamps cache, that keeps a record corresponding to each table (was it modified or not later than query result was cached).

So I started with debug. When we are trying to get result with query cache we do ask update timestamps cache about tables (is it up to date or not?), org.hibernate.cache.UpdateTimestampsCache.java, decompiled:

public synchronized boolean isUpToDate(Set spaces, Long timestamp) 
                            throws HibernateException {
/*  74 */     Iterator iter = spaces.iterator();
/*  75 */     while (iter.hasNext()) {
/*  76 */       Serializable space = (Serializable)iter.next();
/*  77 */       Long lastUpdate = (Long)this.updateTimestamps.get(space);
/*  78 */       if (lastUpdate != null)
/*     */       {
/*  85 */         if (log.isDebugEnabled()) {
/*  86 */           log.debug("[" + space + "] last update timestamp: " 
                              + lastUpdate + ", result set timestamp: "
                              + timestamp);
/*     */         }
/*  88 */         if (lastUpdate.longValue() >= timestamp.longValue()) {
/*  89 */           return false;
/*     */         }
/*     */       }
/*     */     }
/*  93 */     return true;
/*     */   }

OK, everything’s fine! But what is it up to day if there is no recording with such key in update timestamps cache? You will laugh, but I didn’t understand this for long time :).

The next unclear thing for me was: why are there much less elements in update timestamps cache than I have cached entities in my application?

The answer to both of questions is very simple: Hibernate puts information about entities to update timestamps cache only after entity modification. So, as I have some of entities read-only in my application, update timestamps cache will not contain information about them. And the absence of entity information in update timestamps cache means entity is up to date!

So, I promised a couple of thoughts at the headline, where is the second? 😉

The next interesting point for me was OQL (Object Query Language). Can’t tell you a lot about OQL history and ideology. From practical point of view it’s very simple, we do open heap dump with some tool (JHat, Eclipse MAT) and by SQL-like language can query information about objects from our heap. Syntaxes of JHat’s OQL and Eclipse MAT’s OQL differ. I experimented with Eclipse MAT a little bit.

OQL could be really useful. For example:

  • You can check the actual instances number. Say, I read in article mentioned above: “There is only ever one timestamp cache shared by all query caches”. Yeah, I believe, but I see in the code of org.hibernate.impl.SessionFactoryImpl constructor there is UpdateTimestampsCache created. I have several SessionFactoryImpl, so will I have several update timestamps caches?

    To answer this question I should look at code and note: UpdateTimestampsCache has a field called updateTimestamps, which has a field called cache, where all the data are stored. So the question is: Do all UpdateTimestampsCaches have the same updateTimestamps.cache?:

    As you see, all objects are the same except one. But I have additional information: this Cache is loaded by another class loader (another ejb application in my application server). So the answer to my question is: yeah, there is a single update timestamps cache per application indeed.

  • The next useful hint is: you can look at runtime cache settings. Sometimes could be useful to understand what do we have in production indeed? :): SELECT toString(name), maxElementsInMemory, timeToLiveSeconds, timeToIdleSeconds, overflowToDisk FROM net.sf.ehcache.Cache.
  • And really nice thing — cache statistics: SELECT toString(name), memoryStoreHitCount, diskStoreHitCount, hitCount, missCountNotFound, missCountExpired FROM net.sf.ehcache.Cache. As far as I can see from the code, statistics in net.sf.ehcache.Cache is always collected. You shouldn’t enable any config keys to turn it on!