20110513

Speed up your JDBC queries with MetaModel

Although the recent MetaModel version 1.7.1 was a minor release there is something about this release that excites me very much! That thing is our new FETCH_SIZE calculator. So what is that you might say? Well, here’s a little background.

In JDBC there is an option to set the so-called FETCH_SIZE of a query statement, like this:

Statement st = connection.createStatement();
st.setFetchSize(10000);
ResultSet rs = st.executeQuery(...);

What this does is that it tells the database how many rows should be fetched from the database when more rows are needed. In other words: A buffer size, measured in the amount of rows.

Cool – so this means you should set a rather high fetch size if you have plenty of memory? Say 20.000 records? Well, sometimes yes, but not quite at all times!

The guys over at Oracle has done an exemplary work on documenting the memory consumption of their latest JDBC driver. From this you can learn that if you put a too high fetch size, you might run out of memory! But on the other hand you would really like to optimize the buffering of your result sets! The trouble is that the row size in memory is hugely different if you select 1, 2, 10 or 300 columns in your query! Furthermore the size of a VARCHAR(4000) value in the buffer is about 8000 times larger than a BIT value! These findings lead to a very nice new feature in MetaModel: Dynamically setting an appropriate fetch size, depending on the query you are executing...

How does this work? Well basically MetaModel has all the needed meta-information for doing a strictly mathematical calculation. MetaModel knows the data types of all your columns, it knows their column sizes and most importantly – it knows which columns you are querying! So if we want to allow eg. up to 8 megs of memory to be used for the query buffer, then it’s very easy for us to figure out the needed FETCH_SIZE – in principle it’s just to divide the available memory with the size of a single row in the buffer.

So far I’ve tested it out together with some friends of DataCleaner that have both small tables and 200-300 column tables and they see a dramatic improvement in performance as well as it prevents a few OutOfMemory issues!

A final remark – I haven’t seen any other Java frameworks that do something like this. Even Hibernate only has the option to specify a constant FETCH_SIZE:
<property name="hibernate.jdbc.fetch_size">10000</property>

But as explained in this blog entry you will hopefully agree that such an approach is not optimal unless your application only executes a single query again and again!

For more information on MetaModel, please visit the website.

No comments: