The MetaModel project lets you do wonderful and advanced things like filtering, ordering, grouping etc. when working with otherwise static content in CSV files, Excel spreadsheets, XML files etc. Many times I have knocked my head to the door when trying to get simple summary data out of my excel spreadsheets or simply filtering on single rows in the sheets.
An example: Recently I was working on my tax return statement and throughout the year I had recorded all relevant activities in a simple spreadsheet. I had marked the activities on various accounts such as representation, spendings, earnings whatever. Simplified it looked something like this:
Title | Amount | Account |
---|---|---|
Asus EEE | 1.600 | Spendings |
Advisory for Company X | 4.000 | Earnings |
One thing I think is funny when dealing with Excel is that throughout time you've learned how to set up your spreadsheet for easy hacking instead of dynamic querying. What I mean by this is that if I wanted to get SUM for each account in my spreadsheet I would propbably have fundamentally changed my (otherwise nice and pure) spreadsheet in order to make it easy to perform the SUM function on singular accounts. For example I may have introduced an ammount column for each account. But alas, adding a new Account would be pretty cumbersome.
Enter MetaModel! With MetaModel I can write queries in a SQL-like manner. When I say "SQL-like" you may fret and think "oh but don't learn me another SQL dialect, just give me the real deal!". The answer to this is that what I'll learn you now will effectively replace all SQL dialects because the same model (or MetaModel ;)) is usable for all kinds of datastores: Excel, CSV/TSV files, XML and SQL-databases). Let's have a look at some code. First we'll load the Excel file into MetaModel and get a hold of our schema and table structure and identify our columns of interest:
// Use the factory to create a DataContext and automatically (true) narrow column types
DataContext dataContext = DataContextFactory.createExcelDataContext(new File("my_tax_return_activities.xls"), true);
Schema schema = dataContext.getDefaultSchema();
// A simple way of getting the table object is just to pick the first one (the first sheet in the spreadsheet)
Table table = schema.getTables()[0];
Column amountColumn = table.getColumnByName("Amount");
Column accountColumn = table.getColumnByName("Account");
Now that we have the columns represented and the DataContext which we can use to perform queries, here's how we make a couple of interesting queries using the MetaModel query API:
// SELECT SUM(Amount) FROM sheet WHERE Account = "Spendings"
Query q = new Query().select(new SelectItem(FunctionType.SUM, amountColumn)).from(table).where(accountColumn, OperatorType.EQUALS_TO, "Spendings");
// SELECT Account, SUM(Amount) FROM sheet GROUP BY Account
Query q = new Query().select(accountColumn).select(new SelectItem(FunctionType.SUM, amountColumn)).from(table).groupBy(accountColumn);
Now when we've authored the queries we can let MetaModel execute them and deal with the result as appropriate. If we just want to print out the results to the console we'll do something like this:
DataSet dataSet = dataContext.executeQuery(q);
while (dataSet.next()) {
System.out.println(dataSet.getRow());
}
dataSet.close();
30 comments:
it helped me a lot. congrats for you exceptional piece of work.
one question. is it possible to load directly one HSSFWorkbook object?
Hi Jorge,
Glad you found it useful! Currently what you are asking is not possible. But if you feel like it -take a look at the source code for MetaModel. I'm sure it will be an easy patch to develop and I will welcome it for sure.
i'll sure do that. i've read your blog entirely and i'm amazed about your work. i've only tried metamodel to query some excel files and didn't notice that metamodel is just the tip of the iceberg. congratulations
Hi Jorge,
I'm honored that you think my work is impressive! Thank you very much.
If you want to join in on the development I recommend that you add your email to this mailing list: http://groups.google.com/group/metamodel
Currently the mailing list is rather dead, but we can change that of course :-) Please email in your idea about the HSSFWorkbook constructor for the ExcelDataContextStrategy and we'll take it from there! Looking forward to having you on board :)
Hi.
Kasper, can you explain how to use 'where' clause to query data from two worksheets? How to connect them? Is it possible?
Hi Dimon4k,
Actually just yesterday I answered that question at the eobjects forums so I'm just going to direct you with a link :)
Hope it works for you:
http://eobjects.org/trac/discussion/7/76
Hi Kasper,
Thanks a lot for the useful information.
I tried the same example and I got correct result. However, Im trying to create a web service that get data from Excel, when I deployed my application, I got the error "NoClassDefFoundError" for the DataContext. I have included the jar file in the project. I'm currently using Eclipse and JBoss.. I hope you can advice me.
Thanks a lot
Entisar
Hi entisar,
My only guess would be that you are maybe not including all the needed JARs. If you need to use the excel spreadsheet functionality, you will need:
MetaModel-full, MetaModel-excel, MetaModel-querypostprocessor, MetaModel-core.
It just occurs to me that we should probably distribute an assembled jar as well. But that's probably for you to wait for MetaModel 1.1 for :)
Thanks a lot for your reply.
i've checked it and kept only these four JARs, but Im getting now
ava.lang.NoClassDefFoundError: dk/eobjects/metamodel/MetaModelException
at java.lang.Class.getDeclaredMethods0(Native Method)
at java.lang.Class.privateGetDeclaredMethods(Unknown Source)
at java.lang.Class.privateGetPublicMethods(Unknown Source)
at java.lang.Class.getMethods(Unknown Source)
at org.jboss.ws.metadata.builder.jaxws.JAXWSMetaDataBuilder.processWebMethods(JAXWSMetaDataBuilder.java:885)
at org.jboss.ws.metadata.builder.jaxws.JAXWSWebServiceMetaDataBuilder.buildWebServiceMetaData(JAXWSWebServiceMetaDataBuilder.java:150)
at org.jboss.ws.metadata.builder.jaxws.JAXWSServerMetaDataBuilder.setupProviderOrWebService(JAXWSServerMetaDataBuilder.java:52)
at org.jboss.ws.metadata.builder.jaxws.JAXWSMetaDataBuilderJSE.buildMetaData(JAXWSMetaDataBuilderJSE.java:63)
at org.jboss.wsf.stack.jbws.UnifiedMetaDataDeploymentAspect.create(UnifiedMetaDataDeploymentAspect.java:66)
at org.jboss.wsf.framework.deployment.DeploymentAspectManagerImpl.deploy(DeploymentAspectManagerImpl.java:118)
at org.jboss.wsf.container.jboss42.ArchiveDeployerHook.deploy(ArchiveDeployerHook.java:97)
at org.jboss.wsf.container.jboss42.DeployerInterceptor.start(DeployerInterceptor.java:90)
at org.jboss.deployment.SubDeployerInterceptorSupport$XMBeanInterceptor.start(SubDeployerInterceptorSupport.java:188)
at org.jboss.deployment.SubDeployerInterceptor.invoke(SubDeployerInterceptor.java:95)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
at $Proxy45.start(Unknown Source)
at org.jboss.deployment.MainDeployer.start(MainDeployer.java:1025)
at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:819)
at org.jboss.deployment.MainDeployer.deploy(MainDeployer.java:782)
at sun.reflect.GeneratedMethodAccessor21.invoke(Unknown Source)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
at java.lang.reflect.Method.invoke(Unknown Source)
at org.jboss.mx.interceptor.ReflectedDispatcher.invoke(ReflectedDispatcher.java:155)
at org.jboss.mx.server.Invocation.dispatch(Invocation.java:94)
at org.jboss.mx.interceptor.AbstractInterceptor.invoke(AbstractInterceptor.java:133)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
at org.jboss.mx.interceptor.ModelMBeanOperationInterceptor.invoke(ModelMBeanOperationInterceptor.java:142)
at org.jboss.mx.server.Invocation.invoke(Invocation.java:88)
at org.jboss.mx.server.AbstractMBeanInvoker.invoke(AbstractMBeanInvoker.java:264)
at org.jboss.mx.server.MBeanServerImpl.invoke(MBeanServerImpl.java:659)
at org.jboss.mx.util.MBeanProxyExt.invoke(MBeanProxyExt.java:210)
at $Proxy9.deploy(Unknown Source)
at org.jboss.deployment.scanner.URLDeploymentScanner.deploy(URLDeploymentScanner.java:421)
at org.jboss.deployment.scanner.URLDeploymentScanner.scan(URLDeploymentScanner.java:610)
at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.doScan(AbstractDeploymentScanner.java:263)
at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.loop(AbstractDeploymentScanner.java:274)
at org.jboss.deployment.scanner.AbstractDeploymentScanner$ScannerThread.run(AbstractDeploymentScanner.java:225)
Caused by: java.lang.ClassNotFoundException: dk.eobjects.metamodel.MetaModelException
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1358)
at org.apache.catalina.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1204)
at java.lang.ClassLoader.loadClassInternal(Unknown Source)
Im not expert in setting up the environment properly, i dontknow if how do i obtain the assembled JAr.
Entiser
Hi Entisar,
There must be some other explanation (another cause in the stack trace maybe?) because the class that it is missing (MetaModelException) is in the MetaModel-core.jar file.
I'm writing this from a borrowed machine so I can't make a testcase (for the next couple of weeks actually) myself... Are you manually adding the jar's or are you using Maven? If you use Maven it gets so much easier to manage the dependencies, but that's a long story :)
inglyThanks a lot Casper.
Im addind the jars in eclipse manually.
I dont have any knowledge about the maven.
I tried deleting them and add then again. but still giving errors.
I liked your API it seems clean and staraight forward comparing with other APIs i found for Excel.
but I may try another one.
Thanks again,
Ent.
Hi Entisar,
It just occurred to me, that one thing you might be forgetting is to include the Apache POI jars? As stated on the MetaModel, they are required for using MetaModels excel support. I believe that they are in the lib folder of the distributable zip.
Hi,
I downloaded latest version of your MetaModel (1.1.5) and tried your example to query an Excel spreadsheet.
The problem is that I can't even compile since there is not a select(FunctionType, Column) method provided by Query object!
I tried downloading some older versions of your project, but the problem remains.
Can you help me? What am I doing wrong?
Thanks!
Hi Alberto,
Oops, you're right ... That reveals that I made the example by hand and not in eclipse ;-)
The fix is:
query.select(new SelectItem(FunctionType, Column))
Sorry for the inconvenience, I will correct the example right away.
Thanks, that worked ;)
But I'm not done bothering you...
I'm doing some experiments with your MetaModel. I'm trying to query a spreadsheet with a financial report of some sort.
But I'm stuck at opening and reading the XLS file, with this exception which I don't understand:
Exception in thread "main" java.lang.IllegalStateException: Exception reading from file: d:\report.xls
at dk.eobjects.metamodel.ExcelDataContextStrategy.materializeMainSchemaTable(ExcelDataContextStrategy.java:106)
at dk.eobjects.metamodel.QueryPostprocessDataContextStrategy.materializeTable(QueryPostprocessDataContextStrategy.java:286)
at dk.eobjects.metamodel.QueryPostprocessDataContextStrategy.materializeFromItem(QueryPostprocessDataContextStrategy.java:231)
at dk.eobjects.metamodel.QueryPostprocessDataContextStrategy.executeQuery(QueryPostprocessDataContextStrategy.java:107)
at dk.eobjects.metamodel.QueryPostprocessDataContextStrategy.transformColumnTypes(QueryPostprocessDataContextStrategy.java:154)
at dk.eobjects.metamodel.QueryPostprocessDataContextStrategy.transformColumnTypes(QueryPostprocessDataContextStrategy.java:143)
at dk.eobjects.metamodel.QueryPostprocessDataContextStrategy.autoTransformColumnTypes(QueryPostprocessDataContextStrategy.java:193)
at dk.eobjects.metamodel.DataContextFactory.createExcelDataContext(DataContextFactory.java:156)
at it.previnet.bittoa.metamodel.ExcelQuery.main(ExcelQuery.java:17)
Caused by: java.lang.NullPointerException
at dk.eobjects.metamodel.ExcelDataContextStrategy.getCellValue(ExcelDataContextStrategy.java:168)
at dk.eobjects.metamodel.ExcelDataContextStrategy.materializeMainSchemaTable(ExcelDataContextStrategy.java:81)
... 8 more
Can you help me?
Thanks again ;)
I've been looking into this and I'm affraid you've found a "real" bug this time!
I will fix it right away and post a fix-release. Should be available in a couple of hours, so bear with me :)
Alright. MetaModel 1.1.6 is out. I'm uploading to the maven repository as I write this message, but you should be able to download it from google code already if you just need the jar. Let me know how it works.
It seems to work fine, now!
Thank you very much :)
Hi, i just want to ask how to get formula values of excel columns? when i tried using dataset.getRows(), i only get the formula and not the result value of the formula. thanks.
Hi Tina,
Formula calculation is currently not supported. I will add a feature request for it, because I actually think that POI (the underlying excel api that we use) is able of calculating at least rather simple formulas, so it might be a feature that is quite easy to add.
Thanks for the quick response, I hope you can add this feature on your next release it would be a great help for me.. thanks again :-)
Hi Tina,
Actually I got quite excited when looking into how easy it was to evaluate the formulas so I just implemented the solution today :)
It will be released together with the upcoming MetaModel 1.5 within the next coming weeks. If you can't wait, you can also check it out from SVN yourself and build the latest snapshot. It's available here: http://eobjects.org/svn/MetaModel/trunk
Hi Tina and others,
MetaModel 1.5 is now out :)
Read more and get it at:
http://metamodel.eobjects.org/
Hi,
Can you please help me understand, how can I query from a particular worksheet within the Excel workbook.
vk - each sheet is represented as a table. So simply use schema.getTableByName("foobar") and then you have your sheet to query from.
Hi Guys,
See if this freeware helps you. You can download it from the following link. The freeware product name is : FILLO.
http://www.codoid.com/products/view/2/29
-Shiva
Can we have an option for multiple
WHERE condition using metadata..?
Pls Help.
Hi Roshin,
Multiple WHERE items is already possible. If you're missing something, please elaborate some more.
Here there is a working example: http://stackoverflow.com/questions/38105932/apache-metamodel-bad-performance-querying-spreadsheet :)
Post a Comment