March 20th, 2009 1 comment
For about 3 months now I've been using an open source BI application called Pentaho. I am somewhat new to BI and thinking in data-warehouse terms took some getting used to. Now that I have the ETL under control, reporting is starting to be a lot of fun. With some relatively simple queries you can make powerful reports, the use of dimensions gives you nice grouping options. About Pentaho, it's a collection of BI applications. It features ETL, data analysis, reporting and a scheduler. I only use the ETL, reporting and scheduling tools. The use has been somewhat frustrating at times. This was mainly due to the total lack of documentation. Documentation does supposedly exist, but I think its only available for paying customers. I would personally also like to see a bit more integration of the different applications. For instance, I have a report, on this report I want to show the parameters I used to create it. This can be done, but it requires first prepping the report, publishing it, then editing the .xaction file by hand, finally publish the edited .xaction file. Whenever I change something in the report and republish is, I have to go through all these steps all over again. I personally find adding the calling parameters to the report as such an obvious feature that I can't understand why it hasn't been implemented. The ETL part of the tool though is exceptionally powerful. There are many built-in steps available. If a step isn't there, you can always create your own steps (JAVA). Jobs and transformations can be nested indefinitely. It allows for the insertion of  javascript into the jobs/transformations, to do complex data manipulation. For more information take a look at: