n yesterdays posting, I took a look at some of the practicalities behind my recent article on OBIEE “next-generation” architectures. The idea behind this architecture is that you can use the connectivity features of OBIEE to initially report against your data in-place, and then behind the scenes take this data, load it into a data warehouse and then “re-wire” OBIEE so that it eventually reports against the data warehouse, all without the metadata layer and user reports getting interrupted. Whilst this is a nice idea in principle, I was interested to see how well it worked in practice, and so I built an OBIEE metadata layer against three application data sources and created a logical model that spanned across all of them.
In todays posting, I’m going to take the data held in these three application schemas and create a data warehouse out of them using Oracle Warehouse Builder 10.2.0.3. Before I do this though, one thing I have done is extend the logical model in OBIEE to three fact tables and a few more dimensions, so that it becomes a complete dimensional model over all of my source data. When I come to create the data warehouse, it’ll have fact tables corresponding to the four fact tables in my logical model, and dimensions to match the total set of dimensions in the model.
Before I start the process, there’s a few issues that I can see cropping up with this migration. Firstly, OWB is going to create surrogate keys (or to be more precise, a “dimension key”) for each of the dimension tables, and it’s these rather than business keys such as CUST_ID, PROD_ID and so on that are going to join to the fact tables. The fact tables themselves will have surrogate/dimension keys rather than business keys in them, and together these issues are going to mean that the tables I map to in OBIEE are going to be keyed on different fields than before.
Another issue that immediately springs to mind is how OBIEE will handle slowly-changing dimensions. OWB has automatic support for slowly-changing dimensions in the tool, but it’s not clear (at least to me) how OBIEE will handle this. I’m actually going to cheat on this a bit and just use Type 1 dimensions in my example, but I’ll give some thought as I go on to how Type 2 dimensions, which can have multiple entries per dimension member, partitioned by time, will be handled.
The final issue is around all the extraneous columns and rows that OBIEE adds to the tables it creates, particularly the dimension tables. OWB creates a surrogate key for each dimension level, as well as a “dimension key” field that acts as table-wide surrogate key, which has the benefit of allowing slowly-changing dimensions and separation from changes to source system keys, but to the inexperienced it seems like an awful lot of keys. In addition, you get extra rows in your dimension tables for all the levels above the bottom level in your hierarchy, which in conjunction with the dimension key I mentioned earlier allows you to join to the dimension table at any level of aggregation, but which could confuse people if they just want to list out customers, or products for example. There are ways around this (OWB automatically creates views that at least remove the extra rows, and in fact the dimension key concept is quite simple once you work it out) but again it’ll need a bit of thought.
Anyway, the first thing I do is go into Oracle Warehouse Builder and create modules for the three source applications and the data warehouse that I’m going to create.
Then, I create the dimension and fact objects, together with their supporting tables, in the data warehouse module. In this instance, I create them as relational (ROLAP) objects as it’s a bit trickier, currently, to import MOLAP objects into OBIEE.
Once that’s done, I create the mappings to load data from the source tables into the dimension objects. OWB takes care of all the surrogate key handling for me, all I have to do is map in the source data.
Now I can deploy the dimension objects and tables to the Oracle database, and run the mappings to load data into them.
At this point, my dimension tables have data loaded in to them. The next thing to do is to map the transactional information from my source systems into the fact tables. Again, OWB takes care of converting business keys into surrogate keys, the only tricky thing is the daft way that OWB specifies the date business key as a number, in the format YYYYMMDD, which means I have to convert it from the date format used by the source into this numeric format, before I can load it into the fact table.
Finally, once all the data is loaded into the warehouse, I can take stock of the final OWB project, which looks like this:
and then load the resulting objects into my OBIEE repository, like this – note the views over the dimensions that I’ve brought in, I’ll come back to this in tomorrows’ posting.
So, that’s the data warehouse created from the source data I brought in to OBIEE yesterday. Although these steps seem all fairly simple, it was quite a time-consuming task to do this – it took me a good few hours to set up all the dimension and fact objects, make sure all the datatype lengths are correct, create the mappings, add all the constants and conversions that were needed and so on, so I’m looking forward to some sort of automation of this process in the future – automating the creation of a physical schema from the logical model to create the data warehouse schema would be an obvious easy win, and it should even be possible to automate the creation of the data mappings from the source systems to the warehouse tables.
Tomorrow though, I’ll be mapping the logical model in my OBIEE metadata layer to these new warehouse tables, and trying to see whether it’s possible to preserve all the reports and avoid changing the logical model around. It’ll be interesting to see how the table key changes are handled, and how easy it is to get rid of all the extra information OWB adds to the warehouse tables.