Reporting Against Multiple Datasources in OBIEE

I’m currently sitting in a hotel room in Budapest, with a day off between running seminars for Oracle University in Stockholm and Budapest. It was a fairly early flight so I’ve got a few hours free in the afternoon, which has given me the chance to add something to the seminar that I’ve been meaning to add for a long time – a section on joining in second and third data sources to an Oracle BI EE business model.
In the example I’m currently running with, I take a normalized schema and covert it into a denornalized, star schema using the mapping facility – logical table sources, logical joins – in the BI Administrator tool. This is a good “real world” example where the data you need to report on isn’t quite in the right star schema format to report on, but at the end of the day it’s all from the same, single data source and one of the major selling points of BIEE is it’s ability to bring data together from multiple data sources.
As a recap, my original physical model is based on the SOADEMO sample schema, with an additional TIMES table added in, and looks like this:
I then transform this into denormalized star schema with a single fact table, made up of the orders and items tables combined, and three dimension tables, like this:
In this new scenario, I have two additional physical data sources that I want to add to the business model. The first one is some sales quotas, dimensioned by month and product category, the physical model for which looks like this:
Taking a closer look at the tables and their columns, the first thing that jumps out at you is that this data source’s product dimension table starts at product category, whereas the product dimension in the business model starts at product. The same goes for time – this new data source starts at month, whereas the star schema in the business model goes down to day. Taking a look at the quotas table, it has a measure that is not currently found in the existing business model star schema, so we’ll want to add this in to the existing Items business model fact table, but only have it displayed when reports are being run at the product category and month level.
So, there’s two issues to solve here. Firstly, I need to join the two data sources so my quotas measure appears in the Items fact table, alongside quantity, price and so on. Secondly, this new data source is at a different level of granularity to the data in my business model, so I need to somehow tell BIEE to only use it at a specific level of summarization and upwards.
The first step then is to go to the physical model in BI Administrator, select all the new tables, and select the tables in the original physical model that they will join to. This includes the original items, orders, times and customers tables, like this:
Notice how the original schema, and the new one, are unconnected.
Now, I join the two schemas together, by joining the original product table to the new product category table, and the original times table to the new months table. The joins are from the month column in the original time table (not the primary key column, that’s the day (date) id) to the month column in the months table (which is it’s primary key), and the category column from the original product table (again, not it’s key, that’s product ID) to the category column in the new product category table. This establishes the link between the two schemas and means that we can introduce them into the same business model.
The join type I use is Foreign Key, and I draw the join going from the category table to the product table, and from the month table to the times table, as the former is the master and the latter is the detail table.
Now, I can drag the quotas measure from the new source on to the items fact table in the business model, which adds it to the list of measures and adds a second logical table source to the items logical table.
I then drag the category and months physical tables and place them on top of the products and times logical tables, which adds these two tables as logical table sources for the two logical tables.
Now as these two new dimension table logical table sources only apply at certain levels, I double-click on firstly the new category logical table source and then set it so that it only applies from the category level upwards, as defined by the product dimension object that’s also in the model.
This step is important as it’s vital that BIEE only goes to this additional logical table source when reporting at the category level and upwards.
Looking at individual columns within the product logical table, all of them except category still have the original, single data source pointing back to the CUSTDW physical schema. Category, though, has two logical table sources, as this column can be found in both the original CUSTDW schema, and the new one I’ve just introduced.
I then do the same for the times logical table, mapping in the new month physical table, and making sure that the MONTH_YYYYMM and MONTH_MON_YYYY columns now have two logical table sources, and that this new logical table source is only used from the month level upwards.
Finally I go to the items fact table, and edit the new logical table source so that it only applies at the correct levels. All of this of course relies on you having set up dimensions in your logical model, which you’ll also need to support drilling.
Then, after making sure the default aggregation method is set correctly for the new measure, I check the repository consistency and then go over to Answers to run a test report.
Not bad. One question though – can I still use the time dimension table along with a time dimension, so that I can use the AGO and TODATE functions?
Doesn’t look like it. BIEE is very picky with time dimensions, you also can’t use a physical table in a time dimension when it’s used in another business model not in a time dimension. I guess this is a similar sort of situation. Which gets me to my last point – although this is a pretty cool feature, don’t forget of course that joining across databases is going to be significantly slower than reporting against a single, integrated source, so whilst this approach is good to get you up and running, in time you should be trying to move your different reporting data sources into a single, integrated data warehouse. That’s it for now though, if anyone’s got an alternative (or better) approaches, let me know.