I know this blog title can be classified as a deviation from my more conventional ones but i have been planning on writing this for a while now, so hopefully this should be of some interest to everyone. As we start working with BI EE more and more, we get to know what its good features are and along with it we get exposed to its not so good ones(as with any tool) more. Recently i came across a requirement where BI EE was being implemented on a Data Warehouse that was built almost 4 years ago. The Data Warehouse had been built based on Kimball’s model and it had been satisfying the requirements of the customer so far(excel & MSAS reporting) unless and until BI EE came in as potential replacement for MSAS. When the customer first told me this, i was kind of confused on why BI EE failed to meet their requirement considering the fact that it was a very simple data warehouse(standard set of dimensions and 3 to 4 fact tables). On further analysis, there were 2 main requirements that stood out to be showstoppers in using BI EE
1. Ability to use any dimension member(Across dimensions) as a measure and the ability to use any measure as a dimension attribute – This is one key requirement that cannot be solved easily by BI EE currently. The problem with Business Model and Mapping layer of BI EE is the fact that we have to identify measures. Though this will work in most cases but if we want to do true multi-dimensional reporting, then the tool should ideally provide us with an ability to even treat measures as dimensions. For example, consider the report shown below
This is a very simple report where Amount Sold and Quantity Sold are measures (both in the data warehouse as well as in BI EE). But what if the customer requirement is to create a report as shown below
In the above case, Amount Sold and Quantity Sold measures in the first report have become dimension attributes. Also, the Channel dimension attributes Direct and Indirect have become measures in the second report.
In my customer scenario, end-users required the ability to treat any dimension as measure and vice versa. So, clearly they wanted a multi-dimensional reporting capability using their existing Data Warehouse and using BI EE. In retrospect, i would say this is a very common requirement for users who are used to Multi-dimensional tools like Essbase, MSAS etc. In some cases, with BI EE 10g, where we need to treat dimensions as measures, we just use Pivot Tables. But in cases, where measures in BI EE need to be shown as dimension attributes, then the only way to solve such requirements is to use set operations like using UNION etc.If we use them in Answers, we will start facing all sorts of issues like losing drills, losing the ability to use is prompted filters etc. There are other methods as well like the one that i have discussed before here. But the problem is the fact that we cannot make any method (BI EE modeling methods)generic enough to solve all these requirements.
2. The second requirement was to provide drilling ability on measures. For example, consider the report below. Here the 2 measures Amount Sold and Quantity Sold have been grouped under SALES. Similarly there are another 2 measures, Unit Price and Unit Cost which have been grouped under COSTS.
Drilling on COSTS or SALES should give us the individual measures.
Though the above example demonstrates a single step Measure hierarchy, the actual requirement was to have a multi-level measure hierarchy.
Ideally, considering the requirement, this would have been solved by using Essbase & Smart View. But considering the fact that the customer had already bought BI EE, a solution was to be arrived at using BI EE 10g. There were 2 things that we knew & were sure of
1. The data warehouse was working well. The customer was open to tweaking it a bit further to enable this requirement within BI EE.
2. Essbase provides this capability and will quite easily solve the customer’s requirement.
2. Essbase provides this capability and will quite easily solve the customer’s requirement.
We even thought of probably asking the customer to wait for BI EE 11g, but again the actual problem was in the fact that measures were actually required by BI EE which i don’t think will go away in the near future since the fundamental design of BI EE RPD revolved around the need to identify measures and dimensions(we weren’t sure of what changes in the RPD were coming in 11g and hence we had to find a solution in 10g itself). This is where we had to devise a DW model using the concepts of Essbase and then retrofit it back to the DW as well as BI EE.
I will illustrate what we did for the customer considering the SH schema of Oracle which is based on Kimball’s DW methodology.
As you see, we have 2 fact tables(of differing grain) and multiple dimension tables linking to the facts. Each fact table has a set of pre-defined measures as shown below
From an end-user perspective, each of the measures above are nothing but attributes of data. So, in effect if you model this in Essbase, what essbase will do is, it will maintain an internal single measure called (<Data>) and then everything else are nothing but attributes of this single measure(i.e dimensions. How these are internally stored within Essbase is determined by the 2 storage options BSO and ASO). So, to model this similar to Essbase, we start with separating out measures into its own dimension table called Measure as shown below
Then we create 2 new Materialized Views, Sales_Data & Costs_Data, on top the Sales and Costs fact tables as shown below
So the actual relationship diagram between the fact and dimension tables will look as shown below
So far so good. What this enables us is a fact that we can now sort of treat all measures as dimension attributes. But how do we now provide the ability to treat dimension attributes as measures.
This is done by applying Filter function on the DATA column. For example, to create a measure called Direct Sales we create a logical column and then apply a function as shown below
FILTER("SH - Multiple Measure Dimensions"."Sales - Facts".Sales USING "SH - Multiple Measure Dimensions".Channels."Channel Class" = 'Direct')
This way, we can create as many logical measure columns as possible depending on what we need to treat as measures in the presentation layer. The good part about this is, we can use an attribute as a measure and also as a dimension.
The idea is we build a model containing the following presentation layer
For every dimension table, there will be a corresponding Fact Measure table with measures required for reporting.
So, to achieve the first requirement where we need to convert the measures as dimension attributes, we choose the following columns in the report
If we look at the SQL of this report, you would notice that FILTER function converts it into relevant case when statements thereby providing us with the ability to do multi-dimensional analysis
select T14286.PROMO_NAME as c1, T14409.MEASURE as c2, sum(case when T14224.CHANNEL_CLASS = 'Direct' then T14394.DATA end ) as c3, sum(case when T14224.CHANNEL_CLASS = 'Indirect' then T14394.DATA end ) as c4, T14286.PROMO_ID as c5, T14409.MEASURE_ID as c6 from MEASURES T14409, PROMOTIONS T14286, CHANNELS T14224, SALES_DATA T14394 where ( T14224.CHANNEL_ID = T14394.CHANNEL_ID and T14286.PROMO_ID = T14394.PROMO_ID and T14394.MEASURE_ID = T14409.MEASURE_ID and (T14224.CHANNEL_CLASS in ('Direct', 'Indirect')) and (T14409.MEASURE in ('Amount Sold', 'Quantity Sold', 'Unit Cost', 'Unit Price')) ) group by T14286.PROMO_ID, T14286.PROMO_NAME, T14409.MEASURE, T14409.MEASURE_ID order by c1, c2
Similarly to solve the second requirement, we just need to pull the relevant measure attributes from the measure dimension and then use the DATA field for our measure
There are significant advantages of this method as this does not fix us to any specific dimension/measure. It provides an ability where we can swap measures and dimensions at will.
So the question to everyone reading this is, do we need measures at all in the Fact table(apart from the implicit Data Fact column)? Majority of the tools like BOBJ, Cognos etc all require measures to be assigned explicitly. How about forking out the measures into their own dimension and then treat everything else as normal? Are there any downsides to this approach? The features offered seem to outweigh the conventional modeling method. But i am aware that there could be potential problems with this as well. One problem which i could see is the performance. The size of the fact tables can grow quite quickly if there are a lot of measures. Also, the end-users might have to get used to this model. Is there anything else that you see could become potential problems in the future? If so, do put your thoughts in the comments section.