Showing posts with label LTS. Show all posts
Showing posts with label LTS. Show all posts

Oracle BI EE 10.1.3.4.1 – Understanding Logical Calculations & Native DB Calculations – Repository Modeling

Ever wondered what is the actual difference between the logical calculated measures in the repository and the native DB calculated measures? I have had this question asked to me quite a few times in the recent past that i thought it made sense to create a separate blog entry for it. The difference between the 2 can turn out to be pretty significant and can even result in different numbers if not correctly used. Lets look at the differences between these 2 and how they together can provide excellent SQL Analytics capabilities in the repository.
Just to emphasize, logical calculated measures do not mean that the calculations will be done in the BI Server. Though it is possible in some cases that such logical calculations can be pushed to the BI Server layer, they have a completely different significance. Lets understand this through a couple of examples. Lets take the SH schema that gets installed as a sample schema in the Oracle database. Its a schema that contains 2 fact tables (SALES and COSTS) and set of dimension tables (PRODUCTS, PROMOTIONS, TIMES, CHANNELS & CUSTOMERS). Lets assume that SALES fact table has the following measures
1. AMOUNT_SOLD
2. AMOUNT_BUDGETED
Now, for the sake of illustration, lets assume that the FACT table SALES has data way back from 1998 for the AMOUNT_SOLD measure. But the Budgeting process only started in 1999. So, for all the records in the SALES fact table the AMOUNT_BUDGETED would be null for the year 1998 as shown below.
image
Along the same lines, lets assume that the COSTS fact table (or the COSTS tracking application) came into existence only in 1999. So, even the COSTS measures UNIT_PRICE and UNIT_COST will be null for the year 1998.
image
Also, to give you a perspective, since SALES and COSTS fact tables are at differing grains, they have been modeled as separate fact tables as shown below.
image
Lets start with a very simple requirement. No measures should be shown as nulls. All nulls should be replaced by zeroes. Seems pretty straight forward.
DB Calculation:
The most common way of doing this is, we go into the AMOUNT_BUDGETED, UNIT_PRICE & UNIT_COST measure in the LTS mapping and apply a formula in the mapping as shown below.
image
image
After making this change, lets go back to the report and see the result.
image
If you notice, the IFNULL is applied only on the AMOUNT_BUDGETED measure. The main reason behind this is, when you apply a calculation on a measure in the LTS mapping area, that will be applied only on existing rows. In our case, we have data combined from 2 different fact tables. In the case of AMOUNT_BUDGETED, the row for 1998 existed and it was a matter of the measure being NULL. But in the case of UNIT_COST and UNIT_PRICE the corresponding rows for the year 1998 do not even exist. Hence the application of IFNULL in the mapping layer at-least for this example is of no use. You can clearly understand this by looking at the actual SQL query fired.
WITH
SAWITH0 AS (select sum(T6428.AMOUNT_SOLD) as c1,
     sum(nvl(T6428.AMOUNT_BUDGETED , 0)) as c2,
     T6476.CALENDAR_YEAR as c3,
     T6294.CHANNEL_DESC as c4
from
     TIMES T6476,
     CHANNELS T6294,
     SALES T6428
where  ( T6294.CHANNEL_ID = T6428.CHANNEL_ID and T6428.TIME_ID = T6476.TIME_ID )
group by T6294.CHANNEL_DESC, T6476.CALENDAR_YEAR),
SAWITH1 AS (select sum(nvl(T6302.UNIT_PRICE , 0)) as c1,
     sum(nvl(T6302.UNIT_COST , 0)) as c2,
     T6476.CALENDAR_YEAR as c3,
     T6294.CHANNEL_DESC as c4
from
     TIMES T6476,
     CHANNELS T6294,
     COSTS T6302
where  ( T6294.CHANNEL_ID = T6302.CHANNEL_ID and T6302.TIME_ID = T6476.TIME_ID )
group by T6294.CHANNEL_DESC, T6476.CALENDAR_YEAR)
select distinct case  when SAWITH1.c3 is not null then SAWITH1.c3 when SAWITH0.c3 is not null then SAWITH0.c3 end  as c1,
     case  when SAWITH1.c4 is not null then SAWITH1.c4 when SAWITH0.c4 is not null then SAWITH0.c4 end  as c2,
     SAWITH0.c2 as c3,
     SAWITH0.c1 as c4,
     SAWITH1.c2 as c5,
     SAWITH1.c1 as c6
from
     SAWITH0 full outer join SAWITH1 On SAWITH0.c3 = SAWITH1.c3 and SAWITH0.c4 = SAWITH1.c4
order by c1, c2
As you see in the query above, the IFNULL is pushed only in the inner sub-queries. But the null in COST measures appear when the conforming dimension join happens – which is in the outer query. How do we push the IFNULL to the outer query.
Logical Calculations:
Lets introduce a Logical Calculated column for both UNIT_PRICE and UNIT_COST as shown below.
image
image
image
Lets include these 2 measures instead of the older measures in the report.
image
This has effectively converted the null cost measures to zero. If you look at the query below, you will notice that the outer query will have IFNULL(nvl in this case) clause on both the cost measures.
WITH
SAWITH0 AS (select sum(T6428.AMOUNT_SOLD) as c1,
     sum(nvl(T6428.AMOUNT_BUDGETED , 0)) as c2,
     T6476.CALENDAR_YEAR as c3,
     T6294.CHANNEL_DESC as c4
from
     TIMES T6476,
     CHANNELS T6294,
     SALES T6428
where  ( T6294.CHANNEL_ID = T6428.CHANNEL_ID and T6428.TIME_ID = T6476.TIME_ID )
group by T6294.CHANNEL_DESC, T6476.CALENDAR_YEAR),
SAWITH1 AS (select sum(nvl(T6302.UNIT_PRICE , 0)) as c1,
     sum(nvl(T6302.UNIT_COST , 0)) as c2,
     T6476.CALENDAR_YEAR as c3,
     T6294.CHANNEL_DESC as c4
from
     TIMES T6476,
     CHANNELS T6294,
     COSTS T6302
where  ( T6294.CHANNEL_ID = T6302.CHANNEL_ID and T6302.TIME_ID = T6476.TIME_ID )
group by T6294.CHANNEL_DESC, T6476.CALENDAR_YEAR)
select distinct case  when SAWITH1.c3 is not null then SAWITH1.c3 when SAWITH0.c3 is not null then SAWITH0.c3 end  as c1,
     case  when SAWITH0.c4 is not null then SAWITH0.c4 when SAWITH1.c4 is not null then SAWITH1.c4 end  as c2,
     SAWITH0.c2 as c3,
     SAWITH0.c1 as c4,
     nvl(SAWITH1.c2 , 0) as c5,
     nvl(SAWITH1.c1 , 0) as c6
from
     SAWITH0 full outer join SAWITH1 On SAWITH0.c3 = SAWITH1.c3 and SAWITH0.c4 = SAWITH1.c4
order by c1, c2
Some might argue that IFNULL in the mapping (db calculation) is redundant for the AMOUNT_BUDGETED measure as we can have the same effect by using a logically calculated AMOUNT_BUDGETED measure(like what we did for UNIT_COST & UNIT_PRICE). This argument is true for this case. But database calculations are always preferred over Logical Calculations unless the calculation cannot be pushed into the database. Lets see why.
Logical Calculation & DB Calculation:
For example, lets create 2 measures which are basically derived measures calculating the difference between AMOUNT_SOLD and AMOUNT_BUDGETED. Variance is a measure where the calculation of AMOUNT_SOLD – AMOUNT_BUDGETED is in the LTS mapping layer. Logical Variance is a measure where the same calculation is in the Logical calculation section as shown below
image
image
Lets create a report containing both the measures separately and look at the SQL queries fired.
image
select T6476.CALENDAR_YEAR as c1,
     T6294.CHANNEL_DESC as c2,
     sum(T6428.AMOUNT_SOLD - T6428.AMOUNT_BUDGETED) as c3
from
     TIMES T6476,
     CHANNELS T6294,
     SALES T6428
where  ( T6294.CHANNEL_ID = T6428.CHANNEL_ID and T6428.TIME_ID = T6476.TIME_ID )
group by T6294.CHANNEL_DESC, T6476.CALENDAR_YEAR
order by c1, c2
This query is as expected. But lets create the same report using the Logical Calculated column.
image
The results are the same. But if you look at the query fired
WITH
SAWITH0 AS (select sum(T6428.AMOUNT_BUDGETED) as c1,
     sum(T6428.AMOUNT_SOLD) as c2,
     T6476.CALENDAR_YEAR as c3,
     T6294.CHANNEL_DESC as c4
from
     TIMES T6476,
     CHANNELS T6294,
     SALES T6428
where  ( T6294.CHANNEL_ID = T6428.CHANNEL_ID and T6428.TIME_ID = T6476.TIME_ID )
group by T6294.CHANNEL_DESC, T6476.CALENDAR_YEAR)
select distinct SAWITH0.c3 as c1,
     SAWITH0.c4 as c2,
     SAWITH0.c2 - SAWITH0.c1 as c3
from
     SAWITH0
order by c1, c2
you will notice that there is a significant change in the query. There is an un-necessary inline view that BI Server creates. This is absolutely not desired as all that you are doing is a simple difference in measure calculation at-least for this use-case. In effect, the point is while designing a model for reporting we need to ensure that the calculations are pushed at the right places to derive maximum performance. Both type of calculations complement each other very well though. In fact, in some complex scenarios we can control the order of joins made across the tables by mixing the database and logical calculations. Well, that is a topic for another blog entry altogether!!! Though the use cases above are pretty simple, the same logic can be applied for complex scenarios (like assume a requirement where DB calculation happens at a lower level but the Logical calculation happens at another level). This way we can tweak how our SQL queries are constructed by the BI Server.

Logical Table Source(LTS) Override – A multiple LTS logical fact with Count and Count Distinct Measures

One of my colleague has encountered a strange problem which I even have not noticed ever because the problem is bit ridiculous for me . The problem is about applying count distinct and count aggregation rule while building a measure . The measure will be mapped to multiple logical fact table source .
The problem in nutshell is, while applying the Count on column NUM_OF_CONTACT it is doing SUM(NUM_OF_CONTACT) instead of COUNT (NUM_OF_CONTACT) whereas using Count Distinct on the same column will be driving the Distinct count on Lowest level of LTS all time(i.e query hitting single/one LTS)  irrespective of the dimension and dimensional content specification .
Initially it is bit strange for me . After investigating across metalink I got couple of such issue encountered by users . Some of them are : Doc ID : 510223.1 SR 38-1054325181, Doc ID 526854.1 SR 38-1505875337 . But it does not give me the proper info about the workaround or solution.
Re-engineering on it reveals that there are some interesting and unknown feature left behind by Oracle/Siebel to overcome such problems where measures need to be totally driven by the content specification of fact table LTS. Its nothing but “Logical Table Source Override” . I proof this I replicate the problem in my env and done a POC and outcome/workaround of it as below :
1 . Create the below tables and insert the handcrafted data .
LTS Override1






LTS Override2





2. Configure the repository as below :
  LTS Override3



3. The Logical Fact table source content has been mapped as applicable . The 2 LTS content mapping as below . Similarly other LTS has been mapped .
LTS Override4





4. Now see how the Count distinct has been defined .
  LTS Override5 




However the definition prove to be wrong and erroneous while you drag the measure with the dimensional column .Here I have drag LVL2 and as per the content mapping the query should be routed via “DIM_POSITION_FLM” . But strange enough whatever column you choose from dimension table the query always routing via “DIM_POSITION_KAM” .Have a look on below :
LTS Override7






5 . Definitely the above result is wrong . A correct approach is to redefine the measure by LTS overriding as below :
LTS Override8






6. And finally see the correct result and the query log .
LTS Override9





7. Whereas a totally different kind of behavior has been observed while you are dragging “NUM_OF_CONTACT_Count” with measure . A simple count aggregation has been defined against it . The result would be below :
LTS Override10






So this create another major problem as any dimension you choose always SUM aggregation will be applied whereas as per Repository definition it should be Count .
8. A correct approach/workaround of the problem is to redefine the measure . However for Count aggregation you haven’t any choice of LTS overriding . Here you have to go by Based On Dimensions . Lets see this and the issue has been resolved !
LTS Override11




9. So apparently it seems everything is working fine with some tweaking as LTS override and based on Dimensions funda . But wait there are some more strange thing awaiting  and I am really bit astonished how this would happened ! If any body could explain me the reason I will be grateful .
I have deleted the measure “NUM_OF_CONTACT” and “NUM_OF_CONTACT_Sum” from logical fact table as it is of no use for me for this POC . However after removing this I observed only the measure “NUM_OF_CONTACT_Count_Distinct_Override” is functional and rest of them is not behaving what it should be , as explained above .
Dragging “LVL2″ and any of the measure (apart from “NUM_OF_CONTACT_Count_Distinct_Override”) leads to inconsistent query firing in DB :
select distinct T41.LVL2 as c1,
     T38.NUM_OF_CONTACT as c2
from
     DIM_POSITION_H T41,
     DIM_POSITION_FLM T38
where  ( T38.POSITION_WID = T41.POSITION_WID )
order by c1
And more astonishing fact is that for all 3 measures the query is same !!! How come this possible ? I dont know if I am missing something .But this is quite strange how removing an independent measure impact the rest of the measures query ! Still can’t find any justification ……

Experience with creating, using their own indicators and performance

In the next post I mention the experience with the creation, use and own performance indicators (calculated columns). Post article builds on two ways to define your own calculations (indicators) in the Business model .Custom indicator (eg, arithmetic operations on values ​​of two columns) can be defined in the following ways:


  1. Directly in the design report
  2. Using metadata using logical columns
  3. Using metadata using physical columns

1st In the draft report
  • Easy and quick setup
  • Suitable for use unfrequented indicators (the indicator is only available in the report where it is created)
  • The actual calculation is performed over the indicator aggregated or detailed (depending on whether the default indicator was preset in the metadata layer of an aggregate function)
  • When a misunderstanding of the indicators which are based computing and application functions may be incorrect / unwanted outcome (eg, averaged totaling no indicators or see. here )
  • I recommend using only if the source columns are also discussed in report
  • When using columns that are not content to report high physical question of COST, which can grow up to 6 orders of magnitude
  • This method of use for end users rather ad-hoc needs


2nd Using metadata using logical columns
  • The term can refer to columns / indicators in the logical table (from multiple sources)
  • First, aggregation is performed, and then calculate a new column which can affect the accuracy / inaccuracy of the results - see more. here
  • If the referenced column report content, so everything is OK. If not, then the question has a high physical COST
  • The indicator is named and delegated to the presentation layer - it can be reused

Sample final query: WITH sawith0 AS (SELECT DISTINCT t17101.is_on_us_flag AS c1, c2 AS t1339.date_dt l1_owner.d_date FROM t1339, t17101 l2_owner.a_stat_atm_agg WHERE (t1339.date_dt t17101.audit_id =)), sawith1 AS (SELECT SUM (t17101 . amt_amt) AS c1, c2 AS t1339.date_dt l1_owner.d_date FROM t1339, t17101 l2_owner.a_stat_atm_agg WHERE (t1339.date_dt = t17101.audit_id) GROUP BY t1339.date_dt) SELECT DISTINCT CASE WHEN sawith0.c2 IS NOT NULL THEN sawith0. c2 sawith1.c2 WHEN IS NOT NULL THEN sawith1.c2 END AS c1, CASEWHEN sawith0.c1 = '1 ' THEN sawith1.c1 ELSE 0 END AS C2 FROM sawith0 FULL OUTER JOIN ON sawith1 sawith0.c2 sawith1.c2 = ORDER BY c1 thirdUsing metadata using physical columns

























  • The term can refer to columns in the tables attached individuals (source)
  • First, the calculation is performed, and the aggregation of a new column which can affect the accuracy / inaccuracy of the results - see more. here
  • "Slight" COST physical inquiry and whether or not the source columns used in the report
  • The indicator is named and delegated to the presentation layer - it can be reused

Sample final query: SELECT t1339.date_dt AS c1, SUM (CASE WHEN NOT LIKE '00% t5199.resp_code_cd ' t17101.trn_cnt THEN ELSE 0 END ) AS c2 FROM l1_owner.d_date t1339, t5199 lookup_owner.d_response_code,l2_owner.a_stat_atm_agg t17101 WHERE (t1339.date_dt = t17101.audit_idAND t5199.response_code_id = t17101.resp_cde_id t1339.date_dt GROUP BY ORDER BY c1 Conclusion From the above information, my recommendation is clear - to define their own indicators to use to the maximum extent of physical columns:




















  • The indicator is delegated to the presentation layer, so it can be reused by end users themselves
  • There is another way of limited use (see final question)
  • Lower COST physical query

Ratios - Tržba/m2

General recommendations for saving ratios in the data warehouse is to store separately the numerator and denominator separately. In the case where the metric is calculated as the ratio of facts and metrics dimensional attribute this recommendation is met implicitly. The question is how to design metadata in OBI. Eg.: Data Model: F_SALES_BY_STORE_PD. SALES_AMT (sales) and D_STORE.STO_SELLING_AREA (m2 sales area stores). We are interested in "sales per m2 of sales area stores" (hereinafter Tržba/m2). BMM layer OBI looks like this: A classic model where:







  • Fact-Sales has LTS: F_SALES_BY_STORE_PD and
  • STO_SELLING_AREA is an attribute dimension Dim - Store that has LTS D_STORE
is unsatisfactory. Set returns incorrect result when the totalized sales, sales area is divided by only one of the stores. It is nowhere said that the need STO_SELLING_AREA across multiple stores together. Sales / Sales area (m2) = "- basic metrics." "sales" / shop. "Sales area (m2)" The solution could be pulling STO_SELLING_AREA dimension of the Fact Sales-whether by creating a view on the db level, or at the BMM layer (new table for the current LTS). Now you can set the aggregation rule of SUM STO_SELLING_AREA. Is not won.They are not treated in situations where it makes sense to add STO_SELLING_AREA and when not (need is the sum of sales areas such stores, each store contributes its sales area to the total sum in the period examined just once) real solution involves adding an attribute to the Fact-STO_SELLING_AREA Sales. However, it is necessary to have STO_SELLING_AREA mapped to the new LTS consists of a table setting with D_STORE = Content Store detail. Now at the newly created logical metadata defined metrics used in the report generates the correct SQL query, consisting of two subdotazů:And we're in goal, even if ... it would not improve in terms of business logic. Demonstration of the method of implementation of calculation is unfair to stores that are open fewer days than others, or for stores that started their activities sometimes during the reporting period.