Showing posts with label DAC OBIA. Show all posts
Showing posts with label DAC OBIA. Show all posts

Analyzing BI Apps ETL Runs using OBIEE and the DAC Repository

The other week I posted an article on optimizing the Oracle B Applications ETL process, and one useful set of diagnostic data I highlighted at the time was the ETL step timings that are available in the DAC Console. For every running or completed execution plan in the DAC Repository, you can list out the individual steps in the process together with the start, time, end time, elapsed time, rows process and so on, like this:
Dacsql1
Working out where the time has gone in an ETL process is therefore quite simple, as you can sort the list of tasks by Start Timestamp, or Elapsed time, to quickly focus on on the tasks that have taken the most time to complete. You can also generate reports and graphs that tell you how the time was spent in this ETL run, broken down by ETL stages and so on.
Dacsql2
But what if you want to see the trend for a particular mapping, task group or phase? You can view each execution plan run in the DAC console, locating a particular mapping within each run and making a note of how long it took to run, but clearly it would make more sense if we could load up the data in Excel or in a database and analyze it there. You can, in fact, right-click on the list of tasks in an exection plan and select Output to File, and this generates a delimited flat file that you can import into Excel and analyze. But the import doesn’t in my experience handle dates and times very well, making it hard to analyze task timings, and besides – the data you’re interested in is in a couple of databases anyway and so it’d make more sense to analyze it there.
The first place that you can find task execution details and timings is the Informatica Repository. There’s a view there called REP_SESS_LOG that contains details of every workflow execution recorded in the Informatica Repository. If you run a query against this view, you can see the start and end times for a particular workflow (equating to a task in the DAC), the number of rows processed, and from this you can generate the total run time and the row throughput of the mapping, like this:
select workflow_name
,      to_char(actual_start,'DD-MON-YYYY HH:MI:SS') START_TIME
,      to_char(session_timestamp, 'DD-MON-YYYY HH:MI:SS') END_TIME
,      round((session_timestamp - actual_start) * 86400,2) ELAPSED_TIME_SECS
,      successful_rows
,      failed_rows
,      round(successful_rows / ((session_timestamp - actual_start) * 86400),2) ROWS_PER_SEC
from rep_sess_log where workflow_name = 'SDE_ORA_GLJournals_Full'
Giving you an output , in SQL Developer, like this:
Dacsql3
Using this sort of report, you can see how the two key metrics of elapsed time in seconds, and rows processed per second, have changed over a series of ETL runs. Now this is quite neat, but what the Informatica Repository views don’t have access to is the additional metadata in the DAC such as the Execution Plan name, the Source System Folder (so that you can see if you’re running a custom, out out-of-the-box version of the mapping), the ETL_PROC_WID and so on. For these. you need to run a similar query in the DAC Repository tables, which are a separate set of tables to the Informatica Repository ones, and usually held in their own schema.
select d.etl_defn_name    EXECUTION_PLAN
,      r.phy_folder_name  FOLDER
,      r.step_name        TASK
,      to_char(r.start_ts,'DD-MON-YYYY HH:MI:SS') START_TIME
,      to_char(r.end_ts,'DD-MON-YYYY HH:MI:SS') END_TIME
,      round((r.end_ts - r.start_ts) * 86400,2) ELAPSED_TIME_SECS
,      r.status
,      r.sucess_rows      SUCCESSFUL_ROWS
,      r.failed_rows      FAILED_ROWS
,      round((r.sucess_rows / (round((r.end_ts - r.start_ts) * 86400,2))),2) ROWS_PER_SEC
from   w_etl_defn_run d
,      w_etl_run_step r
where  d.row_wid = r.run_wid
and    r.step_name = 'SDE_ORA_GLJournals'
order  by d.etl_proc_wid desc
This query returns the same basic information as the previous one, except this time we’ve got some information on the Execution Plan, Source System Folder and so on, and it’s sorted by the ETL_PROC_WID so the most recent run is at the top.
Dacsql4
So far, so good. If you were particularly clever with analytic functions and so on, you could probably write some SQL now to show you the timings have changed for a particular mapping over time, or you could highlight those tasks whose timing over a window of ETL runs has changed more than a certain percentage. But given that we’ve got a perfectly good BI tool available that we can also use to analyze report run times, why don’t we use OBIEE to analyze this data as well?
If you bring the key DAC tables into the OBIEE physical layer, and the S_ETL_DAY table from the Usage Tracking tables if you want to analyze your ETL runs over time, you can then create a logical dimensional model out of them. W_ETL_DEFN_RUN and W_ETL_RUN_STEP become your fact tables, the other tables in the DAC schema make up the task and execution plan dimensions, and the S_ETL_DAY table gives you a ready-made set of time dimension rows.
UPDATE 2/2/09 : I’ve uploaded the RPD file if you want to take a look. It’s by no means finished or error tested, but if you want to see how I put the example together, feel free to download it and see how it’s put together.
Dacsql5-1
So now you can start to analyze your DAC repository data using OBIEE. Here’s a quick table of ETL runs:
Dacsql6
You can also create some nice graphs showing, for example, row throughput and time elapsed over time for a particular DAC task.
Dacsql7
… Or perhaps provide a breakdown of the total ETL time by task phase.
Dacsql8
So all in all this is not a bad complement to the DAC statistics you can get from the BI Management Pack, and the query run statistics you can get from the Usage Tracking tables and reports.. In terms of how you set the RPD up for this, the joins between the physical tables look like this (W_ETL_GROUP is an alias for W_ETL_STEP):
Dacsql9
The logical model is then derived from this, with two fact tables (one for ETL runs, and one for task runs), two conformed dimensions and a single one used just by the Task Runs fact table.
Dacsql10
The Task Runs fact table is sourced from the W_ETL_RUN_STEP table that contains most of the details of the task execution, together with the W_ETL_DEFN_RUN table that provides the ETL_PROC_WID and details of the execution plan that called the mapping.
Dacsql11
The Task dimension is a bit more interesting, as you have to denormalize the W_ETL_STEP, W_ETL_FOLDER and W_ETL_PHASE tables to get the main task details, the folder it points to and the phase it’s assigned to; then you have to join through the W_ETL_GROUP_STEP table to find the task group, then join back to the W_ETL_STEP table (aliased as W_ETL_GROUP) to get the task group name. The rest of the tables are fairly simple.
Dacsql12
Watch out for the join between the S_ETL_DAY table and the DAY_DT column in the W_ETL_RUN_STEP table, as the latter includes minutes and seconds and the former doesn’t – I had to do a complex physical join between the two tables and CAST () the DAY_DT column as DATE to truncate the seconds off. Once you’ve got this all set up though, you can start to create calculations based on the timings in the DAC tables, such as this one to calculate the elapsed time.
Dacsql13
You can even create a DAC analysis dashboard, so that you can monitor your ETL process along with your query timings, create alerts and guided analytics, and provide the ability to drill-down from the overall execution plan to show individual mappings, task groups and phases.
Dacsql14
Like I said, not bad. Don’t forget that with the DAC tables at least, none of these are “public views” or APIs so use them at your own risk, they may well change from release to release. But it’s a nice way of making use of the data in the DAC Repository, and giving you the potential later on to combine it with query response time data from the Usage Tracking tables into a single “DBA View” of what’s going on in OBIEE and the DAC.

Performance Measurement of an OBIEE Application

Its been over a year since, we been having fights/argument with Business regarding application performance, IT kept claiming that application performances well, but business kept saying — “I can click on a page and go for coffee and come back, go for lunch come back, go for meetings and come back and still page keeps running”. Of course I am exaggerating. Along with the database team and DBA we worked on the issue and got to the point where we felt results were good but still it was never ending battle with business.
With frustration, we started looking at the other avenues. Just to horn my webdevelopment skills, downloaded Firebug, the great Mozilla extension and found an option Net withing the utility. Net option does nothing but the “Network Monitoring “.
Looking at the statistics on Net option, I was amazed that almost 20% of the time was taken by CSS and .js. Having said this, doesn’t mean that our reports were superfast, but agreement of 9sec per report was not achievable as CSS and JS were taking around 3.5 seconds. Apart from the CSS, there were few gif files were missing and http request to get those images were taking almost 1.5-2sec. All these things were adding up to the report rendering.

In the above screen shot you can see that the total time (22.08 sec) taken by the request to render on to the page. Look at the saw.dll, this is nothing but the report request to SAS server taking over 19.52 sec and rest is Java Script,CSS, images and etc.
Coming back to the main point, if you guys are profiling your application and doing the application base line, I would highly encourage using Firebug tool. Also keep in mind that its just not the query which constitutes the report performance but the CSS,JS and other things also contributes to it. Along with Firebug you can use Yahoo slow utility. It gives the nice pie view of the total time spent. see the screen shot below.
Yahoo Slow
For more details on how to use Firebug in general go to http://getfirebug.com/. For yahoo slow, go to http://developer.yahoo.com/yslow/.
In case of any questions, please email me at sunil_ranka@yahoo.com. If need arise I can do a conference call.

Basics of OBI Apps

Oracle BI Applications – Overview
Oracle Business Intelligence Applications are a set of prebuilt solutions that deliver business intelligence across an organization, giving users the key information they need to maximize effectiveness. These solutions transform and integrate data from a range of enterprise sources, including Siebel, Oracle E-Business Suite, as well as corporate data warehouses.
Oracle BI Applications includes the following modules:
1. Oracle Financial Analytics
2. Oracle Human Resources Analytics
3. Oracle Supply Chain and Order Management Analytics
4. Oracle Procurement and Spend Analytics
5. Oracle Project Analytics
6. Oracle Sales Analytics
7. Oracle Service Analytics
8. Oracle Contact Center Telephony Analytics
9. Oracle Marketing Analytics
10. Oracle Loyalty Analytics
The BI Applications are built upon the foundation of Oracle BI Enterprise Edition which means that they allow extension of the suite with additional data sources and own-built dashboards and reports.
The Oracle BI Applications pre-packaged content comes with three main pre-built repositories:
1. The OBIEE repository and catalog
2. The ETL repository tailored for the source applications
3. The Data Warehouse model
The figure below diagram presents a functional overview of OBIA.
The pre-built ETL repository includes not only routines for changed data capture, extract to staging area and the load into data warehouse tables, but also seed data for common dimensions like the time dimension or dimension lookup files with domain values or dimension value sets to support sources independent metric calculation.
The pre-built Business Analytics Warehouse is one single but modular data warehouse model to support one or a combination of the source systems.
The pre-built OBIEE repository and web catalog are aligned with the Business Analytics Warehouse and contain the domain specific and end user facing dashboards, reports and KPI’s.
The below diagram shows the Architecture of the OBI Apps
Data Flow Description
1. First, the DAC scheduler kicks off jobs to loadrefresh the OBAW at regular intervals or alternatively, these jobs could be kicked off manually from the DAC client.
2. The DAC server uses the object and configuration data stored in the DAC repository to issue commands to the informatica Server.
3. The informatica server executes the commands issued from DAC, and uses the objects and configuration data stored in the informatica repository.
4. Then the data are extracted, transferred and loaded from the transactional databases into the OBAW target tables.
5. After the ETL is complete and OBAW is online, an OBIEE end user runs a dashboard or report in the Answers or Interactive dashboard.
6. The request comes through the web server and interacts with the presentation server.
7. The presentation server interacts with OBI Server and OBI server will understand the requests and hits the OBAW if it’s not cashed and extracts the data and provides it to the presentation server.
8. The presentation server formats the data into the required format and through web server, the request is presented to the end user.
Oracle BI Applications Vs OBIEE Implementation Approach
An OBIA implementation differs from a traditional data-warehouse build, in that it is complete with pre-built, pre-mapped and pre-packaged insights based upon best practice principles. When building a traditional data warehouse, a significant amount of time and effort is given over to the analysis and build of the warehouse.
The diagram below shows a comparative approach to a traditional data warehouse against OBIA.
This pre-built approach differs from the OBIEE manifests itself in the application layer as can be seen from the screenshots below. The upper image is of the BI Administration Tool and as can be seen the platform is installed but there is no content, this all needs adding manually after an analysis and design stage. The lower image is of the same tool but as installed with OBIA. Here all the content of the Physical, Business and Presentation Layers comes already mapped thus removing the need for the design, analysis and build stages.
Due to the large amount of pre-built content and pre-defined structures, implementation time is greatly reduced, upgrade paths guaranteed and integration with your source systems assured. The following diagram shows a comparative of OBIEE implementations versus OBIA.

Refreshing Data From Source Tables In DAC


--update refresh dates to re-fresh data for these
tables
update w_etl_table_dt tbl_dt
set last_refresh_dt = null
where exists 
(select 'x' from w_etl_table tbl, w_etl_dbconn dbconn
where tbl_dt.table_wid = tbl.row_wid
and dbconn.row_wid = tbl_dt.dbconn_wid
and dbconn.name = 'OLTP'
and tbl.name in ('S_DOC_AGREE','S_AGREE_ITEM'));
 
In this in place of in ('S_DOC_AGREE','S_AGREE_ITEM')
you should change and do it for all the S_ tables.
Also instead of set last_refresh_dt = null set set
last_refresh_dt = <2 year ago>.
 
 You can then see the SQL in Change capture phase to
check if this is really working. 

Date dimension in BI Apps 7.9.5 is not checked to be Time dimension. Do you know if there is any particular reason why it is so?


I found an answer to something that bothered me for a while.
In setting up a Time dimension, there are requirement of the physical Tables that can be used, for e.g. Time Dimension table cannot join any physical table other than the fact table Time Dimension sources on the physical layer cannot be part of a complex join In the BM layer any logical Dimension defined as a Time Dimension cannot be part of any other logical tables . In the current 7.9.5 rpd the Date Dimension does not meet some of these requirements So if we were to designate the out of the box OBIApps rpd Date dimension as time dimension we can not have complex join to w_day_d in the physical layer. Currently out of the box OBIApps rpd has several complex join defined with the w_day_d_common alias table which is the detailed level LTC in the Date logical dimension. This issue causes error if you then try to check the time dimension flag and do a consistency check.
A comment on this from one of our Consultant as below “apps 7.9.5 was not ready to convert it to a true time dim. If you check the checkbox you will see all of the consistency errors. Part of it is due to the date fields being used in the inner joins of LTSs on other Dims and Facts which is a no-no for the Time Dim. It is easier just to create your own Time Dim that is used for the Time Series formulas. Or you could configure time series the old school way.” However the OBI 7.9.6 apps does have the Date Dimension checked as a Time Hierarchy. Also an additional response from Engineering as below
==================== The simple answer was that we didn’t use the OBIEE Time Series functions in BI Apps 7.9.5, and used them in BI Apps 7.9.6, hence configuring the Date dimension as a time dimension. In BI Apps 7.9.5, none of the new OBIEE Time Series functions were used. This was because the functionality was immature and had many bugs. These bugs were fixed in OBIEE 10.1.3.4.1 and BI Apps 7.9.6 has uptaken and used the new Time Series functions quite a bit. So for correct functionality of BI Apps 7.9.6, OBIEE 10.1.3.4.1 is a must.
This is interesting. My favorite part is “In BI Apps 7.9.5, none of the new OBIEE Time Series functions were used. This was because the functionality was immature and had many bugs. These bugs were fixed in OBIEE 10.1.3.4.1″ – soif you’re on 10.1.3.2. – you should upgrade as soon as possible (although this might mean months). I knew time dim was broken for a while – it’s just an official confirmation.