More on Connection Pools…

We have discussed the basics about the connection pool in one of the previous post. Connection Pool is a very important repository object and we should have a good knowledge of that. Moreover, many interview questions are based out of this. Now, lets understand something more about it…
Below is the image of the connection pool
We have 3 tabs:
1)General       2)XML     3)Write Back
In the General tab,
We need to give the Name of the connection pool.
Call Interface specifies that the Siebel Analytics will be making use of this application program interface(API)  to access the Datasource. We have a list of call interfaces available based on the type of DB we are using as the data source. Like for Oracle we  have OCI, ODBC etc.
 Maximum Connections tells the maximum number of users that can connect through this connection pool. The default value is 10.
Require fully qualified table names is used to have the tables names with complete details like DB.schema.table_name. When this option is selected, all requests sent from the connection pool use fully qualified names to query the underlying database.
In Data Source Name  we need to give the Host String of the DB to which the queries will be routed. We need to give a valid logon info for the data source else will fail to connect to the particular Data source.
If Shared Logon option is checked , then all connections to the database that use the connection pool will use the user name and password specified in the connection pool. If this option is not selected, connections through the connection pool use the database user ID and password specified in the DSN or in the Siebel user profile.
In Username & Password and password u need to specify the Username and password of the underlying DB. We can also create a variable for the Username and make changes in that when we intend to change the underlying DB.
The Enable Connection Pooling allows a single DB connection to remain open for specified time so that the future query requests could route through the same connection. It saves the overhead of opening and closing new connections and if this option is unchecked then we for each query the DB need to open a new connection.
Timeout specifies the time for which the connection to data source remains open after the request completes.  During this time the new requests make use of this connection instead of opening a new connection. If timeout value set to 0 then connection pooling is disabled.
Execute queries Asynchronously specifies that whether the data source supports asynchronous queries or not.
Execute on connect allows the Siebel Analytics Admin server to specify a command to the DB each time the connection to the DB is established. It can be any command accepted by the DB.
Parameters Supported when checked tells that that all the DB parameters mentioned in the DB features file are supported by Siebel Analytics server.
Isolation Level controls the transaction locking behaviour for all statements issued by a connection.
  • Committed Read- Locks are held while the data is read to avoid dirty reads. Data can be changed before the transaction ends with that connection.
  • Dirty Read- 0 Locking. Can read uncommitted or dirty data,  change values in data during  read process in a transaction. Least restrictive of all types.
  • Repeatable Read- Places locks on all data used in a query so that nobody can update the data. However new rows can be inserted by other users but will be available in later reads in the current transaction.
  • Serialization- Places a range lock on data set preventing other users to insert or update the rows in data set until the transaction is complete. Most restrictive of all.

Connection Pool in Admin Tool….

I m back again after some gap… Now, lets understand whts connection pool..
The connection pool is an object in the Physical layer that describes access to the data source. It contains information about the connection between the Siebel Analytics Server and that data source. 
The Physical layer in the Administration Tool contains at least one connection pool for each database. When you create the physical layer by importing a schema for a data source, the connection pool is created automatically. You can configure multiple connection pools for a database. Connection pools allow multiple concurrent data source requests (queries) to share a single database connection, reducing the overhead of connecting to a database.
For each connection pool, you must specify the maximum number of concurrent connections allowed. After this limit is reached, the Analytics Server routes all other connection requests to another connection pool or, if no other connection pools exist, the connection request waits until a connection becomes available.
A word of caution : Increasing the allowed number of concurrent connections can potentially increase the load on the underlying database accessed by the connection pool.
In addition to the potential load and costs associated with the database resources, the Siebel Analytics Server allocates shared memory for each connection upon server startup. This raises the number of connections and increases Siebel Analytics Server memory usage.
I would like users to share their comments also…

Implicit Fact Column in the Admin Tool

Many a times i have heard users facing issues with server picking up improper joins in Admin tool. Lets take an example to understand this and lets c the solution….


Take an example that we have 2 dimension tables viz. Day Dim & Opty Dim and 2 fact tables viz. Revenue Fact & Order Fact. Now, I wana make my report in such a manner where i take only 2 dimension columns from the 2 dim tables and no fact column eg: I wana c number of opptys in an year.
What will happen???
When the report will run, the data will be fetched by joining the dimension(Day Dim) with dimension(Opty Dim) through a fact table(Order Fact or Revenue Fact) .Now, sometimes in these cases the server gets confused that from which fact table it should join so as to fetch the data. Then what the server does, is that, it joins with the nearest fact(eg: Order fact) and pulls the data through it.
But this can sometimes causes issues and may fetch incorrect data. Because if the user want to see the data based on the Revenue Fact and server joins with Order fact and pulls the data, then the number of Optys per Year can differ in count as there many be different Opty wids in both the facts.
I guess u got wht i meant….


So, to take care of this issue we can assign an Implicit Fact Column in the Presentation catalog in the Admin tool. This column is generally the lowest granularity column of the fact table like Row_wid etc. By setting this column in the presentation catalog the server takes the join from a particular fact table whoose column is specified in the Implicit fact column for that particular presentation catalog..


We can set the implicit fact column as shown below:
User feedbacks are invited and kindly post if you have faced or  are facing the similar issue.

Why start-stop SAS-SAW in defined orders?

Did this thing ever come into ur mind that why we stop SAS(Siebel Analytics Server) & SAW(Siebel Web Server) in a defined order???
Well it came into my mind… And here is the reason why..
Well first lets talk of whts the order of starting and stopping the SAS & SAW Servers…
Start
SAS -> SAW
Stop
SAW-> SAS

Now, the SAS is started before SAW because if u start SAW and SAS is not started then u wont b able to login into the aplication as the authentication is done through the SAS only. So SAS needs to be started before SAW.
While stopping, SAW is stopped before SAS because, if SAS is stopped before SAW then
1) Users wont be able to query anything or run any report.
2) When we make use of Microsoft IIS Web Server for SAW, in that case the IIS Web Server creates temporary files for its web caching capabilities in the location specified by the Microsoft IIS installation. Now if the SAS is stopped before SAW these temp files remain consuming the disc space.
3) The SAW save backups of the Web Catalog file while the session is running and changes are made in the Web Catalog. As a Shut down process of SAW, the backup file is merged with the Web Catalog and changes are synchd with the original Web Catalog file but when we shut down SAS before SAW then there is a chance that some changes can get missed out.
What do u think can there be some other reasons also??

Whts Dimensional Modelling…??

If we remember prev we have discussed about the 2 common industry techniques viz. Datawarehousing & Dimensional Modelling. So i have brought a brief overview of wht exactly is dimensional modelling(DM)…
So lets start….
Whats DM????
Dimensional modelling is a logical design technique used for DWs.
Lets understand it in a diff way. If a user want to c some report of dollar value with region for an eg. so what will he want to c???
He wud like to c a column dollar with region…
Based on the user requirement the data needs to be segregated in facts and dimensions and the data should have a proper relationship between the fact & the dimension so that the user can c the data is the desired format.
So, to segregate the data into facts and dimensions and to logically organize the data in a way so that the user can understand it is done in Dimensional Modelling.
Now people have a general tendancy of confusing this with the ER Diagrams, so lets c wht exactly are ER diagrams…
ER Diagrams is also a logical design technique but this technique is used in transactional DBs like OLTP. It aims at removing the redundancy and hence is best used for transactional queries as it makes transactions simpler and deterministic.
DM is a logical design technique which aims to provide a standard framework for a high query performance and is Dimensional in nature.
So, whts the relationship between a DM and ER diagram and how it takes the shape of DM???
Lets c how it happens…..
The key to understand the relationship between the DM & ER  is that a single ER diag. breaks into many DM diag.
Think of a large ER diag. having all the business processess & relations of a company. Now
1) Separate the ER diag. into separate bnusiness processes and model each separately.
2) Identify many to many relations in ER diag. which have numeric or additive nonkey facts in them as designate them as facts.
3) Denormalize the rest of the tables with single keys (dimension tables) and join them to fact tables.
Confirmed Dimension : When a dimension table joins to more than 1 fact tables it is repsesented in both the schemas and is referred to as Conformed between 2 dimensional models.
The above process can b better understood by the below images of ER Diag. & DM Diag.
ER Diagram
DM Diagram
This is how we result in doing the DM and create a DW from an OLTP systems…

Back to basics…..

In some previous we have discussed about the Dimensional Modelling, Datawarehousing, Facts and Dimensions etc…
But where and how are all these things used and How are they related to Analytics????
Actually Analytics is a reporting tool which aims to provide users the reports as per their requirements without much efforts and in less time.  So, to fullfill all these requirements we use the above mentioned terms….
Analytics generally makes use of DWs for storing data. The DWs have Dimension modelling done in place which help users with a fast response time. Means when a user tries to run the reports and fetch data then all this is done is seconds…
The DWs using Dimensional Modelling tech. have a denormalized structure of data which means the facts and dimensions present in DB have duplicacy in data and  that  helps to fetch data in less time. This could be better understood by reading some previous blogs abt DW and DM…..
The DWs are maintained to store history data and they dont have the transactional data as compared to OLTP. The data needs to be loaded in the DW at diff freq. depending on the client req. This movement of data can be done through many tools.
This process of movement of data between OLTP & OLAP is called ETL i.e. Extraction, Transformation & Loading.
Extraction – The process of extraction of data from OLTP DB.
Transformation – The process of Data conditioning based on how the data needs to be stored in DW.
Loading – The process of loading of data in DW.
There are several tools available for ETL process…
Informatica, Data Stage, Abinitio etc…

Clustering & Load Balancing Of BI Server

I have read something and wud like to share that with u… Mite help u…. have a look…
Clustering capability  allows for the deployment of a multi-server environment to better manage large volumes of users and to provide high availability.
TYPES OF CLUSTERS
Describes the different types of computer clusters.
High-Availability (HA)
The main function of High-Availability clusters is to improve the availability of the computing system in case of a breakdown. In HA clusters not all nodes may be contributing to the operation at a given moment. Some of the nodes are redundant standby nodes that operate only in case of the failure of the main node. This connfiguration is referred to as ‘active-passive’ deployment.
Load-Balancing
Load-Balancing clusters are used to increase performance by distributing the load among different computing nodes in the cluster deployment. A load balancer, which is installed in front of all computing nodes, is responsible for distributing the load. In case of a node failure the remaining node(s) will take over to service future requests. Load-Balancing clusters also act as a HA clusters. This configuration is referred to as ‘active-active’ deployment.
High-Performance Computing (HPC) 
The purpose of HPC is to build a massive parallel computer from a set of small machines. HPC is focused on raw performance and not on high availability. HPC installations are not highly available and are usually rebooted often. HPC are often used in scientific calculations.
Grid Computing 
Grid Computing achieves some of the HPC features; unlike HPC the Grid treats each node in the cluster as a separate machine and not as part of a single computer. This provides Grid with the robustness missing in HPC but at the expense of performance compared to HPC.
Below is the link to pic to show how clustering works…
What is a Load Balancer????
The load balancer’s job is to distribute requests coming from the client to the different Oracle BI Publisher servers installed in the cluster. The minimum requirement for the load balancer is to support a feature, by which it redirects requests from each unique client to the same server.

Differences Between Siebel Analytics & OBIEE

I have heard this question multiple times from many people. So let me share the answer with u guys….
Oracle BI Answers-Based Metadata Dictionary Feature
This feature enables administrators to analyze metadata repository statistics in Oracle BI Answers. The OBIEE server can generate metadata dictionary which describes the metrics contained in a rep and the attribs of rep objects. Its an XML doc. 
In OBIEE this can be accessed directly from BI Answers selection pane where specific metadata information will be shown to guide report construction. Once this BI Answers based functionality is been enabled an icon will appear next to each Subject Area, Table name, and Column shown in the selection pane for that Subject Area. Clicking on the icon will open up a specific Metadata Dictionary page to show information for that element and links to related repository information

Multi-Select Dashboard Prompts Feature
This feature enables users to analyze data more easily with an easier to use method. Prior to this, a multi-select dashboard prompt with a large number of values could become difficult to work with.
 A new search feature adds the ability to better navigate and select values from a multi-select prompt when there are many values. Modifications to the existing multi-select dashboard prompt now allow the user to search through the available values using four matching schemes: begins with, ends with, contains, and is Like (Pattern Match). Wildcards like % can also be used in the queryA more sophisticated paging mechanism is also now present which allows the user to page ahead according to a configured search set size as needed. Some other features are also added in multiselect prompts like copy paste of values and selection of multiple value susing shift key.
Integration with BI Dashboard Feature
2 new Interactive Dashboard features have been added to provide additional interactivity with Microsoft Office.
1) Download to PowerPointThis new link is found within the “Report Links” Download menu and will download a static version of that report to PowerPoint. The Oracle BI Office Add-in does not need to be installed to use this functionality.
2) Paste reports directly into MS Office applications
This functionality is also made available along with the other Dashboard Report Links configured using the Dashboard Editor. A new “Copy” link can be added to a report that will copy the XML definition of the current report to the Windows clipboard. Pasting this link using the Paste function found in the Oracle BI menu will convert the copied XML into an Office version of that report. This functionality will only work where the Oracle BI Office Add-in has been installed. Once pasted, this Dashboard report will then be translated into native Office objects that can be subsequently refreshed with live data at any point.
Import Oracle Catalog Feature
Enables Admins to import Database objects through a native Oracle OCI connection
The OBI Admin Tool supports importing of physical Database objects through a native Oracle OCI connection and reuses the same OCI connection for queries. This feature mitigates the need to setup an ODBC connection to an Oracle Database purely for importing metadata objects into the Admin Tool.
Embedded Database Functions Feature
Enables users & Admins to create more powerful reports by directly calling Database functions from either Oracle BI Answers or by using a Logical column (in the Logical Table source) within the Metadata (repository). Eg: Evaluate function etc.
Presentation Variable
A new variable has been introduced which help accepting the dynamic values from the user in Oracle BI Answers. presentation variable must be declared in a dashboard prompt (using the Set Variable field), and its name and value are determined by the user.
format : @{variables.variable_name}
Time series Functions
The Time series wizard has been removed from OBIEE and some new functions called Time Series functions have been introduced.

Working with Joins on Admin Tool…

Joins is a very common thing and most of the people don’t give much attention how and when we use Physical Joins and when we use coplex joins… Lets c how and why of joins…..
Primary-Foreign Key Joins
The PK-FK joins are generally used in Physical Layer. These joins are required to form the physical query using which the data from the underlying DB will be pulled. These are the joins between the PK FK columns in the facts and dimensions. These joins are used between defined columns between 2 tables. As shown in the pic below the jons are defined between 2 columns which can be seen from the expression.
Physical Foreign Key Joins Diag.
We generally make use of PK-FK joins in Physical layer and these joins are called Physical Foreign Key Joins.
We can also make use of PK-FK joins in BMM layer or Logical Layer and these joins are called Logical Foreign Key Joins .
The PK-FK Joins in BMM are present when we drag the entire schema from Physical Layer to the BMM layer. These joins can be present in BMM in the case when we have only 1 Logical Table source(LTS) per Dimension in the BMM.
But as per the best practices we should restrict our use of Logical FK Joins in BMM because if there is a requirement to increase the number of LTS in a dimension in future, then, we can’t do that if we make use PK-FK joins in BMM.
Logical Foreign Key Joins Diag.
Complex Joins
These joins are generally used in BMM layer. Specifying the Complex Joins is required so that the Siebel Analytics Server can have the necessary metadata to translate a logical request against the business model to SQL queries against the physical data sources. The complex join information provides the Analytics server with the many-to-one relationships between the logical tables. This logical join information is used when the Analytics server generates queries against the underlying databases.
A complex join takes in consideration the joins at the physical layers and accordingly makes the joins at the BMM layer i.e. This join is specified between 2 tables and we don’t specify any columns in the join, the Analytics server itself picks the joins based on the joins in the Physical Layer.
   Logical Join BMM Layer Diag.
When we use a complex join in the Physical layer it acts as a one to one join and the join type (left, rite outer) and cardinality options are not present in that case. The complex joins in Physical Layer are used only in specific cases where we have to join the tables based on some conditions. 
We have to select the columns of the tables between which we need to put the joins and we can place any conditions in the expression based on the requirement.
Complex Join Physical Layer Diag.

Lets start with the Basics

Lets start with some basics for why we use Analytics. I knw this is a very basic thing and everybody must be knowing this but still there are a number of newbies who wana learn BI and search for some matter on the same. So lets start…
WHY ANALYTICS???
First of all Analytics is a Reporting tool which provides users with the data and tool to answer their queries that are important to run their business. Eg. The users like Managers etc. need reports to check how their business is going on  and what are the profits and lossess, and what opportunities are there in what all regions etc. These reports can be made available to the users on just a click using Analytics. The users dont have to write queries or take out xcels and do all such efforts.
Other Analysis tools are generally difficult to use and requires a detailed knowledge, may require writing queries, or compiling of data from many sources etc. and the Reporting tools are generally static and are less interactive. Hence, Analytics act as a package and can help users in all these scenerios.
Analytics provides insight, processing, and prebuilt solutions that allow users to access critical business information and acquire the business intelligence to achieve optimal results. We can access large business data frequently by making use of Analytics. It makes use of the basic Industry techniques like Datawarehousing  and Dimensional Modelling