As promised in the previous post.. lets learn some more on the DW Schemas and the Facts and the Dimension Tables. Before jumping directly to Facts and Dimensions lets first understand where and why exactly are these Facts and Dimensions used?? So lets start with the DW Schemas….
DW Schemas
A Schema is a collection of Database objects like tables, views, indexes etc. The DW schema is designed based on the model of the source schema and the requirements from the users. There are number of other things that decide the design of a DW and can be classified in 2 categories:
Physical Design
Logical Design
Physical Design
Logical Design
But we wont go into so much details about the DW and lets focus on the Schema Types.
So as discussed in the previous post there are 3 types of Schemas:
Star Schema
This is simplest DW schema. It resembles a star with points radiating from the center.The center of the star is the Fact Table and the points of the star are Dimension Tables. A star schema resembles a Start Topology in networking and in this the Fact Table at the center joins to the individual Dimension tables at the corners of the star with only one join. Each Dimesion table is joined to the Fact table using the Primary Key and the Foreign Key join. The Dimension Tables are not joined to each other.
A Star Schema optimizes performance by keeping queries simple and providing fast response time. All the information about each row is stored in one row.
A Star Schema optimizes performance by keeping queries simple and providing fast response time. All the information about each row is stored in one row.
Snowflake Schema
This is a more complex schema as compared to a Star Schema and is called Snowflake schema cz its ER Diagram resembles a Snowflake. The Snowflake Schema normalizes dimesions to eliminate the redundancy which means that the Dimension data is grouped into multiple tables instead of a single table. While this saves space at the same time it increases the number of Dimension tables and hence more number of foreign key joins and hence complex queries and reduced query performance. In this theDimesion table is joined to another Dimension table.
Hybrid Schemas
These are the third type of schemas and are hardly used. These are a combination of number of schemas.
Now you must be inquisitive to know that what are the Fact and Dimension tables which i have used in my previous posts as well as the present post. So lets move ahead with that…..
Facts and Dimensions are the DW objects.
Fact Tables
The fact tables are large tables that store business measurements known as measures or facts. They typically contain facts and foreign keys to the dimension tables.
Eg: We have Customer Dimension and Sales Fact. Then the Sales fact will contain the Sales information of the various customers like profit, total sales etc.
The fact tables generally contain facts at the same level of aggregation which means that all the facts present in the fact table are at the same level. To better understand this lets again take an Eg.
Now in the Sales Fact all the data that will be present will be based on the customer data, like a single customer A will have his profit or sales listed in the Sales Fact and same is the case with customer B and so on. Means the enitre data is at the same level.
Eg: We have Customer Dimension and Sales Fact. Then the Sales fact will contain the Sales information of the various customers like profit, total sales etc.
The fact tables generally contain facts at the same level of aggregation which means that all the facts present in the fact table are at the same level. To better understand this lets again take an Eg.
Now in the Sales Fact all the data that will be present will be based on the customer data, like a single customer A will have his profit or sales listed in the Sales Fact and same is the case with customer B and so on. Means the enitre data is at the same level.
3 types of facts:
Additive : Can be aggregated by simple maths calculation.
Semi Additive: Can b aggregated along some dimension and not along others.
Non Additive: Cant be aggregated at all.
Semi Additive: Can b aggregated along some dimension and not along others.
Non Additive: Cant be aggregated at all.
Dimension Tables
Dimension attributes help to describe dimension value. They are normally descriptive, textual values. Several distinct dimension when combined with facts enable u to answer a business question. Each Dimension table has a Unique Identifier for each record which becomes the PK for the Dimension tables.
If Fact is an Entity then Dimension are the attributes that best define the fact.