One of the nice little features of the BI Server that is rarely used in a pure BI EE implementation is the Persistent Connection Pools. As you would probably know, BI Server adds one more layer of processing, if required, through its in-memory processing engine. For example, in the last blog entry i had shown how BI Server makes the in-memory stitch join to achieve standard non-transactional dimension member reporting. But sometimes too much of BI Server processing can slow down the queries considerably. Also, the most common issue with BI EE is, it can slow down the queries considerably when literals (when it goes above 1000) are used in the IN clause of the queries. To avoid such issues BI Server provides an option called Persistent Connection pool.
A persistent connection pool pushes down the processing to temporary tables rather than the BI Server memory. Unfortunately, currently one cannot exercise much control on how and when the temporary tables are created but when used correctly they can increase performance to an extent. This is very similar to Oracle Data Integrator where temporary tables can be leveraged to make joins across databases. For illustration, i shall be using a very simple report wherein we would be enabling the persistent connection pool.
Assume that we are reporting on the SH schema with a very simple repository design. We have a report which references another report in its filter as shown below.
Also for now lets assume that the database does not support IN_SUBQUERIES as shown below.
So, in order to generate the above report BI Server would fire 2 different queries shown below.
select distinct T2976.PROD_NAME as c1 from PRODUCTS T2976 where ( T2976.PROD_NAME in ('1.44MB External 3.5 Diskette', '128MB Memory Card', '17 LCD w/built-in HDTV Tuner', '18 Flat Panel Graphics Monitor', '256MB Memory Card', '3 1/2 Bulk diskettes, Box of 100', '3 1/2 Bulk diskettes, Box of 50', '5MP Telephoto Digital Camera', '64MB Memory Card', '8.3 Minitower Speaker') )
select T2976.PROD_NAME as c1, sum(T3013.AMOUNT_SOLD) as c2 from PRODUCTS T2976, SALES T3013 where ( T2976.PROD_ID = T3013.PROD_ID and T2976.PROD_NAME in ('1.44MB External 3.5 Diskette', '128MB Memory Card', '17 LCD w/built-in HDTV Tuner', '18 Flat Panel Graphics Monitor', '256MB Memory Card', '3 1/2 Bulk diskettes, Box of 100', '3 1/2 Bulk diskettes, Box of 50', '5MP Telephoto Digital Camera', '64MB Memory Card', '8.3 Minitower Speaker') ) group by T2976.PROD_NAME order by c1
So, in effect BI Server converts the sub-query into individual literals and then pushes it down to the where clause of the main query. Now, lets go to the database properties and enable persistent connection pooling.
When a persistent connection pool is enabled, BI Server assumes that there is no need for doing in-memory processing wherever possible. It will try to push down all the processing into the physical layer using temporary tables. The temporary tables get created and get populated based on the transaction boundary defined in the connection pool
As you see, one can specify the database, table-space related properties of the temporary table. Now lets go back and run the same report above. Now you would be noticing 3 different queries. The first query would be the sub-select query fired separately. The 2 query would be a CREATE TABLE query. This is where it gets interesting as this opens up other possibilities that we can put to good use. I would cover them later.
I would have liked a Global Temporary Table instead of a normal table for Oracle. But again the idea of doing this looks promising. Once the table is created, BI Server populates the table in-memory. And then it uses this temporary table in the sub-query of the 3rd query.
Unfortunately BI Server does not do a full clean up of the tables using Purge. DROP commands are not logged and if you are on 10g, you can still see the tables in the Recycle Bin after the command execution.
This is an interesting feature used extensively by the Marketing Segmentation Engine. Unfortunately there is not much control that we can exercise using this. DBA’s would not be happy at-least in the current context of table execution. Also, the isolation levels would have to be set correctly to avoid deadlock kind of scenarios. Having said that this can be put into good use in some other reporting use cases. I shall be covering them later.