Well ! A very interesting and controversial topic indeed , typically for Group Discussions . OBIEE is not very much popular yet in OLAP Caching arena .There is hell lot of debates humming around regarding the BI server caching and obviously the aspects of performance tuning in effect of that .There are couple of benefits achievable in performance arena in true sense though sacrificing with couple of trade-offs .At the end of that day basic is , do Caching enable but do judiciously and make sure it would not get you into trouble in future . So need to configure the Caching considering substantial benefits as opposed to the trade-off. Here I am only focused on OBI server(SAS) level caching rather OBAW server(SAW).
Introduction
OBI has the ability to Cache query results, such that submitting the earlier processed requests do not pass through to the database and process it from filesystem.
Caching Framework
Benefits
Providing significant performance improvements as it frees up the database resources to perform other tasks and thus DB can entertain other set of user queries .
Native filesystem should perform better query processing and retrieve fast results comparing the data to and fro communication with database in network channel .
Moreover it conserves network resources by avoiding connection to the database server so always less network traffic engagement could be markable .
This also reduces the BI Server Engine processing overhead over queues of user requests .
However all the above things are so nice to comprehend and so difficult to manage as providing the flexibility of faster response would lead to users an inconsistent state of data if not properly managed and planned .
This management and planning have several glitches and lots of dependencies and it is called as Cache Purging as opposed to its counterpart as Cache Seeding .
Cache Management Techniques
Do change Cache configurable parameters first .Note if disk space for Cached file exceed its limit ,entries least recently used(LRU) will be discarded and replace automatically to make room for new entries .Below mentioned techniques could be broadly distinguished as Manual , Automatic and Programmatic way to manage Caching :
Cache Seeding
There are several means to do this business :
1) Set Global Cache parameter on – This will cache query running on any physical tables .By default for all tables in repository is cacheable .
2) Switch on Cacheable property – This will provides table level benefit and extra customisation that which Physical tables should participate in generating query cache . E.g : sometime user would be more interested on giant Fact table caching rather tiny dimension tables .
3) Scheduling iBot – iBot could be properly configured and used for cache seeding purpose .This will silently build the cache without having any manual intervention . Possibly triggered in a time window after daily ETL load finishes or can be further customised and automated based on result retrieved from another iBot in chain request .The second iBot necessary ping the DB to identify whether a database update has been done(after ETL finish) before passing the request to trigger its chained counterpart .This will build data and query cache for a dashboard request and not for entire set of tables .
4) Running nQCmd utility :
Another fantastic way to handle query caching which doesn’t have any dependency on the ETL load .But overhead to accumulate the superset of the actual physical query needs to be fired against a request /report and put it down in a single file to pass as parameter of nQCmd.exe . This necessarily need to be invoked after ETL run and by the ETL job itself .It could be done using remote login to BI server and trigger nQcmd automatically and thus iBot scheduling time dependency could be avoided .
Cache Purging
A very important and crucial mechanism which should be proven good and perfect to make Caching a success story :
1) Manual purging – Usually a dedicated dumb Administrating job ,kind an overhead for a company .This could be done simply by deleting the existing Cache TBL files or by firing purge from BI Admin Tool .This purging could be done by categories i.e. (Repository , Subject areas , Users or by physical tables) from Admintool in online mode .
2) Calling ODBC Extension – Bundled ODBC extension function like SAPurgeCacheByDatabase() ,SAPurgeCacheByQuery(),SAPurgeCacheByTable(),SAPurgeAllCache() etc . could be called to free the cache table for specific queries, tables,database or all .See Oracle documentation for details .This should be called using nQCMD utility and just after ETL load and before Cache seed to ensure there is no time window related gap and dependency .
3) Event Polling table – A nice and robust concept but not so nice to manage and lead to extra overhead .Really a good technique to make BI server aware of that DB update done and now carry forward to do your business of purging . A Cache Polling frequency is an important step and should be predefined and robust to make it a success .Poll table will be populated by a auto insert DB Trigger each time target DB tables updated .Analytics server polls that table at specific set of intervals and invalidates cache entries corresponding to updated tables.
4) Calling iBots to Purge Cache – It could be done by calling a custom java scripts .This in turn call nQCmd and ODBC extension to free cache .However the catch in this feature is again the iBots need to be scheduled just after ETL run and before Cache seed . So you might not sure about stale data if ETL doesn’t meet the SLA .Again this could be done after setting a chained iBots to trigger the Purging activity in proper time .So ground rule is that never rely on iBot schedule on time.Lets pick the status from DB to trigger it .
Trade-off
Not purging the outdated caches , known as Stale Caches , can potentially return inaccurate results over time .Think of a situation where your Cache didn’t get purged on time after ETL load finishes. In this scenario though database has been updated but the change is not going to be reflected in your cached data as the seeded cache having outdated data at your filesystem and thus results a stale data which would throw inconsistent and wrong result .This potentially will cause huge confusion to the users mind .Thus Cache retention ,refresh time is important.
Not only that,in large platform , Caching should be separated and spreaded across multiple folders/mountpoints for better utilization of I/O across the filesystems .The query Cache storage should be on local, high-performance ,high-reliable storage devices .The size of consumption of the Cached files would be a bottleneck for performance improvement across disk usage space .It should be used with proper Caching replacement algorithm and the policy towards the number of maximum cache entries defined under NQSCONFIG.ini file . A potential Cache related problem found in Clustered environment where the Cache files build on one native host is not sharable with other one and this leads to be the same cache build across clustered participants as it could not be ensured that which BI server will handle which request .Until and unless the request get processed Cluster participant can’t determine there was already some Cache hit based on the query generated and the request need not to be processed from other Clustered BI server .Again Purging need to be done from both Clustered servers .Cluster Aware Cache is propagated across Clustered Servers only when Cache is seeded via iBots and not using general dashboard request or answer based queries . So finally if you understand your business and requirements you could achieve success using Cache management by availing any of the above techniques .But again , be cautious ,Caching is not so reliable candidate at least from my experience .Hope Oracle will surely looks upon it to make it robust and non-debatable . Wish you the best and good luck with Cache implementation