One of the issues that people often face while using BI EE is the lack of control on Multi-Select prompts. One cannot set Multi-Select prompts to a presentation variable & similarly one cannot display the multi-selected values in a report (only filter view supports the display of all the selected values). The most common requirement is to do string operations on the multi-selected values by treating them all as a single string.
The other requirement that i have seen in the past is to do String Aggregation on certain dimensional attributes. Currently BI EE does not have any specific aggregations for strings. One can theoretically use FIRST, LAST etc kind of aggregation on string columns but those are not aggregations per se. A typical string aggregation would involve concatenation of all the strings in a specific format.
Both the above requirements can be solved by a simple technique. From a requirements perspective, they are different. But from an implementation standpoint both are similar as both the requirements require BI EE to do a string concatenation. To accomplish this we shall be using the COLLECT database function that was introduced in 10g. If you are in pre-10g release, you can use the STRAGG function created by Tom Kyte. And if you somehow have upgraded to 11gR2, then you can useLISTAGG. The idea is, we push down the aggregation part of the strings to the database and then use them directly in BI EE. The example that i shall be going through today will involve a simple Mult-Select Prompt on the CHANNELS_DESC column of the CHANNELS table in the commonly used SH schema. The requirement is to display the chosen multi-select values as a column in the report as shown below
There are basically 2 ways of achieving this. I will demonstrate both of them here. Both the approaches require the aggregation to be pushed into the database. So, we start with creating a Type and a database function to convert the output of the COLLECT function to a set of strings.
create or replace type MultiSelect as table of varchar2(1000);
CREATE OR REPLACE FUNCTION MultiSelect_Pipe (p_MSP IN MultiSelect) RETURN VARCHAR2 IS var_msp VARCHAR2(4000); var_index NUMBER; BEGIN var_index := p_msp.FIRST; IF var_index IS NOT NULL THEN var_msp := '('''; WHILE var_index IS NOT NULL LOOP IF var_index <> 0 THEN var_msp := var_msp || p_msp(var_index) || ''','''; END IF; var_index := p_msp.NEXT(var_index); END LOOP; var_msp := substr(var_msp,1,length(var_msp) - 2) || ')'; END IF; RETURN var_msp; END MultiSelect_Pipe;
If we test the above function in SQL, we should get the output in a string concatenated form.
select distinct multiselect_pipe(cast((collect(channel_desc) Over ()) as Multiselect)) AS Channel_desc_agg from channels
Approach 1:
This approach will directly use the above function that we created above using EVALUATE from the repository. There are some drawbacks with this approach which shall be covering later. To implement this, we need to create a new column and basically use the same function within Evaluate function in the repository as shown below
EVALUATE('MULTISELECT_PIPE(CAST((COLLECT(%1) OVER ()) AS MULTISELECT))' AS CHARACTER ( 1000 ), "ORCL".""."SH"."CHANNELS"."CHANNEL_DESC")
If this column is exposed in the presentation layer, one can now use this column in the report directly. It will capture the multi-select values from the prompt as well.
And if you look at the query, you will notice that the database function gets pushed in to the sql. The filter values applied on the multi-select prompt will be honored by the database function as well.
Select MULTISELECT_PIPE(CAST((COLLECT(T26412.CHANNEL_DESC) OVER ()) AS MULTISELECT)) as c1, T26412.CHANNEL_DESC as c2, T24112.PROD_CATEGORY as c3, sum(T24170.AMOUNT_SOLD) as c4 from PRODUCTS T24112, CHANNELS T26412, SALES T24170 where ( T24112.PROD_ID = T24170.PROD_ID and T24170.CHANNEL_ID = T26412.CHANNEL_ID and (T26412.CHANNEL_DESC in ('Direct Sales', 'Tele Sales')) ) group by T24112.PROD_CATEGORY, T26412.CHANNEL_DESC order by c1, c2, c3
There are a couple of drawbacks with this approach. Since we are not isolating the filter on the String aggregated column, if your fact table does not have a value for a chosen prompt value, that will not be included in the output. For example, lets assume that in the multi-select prompt we are choosing 3 channel values i.e Catalog, Direct Sales and Tele Sales. There is no transaction for Catalog in the Fact table. So, the string aggregated column will show only Direct Sales and Tele Sales. This is not a true reflection of the Multi-Select prompt values.
Also, the CHANNEL_DESC_AGG column requires CHANNEL_DESC column to be part of the query as well. If not, BI EE will try to push the analytic function into the group by clause there by resulting in an error.
Approach 2:
Both the issues above can be negated by using another approach. This approach will use the same database function. But what we will be doing here is, we will be modeling our repository in such a way that BI EE will split the above the report’s single query into 2 separate queries. So we start with creating a new database connection (this is needed) and copying the CHANNELS table over to the new connection.
Then we create a physical layer join between the CHANNELS table in the original SH schema database and the CHANNELS table in the new database
In the BMM layer, we need to snowflake the Channels dimension by adding a new logical table called Channels_Agg. This new logical table will contain the EVALUATE column(same as in Approach 1) and the CHANNEL_DESC column.
Create a logical snowflaked join between Channels and Channels Agg logical tables. In the presentation layer, expose CHANNEL_DESC and CHANNEL_DESC_AGG columns from the Channel Agg logical table. Lets now test the above Multi-Select report again with the same 3 values (one missing in fact table)
As you see, the above is a true reflection of the Multi-Select prompt. It always shows the values chosen in the prompt no matter what values are present in the fact table. If you look at the SQL, you would notice 2 separate queries being fired by BI EE.
Also, this does not depend on the selection of CHANNEL_DESC column in the report.
Currently we had to use the analytic function since BI EE cannot do analytic equivalent functions directly from the repository (though one can mimic them using level based measures but we do not have complete control on the queries). Hopefully as new releases come out, we should see such capabilities in the repository itself.