Continuing on the discussion about OBIEE (Oracle BI EE) structure of the Logical Layer – please read Part 1 before this chapter.
To set the context, The discussion is around your choices in table design in the warehouse and I am only referring to the reporting in OBIEE on the datawarehouse, and your design considerations for Logical Tables, logical Table Sources and Physical Joins in the Physical Layer. Another point to note is that I am not too concerned about the theoretical difference between Snowflakes and Normalised, in the example shown in Part 1 you could call that a Snowflake or a Normalised design. Some have described Snowflakes as semi-normalised stars which is a fair decription, but the important point for the discussion is that there is at least one join between dimensions, which instantly makes it a non-star design, and is on the road to normalisation.
In part three I will look at a mixed model – Logical stars with normalised physical tables.
I do believe the actual dataWarehouse design and ETL routines should follow what you decide in your logical design considerations. Having said that though, I don’t think that the whole reporting system can be developed in a linear path, you often have to loop back at each stage when more clarity is found.
And finally, Please join in the discussion by posting your comments.
Now, on with the blog…
From Part 1 some of you may be thinking that I’m advocating that Stars are not always required in the warehouse. Well, I tried to put together a fair case for the use of normalised design as best as I could. Now, lets have a look at the case for a pure star.
Continuing with the Contact example used in part 1, we can redraw the model in a logical and physical pure star.
Running the same report.
And obviously it works, but look at the SQL:
T3489.”ATTRIB_11″ as c1,
sum(T3517.”BLAH_VALUE”) as c2
from “W_PERSON_DX” T3489,
where ( T3489.”ROW_WID” = T3517.”CONTACT_WID” )
group by T3489.”ATTRIB_11″order by 1
So this time the join is direct in the physical layer, and therefore the server does not add in another table to get the answer.
Is this reduction in joins THE main reason why you design a pure star? Or are there other reasons why star = good?
So on the face of it there is no problem with either method. The star method was very efficient in the query, minimising the number of joins. But at what cost to the ETL process? Here, in this example, none because it joined to an existing key, however if you were talking about say, Account Id, then the mapping would have to look it up and add it to the target table. Of course the Star Transformation in the db will struggle with the normalised schema (or will it – answers on a postcard).
But. I cheated. Twice. Firstly, this is a very simple model, with little data in, so design is not vital.
Secondly, what if I want to do this …..
Let’s build a hybrid. Surely we can use a star and introduce some level of normalisation and all will be well? Consider the following design.
The above ‘star’ has all the physical dimension tables in the single dimension logical table.
We have a logical star. With Physical table sources arranged in a physical star, except we have a few ‘spokes’ around some of the physical dimension tables (this is essentially what mosst people would call a Snowflake!). A common use here would be a ‘List of Values’.
Does it work? and is it ‘better’?
There are two considerations
- Errors is Analytics generating SQL
- Duplication of data.
I’ll explore these in part 3.