do not copy --confidential homework 6b partial key generating a fact table for henry books bcis 4660...
DESCRIPTION
DO NOT COPY --CONFIDENTIAL Selecting the Fact Grain When you design a star schema, an important decision is the granularity level (grain). One valid approach would be to set the grain at the book copy level. Then you would record each individual book copy and add facts such as condition and price. Another valid choice would be to produce inventory facts at the book level. Then you could show total on-hand quantity at each branch for each book. In order to associate books with all their co-authors, multiple fact entries would have to be made for each book. We continue this demonstration by setting the grain at the book-author level. Facts at the copy level would not be recorded.TRANSCRIPT
DO NOT COPY --CONFIDENTIAL
Homework 6b Partial KeyGenerating a Fact table
for Henry Books
BCIS 4660Dr. Nick Evangelopoulos
Spring 2012
DO NOT COPY --CONFIDENTIAL
Henry Books Final Star Diagram (from HW5)
DO NOT COPY --CONFIDENTIAL
Selecting the Fact Grain• When you design a star schema, an important decision is
the granularity level (grain).• One valid approach would be to set the grain at the book
copy level. Then you would record each individual book copy and add facts such as condition and price.
• Another valid choice would be to produce inventory facts at the book level. Then you could show total on-hand quantity at each branch for each book.
• In order to associate books with all their co-authors, multiple fact entries would have to be made for each book. We continue this demonstration by setting the grain at the book-author level. Facts at the copy level would not be recorded.
DO NOT COPY --CONFIDENTIAL
Time Dimension• Use the same Time table you used for Premiere
Products, the one listing all dates in 2013 (posted on the Web).
• You can import the Excel file directly to Access. Select External Data > Excel.
DO NOT COPY --CONFIDENTIAL
Branch Dimension• Start by deleting the relationship between Copy
and Branch (go to Database Tools > Relationships, select the line, then hit Delete)
DO NOT COPY --CONFIDENTIAL
Branch Dimension• Open the Branch table, select all entries and
transfer them to Excel• Then delete all data entries in db table Branch
• In Excel, create a new column called Branch_key and populate it with the values 1,2,3,4
DO NOT COPY --CONFIDENTIAL
Branch Dimension• Open the Branch table in design view, “unkey”
BranchNum, and add a new attribute called Branch_key. Specify “Number-integer” as the data type
• In Excel, copy all data cells. In Access, open Branch in data view, select the first (blank) row and paste there all the data
• Continue this way and prepare Book, Publisher, Author, and Branch. For each one:– In Access, delete the relationships that connect the
table to other tables– Transfer the data to Excel– In Excel, create a new “key” column– In Access, delete all rows– In Access, in design view, remove the primary key tag
from AuthorNum, BookCode, etc., and add the “Author_key”, “Book_key”, etc., attribute
– Transfer the data from Excel to Access
DO NOT COPY --CONFIDENTIAL
The other Dimensions
DO NOT COPY --CONFIDENTIAL
Working on the Fact table• Open the Copy table and examine it. In your Fact
table you want to record the number of copies (on-hand quantity) for each book-branch combination
• This can be accomplished by executing a “count” query, using the pair “BookCode, BranchNum” in the “Group By” clause.
DO NOT COPY --CONFIDENTIAL
Working on the Fact table
• In Access, under the “Create” tab, select “Query Design”
• Press the “Totals” button
• Add the “Count” of “CopyNum”• Specify BookCode and
BranchNum as “Group By” attributes
• Execute the query and verify that the results produce 48 rows
DO NOT COPY --CONFIDENTIAL
OnHand quantity• The reason why we used the count() function in
the previous slide, is to produce the onHand quantity for each book
• Rename the “count(CopyNum)” field into “OnHand” by using the colon notation
Query in Design view Query results after execution
DO NOT COPY --CONFIDENTIAL
Working on the Fact table• Open FactQuery in design view again• Add “BookCode” from table “Wrote”, specifying that the
BookCode should be the same as Copy.Bookcode in the “Criteria” row
• Add the attributes AuthorNum and Sequence from Wrote
• Execute the query. Verify that 51 rows are produced.
DO NOT COPY --CONFIDENTIAL
Avoiding a Cartesian Product• In the previous screen, as soon as we added the new table
“Wrote”, there was a danger of producing Cartesian product• To avoid this, specify which values of BoookCode you want
to see listed in the results• If you do nothing, you will get all of them for each line of
previous results (Cartesian product). If you add some value in the Criteria line, you will only get the matching values
DO NOT COPY --CONFIDENTIAL
Physical Inventory Date• Recall that we want our Fact table to include the date
when we recorded our inventory (PhysInvDate).• Think of your data warehouse as a repository of
permanently stored historical records. You want to know how many books you had on the date you executed the query we are working on
• Instead of saving that date in a “date” format, we will use the Time_key of the date when we took inventory
• In practice, your FactTable-query would be executed periodically. For example, you execute it on the day when Time_key = 10 and you store the 51 records it produces. Then you execute it again when Time_key = 17, you store 51 more records, and so forth.
DO NOT COPY --CONFIDENTIAL
Adding Time_key to the fact table• Assuming that inventory is taken on January 10,
2013, add the value 10 on the “criteria” row of your query design. (For Date = 1/10/2013, Time_key = 10.)
• Execute the query. Verify that 51 rows are again produced.
DO NOT COPY --CONFIDENTIAL
Check the current status of your results
• Execute your query to verify it produces the desirable results
• These are still far away from the final Fact table contents, but we are on the right track!
Query in Design view
Query results after execution (51 rows)
DO NOT COPY --CONFIDENTIAL
Replacing BookCode by Book_key• It’s now time to show Book_key for each book,
instead of BookCode1. Add BookCode from Book
2. Add Copy.BookCode in the Criteria line, to avoid a Cartesian product
3. Uncheck all columns where BookCode appears, so that only Book_key is listed in the query results
DO NOT COPY --CONFIDENTIAL
Check the current status of your results
• Execute your query to verify its results• There is some excellent progress. You are already
getting Book_key, Time_key, onHand quantity and author Sequence
• Next steps: – First introduce Branch_key and Author_key, then hide
BranchNum and AuthorNum– Bring PublisherCode (from Book), then cross-reference it with
PublisherCode from Publisher, finally replace it by Publisher_key
Query results after execution (51 rows)