BA 221 Mod 6

Cards (25)

  • Dimension Tables

    A dimensional model divides the world into measurements and context. Context intuitively divided into clumps called dimensions. Dimensions describe the "who, what, where, when, why, and how" of the facts. They can also identified as the "by" words in a business question that asks for a report.
  • Dimension Tables
    • Contain the parameters by which the fact table measures are analyzed
    • Provide the context to the fact table measures they describe
    • Contain descriptors of the business (nouns)
    • Utilize business terminology
    • Have many large columns
    • Contain textual and discrete data
    • Are usually smaller than fact tables
    • Have a single column surrogate primary key (called the warehouse dimension key)
    • Are joined to a fact table through a foreign key reference to their primary key
    • Can contain one or more hierarchies
    • The hierarchies are de-normalized into the dimension tables
  • Surrogate key
    A system assigned primary key. When the first row is added to a dimension, the system automatically assigns a key of 1 to the row. As each additional row is added, the system automatically increments the key by 1. It's meaningless, but essential as a foreign key in fact tables.
  • Warehouse Dimension Keys
    • Single column surrogate keys
    • Provide key control within the data warehouse
    • Substantially improve performance
    • Enable one method of tracking attribute history
    • Facilitate exception references from a fact table
  • Surrogate keys are implemented in every dimension, even date and time dimensions
  • Exception Condition Dimension Table Rows

    • 0 - the fact table row had an invalid legacy id for this dimension (Invalid)
    • -1 - The fact table row should reference a value for this dimension, but the value is unknown (Missing Mandatory)
    • -2 - The fact table row is not applicable for this dimension (Missing Optional)
  • Dimension Table Classifications and Examples
    • Date based (Date Dim, Month Dim)
    • Time based (Time Dim)
    • Business Entities (Store Dim, Customer Dim)
    • Analytical Profiles (Customer Demographics Dim)
  • Date Based Dimensions

    • Contain attributes like Sale Date, Sales Calendar Month, Sales Calendar Quarter, Sales Calendar Year, Sales Fiscal Month, Sales Fiscal Quarter, Sales Fiscal Year, Start Date, End Date, Arrival Date, Departure Date, Order Date, Ship Date
  • The Date Dimension Family
    Implement the family and name each dimension for its granularity - Date Dim, Week Dim, Month Dim, Quarter Dim, Year Dim
  • The Date Dimension Family uses at least one character column for date and puts in all attributes that simplify analysis. It also enables all date functions (add, subtract, etc).
  • Enterprise-Wide Events Dimension

    • Contains attributes like Special Event Description (World Cup final game, Significant stock market fall/rise, Interest rate raise/drop, Gasoline price increase/decrease)
  • Reporting Periods Dimension
    • Contains attributes like MTD IND (Month to Date Indicator), YTD IND (Year to Date Indicator), Current 3MTH IND, Current 12MTH IND
  • Simpler, less error-prone filters using Reporting Periods Dimension
    SELECT sum(sales) FROM fact <join statements> WHERE current_3mth_ind = 'YES'
  • Role-Playing Dimensions

    Entity taking on different roles or uses for the same entity. Examples: Order date, Ship date, Bill to customer, Ship to customer, Flight origination airport, Flight destination airport.
  • The wrong way to model role-playing dimensions is to have the same column headings for different roles, making them indistinguishable.
  • Dimensional Normal Form (DNF)

    • A creative and practical approach to design Dimension Table Families. Fact tables are highly normalized for maintainability and flexibility. Dimensions have their hierarchies de-normalized into them for usability and performance. The schema is limited to two levels - a single first level or central highly normalized table called a fact table, and multiple second level tables called dimension tables linked to the first level table in primarily one to many relationships.
  • Snowflaking, where normalized tables are used in the dimensional model to break dimension hierarchies into normalized tables connected by foreign key - primary key relationships, is bad because it leads to many joins which can impact performance.
  • Hotel Business Overview
    Maitutulog Mo Kaya Hotel (MMK Hotel) is composed of 500 hotels with three property types (luxury, economy, budget) and different room types. The goal is to maximize utilization and profit.
  • Hotel Business Strategic Planning
    1. Determine the most profitable hotel type and room type mixes
    2. Can I just build more of the most profitable hotel types or can I just put in more of the most profitable room types
    3. Is there a saturation point that would make utilization and profit go down
    4. What is the competition doing that will cause a change in my occupancy rates
    5. How do rates, discounts, promotions, campaigns, and advertising affect my profit
  • Daily Room Type Profitability Analysis
    1. What room types have the highest/lowest profitability across the chain, by property type
    2. Which hotels have room type profitability different from the norm
    3. How does weekend profitability compare with weekdays
    4. How does weekday profitability differ by day
  • The challenge is to build the Logical Data Models for Daily Room Billing and Daily Utilization and Profitability based on the Hotel Property Management System Source Data 3NF
  • Dimensional modeling

    A data warehousing approach that organizes data into measurements (facts) and context (dimensions) for easier analysis and understanding of relationships between data elements
  • Measurements (facts)

    Numerical values that are typically the focus of analysis, such as sales revenue, product quantity, or employee hours worked
  • Context (dimensions)

    Additional information about measurements, describing the "who, what, where, when, why, and how" of the facts, such as date, product, region, and salesperson
  • dimensions
    Can be identified as the "by" words in a business question that asks for a report, such as "by region" or "by product category"