Topic 4_1

Cards (20)

  • Data cube
    A lattice of cuboids, where the bottom-most cuboid is the base cuboid and the top-most cuboid (apex) contains only one cell
  • Materialization of data cube
    1. Materialize every cuboid (full materialization)
    2. Materialize none (no materialization)
    3. Materialize some (partial materialization)
  • Formula for calculating the number of cuboids in an n-dimensional cube with L levels
  • Data Generalization
    A process of abstracting conceptual level knowledge from large set of task-relevant data in a database from a relatively low conceptual level to higher conceptual levels
  • Descriptive data mining

    • Describes data in a concise manner, highlighting interesting general properties. Supports interest.
  • Predictive data mining

    • Constructs a model and attempts to predict behavior of new data (classification, regression, etc.)
  • MOLAP (Multidimensional Online Analytical Processing)

    Fast on-line analytical processing takes minimum time if aggregates for all the cuboids are precomputed
  • Data Cube Materialization / Precomputation
    Precomputation of some of the cuboids in advance leads to fast response time and avoids redundant computations during on-line analytical processing
  • Types of Data Cube Materialization
    • No materialization: Don't precompute any of the non-base cuboid
    • Full materialization: Precompute all the cubes
    • Partial Materialization: Selectively compute a proper subset of the cuboids
  • Base cell
    A cell which belongs to a base cuboid
  • Aggregate cell
    A cell which belongs to a non-base cuboid
  • Ancestor-descendent relationship between cells

    An i-D cell a is an ancestor of a j-D cell b if: 1) i<j and 2) for 1≤m≤n, am=bm whenever am≠"*" and 3) if j=i+1, a is called parent of b or b is a child of a
  • Full cube
    All cells and cuboids are materialized. All possible combination of dimensions and values.
  • Iceberg cube
    Partial materialization. Materializing only the cells in a cuboid whose measure value is above the minimum threshold (Iceberg Condition: count(*) >= min support)
  • Closed cube
    No ancestor cell is created if its measure is equal to that of its descendent cell
  • Shell cube
    Only cuboids with limited number of dimensions are created
  • Formula for calculating the number of cells in a full cube
  • Iceberg cube
    A subset of a full cube that contains only the most significant data, excluding low-value or insignificant data points
  • Closed cube
    Contains data for all possible combinations of dimensions, even if some intersections have no data associated with them
  • Shell cube
    Contains only non-empty cells, meaning it stores data only for intersections of dimensions where data actually exists