A lattice of cuboids, where the bottom-most cuboid is the base cuboid and the top-most cuboid (apex) contains only one cell
Materialization of data cube
1. Materialize every cuboid (full materialization)
2. Materialize none (no materialization)
3. Materialize some (partial materialization)
Formula for calculating the number of cuboids in an n-dimensional cube with L levels
Data Generalization
A process of abstracting conceptual level knowledge from large set of task-relevant data in a database from a relatively low conceptual level to higher conceptual levels
Descriptive data mining
Describes data in a concise manner, highlighting interesting general properties. Supports interest.
Predictive data mining
Constructs a model and attempts to predict behavior of new data (classification, regression, etc.)
Fast on-line analytical processing takes minimum time if aggregates for all the cuboids are precomputed
Data Cube Materialization / Precomputation
Precomputation of some of the cuboids in advance leads to fast response time and avoids redundant computations during on-line analytical processing
Types of Data Cube Materialization
No materialization: Don't precompute any of the non-base cuboid
Full materialization: Precompute all the cubes
Partial Materialization: Selectively compute a proper subset of the cuboids
Base cell
A cell which belongs to a base cuboid
Aggregate cell
A cell which belongs to a non-base cuboid
Ancestor-descendent relationship between cells
An i-D cell a is an ancestor of a j-D cell b if: 1) i<j and 2) for 1≤m≤n, am=bm whenever am≠"*" and 3) if j=i+1, a is called parent of b or b is a child of a
Full cube
All cells and cuboids are materialized. All possible combination of dimensions and values.
Iceberg cube
Partial materialization. Materializing only the cells in a cuboid whose measure value is above the minimum threshold (Iceberg Condition: count(*) >= min support)
Closed cube
No ancestor cell is created if its measure is equal to that of its descendent cell
Shell cube
Only cuboids with limited number of dimensions are created
Formula for calculating the number of cells in a full cube
Iceberg cube
A subset of a full cube that contains only the most significant data, excluding low-value or insignificant data points
Closed cube
Contains data for all possible combinations of dimensions, even if some intersections have no data associated with them
Shell cube
Contains only non-empty cells, meaning it stores data only for intersections of dimensions where data actually exists