11.2.1 Data mining techniques

Cards (35)

  • What is the primary goal of data mining?
    Discover patterns in data
  • Data mining can be used to forecast future trends.
  • Data mining is the process of extracting patterns and insights from large datasets
  • Match the data mining technique with its purpose:
    Association Rule Learning ↔️ Discovers relationships between variables
    Classification ↔️ Assigns data to predefined categories
    Clustering ↔️ Groups similar data points together
    Anomaly Detection ↔️ Identifies unusual data points
  • Steps in Association Rule Learning
    1️⃣ Discover frequent itemsets
    2️⃣ Generate association rules
    3️⃣ Evaluate rule metrics
  • Association Rule Learning identifies relationships between variables in a dataset
  • What type of data is required for classification?
    Labeled data
  • Clustering is sensitive to feature selection.
  • What is a common weakness of anomaly detection?
    High false positive rates
  • In Association Rule Mining, support measures the frequency of the rule's occurrence
  • Confidence in Association Rule Mining measures the probability that the rule is correct.
  • What does lift measure in Association Rule Mining?
    The rule's effectiveness
  • Classification categorizes data into predefined classes
  • Classifying emails as spam or not spam is an example of regression.
    False
  • Clustering groups similar data points together based on shared characteristics
  • DBSCAN is capable of finding clusters of arbitrary shapes.
  • K-Means partitions data into clusters based on minimizing the distance to centroids
  • Hierarchical clustering requires specifying the number of clusters beforehand.
    False
  • What does DBSCAN use to identify clusters?
    Density
  • Data mining discovers patterns, trends, and useful information from datasets
  • Trends in data mining refer to recurring sequences or relationships.
    False
  • What is the purpose of decision-making in data mining?
    Selecting best action
  • Association Rule Learning identifies relationships between variables
  • Clustering is sensitive to feature selection.
  • Which data mining technique predicts continuous numerical values?
    Regression
  • Match the metric in Association Rule Mining with its description:
    Support ↔️ Frequency of the rule's occurrence
    Confidence ↔️ Probability that the rule is correct
    Lift ↔️ Measures the rule's effectiveness
  • In market basket analysis, a rule might be "if a customer buys bread, they are likely to buy butter
  • What is the primary difference between classification and regression?
    Output type
  • Regression outputs discrete categories.
    False
  • Give an example of a classification problem.
    Email spam detection
  • K-Means clustering partitions data into K clusters based on minimizing distance to centroids
  • Hierarchical clustering is computationally intensive for large datasets.
  • What is a key weakness of DBSCAN clustering?
    Sensitive to density parameters
  • Anomaly detection identifies data points that deviate significantly from the norm
  • Machine learning models can be used for anomaly detection.