EDA-Lesson 2-3-4

    Cards (24)

    • Statistics deals with the collection, presentation, analysis, and use of data to make decisions, solve problems, and design products and processes
    • Descriptive Statistics (DS) involves describing the characteristics and properties of a group of persons, places, or things based on easily verifiable facts
    • Inferential Statistics (IS) draws inferences about a population based on data gathered from samples using techniques of Descriptive Statistics
    • Sample is a small group taken from the population, also known as a statistic
    • Variables are the parameters being studied in statistics
    • Qualitative Variables are non-numeric data answered with qualitative information
    • Population refers to the totality of all observations from which the data set is acquired, also known as a parameter
    • Discrete Data are countable quantities with finite equal intervals, like the number of individuals
    • Independent Variable is a naturally occurring phenomenon that can be altered by changing its magnitude
    • Dependent Variable is observed upon applying changes to the independent variable
    • Controlled Variable is kept constant to check for external effects on the dependent variable
    • Quantitative Variables are countable or measurable quantities
    • Continuous Data are measurable quantities with infinite values between intervals, like height and weight
    • Scales of Measurement:
      • Nominal: Categorical data assigned to numbers
      • Ordinal: Numbers designate the rank order of data
      • Interval: Constant range between numeric values, addition and subtraction applicable
      • Ratio: All basic mathematical operations can be performed, non-arbitrary zero point
    • Sampling is the process of taking samples from the population
    • Probability Sampling eliminates biases against certain events by listing all possible events and selecting them randomly
      • Simple Random Sampling
      • Systematic Sampling
      • Stratified Sampling
      • Cluster Sampling
    • Non-Probability Sampling has certain or no chance of an individual being selected
      • Convenience Sampling
      • Quota Sampling
      • Purposive Sampling
    • Types of Data Presentation:
      • Textual Form
      • Tabular Form
      • Graphical Form
    • Univariate Analysis includes:
      • Measure of Central Tendency
      • Measure of Position
      • Measure of Variation
      • Measure of Shape
    • Mean is the most widely used parameter for describing ratio data and can be arithmetic, geometric, harmonic, trimmed, or root mean square
    • Median is the midpoint of values when ordered from smallest to largest, unaffected by extreme values
    • Mode is the most frequently occurring value, used for nominal data and polls
    • Quantiles are points taken at regular intervals from the cumulative distribution function of a random variable, including Quartiles, Deciles, and Percentiles
    • Measures of Variation include Range, Mean Absolute Deviation, Variance, Standard Deviation, and Coefficient of Variation
    See similar decks