Statistics deals with the collection, presentation, analysis, and use of data to make decisions, solve problems, and design products and processes
Descriptive Statistics (DS) involves describing the characteristics and properties of a group of persons, places, or things based on easily verifiable facts
Inferential Statistics (IS) draws inferences about a population based on data gathered from samples using techniques of Descriptive Statistics
Sample is a small group taken from the population, also known as a statistic
Variables are the parameters being studied in statistics
Qualitative Variables are non-numeric data answered with qualitative information
Population refers to the totality of all observations from which the data set is acquired, also known as a parameter
Discrete Data are countable quantities with finite equal intervals, like the number of individuals
Independent Variable is a naturally occurring phenomenon that can be altered by changing its magnitude
Dependent Variable is observed upon applying changes to the independent variable
Controlled Variable is kept constant to check for external effects on the dependent variable
Quantitative Variables are countable or measurable quantities
Continuous Data are measurable quantities with infinite values between intervals, like height and weight
Scales of Measurement:
Nominal: Categorical data assigned to numbers
Ordinal: Numbers designate the rank order of data
Interval: Constant range between numeric values, addition and subtraction applicable
Ratio: All basic mathematical operations can be performed, non-arbitrary zero point
Sampling is the process of taking samples from the population
Probability Sampling eliminates biases against certain events by listing all possible events and selecting them randomly
Simple Random Sampling
Systematic Sampling
Stratified Sampling
Cluster Sampling
Non-Probability Sampling has certain or no chance of an individual being selected
Convenience Sampling
Quota Sampling
Purposive Sampling
Types of Data Presentation:
Textual Form
Tabular Form
Graphical Form
Univariate Analysis includes:
Measure of Central Tendency
Measure of Position
Measure of Variation
Measure of Shape
Mean is the most widely used parameter for describing ratio data and can be arithmetic, geometric, harmonic, trimmed, or root mean square
Median is the midpoint of values when ordered from smallest to largest, unaffected by extreme values
Mode is the most frequently occurring value, used for nominal data and polls
Quantiles are points taken at regular intervals from the cumulative distribution function of a random variable, including Quartiles, Deciles, and Percentiles
Measures of Variation include Range, Mean Absolute Deviation, Variance, Standard Deviation, and Coefficient of Variation