Convenience sampling is a non-probability sampling method where the researcher selects the most readily available individuals to be included in the sample.
Statistics is the science of data, dealing with collection, presentation, analysis, and use of data to make decisions, solve problems, and design products and processes
Branches of Statistics:
Descriptive Statistics: describescharacteristics and properties of a group, based on easily verifiable facts, does not drawinferences
Inferential Statistics: draws inferences about a population based on data gathered from samples, leading to predictions about larger data sets
Population:
Totality of all observations from which the data set is acquired
All possible events should be considered
Variable describing the population is a parameter
Sample:
Smallgroups taken from the population
Heterogeneousgroup representing the population
Variable describing a sample is a statistic
Variables in statistics:
Qualitative Variables: categoricaldata answered by non-numeric data
Quantitative Variables: numericaldata that are countable or measurable quantities
Categories of Quantitative Data:
Continuous Data: measurable quantities with infinite values between intervals
Discrete Data: countable quantities with finite equal intervals
Dependent vs Independent Variable:
Independent Variable: naturallyoccurring phenomenon that can be altered
Dependent Variable: observed upon application of changes to the independent variable
Controlled Variable: kept constant to check for externaleffects
Extraneous Variable: minimaleffect on the result
Scales of Measurement:
Nominal: assigning numbers to categorical data
Ordinal: assigning rank to data levels
Interval: assigning constantdifference between numeric data
Ratio: assigning continuousrange of data
Sampling:
Process of taking samples from the population
Probability Sampling: eliminates biases against certain events
Simple Random Sampling: arranging population according to rules and selecting randomly
Systematic Sampling: arranging population in order and selecting every kthelement
Sampling (cont.):
Stratified Sampling: grouping population into strata and performing random sampling
Cluster Sampling: identifying clusters with heterogeneous characteristics and selecting a cluster as a sample
Non-Probability Sampling: certain or nochance of an individual being selected
Data Presentation:
Textual Form: presentation using sentences and paragraphs
Tabular Form: presentation using tables
Graphical Form: pictorial representation
Data Presentation (cont.):
Ungrouped Data: data points treated individually
Grouped Data: data points treated and grouped according to categories
Stem and Leaf Diagram: data split into "stem" and "leaf"
Frequency Distribution Table:
Class limits: smallest and largestvalues within the class interval
Class boundaries: more preciseexpression of the class interval
Class boundaries is acquired as the midpoint of the upper limit of the lower class and the lower limit of the upper class
Frequency: The number of observations falling within a particular class
Class width (class size): Numerical difference between the upper and lower class boundaries of a class interval
Class mark (classmidpoint): The middle element of the class, usually symbolized by x
Cumulative Frequency Distribution: Derived from the frequency distribution by adding the class frequencies or partial sums
Types of Cumulative Frequency Distribution:
Less than cumulative frequency (<cf): Frequencies are lessthan or below the upper-class boundary they correspond to
Greater than cumulative frequency (>cf): Frequencies are greater than or above the lowerclass boundary they correspond to
Relative Frequency: Percentage frequency of the class with respect to the total population, used for presenting pie charts
Relative Frequency Distribution: The proportion in percent of the frequency of each class to the total frequency, obtained by dividing the class frequency by the total frequency and multiplying by 100
Steps in Constructing a Frequency Distribution Table:
1. Get the lowest and highest value in the distribution
2. Get the value of the range
3. Determine the number of classes using Sturge's Formula or the Square root Principle
4. Determine the size of the class interval
5. Construct the classes
6. Determine the frequency of each class by counting the number of items in each interval
Graphical Form of Frequency Distribution:
Frequency Polygon: Line graph with points plotted at the midpoint of the classes
Histogram: Bar graph plotted at the exact lower limits of the classes
Ogive: Line graph representing the cumulative frequency distribution, where the ogive represents <cf and > ogive represents >cf
Steps to create an ogive graph:
1. Calculate Cumulative Frequencies
2. Choose the Scale
3. Plot Points
4. Connect Points to form a step-like curve representing the ogive
Measure of Central Tendency:
Mean: Most widely used parameter for describing ratio data, calculated by summing values and dividing by the number of values
Median: The midpoint of values after they have been ordered from smallest to largest
Mode: The value that appears most frequently
Measure of Variation (Dispersion):
Range: The difference between the largest and smallest number in the set
Mean Absolute Deviation (MAD): The average of unsigneddeviations from the mean
Variance: The average of squaredeviations
Standard Deviation: The positivesquareroot of the variance
Coefficient of Variation (CV): The percentage of the ratio of standarddeviation to the mean
Measure of Shape:
Skewness: Degree of asymmetry of distribution about a mean
Kurtosis: The degree of peakedness exhibited by the distribution