A hypothesis is an educated guess or proposition that attempts to explain a set of facts or natural phenomenon
Hypothesis testing is another area of Inferential Statistics. It is a decision - making process for evaluating claims about a population based on the characteristics of a sample purportedly coming from the population
The process of hypothesis testing involves making a decision between two opposing hypotheses (null and its alternative).
Two Types of Hypotheses:
NULL HYPOTHESIS, denoted by Ho, is a statement that there is NO difference between a parameter and a specific value, or that there is NO difference between two parameters.
ALTERNATIVE HYPOTHESIS, denoted by Ha, is a statement that there is difference between a parameter and a specific value, or that there is a difference between two parameters
Types of Alternative Hypothesis Definition
A non-directional alternative hypothesis (two-tailed test) states that the null hypothesis is wrong. It does not predict whether the parameter of interest is larger or smaller than the reference value specified in the null hypothesis.
A directional alternative hypothesis states that the null hypothesis is wrong, and also specifies whether the true value of the parameter is greater than (one-tailed test-right tail) or less than (one-tailed test-left tail) the reference value specified in null hypothesis.
The level of significance, also denoted as alpha or ๐ผ, is a measure of the strength of the evidence that must be present in your sample before you will reject the null hypothesis and conclude that the effect is statistically significant.
The researcher determines the significance level before conducting the experiment. To obtain the level of significance use the formula ๐ผ = 1 โ confidence level.
Types of Errors
Type I Error: If the null hypothesis is true and rejected, the decision is incorrect.
Type II Error: If the null hypothesis is false and accepted, the decision is incorrect.
Under the normal curve, the rejection region refers to the region where the value of the test statistic lies for which we will reject the null hypothesis. This region is also called critical region.
Other Elements of Hypothesis Testing
Population โ refers to the totality of objects, individuals, characteristics, or reactions of interest
Sample โ is a group of subjects carefully selected from a population of interest
Parameter โ is the numerical value that describes characteristics of a population
Statistic โ is the numerical value that describes a particular sample
Hypothesis Testing is the process of using statistics to evaluate the utility and validity of the research theory, and this activity always begins with formulating statement or expectation to a certain phenomenon.
Use โยตโ for mean/ average and โpโ for proportion
Test Statistic โ a statistical way of testing a hypothesis whether to reject the null hypothesis and it also compares your data with what is expected under the null hypothesis.
If n<30 and given is s used t-test
If nโฅ30 and given is o used z-test
Rejection region or critical region plays an important role in conducting hypothesis testing. Aside from showing the area where we can decide whether null hypothesis is to reject or not, it also gives us the opportunity to determine if an error is being committed in hypothesis testing.
The Central Limit Theorem states that if you have a population with mean ฮผ and standard deviation ฯ and take sufficiently large random samples from the population with replacement, then the distribution of the sample means will be approximately normally distributed.
If population standard deviation is unknown and n < 30, then t โ test is appropriate.
If population standard deviation is known and n โฅ 30, then z โ test is used.
To test a claim about population proportion, we use the z โ test for population proportion.
Computing the z formula
Bivariate data are data that involve two variables. The purpose of the analysis of this type of data is to describe relationships. We will be doing this in terms of strength and direction. The relationship of variables in bivariate data can be displayed using a graph called scatter plot.
The form of scatter plot may be linear or nonlinear. That is, the points closely follow a straight line or if they form a curve while increasing or decreasing steadily
Constructing a scatter plot is just the same as plotting points in the Cartesian Coordinate Plane.
In linear correlation, the one that is often used to measure the association between two variables is the Pearson Product-Moment coefficient of correlation, also known as the sample correlation coefficient. To find its value, we use the formula,
Solving Correlation Problems: The correlation coefficient is a measure of the strength of the relationship between two variables.
We analyze and interpret the computed value of r by looking at its direction and strength. The direction depends on the sign of r (positive or negative).
Positive correlation implies that high values on a variable correspond to high values on the other variable or low values on one variable correspond to low values on the other.
Negative correlation implies that high values of one variable correspond to low values in the other variable or low values in one variable correspond to high values in the other variable. On the other hand, the strength of correlation depends on its value.
The table below is used to determine the strength of computed r.
An independent variable is a variable that is hypothesized to have an impact on the dependent variable, can be manipulated, and usually denoted by X.
An dependent variable is a variable that is being tested, its value relies or depends on the value of the independent variable, and usually denoted by Y.
Remember that the dependent variable depends upon independent variable.
The probability of committing Type I error is represented by ๐ผ (Greek letter alpha), while the probability of committing a Type II error is denoted as ฮฒ (Greek letter beta).