HUH PERFECT KA SA MMW FINALS?

Cards (69)

  • Data management involves organizing, storing, and handling data to ensure accuracy, availability, and security. It enables actionable insights, operational efficiency, competitiveness, and compliance with regulations.
  • Statistics aids in data management by providing tools to summarize, analyze, and interpret data. It ensures data quality, supports data-driven decisions, and helps extract insights from complex datasets for better management and decision-making.
  • Statistics is the science of collecting, organizing, presenting, analyzing, and interpreting quantitative or numerical data. It transforms numbers into useful information and has two main categories: descriptive and inferential.
  • Descriptive Statistics: Involves collecting, organizing, and presenting data to provide a clear overview, focusing on central tendency and variability. It is used to summarize and describe the features of a dataset.
  • Descriptive Statistics is concerned with describing the characteristics and properties of a group of persons, places, or things of interest.
  • Inferential Statistics: Goes beyond describing data by making predictions or inferences about a population based on a sample. It includes techniques like hypothesis testing, confidence intervals, and regression analysis to draw conclusions and make decisions with a known level of uncertainty.
  • Inferential Statistics consists of methods that use sample results to help make decisions or predictions about a population.
  • Advancements in technology are transforming statistics, enabling effective use of big data. This chapter explores Excel, a popular tool in statistics. Excel is a powerful tool for managing, analyzing, and visualizing data efficiently for statistical applications.
  • Population – consists of all elements whose characteristics are being studied. 
  • Sample – a portion of the population selected for study
  • Constant - characteristics of objects that do not vary
  • Variable
    • characteristic/conditions that can change 
  • Data
    • values associated with a variable
  • Parameter
    • descriptive of population 
  • A measure of central tendency or position is a single figure which is representative of a general level of magnitudes or values of items in a set of data.
  • 3 most common central tendency:
    • Mean — refers to the sum of the value of the items divided by the number of items.
    • Median — the middle value in an ordered set of data. 
    • Mode — the most frequently occurring value in a set of data.
  • Percentiles are values that divide a set of data into 100 equal parts.
  • Some Properties of Mean, Median, and Mode
    • Mean is unique
    • Mean is affected by extremely high and extremely low values, called the outliers.
    • Median is used when one must determine whether the values fall into the upper half or lower half of the distribution
    • Mode can be used when the data is nominal
  • Median is also a measure of location.
  • Quartiles – values that divide a set of observations into 4 equal parts.
    • 25% of data falls below Q1
    • 50% of data falls below Q2 - median
    • 75% of data falls below Q3
  • Deciles – values that divide a set of observations into 10 equal parts.
    • 30% of the data falls below  D3
    • 80% of the data falls below  D8
    • ETC.
  • Percentiles
    • values that divide a set of observations into 100 equal parts
    40% of the data falls below P40
  • A measure of variability is a value that describes the spread or dispersion of a set of data points.
  • The range is the difference between the highest and lowest values in a dataset.
  • The variance is the average of the squared deviation from the mean.
  • The standard deviation is the square root of the variance.
  • Correlation analysis is a group of statistical techniques to measure the association between two variables.
  • Correlation coefficient is a measure of the relative strength of a linear relationship between two numerical variables.
  • simple linear regression is a fundamental tool in statistics for understanding the relationships between two variables and make predictions based on those relationships
  • the dependent variable (x) is also known as the response variable is the variable being predicted and estimated
  • the independent variable (y) is also known as the predictor or explanatory variable is the variable believed to have an impact on the dependent variable
  • the intercept, denoted by a, is the expected mean value of the dependent variable when the independent variable is set to zero
  • the slope, b, is average change in the dependent variable for every unit change in the independent variable
  • the coefficient of determination (r2) is the proportion of the total variation in the dependent variable that is explained or accounted for by the variation in the independent variable
  • Logic is the study or science of correct reasoning 
  • A proposition (p) or a logic statement which is either true or false but not both simultaneously
  • a proposition variable represented by lowercase or capital letter in the english alphabet is used to denote an arbitrary proposition with unspecified true value
  • a proposition that conveys a simple idea with no connecting words and can be represented by only one propositional variable is called simple proposition
  • a compound proposition is a proposition formed by combining two or more simple propositions using some connecting words
  • a logical operator is a connecting word used to construct compound propositions by combining simple propositions