EDA

Cards (64)

  • Statistics is the branch of science that deals with collection, presentation, organization, analysis, and interpretation of data
  • Examples of applications of statistics include:
    • Population census
    • Public choices/responses
    • Product advertisements
    • Teaching and instruction
    • Scientific observations and experiments
    • Engineering data collection
  • Two main branches of statistics:
    • Descriptive statistics: methods for organizing and summarizing data
    • Inferential statistics: generalizing from a sample to the population and assessing the reliability of such generalizations
  • Uncertainty:
    • Occurs when the true value of a certain quantity at a single instance is unknown
    • Derived from theoretical information and expressed in terms of probabilities
  • Variability:
    • Occurs when a quantity is measured at multiple instances with considerable differences between measurements
    • Derived from data extracted from observations and experiments, expressed in terms of frequencies
  • Sources of uncertainty:
    • Aleatory uncertainties: caused by natural randomness
    • Epistemic uncertainties: caused by an "incomplete" understanding of reality
  • Data analysis process:
    1. Understanding the nature of the problem
    2. Deciding what to measure and how to measure it
    3. Data collection
    4. Data summarization and preliminary analysis
    5. Formal data analysis
    6. Interpretation of results
  • Population vs Sample:
    • Population refers to the entire collection of individuals or objects about which information is desired
    • Sample is a representation of the population, making it a subset of the population
  • Statistic vs Parameter:
    • Statistic: a summary measure that describes a specific characteristic of a sample
    • Parameter: a summary measure that describes a specific characteristic of a population
  • Data and Measurement:
    • Data is a collection of observations on one or more variables
    • Variable is a characteristic whose value may change from one observation to another
  • Classification of Data:
    • Categorical (Qualitative): individual observations are categorical responses
    • Numerical (Quantitative): individual observations are expressed as numbers
    • Discrete: values correspond to isolated points on the number line
    • Continuous: values correspond to all points inside an interval on the number line
  • Measurement:
    • Process of determining the value (for numerical data) or label (for categorical data) of the variable based on observations
  • Levels of Measurement:
    • Ratio Level
    • Interval Level
    • Ordinal Level
    • Nominal Level
  • Data Collection Methods:
    1. Use of documented data
    2. Surveys
    3. Experiments
    • Independent variables: variables that may be directly manipulated
    • Dependent variables: variables that cannot be manipulated directly but can have their values changed
    4. Observations
  • Sampling:
    • Process of obtaining or selecting samples from a population related to a study
  • Sampling Bias:
    • Selection Bias: samples differ from the population due to systematic exclusion
    • Measurement or Response Bias: samples differ from the population due to observation method
    • Nonresponse Bias: samples differ from the population due to missing data
  • Sampling Methods:
    • Random Sampling
    • Stratified Random Sampling
    • Cluster Sampling
    • Systematic Sampling
  • Introduction to Design of Experiments:
    • Experiment: method of collecting data with human intervention on conditions affecting variables
    • Explanatory Variables: independent variables controlled by the experimenter
    • Response Variables: dependent variables related to explanatory variables
    • Experimental Conditions or Treatments: set-ups to observe relationships between variables
  • Strategies for Design of Experiments:
    • Random Assignment
    • Blocking
    • Direct Control
    • Replication
  • Techniques in Data Organization and Presentation:
    • Textual
    • Tabular
    • Graphical
  • Techniques and Methods Used in Data Organization and Presentation:
    1. Raw Data and Array
    2. Frequency Distribution
    • Absolute Frequency
    • Relative Frequency
    • Frequency Histogram, Frequency Polygon, Ogive
    3. Line Chart
    4. Bar Charts
    5. Pie Chart
    6. Pictograph
    7. Statistical Map
    8. Dotplot
    9. Stem-and-Leaf Display
    10. Scatterplot
  • EDA is the process of designing electronic circuits using computer-aided design tools
  • Statistics is the branch of science that deals with collection, presentation, organization, analysis, and interpretation of data
  • Examples of applications of statistics include:
    • Population census
    • Public choices/responses
    • Product advertisements
    • Teaching and instruction
    • Scientific observations and experiments
    • Engineering data collection
  • Two main branches of statistics:
    • Descriptive statistics: methods for organizing and summarizing data
    • Inferential statistics: generalizing from a sample to the population and assessing the reliability of such generalizations
  • Uncertainty:
    • Occurs when the true value of a certain quantity at a single instance is unknown
    • Derived from theoretical information and expressed in terms of probabilities
  • Variability:
    • Occurs when a quantity is measured at multiple instances with considerable differences between measurements
    • Derived from data extracted from observations and experiments, expressed in terms of frequencies
  • Sources of uncertainty:
    • Aleatory uncertainties: caused by natural randomness
    • Epistemic uncertainties: caused by an "incomplete" understanding of reality
  • Data analysis process:
    1. Understanding the nature of the problem
    2. Deciding what to measure and how to measure it
    3. Data collection
    4. Data summarization and preliminary analysis
    5. Formal data analysis
    6. Interpretation of results
  • Population vs Sample:
    • Population refers to the entire collection of individuals or objects about which information is desired
    • Sample is a representation of the population, making it a subset of the population
  • Statistic vs Parameter:
    • Statistic: a summary measure that describes a specific characteristic of a sample
    • Parameter: a summary measure that describes a specific characteristic of a population
  • Data and Measurement:
    • Data is a collection of observations on one or more variables
    • Variable is a characteristic whose value may change from one observation to another
  • Classification of Data:
    • Categorical (Qualitative): individual observations are categorical responses
    • Numerical (Quantitative): individual observations are expressed as numbers
    • Discrete: values correspond to isolated points on the number line
    • Continuous: values correspond to all points inside an interval on the number line
  • Measurement:
    • Process of determining the value (for numerical data) or label (for categorical data) of the variable based on observations
  • Levels of Measurement:
    • Ratio Level
    • Interval Level
    • Ordinal Level
    • Nominal Level
  • Data Collection Methods:
    1. Use of documented data
    2. Surveys
    3. Experiments
    • Independent variables: variables that may be directly manipulated
    • Dependent variables: variables that cannot be manipulated directly but can have their values changed
    4. Observations
  • Sampling:
    • Process of obtaining or selecting samples from a population related to a study
  • Sampling Bias:
    • Selection Bias: samples differ from the population due to systematic exclusion
    • Measurement or Response Bias: samples differ from the population due to observation method
    • Nonresponse Bias: samples differ from the population due to missing data
  • Sampling Methods:
    • Random Sampling
    • Stratified Random Sampling
    • Cluster Sampling
    • Systematic Sampling
  • Introduction to Design of Experiments:
    • Experiment: method of collecting data with human intervention on conditions affecting variables
    • Explanatory Variables: independent variables controlled by the experimenter
    • Response Variables: dependent variables related to explanatory variables
    • Experimental Conditions or Treatments: set-ups to observe relationships between variables