Stats and prob

Cards (21)

  • Data science discovery and practice that involves the collection, management, processing, analysis, visualization, and interpretation of vast amounts of heterogeneous data associated with a diverse array of scientific, translational, and inter-disciplinary applications
  • Data science lies at the intersection of the statistical and the computational sciences, and domain-specific scholarly disciplines and application areas. It incorporates the availability and diversity of quantitative information and the theo and practice of statistics and computer science that make processing and understanding possibl.
  • Data mining and big data have the same concep, the use of the most powerful hardware, the most powerful programming systems, and the most efficient algorithms to solve problems in science, commerce, healthcar, government, humanitie and many other fields of human endeavor.
  • Data science is a new interdisciplinary field that synthesizes and builds on statistics, informatics, computing, communicatio, management, and sociology to study data and it’s environments.
  • Data science is primar used to make decisions and predictions by using predictive causal analytic, prescriptive analytic (predictive plus decision scienice) and machine learning.
  • The definitions above can be summarized in the following aspects:

    1.the center of data science is data, especially Big data.
    2.the purpose of data science is to obtain information or knowledge from data that will help in making better decisions and understanding the development and chang of nature or society better.
    3.Data science is a multidisciplinary field that has applied theories and technologies from several discipline.
  • this professor believes that data science is statistic. For him
    Karl Broman professor of University of Winsconsin
  • Karl broman once say
    “If you’re analyzing data, you’re doing statistics.”
  • You can call it data science or “informatics” or “analytics”
  • for applied statistician Nate Silver
    “Data scientist“ is an attractive term for statistician.
  • he is applied Statistician?
    Nate Silver
  • He added that statistics is “science” and that data scientis is “slightly redundant is some way”.
    Nate Silver
  • For _ professor of statistics and director of applied statistics center at columbia Science.

    Andrew Gelman
  • In the early days of Statistics, people would pluck numbers out of tables to make predictions about future events.
  • He believes that statistics is “not the important part of data science or even close”
    Andrew Gelman
  • He emphasized that data science deals with databases and coding, and statistics is just an option
    Vasant Dhar
  • Moreover Vasant Dhar, professor of information systems at New York university believes that data science is difference from the existing practice of data analysis across all discipline, which focuses only on explaining data sets.
  • A data scientist must be able to do the following tasks

    •collect large amounts of messy data and transform it into a more usable format.
    •solve busines-related problems using data-driven techniques.
    •Work with a variety of programming languages (SAS, R, Phyton)
    •Have a solid grasp of statistics, including test and distributions.
    •Communicate and collaborate with both IT and busines.
  • Is a language and environment for statisticAl computing and graphics developed by Bell laboratories.
    R
  • Is an object-orienTed, interpretmed, and interactive programming language by Guido van Rossum.
    Phython/python
  • The_language is a programming language devel by Anthony James Barr as statiscal analysis tool.
    SAS