STAT- 305 Introduction to Data Science- Fall 2020


Major References

  • R for Data Science by Hadley Wickham and Garrett Grolemund
  • Data Visualization A practical introduction by Kieran Healy
  • R Programming for Data Science by Roger Peng
  • R Markdown: The Definitive Guide by Yihui Xie, J. Allaire, and Garrett Grolemund


  • Some Resources to Learn R

  • The official intro, "An Introduction to R", available online in HTML and PDF
  • John Verzani, "simpleR", in PDF
  • Quick R Well-organized list of R programming skills.
  • Patrick Burns, The R Inferno. "If you are using R and you think you're in hell, this is a map for you."
  • Thomas Lumley, "R Fundamentals and Programming Techniques PDF.
  • Statistical Inference via Data Science A ModernDive into R and the Tidyverse



  • More Resources to Learn Computing

    Try R A good resourse to learn R online
    R tutorials Yet another collection of resourses to learn R
    Exploratory Data Analysis Wide range of statistical topics are covered in this web page with video lectures and other supplementary materials.
    Stat Apps Statistics Apps for data visualization
    Markdown Themes Appearance and Style themes to create HTML document using R Studio.
    Shiny Apps A comprehensive Resource of Shiny Apps

    OpenItro This is an excellent resource for introductory statistics. Apart from lecture notes they also have well explained examples with R code.
    Software Carpentry Valuable resource for general scientific computing. This is not specific to R but has helpful tips on general computing techniques.



    Data Repositories

    Data Search Engine Data Search engine by Google
    Machine Learning repository UCI Machine Learning Repository- a comprehensive webpage with varities of data sets.
    List of data A rich collection of data
    Data Journalism Open data sets by British newspaper "theguardian".

    Visualization BBC Visual and Data Journalism cookbook for R graphics
    StatSci.org. Collection of Statistics apps.
    Shiny Apps A nice collection of shiny apps for intriductory statistics.
    Tidymodels Develop and tune statistical models
    Shiny Apps Visualization Central Limit Theorem, Decision Tree
    Probability Theory Learn probability theory with simulation
    Neural Network Visualization Neural network

    MCMC Bayesian and frequentist computation: notes and codes

    Free eBooks-Project Gutenberg 58,000 free eBooks

    50+ free data sets for Data Science projects A good collection of some interesting data sets.

    Data Visualization: A Practical Introduction by Kieran Healy Hands-on Introduction on how to use R for Data Visualization.

    Opportunity Insights Publicly available data on economic inequality.
    Neural Network Dynamic visualization of neural Network