STAT -430/530- Applied Regression Analysis- Fall 2017

spritelogo


Meeting Information

Instructor: Keshav P. Pokhrel, Ph.D.
Meeting Times: TR 6:00PM- 7:15PM
Email: kpokhrel(at)umich.edu
Meeting Location: 2048CB
Fi Office: 2087CB
Office Hours:
Monday 12:30 PM-1:30 PM
Tuesday 12:30 PM- 1:30 PM
Thursday 5:00 PM- 5:50 PM
and by appointments

Course Description and Objectives


Description:
We will discuss one of the most fundamental supervised statistical learning tool: regression analysis. Major topics covered are single variable linear regression, multiple linear regression, polynomial regression, and logistic regression. We will spend ample amount of time to discuss when and how to apply statistical models. Variable selection techniques together with analysis of residuals will be emphasized. We will extensively use computer software to analyze data with emphasis on interpretability and reproducibility. A software called "R" is a major computing workhorse for this course.

Objectives:
One of the major objective of this course to develop understanding of theory behind the statistical methods and develop successful applications of these models. The course seeks to blend theory and applications while avoiding extremes of theoritical isolations. Course aims to fit and interpret parameters of linear and non-linear regression models with ample emphasis on model assumptions.

Program Goals


  1. Understand the fundamentals of probability theory
  2. Understand statistical and inferential reasoning
  3. Become proficient at statistical computing
  4. Understand the fundamentals of statistical modeling and understand its limitations
  5. Become skilled in the description, interpretation and exploratory analysis of data by graphical and other methods.
  6. Interpret the results correctly in the context of the problem.
  7. Complete a research project with a clear write up from descriptive statiatics to statistical modeling.
Student Leanrning Outcome:
  • Increase students’ command of problem-solving and facilitate using problem-solving strategies through statistical thinking.
  • Increase students’ ability to communicate and work cooperatively.
  • Increase students’ ability to use technology and to learn from the use of technology.
  • Increase students’ knowledge about the history and nature of statistics.
  • Develop an understanding of how statistics is done and learned so that students become self-reliant learners and effective users of statistics. STAT 530 Only: : Graduate students are expected to demonstrate the ability to work with more theoretical problems including derivations, proofs, as well as other more advanced statistical techniques.

    Textbook


    Applied Regression Models-4th Edition by Kutner, Nachtsheim, and Neter
    Major Reference Books
    1. Julian Faraway Practical Regression and Anova using R
    2. Julian Faraway Linear Models ith R
    3. Peter Dalgaard Introductory Statistics with R
    4. Diez, Barr, and Cetinkaya-Rundel OpenIntro Statistics.

    Homework


    At least five sets of homework problems will be assigned. Some addition homework problems will periodically be assigned during the lecture. Majority of the homework problems will be from book but be prepared to solve any problems that alligns with our course content. Good news! Lowest homework grade will be dropped. Also, we wil take advantage of a free online resource called Datacamp. for coding. I will collect homework through canvas only. For better exam results you need to master all the homework problems. You are expected to spend an average of 4-6 hours of work per week outside of class. Late submission of homoework will result in losing 30% of the assignment score per day.

    Exams


    There will be two mid-term exams, and a comprehensive final exam. To answer the exam questions, you are expected to have a clear mathematical reasoning of the statistical methods used to solve the subject problems. You are allowed to use a sheet of notes in mid terms and final. The sheet must not be larger than A4 size and has to be prepared by you. You may use both sides of sheet.

    Project


    There will be two mini-projects and a semester project. For a good project, you need to describe the data, pose reasonable hypotheses, estimate parameters, select appropriate regression model/s, and explain the results in both statistical and in a nontechnical language. Primary objective of these projects is to apply statistical methods in the real life situations and come up with logical reasonings and explanations of the statistical methods. Late submission of project will result in losing 10% of total points everyday.



    In Class labs and Quizzes


    You will get worksheets with problems in the class. Students can interact with the friends and look at the notes to solve the problems. I encourage everyone to solve the problems on the white board and interpret the results to the class. I urge you to find interesting problems from the areas (eg. business, sociology, biology, sports, public health etc.) of your interest, this will help you to prepare for your project and at the same you are higly likely to earn better score in quizzes.

    Software


    We use a software called "R". R is a programming language for statistical computing and visualizing data. It can be downloaded for free from http://www.r-project.org. We will use R Studio for regular classroom activities. R studio is an open source Integrated development Environment(IDE) for R. To download R click here for windows and here for Mac. After Installing R: click R Studio to download R studio.

    Your performance is measured by the weighted average of homework, exams, project and classwork/quizzes. If you have any grade disputes you need to notify me within a week after grades are posted in canvas.
    Evaluations:
    Exam I (20%) Thursday, October 12
    Exam II (20%) Thursday, November 21
    Mini Project I(5%) Due, October 27
    Mini Project II(5%) Due, December 05
    Homework (10%) TBD
    Class Work and quizzes (5%) TBD
    Final Project(10%) Presentation, December 12
    Final exam (25%) December 19 (6:30 -9:30PM)

    Grade Distribution

    Your final grade will be based on the weighted average of two mid-term exams, graded homework, classwork, a semester projects (with three parts), and a final exam. Lowest homework grade will be dropped.
    Letter Grade E D- D D+ C- C C+ B- B B+ A- A A+
    Percentage 0-59 60-62 63-66 67-69 70-72 73-76 77-79 80-82 83-86 87-89 90-92 93-96 97-100

    Disability Statement


    The University will make reasonable accommodations for persons with documented disabilities. Student need to register with Disability Resource Services (DSR) every semester they are enrolled for classes. DRS is located in counseling & Support Services, 2157 UC. To be assured of having services when they are needed, students should register no later than the end of add/ drop deadline of each term. Visit the DSR website at: webapps.umd.umich.edu/aim. If you have disability that necessitates an accommodation or adjustment to the academic requirements stated in this syllabus, you must register with DRS as directed above and notify me. Upon receipt of your notification, we will make accommodation as directed by DRS.

    Academic Integrity


    The University of Michigan-Dearborn values academic honesty and integrity. Each student has a responsibility to understand, accept, and comply with the University's standards of academic conduct as set forth by the Code of Academic Conduct (mdearborn.edu/policies_st-rights), as well as policies established by each college. Cheating , collusion, misconduct, fabrication, and plagiarism are considered serious offenses, and may be monitored using tools including but not limited to TurnItIn. Violations can result in penalties up to and including expulsion from the University. At the instructor's direction, the penalty may be a grade zero on the assignment up to and including recommending that student be expelled from the University. It is the sole responsibility of the student to understand and follow academic guidelines regarding plagiarism. The University of Michigan-Dearborm has an online academic integrity tutorial that can be accessed at: umdearborn.edu/umemergencyalert

    Safety


    All students are encouraged to program 911 and UM-Dearborn’s University Police phone number (313) 593-5333 into personal cell phones. In case of emergency, first dial 911 and then if the situation allows call University Police. The Emergency Alert Notification (EAN) system is the official process for notifying the campus community for emergency events. All students are strongly encouraged to register in the campus EAN, for communications during an emergency. The following link includes information on registering as well as safety and emergency procedures information: .
    If you hear a fire alarm, class will be immediately suspended, and you must evacuate the building by using the nearest exit. Please proceed outdoors to the assembly area and away from the building. Do not use elevators. It is highly recommended that you do not head to your vehicle or leave campus since it is necessary to account for all persons and to ensure that first responders can access the campus.
    If the class is notified of a shelter-in-place requirement for a tornado warning or severe weather warning, your instructor will suspend class and shelter the class in the lowest level of this building away from windows and doors. If notified of an active threat (shooter) you will Run (get out), Hide (find a safe place to stay) or Fight (with anything available). Your response will be dictated by the specific circumstances of the encounter.


    Tentative Academic Calender


    Week Chapters/SectionsTopics covered Remarks
    Week 1 and 2 (Sept 7, 12, 14) Chapter 1 (sections: 1.1-1.8) Linear Regression with One Predictor
    Week 3 and 4 (Sept 19, 21, 26, 28 ) Chapter 2 (sections: 2.1-2.10) Inferences in Regression and Correlation
    Week 5 ( Oct 3, 5) Chapter 3 (sections 3.1-3.10) Diagnostics and Remedial Measures
    Week 6,7 (Oct 10, 12, 19) Review; Exam I, Chapter 3 (sections 3.1-3.10)
    Week 8 ( Oct 24, 26) Chapter 4 (section: 4.4) Regression through Origin
    Week 9( Oct 31, Nov 2) Chapter 6 (section: 6.1, 6.5-6.8) Multiple Regression Models
    Week 10 (Nov 7,9) Chapter 7 and chapter 8 (section: 7.1-7.3, 8.1-8.2)
    Week 11, 12 (Nov 14, 16, 21) Chapter 8 (section: 8.3); Review ; Exam II Multicollinearity, Polynomial regression
    Week 13 (Nov 28, 30) Chapter 9 (section: 9.1, 9.4, 9.6) Model Selection and Validation
    Week 14 (Dec 5, 7) Chapter 13 (section: 13.1, 13.2, 13.6) Nonlinear Regression
    Week 15(Dec 12) Chapter 14 (section: 14.1, 14.2) Logistic Regression


    Homework



    Description Remarks


    R-labs



    Description Remarks
    Likelihood_function Download
    Auto Data Download
    US Baby Names Download
    Galton's Height Data Download
    body temperature Download
    Japanese Waste data Download
    Amazon Tree Age Download
    Salary Data Download
    Smoking Data Download
    Regression refersher A brief review of regression


    Some Helpful Resources


    Data Collection of Text book data
    MovieLens A Collection of movie Ratings Data
    Install R Guideline to download and install R
    Try R A good resourse to learn R online
    R tutorials Yet another collection of resourses to learn R
    Regression Models in R A An online portal to learn regression
    OpenItro This is an excellent resource for introductory statistics. Apart from lecture notes they also have well explained examples with R code.
    Exploratory Data Analysis Wide range of statistical topics are covered in this web page with video lectures and other supplementary materials.
    StatSci.org. A good resource for varieties of data sets. These data sets are open to public and you can use these data sets for your own projects. If you happen to use these data please do not forget to mention the source.
    Data Visualization An online textbook of data visualization.
    More Stat Apps Wonderful collection of Statistics Apps for data visualization
    Machine Learning repository UCI Machine Learning Repository- a comprehensive webpage with varities of data sets.
    List of data A rich collection of data
    Data Journalism Open data sets by British newspaper "theguardian".
    Markdown Themes Appearance and Style themes to create HTML document using R Studio.
    Shiny Apps A comprehensive Resource of Shiny Apps
    KD nuggets Data Sets for Data Mining and Maching Learning
    Pharmaceutical Datasets computational chemistry data sets
    Zillow Housing Price Data
    Want to grad school in biostatistics? An information session at the University of Michigan Ann Arbor Biostatiatics Department.