STAT -430/530- Applied Regression Analysis- Fall 2018

spritelogo


Meeting Information

Instructor: Keshav P. Pokhrel, Ph.D.
Meeting Times: TR 2:00PM- 3:15PM
Email: kpokhrel(at)umich.edu
Meeting Location: 2046CB
Fi Office: 2087CB
Office Hours:
Tuesday 11:30 PM- 12:30 PM
Wednesday 12:30 PM- 1:45 PM
Thursday 11:30-12:30 PM
and by appointment

Course Description


We will discuss one of the most fundamental supervised statistical learning tool: linear regression analysis. Major topics covered are single variable linear regression, multiple linear regression, polynomial regression, and logistic regression. Ample amount of class time will be invested to discuss about the use, mis-use and limitations of statistical models. Different procedures of variable selection and process of analyzing residuals will be emphasized. We will extensively use computer software to analyze data with emphasis on interpretability and reproducibility. A software called "R" is a major computing workhorse for this course.

Course Objectives


One of the major objective of this course is to develop understanding of theory behind the statistical methods and develop successful applications of these models. The course seeks to blend theory and applications while avoiding extremes of theoritical isolations. After a succesful completion of this course, the students are expected to comprehend the procedure to fit linear and non-linear regression models with adequate understanding of parameter interpretation, model assumptions and prediction accuracy.

Program Goals


  1. Understand the fundamentals of probability theory.
  2. Understand statistical and inferential reasoning.
  3. Become proficient at statistical computing.
  4. Understand the fundamentals of statistical modeling and understand its limitations.
  5. Become skilled in the description, interpretation and exploratory analysis of data by graphical and other methods.
  6. Interpret the results correctly in the context of the problem.
Student Leanrning Outcome:
  • Increase students’ command of problem-solving and facilitate problem-solving strategies through statistical thinking.
  • Increase students’ ability to communicate and work cooperatively.
  • Increase students’ ability to use technology and to learn from the use of technology.
  • Increase students’ knowledge about the history and nature of statistics.
  • Develop an understanding of how statistics is done and learned so that students become self-reliant learners and effective users of statistics.
  • Complete a research project with a clear write up from descriptive statiatics to statistical modeling. STAT 530 Only: : Graduate students are expected to demonstrate the ability to work with more theoretical problems including derivations and proofs.

    Textbook


    Applied Regression Models-4th Edition by Kutner, Nachtsheim, and Neter
    Major Reference Books
    1. Julian Faraway Practical Regression and Anova using R
    2. Julian Faraway Linear Models with R
    3. Peter Dalgaard Introductory Statistics with R
    4. Diez, Barr, and Cetinkaya-Rundel OpenIntro Statistics.

    Homework


    At least six set of homework problems will be assigned. Homework will be from text book and from other relevant resources. Graduate students are required do some additional theoritical problems. Lowest homework grade will be dropped. Also, we wil take advantage of a free online resource called Datacamp statistical programming in R. I will collect homework only through canvas. The content of the material will be clear by undertanding homework and other review problems. You are expected to invest an average of 4-6 hours per week of study time (outside of class) to meet your grade expectation. Late submission of homoework will result in losing 20% of the assignment score per day.

    Exams


    There will be two mid-term exams, and a comprehensive final exam. To answer the exam questions, you are required to have a clear mathematical reasoning of the statistical methods. You are allowed to use a sheet of notes in mid terms and final. The sheet must not be larger than A4 size and has to be prepared by you. You may use both sides of the sheet.

    Project


    There will be two mini-projects. For a good project, you need to describe the data, pose reasonable hypotheses, estimate parameters, select appropriate regression model/s, and explain the results in both statistical and nontechnical language. Primary objective of these projects is to apply statistical methods in the real life situations and come up with reasonable inferences and explanations of the statistical methods. Late submission of the project will result in losing 10% of total points everyday.



    In Class Assignments and Quizzes


    In Class Assignments: You will get worksheets with problems in the class. You can interact with the friends and look at the notes, books and other online resources to solve the problems. One of the in-class assignment is to present a research question in the class. By the end of 3rd week you are required to find a interesting problem from the areas (eg. business, sociology, biology, sports, public health etc.) of your interest or major and present it to the class. This will help you to prepare for your project and at the same you are highly likely to earn better score in exams and quizzes. Quizzes: Quizzes will be closed book and closed notes. In most cases- structures, styles and diffculty levels of quizzes will be similar to those of exams.

    Software


    We use a software called "R". R is a programming language for statistical computing and visualizing data. It can be downloaded for free from http://www.r-project.org. We will use R Studio for regular classroom activities. R studio is an open source Integrated Development Environment(IDE) for R. To download R click here for windows and here for Mac. After Installing R: click R Studio to download R studio.

    Your performance is measured by the weighted average of homework, exams, project, in-class assignments and quizzes. If you have any grade disputes-you need to notify me within a week after grades are posted in canvas.
    Evaluations:
    Exam I (20%) Thursday, October 11
    Exam II (20%) Thursday, November 20
    Mini Project I(7.5%) Due, October 25
    Mini Project II(7.5%) Due, December 06
    Homework (10%)
    In Class Assignment(5%) TBD
    Quizz (5%) Sept 20, Oct 04, Oct 23,
    Nov 08, Nov 29, Dec 06
    Final exam (25%) Tuesday, December 18 (3:00PM-6:00PM)

    Grade Distribution

    Your final grade will be based on the weighted average of two mid-term exams, graded homework, classwork, a semester projects (with two parts), and a comprehensive final exam. Lowest homework grade will be dropped.
    Letter Grade E D- D D+ C- C C+ B- B B+ A- A A+
    Percentage 0-59 60-62 63-66 67-69 70-72 73-76 77-79 80-82 83-86 87-89 90-92 93-96 97-100

    Important Dates

  • No classes : Tuesday, October 16 (Fall Break); Thursday, November 22 (Thanksgiving).
  • Academic Deadlines:September 11 is the last day to drop with no penalty. December 11 is the last day to withdraw from the course with ‘W’.
  • Final Exam: Tuesday, December 18 from 3:00PM to 6:00PM in 2046CB.

    University Attendance Policy:


    A student enrolled in a course (lecture, laboratory, recitation, colloquium, seminar, or other university approved format) is expected to attend every scheduled session of the course. The instructor of each course will make known to the students the course attendance policy with respectto student absences. It is the student’s responsibility to be aware of this policy. The instructor makes the final decision to excuse or not to excuse an absence.Presence or participation is also expectedin online courses. Participation in online courses can take various forms; it is the instructor who determines what form of presence or participation is expected. Students enrolled in online courses are responsible for being aware of that policy/expectation. An instructor is entitled to give a failing gradefor excessive absences or for a student who stops participating in class at some point during the semester.

    Disability Statement


    The University will make reasonable accommodations for persons with documented disabilities. Student need to register with Disability Resource Services (DSR) every semester they are enrolled for classes. DRS is located in counseling & Support Services, 2157 UC. To be assured of having services when they are needed, students should register no later than the end of add/ drop deadline of each term. Visit the DSR website at: webapps.umd.umich.edu/aim. If you have disability that necessitates an accommodation or adjustment to the academic requirements stated in this syllabus, you must register with DRS as directed above and notify me. Upon receipt of your notification, we will make accommodation as directed by DRS.

    Academic Integrity


    The University of Michigan-Dearborn values academic honesty and integrity. Each student has a responsibility to understand, accept, and comply with the University's standards of academic conduct as set forth by the Code of Academic Conduct (mdearborn.edu/policies_st-rights), as well as policies established by each college. Cheating , collusion, misconduct, fabrication, and plagiarism are considered serious offenses, and may be monitored using tools including but not limited to TurnItIn. Violations can result in penalties up to and including expulsion from the University. At the instructor's direction, the penalty may be a grade zero on the assignment up to and including recommending that student be expelled from the University. It is the sole responsibility of the student to understand and follow academic guidelines regarding plagiarism. The University of Michigan-Dearborm has an online academic integrity tutorial that can be accessed at: umdearborn.edu/umemergencyalert

    Safety


    All students are encouraged to program 911 and UM-Dearborn’s University Police phone number (313) 593-5333 into personal cell phones. In case of emergency, first dial 911 and then if the situation allows call University Police. The Emergency Alert Notification (EAN) system is the official process for notifying the campus community for emergency events. All students are strongly encouraged to register in the campus EAN, for communications during an emergency. The following link includes information on registering as well as safety and emergency procedures information: .
    If you hear a fire alarm, class will be immediately suspended, and you must evacuate the building by using the nearest exit. Please proceed outdoors to the assembly area and away from the building. Do not use elevators. It is highly recommended that you do not head to your vehicle or leave campus since it is necessary to account for all persons and to ensure that first responders can access the campus.
    If the class is notified of a shelter-in-place requirement for a tornado warning or severe weather warning, your instructor will suspend class and shelter the class in the lowest level of this building away from windows and doors. If notified of an active threat (shooter) you will Run (get out), Hide (find a safe place to stay) or Fight (with anything available). Your response will be dictated by the specific circumstances of the encounter.



    Harassment, Sexual Violence, Bias, and Discrimination:


    The University of Michigan-Dearborn recognizes that students have a right to study in a safe atmosphere free of sexual violence, harassment, bias and discrimination. Should you wish to report an incident of sexual assault, harassment, bias and discrimination, visit https://umdearborn.edu/offices/enrollment-management-student-life/incident-and-complaint-reporting.

    Tentative Academic Calender


    Week Chapters/SectionsTopics covered Remarks
    Week 1 and 2 (Sept 6, 11, 13) Chapter 1 (sections: 1.1-1.8) Linear Regression with One Predictor
    Week 3 and 4 (Sept 18, 20, 25, 27 ) Chapter 2 (sections: 2.1-2.10) Inferences in Regression and Correlation
    Week 5 ( Oct 2, 4) Chapter 3 (sections 3.1-3.10) Diagnostics and Remedial Measures
    Week 6,7 (Oct 9, 11, 18) Review; Exam I, Chapter 3 (sections 3.1-3.10)
    Week 8 ( Oct 23, 25) Chapter 4 (section: 4.4) Regression through Origin
    Week 9( Oct 30, Nov 1) Chapter 6 (section: 6.1, 6.5-6.8) Multiple Regression Models
    Week 10 (Nov 6,8) Chapter 7 and chapter 8 (section: 7.1-7.3, 8.1-8.2)
    Week 11, 12 (Nov 13, 15, 20) Chapter 8 (section: 8.3); Review ; Exam II Multicollinearity, Polynomial regression
    Week 13 (Nov 27, 29) Chapter 9 (section: 9.1, 9.4, 9.6) Model Selection and Validation
    Week 14 (Dec 4, 6) Chapter 13 (section: 13.1, 13.2, 13.6) Nonlinear Regression
    Week 15(Dec 11) Chapter 14 (section: 14.1, 14.2) Logistic Regression


    Homework



    Description Remarks


    R-labs



    Description Remarks
    Likelihood_function Download
    Auto Data Download
    US Baby Names Download
    Galton's Height Data Download
    body temperature Download
    Japanese Waste data Download
    Amazon Tree Age Download
    Salary Data Download
    Smoking Data Download
    Regression refersher A brief review of regression


    Some Helpful Resources


    Data Collection of Text book data
    MovieLens A Collection of movie Ratings Data
    Install R Guideline to download and install R
    Try R A good resourse to learn R online
    R tutorials Yet another collection of resourses to learn R
    Regression Models in R A An online portal to learn regression
    OpenItro This is an excellent resource for introductory statistics. Apart from lecture notes they also have well explained examples with R code.
    Exploratory Data Analysis Wide range of statistical topics are covered in this web page with video lectures and other supplementary materials.
    StatSci.org. A good resource for varieties of data sets. These data sets are open to public and you can use these data sets for your own projects. If you happen to use these data please do not forget to mention the source.
    Advanced R Learn how to make functions
    Data Visualization An online textbook of data visualization.
    More Stat Apps Wonderful collection of Statistics Apps for data visualization
    Machine Learning repository UCI Machine Learning Repository- a comprehensive webpage with varities of data sets.
    List of data A rich collection of data
    Data Journalism Open data sets by British newspaper "theguardian".
    Markdown Themes Appearance and Style themes to create HTML document using R Studio.
    Shiny Apps A comprehensive Resource of Shiny Apps
    KD nuggets Data Sets for Data Mining and Maching Learning
    Pharmaceutical Datasets computational chemistry data sets
    Zillow Housing Price Data
    Want to grad school in biostatistics? An information session at the University of Michigan Ann Arbor Biostatiatics Department.

    Data Search Engine Data Search engine by Google