Applied Statistics

Credits 6 credit points
Instructors Meulen, F.H. van der (Technische Universiteit Delft)
E-mail f.h.vandermeulen@tudelft.nl
Aim To obtain a broad knowledge of nonparametric methods in statistics. Many methods in statistics are parametric in nature. In this case the probability law of the data is assumed to be parametrized by a finite-dimensional parameter. The basic idea of nonparametric methods is to drop this assumption and to make as few assumptions as possible. These methods thereby offer much more flexibility to model the data than classical parametric methods. The topics that we will cover in this course form a mix of classical distribution free methods and more modern topics. We focus on application of the methods, though we will discuss some theoretical issues as well. Examples will be illustrated using the statistical computing package R.
Description

The first 4 weeks are devoted to the following topics:

- Introduction to nonparametric inference and the empirical distribution function ([W], chapter 1 and 2.1)

- Goodness of fit tests ([N])

- Permutation tests ([N])

- Rank tests ([N])

The goodness of fit problem tries to answer the question whether a certain parametric statistical model is appropriate or not. In case it is hard to find a suitable parametric model, a nonparametric model offers a viable alternative. If parametric assumptions are hard to justify and/or rejected by a goodness of fit test, the performance of classical tests (often based on the normal distribution) can be cumbersome. In this case, permutation tests are a good alternative. Rank tests are permutation tests applied to order statistics. Such tests are often employed in practice and are strong competitors of classical tests. We will treat the main principles and discuss various of these tests.

 

Weeks 5 up till 13 will be used to cover chapters 4 and 5 from [W]. These chapters are on smoothing and nonparametric regression. The simplest classical linear regression model assumes that the relation between a response variable Y and a predictor variable X

can be modeled by a straight line. In practice however, this may not be appropriate. Nonparametric regression aims to fit a curve while making as few assumptions as possible. We will discuss various approaches to this problem, such as local regression and penalized regression. Besides being practically relevant, these methods also raise mathematically interesting questions. If the outcome of an experiment is nonnormal, for example binary (as is often the case in practice), the principles underlying these techniques can also be used. This leads to nonparametric logistic regression, or more generally, nonparametric generalized linear models. If time permits we will also treat the case of multiple predictors, leading to additive models.

Organization Each class consists of three  45 minutes time slots. The third hour will be used to discuss exercises.
Examination

Each week there are exercises. Some of these will be theoretical, while others will be practical. The practical questions involve analyzing real datasets and performing simulations using the statistical computing package R.

Some of the weekly exercises are compulsory. At the end of the course there is a final written examination.

Exam date: June 7, 2010. 13:00-16:00h in room BBL 001, Utrecht.
Date resit: August 25, 2010. 13:00-16:00h in room BBL 169, Utrecht.

Literature

[W]: "All of nonparametric statistics" (Springer; author: L. Wasserman, "ISBN-10:" 0387251456)

[N]: Notes and articles that will be handed out during the course.

Prerequisites Basic knowledge of probability and statistics.
  Last changed: 16-07-2010 15:08