Home / Program

Applications of Hierarchical Models in Longitudinal and Multilevel Research
Beijing, August, 2007

Instructor

Stephen W. Raudenbush (sraudenb@uchicago.edu)
Department of Sociology and Committee on Education
University of Chicago
1126 E 59th Street
Chicago, IL 60615

Pre-requisites

Applied statistics at the level of multiple regression (e.g., ED795) is required; prior experience with hierarchical models or an advanced course in linear models useful.

Content

Many studies in education, human development, and allied fields are longitudinal or multilevel or both. In longitudinal studies, we may repeatedly observe participants, for example, to assess growth in academic achievement or change in mental health status.

Multilevel data arise because participants are clustered within social settings such as classrooms, schools, and neighborhoods. These settings form a strict hierarchy, for example, when classrooms are nested within schools that are, in turn nested within districts. But they form a cross-classified structure, for example, when schools draw students from multiple neighborhoods and neighborhoods send students to multiple schools. The nested versus cross-classified organization of these settings call for different analytic approaches.

Data that are both longitudinal and multilevel include studies of school effects on student academic growth and neighborhood and family effects on change in mental health. In some cases the participants will migrate across social settings over time. For example, children will experience a sequence of classrooms during the primary years, some residents will move to a new neighborhood. In other cases, the participants will stay put but the character of the neighborhood or school will shift.

This seminar will consider the issues of analysis and, to a limited extent, design that arise in longitudinal and multilevel research settings.


We begin with the axiom that a statistical model represents a tentative conceptual model about the sources of variation in an outcome. The model should reflect the measurement scale of key explanatory and outcome variables. The model not only guides the summary of quantitative evidence it also guides the design of future research.

We shall begin by considering two-level studies in which persons (“level-1 units”) are nested within organizations (“level-2 units”) such as schools. Next, we consider two-level studies of individual change. In this case, we view time-series data (level 1) as nested within persons (level 2). The level-1 model specifies how a person is changing over time. The level-2 model describes the population distribution of the parameters of individual change. We shall briefly allude to compare the structure of these studies to the structure of two-level cross-sectional studies.

Next, we shall consider three-level models. Our initial focus will concern the case in which repeated measures (level 1) are nested within persons (level 2) who are themselves nested in organizations (level 3).

All of the studies considered so far will involve nearly continuous outcomes for which the normality distribution is at least plausible. Our next aim is to generalize two- and three-level models to other kinds of outcomes: binary outcomes, counts, ordered outcomes, and multinomial data. All of these cases fall into the framework of the hierarchical generalized linear model.

Latent variable models may be viewed as hierarchical models. We shall consider how such models can incorporate measurement error, missing predictors, and how they can be formulated to study direct and indirect effects via simultaneous equations.

Not all multilevel data involve a pure nesting. In many important cases, observations are cross-classified by two higher-levels of random variation. For example, persons may be nested in "cells" defined by the cross-classification of schools and neighborhoods; time-series observations may be cross-classified by person and classroom for repeated measures data on children who change classrooms during the elementary years. We shall consider these cases and also cases that involve both nesting and crossing of random factors.

All of the models considered to this point will be estimated by means of maximum likelihood are an approximation to it. Our final topic is to consider the advantages of an alternative, Bayesian, perspective. This is especially useful when the number of higher level units is small.

Statistical issues that cut across applications include: efficiency and robustness of inferences, empirical Bayes and Bayes shrinkage estimation of random effects, exploratory analyses and model checking, and univariate and multivariate hypothesis tests and confidence sets.

Format

The course will be conducted in a mixed, lecture and discussion format. Lab sessions will enable participants to gain skill in analyzing hierarchical data.

The course will closely follow the recently published second edition of Hierarchical Linear Models (Raudenbush and Bryk, 2002, Sage Publications). That book includes many additional references that will be discussed during the course.

Student Work

There will be four assignments using data provided by the instructor. The aim of having these is to insure a minimum competency in applying the basic methods. A final project can take one of several forms:

1. Re-analyses of existing data. Students may analyze their own data or may choose from among several data sets made available by the instructor.

2. Analyses of statistical power, resource allocation, or sample size for future studies. Students can test out newly-developed software for research design and even make changes in the software as needed.

3. Analytic or simulation studies of the statistical properties of novel approaches to estimation and hypothesis testing.

Students may work in groups of three or fewer or may work alone.

Evaluation

Four assignments (15% each) 60%
Mid term Exam 20%
Final project 20%