Program
Categorical Data Analysis – 4 Week Course
Instructor: Daniel A. Powers (email: dpowers@mail.la.utexas.edu)
Course Description
This course provides an overview of the major statistical methods and models for categorical dependent variables. We will cover regression-like models for binomial and binary outcomes, contingency tables, individual-level count data, ordered and nominal polytomous response variables, as well as extensions to these models.
Prerequisites
Students should have had a previous course in linear regression and should have some experience using a statistical software package. Examples will be illustrated using Stata, but SAS, SPSS, R can produce similar results.
Course Syllabus
- [Week 1] Introduction
(Readings: Powers & Xie, Ch. 1-2):
- why categorical data analysis?
- regression-like models and categorical dependent variables
- transformational vs. latent variable approaches to CDA
- review of linear regression
- extending linear regression to categorical responses using GLS
- estimation and interpretation of parameters
- programming examples
- [Week 2] Binary Response Models
(Readings: Powers & Xie, Ch. 3, 5, Handout)
- binomial and binary data
- logit model
- motivation: transformational approach
- interpretation: odds, odds ratios, and relative risks.
- probit model
- motivation: latent variable approach
- interpretation: probits, marginal effects
- estimation of binary response models
- the generalized linear model (GLM)
- programming examples
- model comparisons and model fit
- deviance and chi-square statistics
- alternative probability model
- complementary log-log
- extending binary response models (time permitting)
- multilevel models
- longitudinal models
- Rasch models
- miscellaneous topics
- censored regression models
- selection models
- bivariate outcomes
- endogenous switching regressions models
- [Week 3] Models for Count Data
(Readings: Powers & Xie, Ch. 4, 6, Handout)
- frequency data and count data
- frequency data
- dissecting a contingency table
- odds ratios revisited: measuring association
- loglinear models for contingency tables
- interpreting parameters from loglinear models
- deviance, chi-square, and model fit
- programming
- models for ordinal data
- modeling various forms of association
- scaled association models
- estimation and interpretation of association parameters
- programming
- count data
- Poisson regression models for counts
- estimation, interpretation, and prediction
- extensions: zero-inflated Poisson regression, negative binomial regression, extensions
- extending models for count data (time permitting)
- loglinear models for event-history analysis (proportional hazards models)
- multilevel models for hierarchical or repeated measures data
- incorporating frailty in demographic models for rates
- loglinear models for table standardization
- miscellaneous topics
- [Week 4] Models for Ordered and Nominal Response Variables
(Readings: Powers & Xie, Ch. 7, 8, Handout)
- ordered responses
- types of logits
- cumulative logits and baseline logits
- ordered logit and ordered probit models
- specifications and parameterizations of ordered response models
- proportional odds (PO) assumption
- relaxing PO
- testing PO
- compromises via intermediate models
- estimation, interpretation, and prediction from ordered response models
- programming examples
- nominal responses
- types of logits
- baseline logits
- multinomial logit model
- estimation and interpretation of parameters
- estimation: as a loglinear model
- estimation: using standard programs
- conditional logit model
- discrete choice models
- random utility (latent variable) approach
- specification and interpretation of parameters
- general multinomial model
- specification issues
- the IIA assumption
- programming examples
- sequential responses
- types of logits:
- sequential logit
- continuation ratio logit models
- continuation ratio logit model
- specification, estimation and interpretation of parameters
- programming examples