Program
Home / Program
Quantitative Data Analysis
Instructor: Professor Donald J. Treiman (UCLA) and PKU faculty partner
This is the last time I will teach this course. However, it will continue to be offered regularly. In 2008-2009 and thereafter it will be taught by Prof. Jennie Brand, who has just joined the Sociology faculty.
INTRODUCTION
This is a course in how to do theoretically informed quantitative social research. By the end of the course you should have a fair idea of how to make sociological sense out of a body of quantitative data. Toward this end, we will cover a variety of techniques, including tabular analysis, log linear models for tabular data, regression analysis in its various forms, regression diagnostics and robust regression, ways to cope with missing data, logistic regression, factor analysis and other techniques of scale construction, measurement error, and related topics. But this is not a statistics course; the emphasis will be on using these procedures to draw substantive conclusions about how the social world works. A prior statistics course—Soc. 210A/B, or the equivalent—is required. (Soc. 210C is helpful but not required. Some students first take Soc. 212A/B and then Soc. 210C. The consensus seems to be that either sequence works about equally well.) High school algebra, either remembered or relearned, will also be needed to get through the course. (For those of you whose school algebra is rusty, good reviews can be found in Helen Walker, Mathematics Essential for Elementary Statistics, and W. L. Bashaw, Mathematics for Statistics. If you are not comfortable with school algebra, please work through one of these or an equivalent book by the 3rd week of class.)
Because time is short (and two quarters is a very short time for what we have to cover) the course concentrates on data analysis and the way one links theory and data. We will focus on the quantitative analysis of data from representative samples of well-defined populations. The populations can consist of almost anything—people, formal organizations, societies, occupations, pottery shards, or whatever; the analytic problems are essentially the same. Data collection procedures will be essentially ignored—that is, mentioned only the course of discussions of data- analytic issues. There is simply not enough time to cover both data analysis and data collection; a four-quarter course would be required to do justice to both topics. It is possible that the Sociology Department will offer a course on data collection next year, that is, in 2008-9. Prof. Linda Bourque in the School of Public Health sometimes offers a questionnaire design and administration course, which you can check out. An alternative method of learning about the practical details of data collection is to apprentice yourself (unpaid if necessary) to someone who is about to conduct a survey and insist that you get to participate in it step-by-step even when your presence is a nuisance.
The core of the course is a series of weekly exercises. The exercises are difficult and time consuming. Keeping up-to-date is crucial: you must understand the previous material in order to follow what comes next. So it is imperative that you do each set of exercises completely and on time. Exercises are due in class the week after they are assigned; they will be read and returned the following week. Late exercises will not be accepted. After the first few weeks you will be doing a good deal of analysis using a major U.S. national sample survey (NORC's General Social Survey). However, for many of the assignments you will be able to substitute a data set of your own, focusing on topics that interest you. The course will culminate in a term paper on a topic of your own choosing, in which you will carry out a quantitative analysis of some substantive issue using the technical and analytic skills developed by doing the exercises. It is not uncommon for course term papers to lead to publications or to master’s papers or Ph.D. dissertations. I think you will find the course highly enjoyable and quite demanding.
We are fortunate to have a T.A. this year, Claudia Solari, an advanced graduate student in Sociology. She will be available for advice about computing as well as analytic problems and will be mainly responsible for reading your weekly papers, although I will read them from time to time. She will announce her office hours at the first class meeting. I also will be available for consultation, via either an exchange of email or, where necessary, a face-to-face meeting. Because of the complexity of all our schedules, I have found it more convenient to arrange meetings via email or a brief word in class than to keep regular office hours. Do not hesitate to ask to meet with me if you feel it would be helpful.
PRACTICAL DETAILS
Written exercises. You are expected to prepare your exercises on a word processor. If for any reason you do not know how to type (something that is very rare these days), I strongly urge you to buy a self-instruction book and spend 20 minutes a day teaching yourself to type; it should take you only a month or so to become proficient. You will also need a word processor, or access to one. You may use the machines in the Social Sciences Computing Laboratories (Public Policy 2035), both for word processing and to do the computing work of the course. Check the SSC web page (http://computing.sscnet.ucla.edu/public/labs/labhours.aspx) for open hours and also for the location and hours for other labs maintained by SSC.
Calculator. You will find a calculator a necessity for this course, unless you particularly enjoy doing tedious arithmetic computations by hand. Calculators powerful to handle all of your needs are now available for relatively little money. Unfortunately, I am not up-to-date on what is available, but somebody at the ASUCLA Store probably can help you decide what is optimal. I find it somewhat useful to have a calculator with lots of memory and not at all useful to have built-in statistical functions (e.g., simple correlation and regression). A reasonable alternative is to use the “display” function in Stata, the software we will be using for the course. But this only makes sense if you have Stata installed on your own computer.
Computing. Starting in the 3rd week of the first quarter all assignments will require doing data analysis of one or more sample surveys, using the statistical package Stata, Version 10.0. (While I once taught this course using SPSS, more than 10 years ago I switched to Stata because it is a fast and efficient package that includes most of the statistical procedures of interest to social scientists. As software, it clearly is superior to SPSS; it is faster, more accurate, and includes a much larger range of applications. My judgement in this matter is widely shared, and Stata has become the statistical package of choice within the UCLA Departments of Sociology and Economics and elsewhere as well.) One of the important advantages of Stata is that Stata data sets are fully transportable across platforms. Thus, you may use the same data set to do analysis on a PC or on a UNIX machine, which will give you much greater flexibility in carrying out your work in the course. The 3rd lecture will be devoted to a full-fledged introduction to Stata.
You may carry out your computing work in one of two ways: using one of the very fast PC’s in the Social Sciences Computing (SSC) Laboratories (2035 PPB); or installing Stata on your own computer or some computer to which you have individual access. It also is possible to dial-in to a Social Science Computing server; Claudia Solari will provide the details. Computing procedures will be discussed in class, and the lecture for the 3rd week will be devoted to an introduction to computing and the common data set, the 2004 GSS (about which more below). In the event you do not have access to a computer, you also may use the SSC Lab computers to prepare your weekly exercises. Note that there is a printing charge for graduate students. Check the SSC web page (http://computing.sscnet.ucla.edu/public/labs/labhours.aspx) for open hours and also for the location and hours for other labs maintained by SSC.
Unfortunately, having established a secure market niche and a loyal user base, the Stata Corporation has succumbed to the profit motive. The software is still relatively inexpensive but they get you on the manuals, knowing that you can’t get your work done without them. Unfortunately for you, they are correct about this. You may wish to purchase the software but whether or not you do this, you must purchase the Stata manuals since you will never be able to become facile at Stata without frequently consulting the documentation. Do not succumb to the temptation to save money by consulting the manuals in the computing labs! This having been said, let me suggest several possibilities from among the many offered by Stata Corp. (To do your own “mix and match” among options, go to the Stata Corp web site: http://www.stata.com). There are two separable issues: software and documentation.
Software. If you have a reasonably fast computer with reasonably large memory (almost all desktop or notebook computers purchased within the past four years or so would meet this criterion; many older computers would not), you should consider purchasing the software because of the huge convenience of being able to work at home or in your own office rather than in the SSC Labs. You have three choices. Users of typical sample survey data sets will do fine with “Intercooled Stata,” which is available under the UCLA “GradPlan” (more below) for $155 for a perpetual license and for $95 for a one-year license. (Unless you are absolutely committed to some other software and are only learning Stata in order to get through the course, I recommend spending the extra $60 for the perpetual license; even if you do not become a Stata convert, it is convenient to have the software available.) If you expect to work with very large data sets, such as census files, you should purchase a perpetual license for Stata/SE, for $335 (under the GradPlan). You can, in fact, go to still more expensive options, presuming you have a multiple-processor computer. If this is your situation, consult the Stata web site directly.
Documentation. Many of you will be able to manage with the basic documentation—the Base Reference Manual (3 volumes), Data Management Reference Manual, User's Guide, and Quick Reference and Index, available as a package under the GradPlan for $179—plus the Graphics Reference Manual, sold separately at $65, and the Survey Data Reference Manual ($40), for a total of $284. There are six additional subject-specific manuals. If you want all these, you can purchase complete documentation (15 volumes in all) for $545. Shipping will also cost an additional $20-$30. For those of you for whom Stata will become your primary software (which certainly should include all Sociology students, and perhaps others), purchasing all the documentation now is a good investment, since you may well want to make use of the specialized manuals as your statistical training and research efforts advance beyond this course. (However, you should be warned that Stata historically has come out with a new version about once every 18-24 months and that Stata 10.0 was released in spring 2007; so Stata 10.0 is likely to appear in late 2008 or early 2009. But Stata 10.0 is so powerful that you could simply forgo updating until you have a real job and can afford it.)
To summarize, you can manage for as little as $284 (for the basic manuals plus Graphics plus Survey Data) or you can spend as much as $880 (for Stata/SE plus full documentation). In addition, you may be interested in Stat/Transfer 9 (available from Stata Corp for $65 under the GradPlan), which is an extremely good conversion package that enables you to convert SAS or SPSS or any of a wide number of other file types to Stata files, and vice versa. StatTransfer is available at the SSC Labs, but it is convenient to have it on your own computer.
If you have a previous version of Stata or of Stat/Transfer, you should check the Stata Corp web page for upgrade pricing.
Note: Even if you own a previous version of Stata, you must upgrade to Stata 10.0 since it contains many new commands and is the version from which I teach.
Here’s how to make these purchases. Go to the Stata Corp web site: http://www.stata.com), click on “Order Stata” or “Upgrade Now,” click on “Educational,” click on “Place an order” under “GradPlans,” click to get to UCLA, and complete your order, paying via credit card. I suggest ground shipping as the cheapest option; allow 3-4 business days. To be safe, I suggest ordering Stata by the beginning of the 2nd week of the course, that is on Oct. 1 or 2. This will give you time to get and look over the manuals before the introductory lecture the 3rd week of class and, if you have purchased the software, to install it and try it out. If you have questions about the purchase, call Stata Corp (800-782-8272), and ask to speak to someone about the GradPlan. They are quite friendly.
The course web page. I have established a CourseWeb page for the course, which contains this syllabus and links to the data sets we will use and the documentation for these data sets. From time to time I will add other materials, which you may then download or print for your own use. Each Thursday evening (or Friday morning) I will put up an “Illustrative Answer” to the exercise you turn in that day, which you may also download or print. I urge you to check the web site frequently since it will contain the most up-to-date information regarding the course. The URL (Internet address) is: http://www.sscnet.ucla.edu/07F/soc212a-1/ . You should make a bookmark for this site on your computer.
Books. Apart from the Stata documentation, there is only one required purchase each quarter—a course pack. The Course Pack for the first quarter contains the weekly exercises plus nine chapters I have written to accompany the lectures; these chapters constitute a partial text book, which students generally have found very helpful. The Course Pack can be purchased at Westwood Copies, 1001 Gayley Ave. (phone: 310/208-3233). It will be available sometime before the first class; call to check. Please read Chs. 1 and 2 before the first class meeting. There will be a second course pack to purchase for the Spring quarter. In addition, one book is very strongly recommended. This is:
• Becker, Howard S. 1986. Writing for Social Scientists: How to Start and Finish Your Thesis, Book, or Article. Chicago: University of Chicago Press.
This is a wonderful book, which you should read through just for the pleasure of it, and reread whenever you discover you are hung up on your writing. As you read through it the first time, make a point of identifying your greatest writing sin—you will probably suffer from more than one, so identify the worst one. Then fix it, of course! Read it as soon as you get it.
Four more books are also very strongly recommended, not so much for this course as for advanced study and future reference. If you want to improve your understanding of OLS regression, work through the Fox book (which also includes some material on logistic regression). There are now two very good books for sociologists and other social scientists on logistic regression and allied topics, by Scott Long and by Daniel Powers and Yu Xie. In addition, there is a book by Long and Jeremy Freese on how to do logistic regression analysis using Stata. All five authors are sociologists; all four books were published fairly recently; and all four are fairly demanding. Of the three logistic regression texts, I have some preference for that by Powers and Xie because Long tends to emphasize standardized coefficients, which I generally find more confusing than helpful; but that may be my own limitation. Here are the bibliographic references, in chronological order. You can buy all but the Fox book through the Stata Bookstore, among other places. For some reason, the Fox book is no longer listed on the Stata Bookstore web page.
• Fox, John. 1997. Applied Regression Analysis, Linear Models, and Related Methods. Thousand Oaks: Sage. • Long, J. Scott. 1997. Regression Models for Categorical and Limited Dependent Variables. Thousand Oaks: Sage. • Powers, Daniel A., and Yu Xie. 2000. Statistical Methods for Categorical Data Analysis. Orlando: Academic Press. • Long, J. Scott, and Jeremy Freese. 2006. Regression Models for Categorical Dependent Variables Using Stata. 2nd edition. College Station, TX: Stata Press.
If you are new to Stata, I recommend that you work through the tutorial in the Getting Started with Stata for Windows manual. For any of you who have continue to have difficulty making sense of the material in the manuals, I recommend
• Rabe-Hesketh, Sophia, and Brian Everitt. 2007. A Handbook of Statistical Analyses Using Stata, 4th edition. Boca Raton, FL: Chapman & Hall/CRC.
This is designed as a companion to the Stata manuals. It is fairly elementary, but is a good overview of what can be done with Stata. It covers a lot of ground quickly, so it is best as a review of rather than an introduction to statistics. You can order the book from Stata Corp. There are several other guides to using Stata, which you can check out on the Stata Bookstore web page.
Finally, the classic text on logistic regression, by Hosmer and Lemeshow, first published in 1989, has been updated fairly recently. Hosmer and Lemeshow are biostatisticians and their examples reflect this orientation. In the second edition, many of the examples were generated using Stata. A particular strength of the book is the attention it pays to complex samples that require survey estimation techniques (implemented in the Stata svylogit command).
• Hosmer, David W., and Stanley Lemeshow. 2000. Applied Logistic Regression. 2nd edition. New York: Wiley.
Additional readings. The following will prove helpful to many of you. Note that publications identified as “(Sage No. ##)” are Sage University Papers, in the Quantitative Applications in the Social Sciences Series, published by Sage Publications, Thousand Oaks, CA. These are quite short, less than 100 pages each, and are intended as introductions to various topics. I have found them generally useful, although some are better than others.
On the General Social Survey Davis, James A., and Tom W. Smith. 1992. The NORC General Social Survey: a User’s Guide. (Guides to Major Social Science Data Bases 1.) Thousand Oaks: Sage. This gives a history and, more important, the logic of the GSS. It is a very useful document for any of you who expect to make professional research use of the GSS.
Davis, James Allan, Tom W. Smith, and Peter V. Marsden. 2005. General Social Surveys, 1972-2004: Cumulative Codebook. Chicago: National Opinion Research Center. This is the codebook for the GSS. While you can access the codebook via the Internet, it actually is substantially more convenient to have a local copy. The easiest way to obtain a copy is to ask Libbie Stephenson, the UCLA Data Archivist, to burn a CD for you, which will cost you all of one dollar. You can then put the codebook on your computer, which makes it easy to search or, if you really, really want a paper copy, you can print it. However, since it is about 1,500 pages in length, this will be an expensive proposition.
On cross-tabulations
Davis, James A. 1985. The Logic of Causal Order. (Sage No. 55.) This publication encompasses more than cross-tabulations but is included here because the logic of causal order is for many students most confusing with respect to crosstabulations.
Zeisel, Hans. 1985. Say it with figures. 6th ed. New York : Harper & Row. This is a classic, first published in 1947 and continuously updated through 1985. You should know about it just for historical reasons. But you also are likely to find it useful as a guide to presenting clear and informative tables.
On regression analysis
Achen, Christopher H. 1982. Interpreting and Using Regression. (Sage No. 29.)
Asher, Herbert B. 1983. Causal Modeling. Second Edition. (Sage No. 3.)
Berry, William D., 1993. Understanding Regression Assumptions. (Sage No. 92.)
Berry, William D., and Stanley Feldman. 1985. Multiple Regression in Practice. (Sage No. 50.)
Cohen, Jacob, and Patricia Cohen. 1975. Applied Multiple Regression/Correlation Analysis for the Behavioral Sciences. Hillsdale, NJ: Lawrence Erlbaum Associates. An old, but still useful, text. Lots of practical tricks, all based on school algebra. Fox, John. 1991. Regression Diagnostics. (Sage No. 79.)
Hamilton, Lawrence C. 1992. Regression with Graphics: a Second Course in Applied Statistics. Belmont: Duxbury Press. This is a graphically oriented book on regression by the author of Statistics with Stata 7.0. Although it can be used with any statistical package, it is clear that most of the graphics were created using Stata. It thus is a useful advanced applied statistics text for Stata users.
Hardy, Melissa. 1993. Regression with Dummy Variables. (Sage No. 93.)
Hargens, Lowell L. 1976. “A Note on Standardized Coefficients as Structural Parameters.” Sociological Methods and Research 5:247-56.. Makes a case for comparing standardized coefficients across samples. Read in conjunction with Kim and Mueller.
Jaccard, James, and Robert Turrisi. 2003. Interaction Effects in Multiple Regression. 2nd ed. (Sage No. 72.)
Kim, Jae-On, and Charles W. Mueller. 1976. “Standardized and Unstandardized Coefficients in Causal Analysis: An Expository Note.” Sociological Methods and Research 4:423-38. Makes the case against comparing standardized coefficients across samples (the conventional view). Read in conjunction with Hargens.
Marsh, Lawrence C., and David R. Cormier. 2001. Spline Regression Models. (Sage No. 137.) Stoltzenberg, Ross. 1974. “Estimating an Equation with Multiplicative and Additive Terms.” Sociological Methods and Research 2:313-31. Wildt, Albert R, and Olli Ahtola. 1978. Analysis of Covariance. (Sage No. 12.)
On logistic regression and allied procedures
Aldrich, John H., and Forrest D. Nelson. 1984. Linear Probability, Logit, And Probit Models. (Sage No. 45.)
Borooah, Vani Kant. 2001. Logit and Probit: Ordered and Multinomial Models. (Sage No. 138.)
Conroy, Ronán M. 2002. “Choosing an Appropriate Real-life Measure of Effect Size: the Case of a Continuous Predictor and a Binary Outcome.” Stata Journal 2:290-295.
DeMaris, Alfred. 1992. Logit Modeling. (Sage No. 86.) Includes a discussion of log linear models.
DeMaris, Alfred. 2002. “Explained Variance in Logistic Regression: A Monte Carlo Study of Proposed Measures.” Sociological Methods and Research 31:27-74.
Heiss, Florian. 2002. “Structural Choice Analysis with Nested Logit Models.” Stata Journal 2:227-252.
Liao, Tim Futing. 1994. Interpreting Probability Models: Logit, Probit, and Other Generalized Linear Models. (Sage No. 101.)
Menard, Scott. 2001. Applied Logistic Regression Analysis. 2nd ed. (Sage No. 106.) A non-technical introduction to logistic regression, treated as an extension of OLS regression.
Pampel, Fred C. 2000. Logistic Regression: A Primer. (Sage No. 132.) On log-linear and log-multiplicative analysis Goodman, Leo A., and Michael Hout. 1998. “Statistical Methods and Graphical Displays for Analyzing How the Association between Two Qualitative Variables Differs among Countries, among Groups, or over Time: A Modified Regression-Type Approach.” Sociological Methodology 28:175-230.
Goodman, Leo A., and Michael Hout. 2001. “Statistical Methods and Graphical Displays for Analyzing How the Association between Two Qualitative Variables Differs among Countries, among Groups, or over Time - Part II: Some Exploratory Techniques, Simple Models, and Simple Examples.” Sociological Methodology 31:189-221. [These two articles represent the current state of the art. See also the extensive literature cited, particularly the papers by Xie and by Yamaguchi.]
Hagenaars, Jacques A. 1993. Loglinear Models with Latent Variables. (Sage No. 94.) An advanced topic: structural equation models for categorical variables.
Hauser, Robert M. 1978. “A Structural Model of the Mobility Table.” Social Forces 56:919-43.
Hauser, Robert M. 1980. “Some Exploratory Methods for Modeling Mobility Tables and Other Cross-Classified Data.” Sociological Methodology 1980.
Hout, Michael. 1982. Mobility Tables. (Sage No. 31.) Consideration of an important class of log linear models. Read after Knoke and Burke.
Ishii-Kuntz, Masako. 1994. Ordinal Log-linear Models. (Sage No. 97.)
Kaufman, Robert L., and Paul G. Schervish. 1986. “Using Adjusted Crosstabulations to Interpret Log-linear Relationships.” American Sociological Review 51:717-33.
Knoke, David, and Peter Burke. 1980. Log-Linear Models. (Sage No. 20.) An introduction to log linear models.
McCutcheon, Allan L. 1987. Latent Class Analysis. (Sage No. 64.) Extension of log linear analysis to create an analog to factor analysis for categorical variables.
Rudas, Tamás. 1998. Odds Ratios in the Analysis of Contingency Tables. (Sage No. 119). On factor analysis
Dunteman, George H. 198x. Principal Components Analysis. (Sage No. 69.)
Kim, Jae-On, and Charles W. Mueller. 1978. Introduction to Factor Analysis: What It Is and How to Do It. (Sage No. 13.)
Kim, Jae-On, and Charles W. Mueller. 1978. Factor Analysis: Statistical Methods and Practical Issues. (Sage No. 14.)
McCutcheon, Allan L. 1987. Latent Class Analysis. (Sage No. 64.)
Analog to factor analysis for categorical variables.
On sample selection bias, matching, and allied topics
Becker, Sascha O., and Andrea Ichino. 2002. “Estimation of Average Treatment Effects Based on Propensity Scores.” Stata Journal 2:358-377.
Breen, Richard. 1996. Regression Models: Censored, Sample Selected, or Truncated Data. (Sage No. 111.)
Berk, Richard A. 1983. “An Introduction to Sample Selection Bias in Sociological Data.” American Sociological Review 48:386-98.
Berk, Richard A., and Subhash C. Ray. 1982. “Selection Biases in Sociological Data.” Social Science Research 11:352-98.
Smith, Herbert L. 1997. “Matching with Multiple Controls to Estimate Treatment Effects in Observational Studies.” Sociological Methodology 27:325-353.
Winship, Christopher, and Robert D. Mare. 1992. “Models for Sample Selection Bias.” Annual Review of Sociology 18:327-50. On estimation, statistical inference, and related topics
Eliason, Scott R. 1993. Maximum Likelihood Estimation: Logic and Practice. (Sage No. 96).
Gould, William, and William Sribney. 1999. Maximum Likelihood Estimation with Stata. College Station: Stata Press.
Henkel, Ramon E. 1976. Tests of Significance. (Sage No. 4.) Read in conjunction with Sage No. 43, Bayesian Statistical Inference.
Iverson, Gudmund R. 1984. Bayesian Statistical Inference. (Sage No. 43.) Read in conjunction with Sage No. 4, Tests of Significance.
Mohr, Lawrence B. 1993. Understanding Significance Testing. (Sage No. 73.) A coherent review of what you supposedly learned in introductory statistics but perhaps didn’t quite get, with an emphasis on practical applications.
Mooney, Christopher Z., and Robert D. Duval. 1993. Boostrapping: A Nonparametric Approach to Statistical Inference. (Sage No. 95.)
Mooney, Christopher Z. 1997. Monte Carlo Simulation. (Sage: No. 116.)
Raftery, Adrian. 1995. “Bayesian Model Selection in Social Research.” Sociological Methodology 1995 25:111-63.
Definitive discussion of BIC in the sociological literature. See also the Special Issue on the Bayesian Information Criterion, Sociological Methods and Research, Vol. 27, No. 3 (February 1999) which includes a critical evaluation of BIC by David Weakliem, a defense by Raftery, and discussions by several others.
Smithson, Michael. 2002. Confidence Intervals. (Sage No. 140.)
On coping with missing data
Allison, Paul D. 2001. Missing Data. (Sage No. 136.)
Brick, J. Michael, and Graham Kalton. 1996. “Handling Missing Data in Survey Research.”
Statistical Methods in Medical Research 5:215-238.
Landerman, Lawrence R., Kenneth C. Land, and Carl F. Pieper. 1997. “An Empirical Evaluation of the Predictive Mean Matching Method for Imputing Missing Values.” Sociological Methods and Research 26:3-33. Little, Roderick J. A., and Donald B. Rubin. 2002. Statistical Analysis with Missing Data, 2nd edition. New York: John Wiley & Sons.
The definitive treatment, by the creators of multiple-imputation. Paul, Christopher, William M. Mason, Daniel McCaffrey, and Sarah A. Fox. 2003. ''What Should We Do About Missing Data? (A Case Study Using Logistic Regression with Missing Data on a Single Covariate)." Los Angeles: California Center for Population Research, On- Line Working Paper Series CCPR-028-03). A good overview of the strengths and weaknesses of various methods, with some useful skepticism about multiple imputation as the gold standard. Schafer, Joseph L. 1997. Analysis of Incomplete Multivariate Data. London: Chapman and Hall.
An accessible overview.
Schafer, Joseph L. 1999. “Multiple Imputation: A Primer.” Statistical Methods in Medical Research 8:3-15. A short version of Schafer 1997.
On other topics Carmines, Edward G., and Richard A. Zeller. 1979. Reliability and Validity Assessment. (Sage No. 17.)
Firebaugh, Glenn. 1997. Analyzing Repeated Surveys. (Sage: No. 115.) Fox, James Alan, and Paul E. Tracy. 1986. Randomized Response: A Method for Sensitive Surveys. (Sage No. 58.) A specialized technique for getting estimates of rates for sensitive topics.
Glenn, Norval D. 2004. Cohort Analysis. 2nd ed. (Sage No. 5.)
Jacoby, William G. 1997. Statistical Graphics for Univariate and Bivariate Data. (Sage No. 117.)
Kalton, Graham. 1983. Introduction to Survey Sampling. (Sage No. 35.)
Lee, Eun Sul, Ronald N. Forthofer, and Ronald J. Lorimor. 1989. Analyzing Complex Survey Data. (Sage No. 71.)
The analysis of stratified and clustered samples.
Luke, Douglas A. 2004. Multilevel Modeling. (Sage No. 143.)
Moore, Kristin Anderson, Tamara G. Halle, Sharon Vandivere, and Carrie L. Mariner. 2002. “Scaling Back Survey Scales: How Short is Too Short?” Sociological Methods and Research 30:530-567.
McIver, John P., and Edward G. Carmines. 1984. Unidimensional Scaling. (Sage No. 24.)
Namboodiri, Krishnan. 1984. Matrix Algebra. (Sage No. 38.)
Rudas, Tamas. 2004. Probability Theory: A Primer. (Sage No. 142.)
Sullivan, John L., and Stanley Feldman. 1979. Multiple Indicators. (Sage No. 15.)
Use of multiple indicators to assess validity and reliability.
Vanleeuwen, Dawn M., and Keith H. Mandabach. 2002. “A Note on the Reliability of Ranked Items.” Sociological Methods and Research 31:87-105.
Reading. Since this is not an undergraduate course, no reading is “required”—except for the course reader. But there is much that is valuable, especially when read in conjunction with your analysis. I strongly suggest that you try to read from the above materials in conjunction with each week’s lecture. Also, I will make reading suggestions in class from time to time.