Paul W. Holland - Böcker
Visar alla böcker från författaren Paul W. Holland. Handla med fri frakt och snabb leverans.
9 produkter
9 produkter
538 kr
Tillfälligt slut
459 kr
Tillfälligt slut
The issues surrounding the comparability of various tests used to assess performance in schools received broad public attention during congressional debate over the Voluntary National Tests proposed by President Clinton in his 1997 State of the Union Address. Proponents of Voluntary National Tests argue that there is no widely understood, challenging benchmark of individual student performance in 4th-grade reading and 8th-grade mathematics, thus the need for a new test. Opponents argue that a statistical linkage among tests already used by states and districts might provide the sort of comparability called for by the president's proposal. Public Law 105-78 requested that the National Research Council study whether an equivalency scale could be developed that would allow test scores from existing commercial tests and state assessments to be compared with each other and with the National Assessment of Education Progress. In this book, the committee reviewed research literature on the statistical and technical aspects of creating valid links between tests and how the content, use, and purposes of education testing in the United States influences the quality and meaning of those links.The book summarizes relevant prior linkage studies and presents a picture of the diversity of state testing programs. It also looks at the unique characteristics of the National Assessment of Educational Progress. Uncommon Measures provides an answer to the question posed by Congress in Public Law 105-78, suggests criteria for evaluating the quality of linkages, and calls for further research to determine the level of precision needed to make inferences about linked tests. In arriving at its conclusions, the committee acknowledged that ultimately policymakers and educators must take responsibility for determining the degree of imprecision they are willing to tolerate in testing and linking. This book provides science-based information with which to make those decisions.
1 578 kr
Skickas inom 10-15 vardagar
Kernel Equating (KE) is a powerful, modern and unified approach to test equating. It is based on a flexible family of equipercentile-like equating functions and contains the linear equating function as a special case. Any equipercentile equating method has five steps or parts. They are: 1) pre-smoothing; 2) estimation of the score-probabilities on the target population; 3) continuization; 4) computing and diagnosing the equating function; 5) computing the standard error of equating and related accuracy measures. KE brings these steps together in an organized whole rather than treating them as disparate problems.KE exploits pre-smoothing by fitting log-linear models to score data, and incorporates it into step 5) above. KE provides new tools for diagnosing a given equating function, and for comparing two or more equating functions in order to choose between them. In this book, KE is applied to the four major equating designs and to both Chain Equating and Post-Stratification Equating for the Non-Equivalent groups with Anchor Test Design.This book will be an important reference for several groups: (a) Statisticians and others interested in the theory behind equating methods and the use of model-based statistical methods for data smoothing in applied work; (b) Practitioners who need to equate tests?including those with these responsibilities in testing companies, state testing agencies and school districts; and (c) Instructors in psychometric and measurement programs. The authors assume some familiarity with linear and equipercentile test equating, and with matrix algebra.Alina von Davier is an Associate Research Scientist in the Center for Statistical Theory and Practice, at Educational Testing Service. She has been a research collaborator at the Universities of Trier, Magdeburg, and Kiel, an assistant professor at the Politechnical University of Bucharest and a research scientist at the Institute for Psychology in Bucharest.Paul Holland holds the Frederic M. Lord Chair in Measurement and Statistics at Educational Testing Service. He held faculty positions in the Graduate School of Education, University of California, Berkeley and the Harvard Department of Statistics. He is a Fellow of the American Statistical Association, the Institute of Mathematical Statistics, and the American Association for the Advancement of Science. He is an elected Member of the International Statistical Institute and a past president of the Psychometric society. He was awarded the (AERA/ACT) E. F.Lindquist Award, in 2000, and was designated a National Associate of the National Academies of Science in 2002. Dorothy Thayer currently is a consultant in the Center of Statistical Theory and Practice, at Educational Testing Service. Her research interests include computational and statistical methodology, empirical Bayes techniques, missing data procedures and exploratory data analysis techniques.
1 064 kr
Skickas inom 10-15 vardagar
In their preface to the second edition of Test Equating, Scaling, and Linking, Mike Kolen and Bob Brennan (2004) made the following observation: “Prior to 1980, the subject of equating was ignored by most people in the measurement community except for psychometricians, who had responsibility for equating” (p. vii). The authors went on to say that considerably more attention is now paid to equating, indeed to all forms of linkages between tests, and that this increased attention can be attributed to several factors: 1. An increase in the number and variety of testing programs that use multiple forms and the recognition among professionals that these multiple forms need to be linked. 2. Test developers and publishers, in response to critics, often refer to the role of linking in reporting scores. 3. The accountability movement and fairness issues related to assessment have become much more visible. Those of us who work in this field know that ensuring comparability of scores is not an easy thing to do. Nonetheless, our customers—the te- takers and score users—either assume that scores on different forms of an assessment can be used interchangeably or, like the critics above, ask us to justify our comparability assumptions. And they are right to do this. After all, the test scores that we provide have an impact on decisions that affect people’s choices and their future plans. From an ethical point of view, we are obligated to get it right.
1 381 kr
Skickas inom 10-15 vardagar
6. 2 The Two-Sample Capture-Recapture Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231 6. 3 Conditional Maximum Likelihood Estimation of N . . . . . . . . . . . . . . . . . . . . . . 236 6. 4 The Three-Sample Census . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 6. 5 The General Multiple Recapture Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246 6. 6 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254 7 MODELS FOR MEASURING CHANGE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 7. 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257 7. 2 First-Order Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 7. 3 Higher-Order Markov Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267 7. 4 Markov Models with a Single Sequence of Transitions . . . . . . . . . . . . . . . . . . 270 7. 5 Other Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273 8 ANALYSIS OF SQUARE TABLES: SYMMETRY AND MARGINAL HOMOGENEITY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 8. 1 Introduction . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 281 8. 2 Two-Dimensional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 8. 3 Three-Dimensional Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299 8. 4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309 9 MODEL SELECTION AND ASSESSING CLOSENESS OF FIT: PRACTICAL ASPECTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 9. 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311 9. 2 Simplicity in Model Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 9. 3 Searching for Sampling Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 9. 4 Fitting and Testing Using the Same Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317 9. 5 Too Good a Fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 324 9. 6 Large Sample Sizes and Chi Square When the Null Model is False . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329 9. 7 Data Anomalies and Suppressing Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 332 9. 8 Frequency of Frequencies Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 337 10 OTHER METHODS FOR ESTIMATION AND TESTING IN CROSS-CLASSIFICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 10. 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343 10. 2 The Information-Theoretic Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 344 10. 3 Minimizing Chi Square, Modi? ed Chi Square, and Logit Chi Square . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348 10. 4 The Logistic Model and How to Use It . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 357 10. 5 Testing via Partitioning of Chi Square . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 361 10. 6 Exact Theory for Tests Based on Conditional Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 10. 7 Analyses Based on Transformed Proportions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
2 490 kr
Skickas inom 10-15 vardagar
Test fairness is a moral imperative for both the makers and the users of tests. This book focuses on methods for detecting test items that function differently for different groups of examinees and on using this information to improve tests. Of interest to all testing and measurement specialists, it examines modern techniques used routinely to insure test fairness. Three of these relevant to the book's contents are: * detailed reviews of test items by subject matter experts and members of the major subgroups in society (gender, ethnic, and linguistic) that will be represented in the examinee population * comparisons of the predictive validity of the test done separately for each one of the major subgroups of examinees * extensive statistical analyses of the relative performance of major subgroups of examinees on individual test items.
665 kr
Skickas inom 10-15 vardagar
Test fairness is a moral imperative for both the makers and the users of tests. This book focuses on methods for detecting test items that function differently for different groups of examinees and on using this information to improve tests. Of interest to all testing and measurement specialists, it examines modern techniques used routinely to insure test fairness. Three of these relevant to the book's contents are: * detailed reviews of test items by subject matter experts and members of the major subgroups in society (gender, ethnic, and linguistic) that will be represented in the examinee population * comparisons of the predictive validity of the test done separately for each one of the major subgroups of examinees * extensive statistical analyses of the relative performance of major subgroups of examinees on individual test items.
1 137 kr
Skickas inom 10-15 vardagar
In their preface to the second edition of Test Equating, Scaling, and Linking, Mike Kolen and Bob Brennan (2004) made the following observation: “Prior to 1980, the subject of equating was ignored by most people in the measurement community except for psychometricians, who had responsibility for equating” (p. vii). The authors went on to say that considerably more attention is now paid to equating, indeed to all forms of linkages between tests, and that this increased attention can be attributed to several factors: 1. An increase in the number and variety of testing programs that use multiple forms and the recognition among professionals that these multiple forms need to be linked. 2. Test developers and publishers, in response to critics, often refer to the role of linking in reporting scores. 3. The accountability movement and fairness issues related to assessment have become much more visible. Those of us who work in this field know that ensuring comparability of scores is not an easy thing to do. Nonetheless, our customers—the te- takers and score users—either assume that scores on different forms of an assessment can be used interchangeably or, like the critics above, ask us to justify our comparability assumptions. And they are right to do this. After all, the test scores that we provide have an impact on decisions that affect people’s choices and their future plans. From an ethical point of view, we are obligated to get it right.
1 034 kr
Skickas inom 10-15 vardagar
Kernel Equating (KE) is a powerful, modern and unified approach to test equating. It is based on a flexible family of equipercentile-like equating functions and contains the linear equating function as a special case. Any equipercentile equating method has five steps or parts. They are: 1) pre-smoothing; 2) estimation of the score-probabilities on the target population; 3) continuization; 4) computing and diagnosing the equating function; 5) computing the standard error of equating and related accuracy measures. KE brings these steps together in an organized whole rather than treating them as disparate problems.KE exploits pre-smoothing by fitting log-linear models to score data, and incorporates it into step 5) above. KE provides new tools for diagnosing a given equating function, and for comparing two or more equating functions in order to choose between them. In this book, KE is applied to the four major equating designs and to both Chain Equating and Post-Stratification Equating for the Non-Equivalent groups with Anchor Test Design.This book will be an important reference for several groups: (a) Statisticians and others interested in the theory behind equating methods and the use of model-based statistical methods for data smoothing in applied work; (b) Practitioners who need to equate tests—including those with these responsibilities in testing companies, state testing agencies and school districts; and (c) Instructors in psychometric and measurement programs. The authors assume some familiarity with linear and equipercentile test equating, and with matrix algebra.Alina von Davier is an Associate Research Scientist in the Center for Statistical Theory and Practice, at Educational Testing Service. She has been a research collaborator at the Universities of Trier, Magdeburg, and Kiel, an assistant professor at the Politechnical University of Bucharest and a research scientist at the Institute for Psychology inBucharest.Paul Holland holds the Frederic M. Lord Chair in Measurement and Statistics at Educational Testing Service. He held faculty positions in the Graduate School of Education, University of California, Berkeley and the Harvard Department of Statistics. He is a Fellow of the American Statistical Association, the Institute of Mathematical Statistics, and the American Association for the Advancement of Science. He is an elected Member of the International Statistical Institute and a past president of the Psychometric society. He was awarded the (AERA/ACT) E. F. Lindquist Award, in 2000, and was designated a National Associate of the National Academies of Science in 2002. Dorothy Thayer currently is a consultant in the Center of Statistical Theory and Practice, at Educational Testing Service. Her research interests include computational and statistical methodology, empirical Bayes techniques, missing data procedures and exploratory data analysis techniques. From the reviews:"The book is nicely laid out, is extremely well written, and is an excellent text for a semester course or a short course…The book is highly recommended." Short Book Reviews of the International Statistical Institute, December 2004"This book is well-written and the presentation is clear, rigorous, and concise...A rich set of applications is used to illustrate the methods...This book is a gem! I highly recommend it to any statistician or psychometrician who has even a passing interest in test equating." Pscyhometrika, March 2006"This is a great book, and it is the first to focus on the kernel method of test equating." Applied Psychological Measurement, September 2005