Must Haves' For a Next Generation Comprehensive Assessment System
|By Jason K. Feld, Ph.D.
The Common Core education movement has been accompanied by a rapid and widespread interest in the use of standards-aligned, research-supported comprehensive assessment systems to help guide local educational decision-making. This interest has created a pressing need for the development and implementation of a new generation of technological tools capable of integrating standards-based assessment with instructional decision-making. In turn, these tools can provide data required to inform the many types of decisions confronting educators and administrators in today’s schools.
Because different types of decisions require different types of data, a technology-based comprehensive assessment system must be composed of different types of assessments. Each type serves a different central purpose, and each type may require variations in test characteristics and assessment procedures in order to serve its central purpose.
For example, within a comprehensive system, benchmark and formative assessments aligned to district pacing guides and curriculum can inform differentiated instruction on an ongoing basis. Benchmarks can also be designed and analyzed in ways that allow the data to be used both in tracking student progress toward and mastery of standards and in forecasting performance on statewide tests.
- Pre- and post-tests are useful in measuring academic progress and instructional effectiveness over an extended period of time, and can also be used to identify classes and schools that are highly successful as well as those needing assistance.
- Screening instruments are useful in identifying students at risk for learning problems.
- Placement tests inform grade level placements and advanced course placements.
- Computerized adaptive tests provide efficient measures of individual academic proficiency and can help support individualized intervention planning.
- Observational and rating-scale assessments provide authentic measures of competencies in the environment in which those competencies are used while providing immediate feedback to guide instruction.
Data Essentials of a Comprehensive System
- Data that Addresses District Assessment Needs
One essential data characteristic of a comprehensive system is that it helps to ensure that all of the district’s assessment needs are met. This should be accomplished by including the various required types of assessment within one seamless system, rather than in multiple systems housed in separate “silos.” However, there may be some types of assessment results that are imported into the system, such as statewide test results. Although each type of assessment has a unique central purpose, there is a considerable amount of overlap in subordinate purposes that the various types of assessment may serve. The overlap may lead to increases in assessment efficiency in that a given type of assessment may fulfill more than one assessment need or it may add additional information enhancing the effectiveness of educational decisions.
- Data that Provides Student Mastery Information in Credible and Actionable Ways
A second essential data characteristic of a comprehensive assessment system is that the measurement approach used to document student growth provides actionable information that can be used to guide educational decision making and to credibly track student progress toward mastery of standards. In this regard, implementation of Item Response Theory (IRT) analyses following the administration of district-wide assessments is indispensable. This is because IRT makes it possible to place scores from multiple assessments on a common scale. As a direct benefit, the progression of student scores across assessments over time is a direct measure of growth, thereby making it possible to directly compare results from different assessments. This offers a more complete picture of student growth and achievement than would otherwise be possible with simple raw scores (i.e., percent or number correct). IRT analyses make it possible to forecast students’ likely performance on statewide tests. This ensures that educators have the data necessary to identify students or groups of students at various risk levels (e.g., low, moderate, high) and to engage in differentiated instruction to increase the students’ likelihood of mastering specific standards and pass the statewide assessment. One of the most important benefits of IRT analyses is that it makes it possible for school districts to continually evaluate the psychometric rigor of assessments and the characteristics of items comprising the assessments — i.e., item parameter estimates including difficulty, guessing and discrimination. Continually refreshing item parameter estimates is essential to ensure that item characteristics are kept current as student populations and instructional content change over time. This refreshing process becomes even more critical when standards change leading to the need to realign current items and develop new ones.
To further understand why IRT is an indispensable part of a comprehensive assessment system, let’s take a look at some hypothetical student scores on a series of district-wide benchmark assessments. If a class of students received an average IRT score of 1100 on an initial assessment and an average IRT score of 1150 on a subsequent assessment, the district would know that the students showed significant growth. By contrast, if IRT were not used and the class received an average score of 70 percent correct on the initial assessment and 80 percent correct on the subsequent assessment, no conclusions about growth would be warranted. This is because the difference in scores between the two tests could be attributed to differences in the difficulty of the tests — i.e., the second test could be easier or even harder — or to changes in the achievement of the students, or to both. We just wouldn’t know.
- Data that Facilitates Multiple Forms of Differentiated Instruction
A third essential data characteristic of a comprehensive assessment system is its capacity to provide the necessary data and tools for implementing differentiated instruction. Utilization of IRT analyses in addition to traditional raw data measures of student performance makes it possible for educators to engage in three interrelated forms of differentiated instruction. The most basic type using raw data involves assistance to the learner related to the performance of specific tasks, such as responding to a set of test items.
The second type involves interventions provided for students identified as being at varying levels of risk for not meeting standards on a statewide test. This type involves the use of IRT to identify next instructional steps for students at different risk levels. In this regard, IRT provides the capability to estimate the probability that students at a given ability level will be able to respond correctly to items reflecting standards targeted for instruction during a given time period. This information can be used to recommend instructional interventions.
The third type of differentiated instruction involves re-teaching for students displaying different patterns of progress. For example, students who perform at high levels at the beginning of instruction but make little progress pose a different instructional challenge from those who start out performing poorly, but show significant progress. An example of a report used by school districts and which displays the extent to which students fail to maintain, sustain, or exceed growth expectations, along with risk levels and categorical growth scores, all of which can be used to implement multiple forms of differentiated instruction is shown here.
- Data that Facilitates Efficiency and Effectiveness throughout System Implementation
A fourth essential data characteristic of a comprehensive assessment system is that it increases efficiency and effectiveness economies for a school district. For these to occur, the technology platform used to design, administer, analyze and report assessment results should be common across assessment types. Creating this commonality requires that the technology platform include artificial intelligence routines that guide test construction, review, publication, security, scheduling, administration, scoring, psychometric analyses and reporting. Efficiency and effectiveness are also created when the technology platform includes system monitoring tools that make it possible to guide system implementation. These tools should provide data regarding when tests have been constructed, reviewed, revised and delivered. They should generate basic data on test administration such as the number of students scheduled to take a test, the number who have actually done so and when testing has been completed.
Where Do We Go From Here?
Perhaps the most important characteristic of a truly “next generation” comprehensive assessment system is its ability to adapt to continuous change in response to advances in research, technology, public policy and local school district needs. A checklist to keep in mind while searching for a comprehensive assessment system should include:
- Technology capable of accommodating multiple forms/ types of change
- Ability to accommodate continuously changing standards
- Capability to rapidly align items, curriculum, and instructional tools to those standards
- Dynamic item banks and instructional content that expands continuously
- Capability to generate innovative item types and innovative approaches to curriculum development and delivery that will be required as the transition to online assessment and instruction accelerates
- Capability to incorporate new types of assessments, reports, curriculum, and instructional tools to meet changing needs
A comprehensive assessment system should always be and will always be a work in progress. Nothing less will meet the educational challenges faced by educators in today’s rapidly changing world.
Jason Feld received a M.A. in Psychology from New York University and his Ph.D. in Educational Psychology at the University of Arizona. His research and professional activities in Pre-K and K-12 education, assessment, policy and practice span 30 years. Dr. Feld is a published author in books, scholarly journals, technical reports, and early childhood journals and has served on editorial advisory boards.
Comments & Ratings