For too long, education has relied on backward-looking data—annual test scores and surveys—that tell us where we have been, not where we need to go. This is like trying to navigate a winding road by only checking the rearview mirror. Educators and policymakers need real-time insight to adjust course, not just a post-mortem of past performance.
The key is Practical Measurement for Improvement : a system designed to guide instruction, not simply label outcomes. The current approach often feels like an autopsy rather than a resource for growth. To get better at getting better, we must recognize that yearly results arrive too late to be useful.
The Three Purposes of Measurement – and Why They Get Confused
The problem isn’t a lack of data, but a failure to distinguish between its purposes. There are three fundamental ways we measure, each requiring a different design:
- Accountability (The Scoreboard): Focused on past performance (“Did we meet our goals?”). It’s infrequent, high-stakes, and designed for judgment.
- Research (The Laboratory): Focused on generalizable truths (“Is this theory correct?”). It prioritizes precision, often at the expense of real-world application.
- Improvement (The Steering Wheel): Focused on immediate change (“What works here, and why?”). This prioritizes learning for both students and educators.
Purpose drives design. If you want to improve instruction next week, you need leading indicators—data that can inform daily adjustments, not just annual reports.
The “Tuesday Test”: Is Your Data Actually Actionable?
As researchers have found, the crucial question is: What will educators do with this data tomorrow? If it’s too abstract, delayed, or aggregated to prompt immediate change, it’s useless.
Effective measures must be:
- Theory-aligned : Grounded in educational principles.
- Meaningful : Relevant to classroom practice.
- Actionable : Capable of triggering specific adjustments.
- Low-burden : Easy to collect without overwhelming educators.
- Timely : Available fast enough to inform PDSA (Plan-Do-Study-Act) learning cycles.
But a single measure isn’t enough. Schools are systems, and changes in one area (like math fluency) will impact others (like student motivation).
Systems Thinking: The “Family of Measures”
To understand these interconnected effects, we need a “Family of Measures” that tracks:
- Outcomes (Aim): What we want to achieve.
- Drivers (Key Markers): The factors influencing outcomes.
- Process (Workflow): How things are done.
- Balancing (Unintended Consequences): Both positive and negative side effects.
Rigor with Relevance: Beyond “Validity-for-Use”
Practical measurement isn’t about shortcuts. It demands rigor, but the definition must expand. Traditionally, psychometrics focuses on whether a measure accurately reflects what it claims (“validity-for-use”). However, we also need “validity-in-use”—ensuring that measures are supported by routines, culture, and technical infrastructure to encourage constructive inquiry, not just compliance.
If a measure leads educators to blame students instead of improving their own practice, it has failed.
Equity and Variation: The Power of Granular Data
Traditional accountability reports average results for subgroups. But real efficacy happens in the variation. Practical measurement demands that we explore: “What works, for whom, under what conditions?” Frequent, low-stakes measurement allows teams to see exactly how students respond to new strategies, enabling them to pivot this week, not next year.
This shifts the focus from deficit framing (“What’s wrong with these students?”) to systems thinking (“How is our system failing them?”), empowering educators to act.
From Implementers to Co-Inquirers: Restoring Agency
Practical measurement shifts power dynamics, inviting teachers to be co-inquirers rather than just implementers of mandates. By including practitioners in the design of measures—asking what constitutes meaningful evidence of learning—we build agency. This approach aligns with the Assessment in the Service of Learning (AISL) movement, transforming assessment from an external audit into an internal engine for improvement.
Leaders face a choice: continue driving by the rearview mirror, or invest in the capacities to measure what matters, when it matters. Measurement can fuel disciplined inquiry, but only if designed for learning. The real question isn’t “Did we implement with fidelity?” It’s “Did we improve, with integrity, in this context?” Stop tracing the map of past roads and start accelerating toward efficacy, one Tuesday at a time.
