Wednesday, 14 May 2014

In defence of baseline assessments

Earlier this year the DfE announced new proposals for holding primary schools accountable. These include a "baseline assessment" for pupils in reception. Primary schools that opt-in to using this assessment will then be measured on the progress pupils make over the course of their time in school rather than on the raw results of Key Stage tests.
It's fair to say the idea hasn't been universally welcomed. While the NAHT have made some positive noises the NUT have voted to investigate boycotting these assessments. And I suspect their position is held by the majority of early years teachers.
I don't think the DfE proposal as it stands is perfect - for one thing the suggestion that schools could pick from a range of assessments seems unhelpfully complex. But, given we have high-stakes accountability for primaries, and that this isn't going to change any time soon, the principle seems sensible to me.
However, opponents of the tests have raised some reasonable concerns, particularly that the assessments could be used to "label" children from a young age. I recently received an email from a correspondent (who doesn't wished to be named) which shows how labelling could be avoided while still allowing primaries to be measured on the progress they were making rather than their raw scores, regardless of intake. I've reposted the email in full as, I think, it shows how the benefits of the assessments could be secured without the negatives opponents are worried about.
"My starting point on baseline assessments is that a teacher's focus for ages 4-7 should mostly be about absolutes rather than relatives. As an absolute bottom line, every 7yo should have completed learning to decode (including the complex code, not just the simplified initial code) and thus to read with reasonable fluency, to write properly, to spell (though not full spelling code mastery by 7), and have had opportunities to practice their new skills in worthwhile activities; and similarly for maths. These aspirations should be there for all children (with the perennial exception of true heavy-duty special needs), not just the brighter ones. KS1 assessment ought to be showing us whether these aspirations are met.

But this creates a problem, in that children arrive at primary school with very different levels of development and (though many hate the idea) variable capacities to learn. School intakes are far from homogeneous, and the accountability system will persistently punish some schools if we simply compare KS1 outcomes and don't recognise this. In a high-accountability world, this creates disincentives to work in and run these schools, which over time will tend to lead to differences in teacher and curriculum quality, creating a vicious circle.

I therefore think it is important to have a measure in the system that provides a primary education baseline, so from the first term of Reception. I also favour a test over teacher assessment: teachers are too conflicted otherwise. But I would explicitly make this a measure of schools, not pupils. I might send schools information about cohort performance: average score vs national average, range from highest to lowest, probably no more than this: really just enough for schools to see that there is a fair external perspective on their intake, and to have a sense of what overall level of performance at KS1 ought to be expected. But I would definitely NOT give them individual child scores, nor would I give these to parents. (This sounds shocking to many ears, but it is in fact absolutely normal - eg schools administer all sorts of tests for internal purposes whose results don't go to students or parents.) So children would not be labelled, and schools could not set differentiated child level targets explicitly designed to meet specific Ofsted progress expectations. The child level data would sit in the NPD until needed for KS1 progress/VS calculations for all matched children.

This would allow proper assessment of progress and value-added from YR to Y2 at school (and perhaps classroom) level, but without individual labelling with all its negative consequences and without refocusing lower primary teachers away from absolute expectations. And I really do think that this early stage accountability is necessary, as we all tend to judge the lower end of our children's primary schools by how nice the people are, and only realise what they haven't been taught when it is already getting rather late to do something about it. (My older child was in Y2 before I realised that the school's reading and spelling teaching was lamentable, and I am a fairly well-informed parent who recognised quickly that the problem was with the school and not the child. I know many parents lamenting their children's dyslexia who still don't realise that it was probably avoidable.)

Administration of tests to 4/5 yos is of course a challenge. But

(a) modern computer-based tests are quite accessible to the vast majority of children who will already have seen (and often played) tablet/PC/phone games

(b) they can be adaptive, using quite complex algorithms to determine which questions they use to refine the measure, so that even a teacher watching a child take the test cannot deduce their precise score

(c) the incentive to teachers is to under-report baselines, but it would take a degree of nastiness that I hope not too many are capable of to nudge a child away from the right answer towards a wrong answer

(d) I suspect that screening algorithms will be capable of picking up anomalous patterns of answers if teachers impersonate children and try to replicate their mistakes.

 So I think it will be possible to establish a worthwhile baseline test if these technical issues can be dealt with and if the temptation to use this as an accountability test for nursery classes can be resisted, at this would infallibly lead to nursery classes starting to teach to typical test items, thus undermining the value of the baseline."


  1. This appears sensible. I would still prefer it if the profession were working on convincing the government that there would be a huge benefit to primary education if the high stakes were removed, and replaced with a high expectation, enabling culture instead......

  2. I think this makes a lot of sense, particularly in regards to no individual targets to be derived from the data. If designed well the test could be presented as 'a game we're going to play' rather than 'a test we're going to take' as a further insurance against placing exam stress on such young children. Now I'm suggesting the govt deliver a complex, well-designed IT project, so I might have gone completely mad. Has anyone seen what the new tests look like? Are they done on computer?

  3. Any test done on a computer can only test things that can be tested by a computer, not the complex set of skills and dispositions that need to be in place for a child to settle into school and learn well (see EYFS Development Matters to better understand the complexity of early development). Performing a test for gov't data, and then keeping this data from parents would seem to me to be completely unethical (not to mention data protection issues). The vast majority of children of this age cannot read, so the test could not involve written instructions or questions. For any children with EAL, a spoken test would be unfair. Children do not always arrive in Reception in one cohort, those who are not yet 5 might not arrive until the term after their birthday. A child of this age might 'perform' well on one day, but 'perform' badly on another, at this age physical and environmental factors can have a huge impact. These are just a tiny handful of the flaws with the notion of testing children by computer at the age of 4.

  4. The child level data would sit in the NPD until needed for KS1 progress/VS calculations for all matched children. do my computer science homework