When 140 Characters Isn't Enough

Wednesday, 14 May 2014

In defence of baseline assessments

Earlier this year the DfE announced new proposals for holding primary schools accountable. These include a "baseline assessment" for pupils in reception. Primary schools that opt-in to using this assessment will then be measured on the progress pupils make over the course of their time in school rather than on the raw results of Key Stage tests.

It's fair to say the idea hasn't been universally welcomed. While the NAHT have made some positive noises the NUT have voted to investigate boycotting these assessments. And I suspect their position is held by the majority of early years teachers.

I don't think the DfE proposal as it stands is perfect - for one thing the suggestion that schools could pick from a range of assessments seems unhelpfully complex. But, given we have high-stakes accountability for primaries, and that this isn't going to change any time soon, the principle seems sensible to me.

However, opponents of the tests have raised some reasonable concerns, particularly that the assessments could be used to "label" children from a young age. I recently received an email from a correspondent (who doesn't wished to be named) which shows how labelling could be avoided while still allowing primaries to be measured on the progress they were making rather than their raw scores, regardless of intake. I've reposted the email in full as, I think, it shows how the benefits of the assessments could be secured without the negatives opponents are worried about.

"My starting point on baseline assessments is that a teacher's focus for ages 4-7 should mostly be about absolutes rather than relatives. As an absolute bottom line, every 7yo should have completed learning to decode (including the complex code, not just the simplified initial code) and thus to read with reasonable fluency, to write properly, to spell (though not full spelling code mastery by 7), and have had opportunities to practice their new skills in worthwhile activities; and similarly for maths. These aspirations should be there for all children (with the perennial exception of true heavy-duty special needs), not just the brighter ones. KS1 assessment ought to be showing us whether these aspirations are met.

But this creates a problem, in that children arrive at primary school with very different levels of development and (though many hate the idea) variable capacities to learn. School intakes are far from homogeneous, and the accountability system will persistently punish some schools if we simply compare KS1 outcomes and don't recognise this. In a high-accountability world, this creates disincentives to work in and run these schools, which over time will tend to lead to differences in teacher and curriculum quality, creating a vicious circle.

I therefore think it is important to have a measure in the system that provides a primary education baseline, so from the first term of Reception. I also favour a test over teacher assessment: teachers are too conflicted otherwise. But I would explicitly make this a measure of schools, not pupils. I might send schools information about cohort performance: average score vs national average, range from highest to lowest, probably no more than this: really just enough for schools to see that there is a fair external perspective on their intake, and to have a sense of what overall level of performance at KS1 ought to be expected. But I would definitely NOT give them individual child scores, nor would I give these to parents. (This sounds shocking to many ears, but it is in fact absolutely normal - eg schools administer all sorts of tests for internal purposes whose results don't go to students or parents.) So children would not be labelled, and schools could not set differentiated child level targets explicitly designed to meet specific Ofsted progress expectations. The child level data would sit in the NPD until needed for KS1 progress/VS calculations for all matched children.

This would allow proper assessment of progress and value-added from YR to Y2 at school (and perhaps classroom) level, but without individual labelling with all its negative consequences and without refocusing lower primary teachers away from absolute expectations. And I really do think that this early stage accountability is necessary, as we all tend to judge the lower end of our children's primary schools by how nice the people are, and only realise what they haven't been taught when it is already getting rather late to do something about it. (My older child was in Y2 before I realised that the school's reading and spelling teaching was lamentable, and I am a fairly well-informed parent who recognised quickly that the problem was with the school and not the child. I know many parents lamenting their children's dyslexia who still don't realise that it was probably avoidable.)

Administration of tests to 4/5 yos is of course a challenge. But

(a) modern computer-based tests are quite accessible to the vast majority of children who will already have seen (and often played) tablet/PC/phone games

(b) they can be adaptive, using quite complex algorithms to determine which questions they use to refine the measure, so that even a teacher watching a child take the test cannot deduce their precise score

(c) the incentive to teachers is to under-report baselines, but it would take a degree of nastiness that I hope not too many are capable of to nudge a child away from the right answer towards a wrong answer

(d) I suspect that screening algorithms will be capable of picking up anomalous patterns of answers if teachers impersonate children and try to replicate their mistakes.

So I think it will be possible to establish a worthwhile baseline test if these technical issues can be dealt with and if the temptation to use this as an accountability test for nursery classes can be resisted, at this would infallibly lead to nursery classes starting to teach to typical test items, thus undermining the value of the baseline."

Friday, 11 April 2014

Some thoughts on grief

Until it happened it didn't occur to me that our daughter would be stillborn. I'd worried about a difficult birth; 4-day labours; emergency c-sections; brain damage and so on. It didn't cross my mind that when we arrived at the hospital there'd simply be no heartbeat.

Stillbirth turns out to be relatively common - around 1 in every 200 births in the UK. This rate hasn't fallen in the UK for over 20 years despite significant improvements in other aspects of maternity care. As 90% of stillbirths have no congenital abnormality it should be possible to reduce the rate significantly with better screening.

*******

I suspect part of the reason for the lack of investment in stillbirth research - and lack of media attention - is because it's very hard to talk about. The absence of the paraphernalia that usually accompanies a death reduces the number of opportunities to engage with friends and relatives - leaving instead a almost complete lack of activity in a household prepared for the exhaustions of a newborn. And unlike other deaths, where you can share stories about the deceased from happier times, there's no hook for positive conversations.

Which is why so many of the messages we've received contain words along the lines of "words are futile at this time" or "there are no words" or "I know nothing I can say can make anything better". Of course this isn't true. For me at least the hundreds of messages we've received have been very helpful in processing what's happened. Without them we'd have had almost no communication at all outside of our immediate families. And the box of cards we now have are pretty much the only thing we have to remember her by.

*******

I've found the process of grieving much as one would expect - it comes in waves and there are increasingly long periods - hours at a time now - where I feel pretty normal (and then feel guilty for feeling normal). But everyone's grief is individual and there are some odd quirks which I think are less common. After the horror of the initial few days I've held it together pretty well. The only times I've really felt myself going to pieces was after someone has done me a significant and unexpected kindness. I don't know why - perhaps because it reminds of the enormity of what's happened?

The most noticeable thing has been the difference in the way my wife and I think about her. Because I never had the chance to meet her I think of her in terms of lost possibility; the girl - and woman - she could have been. My wife, though, had a physical relationship with her over many months - making the loss much more visceral. She thinks of her by the name we chose. For some reason I can't.

*******

At some point in the future we'll be holding a fundraising event for Sands, the UK's stillbirth charity, in honour of our daughter and to help pay for research that will hopefully stop other families going through this.

Thursday, 3 April 2014

The worst few days of my life

As many of you know my wife, Linda, and I have been expecting our third child. On Monday afternoon Linda had a 38 week check-up and was told everything was fine. Later that evening she went into a normal labour and early on Tuesday morning we arrived at the hospital. When the midwife did the initial check she was unable to find the baby's heartbeat. After some Casualty-like scenes of panic a doctor confirmed the bad news. Shortly after our daughter was delivered stillborn. As yet the doctors are unable to establish a reason why this happened and in most cases like this they never do.

Needless to say we are heartbroken. The last few days have been the hardest of our lives. But we're very lucky to have our wonderful twins as well as an incredibly supportive network of family and friends. They will see us through this.

I'm writing this public note so that I don't have to tell everyone individually and so that people understand why I'm not returning calls, texts, emails and DMs at the moment. But I am very grateful to everyone who has already offered condolences and support. I'll be back in action soon.

9 things you should know about the new PISA "creative problem-solving" test

Today sees the launch of the first international test of "creative problem-solving". It is the latest addition to the suite of PISA tests run by the OECD which have become hugely influential in global education policy-making.

This test was taken by pupils in late 2012 at the same time as PISA tests in maths, science and reading but the results were held back for a separate launch. I was invited to a pre-embargo briefing yesterday and the information here is taken from a mix of the published reports and answers given by OECD experts at the briefing.

1. The purpose of the test was to measure students ability to solve problems which do not require technical knowledge. The PISA subject tests in maths, science and reading are also based around problem-solving but they do require knowledge in these subject areas (e.g. mathematical concepts and mental arithmetic). Examples of questions include working out which ticket to buy at a vending machine, given a list of constraints, or finding the most efficient place for three people to meet. Unlike the subject PISA tests it was completed on computers which allowed for more sophisticated interactive assessment.

2. Overall the results correlated fairly closely with the PISA subject tests. Unsurprisingly students who are good at maths problems are also good at ones involving general reasoning. The correlation with maths results was 0.8 and with reading was 0.75.

3. But England was one of the countries that did significantly better in this test than in the subject ones. It came 11th overall but the individual rankings are misleading. It makes more sense to think of clusters of countries that did about as well as each other. The leading group of seven consists entirely of Far East countries and jurisdictions. England is in the second group with countries that traditionally do well in PISA like Australia, Canada, Finland and Estonia. Then comes a third group made up other larger European countries and the United States. The countries below the OECD average are primarily smaller European countries and developing nations.

4. This is unhelpful for a number of the big narratives in English education policy. It undercuts the "England is falling behind in the world" narrative so beloved of right-wing newspapers. On a test of intellectual reasoning (which is what this is) our 15 year olds do as well as any other nation bar a small group of Far East jurisdictions (only two of which - Japan and Korea - are not cities or city states).

5. But it's also perhaps unhelpful for those who argue that our education system is dominated by an obsession with tests and narrow curriculum knowledge. It turns out we're actually pretty good at "21st century skills" already. Our students performed better in this test than you would expect based on their maths, science and reading ability. Likewise all the employers arguing that our system isn't delivering the kind of problem-solving skills they need should reflect on these results.

6. The reason England outperformed it's subject PISA scores is that students at the top end did better on the problem-solving test than on the subject ones. Students at the bottom end did no better. This suggests that we're doing something with our more gifted students that we're not doing with our weaker ones. In other countries - e.g. Japan - the opposite was true weaker students did better in problem-solving than subject tests but the strongest ones didn't.

7. In England there was no statistically significant gender difference in performance on this test (in maths and science boys do better; in reading girls do). Interestingly immigrants scored below non-immigrants which is a change from the maths and reading tests where there is no significant difference.

8. The domination of Far East countries puts pay to the notion that their success in PISA subject tests is somehow down to rote-learning or fact-cramming. It also puts pay to the idea that all Far East systems are the same. While Shanghai and Hong-Kong are still in the top group they did much worse on this test than would be expected given their stellar scores in maths, science and reading. Conversely Korea, Japan and Singapore all did better than would be expected.

9. While the test results are interesting they don't tell us why some countries do better than others. Both Singapore and Korea - who come top - have both tried over the past few years to add "21st century competencies" to their curricula to make them less purely focused on academic knowledge. But it's unclear whether their high scores in this test are due to that or because their traditional strength in the academic basics transfers to "creative problem-solving" tests of this type. The OECD presenters were clear that they thought it was impossible to teach problem-solving skills in the abstract without content, but they also felt it was possible to embed them in a knowledge-based curriculum.

Saturday, 29 March 2014

Weekly Update 29/3/14

News:

DfE published the final proposals for a new primary accountability framework. Michael Tidd summarised the main changes and gave his take on them (spoiler: he's not impressed).

The NUT went on strike. Michael Tidd didn't think that was a good idea either. Nor did John Blake.

DfE also published plans for a new 16-19 accountability regime.

And plans to cut £200m from LAs and academies

Best Blogs/Articles:

Why literacy is knowledge by Robert Pondiscio

David Didau on the importance of school behaviour policies (it regularly amazes me how many schools still don't apply one consistently)

Deevybee on whether Dyslexia is an appropriate label

Cherryl KD on training teachers with a SEN specialism

Alex Quigley with some tips for new bloggers

A sample of Daisy Christodoulou's book in American Educator

Harry Webb on the future of education research

Michael Tidd (again) on seven questions you should ask about your post-levels assessment system

Shaun Allison on why some of his school's departments are so successful

Annie Murphy Paul on the importance of analogies

New Research:

Fascinating report from HEFCE on different in degree outcomes for different groups. The main focus has been on state school pupils doing better than private ones but there's a lot of interesting/worrying stuff in there.

Dan Willingham on a new study showing readability levels may well be inaccurate

Big new Gates Foundation funded report on Khan Academy - which still leaves us unsure as to whether it has any benefit.

Saturday, 22 March 2014

Weekly Update 22/3/14

News:

Another week dominated by Ofsted. On Monday Policy Exchange published their eagerly awaited report with some radical recommendations. It was blogged about by David Didau, Tom Bennett, Joe Kirby, Stuart Lock, Robert Peal and me.

On Friday we heard Ofsted's response from Sir Michael Wilshaw. He promised a shift (over the next 18 months) towards shorter inspections for good schools and a review of the framework.

Which seems to fit with the conclusion to my blog on Policy Exchange's report: "Under the current regime I suspect we will see incremental shifts in the right direction but no big bang reset."

NAHT published a really interesting draft manifesto which I hope others engage with

Tristram Hunt is backing Future Leaders campaign to stop discrimination against women in headteacher appointments

On and the Varkey-GEMS Foundation announced a $1 million prize for the world's best teacher. Good luck everyone.

Best Blogs/Articles:

Harry Webb on the many weaknesses of the "nothing can be known about education" viewpoint

Tom Sherrington argues for a symbiosis between traditional and progressive pedagogy

Daisy Christodoulou has collated a variety of alternatives to NC levels

Chris Hall on the lessons from the first batch of EEF randomised control trials

Jo Facer on a wonderful sounding assembly in which she explained the importance of reading

Fascinating piece from Rob Webster on his research showing that getting a statement for a pupil with SEN can actually lead to worse outcomes.

Classroom routines from Elissa Miller who sounds like the most organised teacher in the world

The anonymous Heather F on her really bad teaching

Gifted Phoenix with more info on FSM admissions to Oxbridge than you'll ever need

New Research:

If you're a teacher and have a innovative idea you can win £15k to pay for a year long research pilot

Sunday Times on new research showing that state school pupils get better degrees those from private schools with the same qualifications (unfortunately paywalled + the full research is not yet published)

A new Sutton Trust report on parenting and attachment

Monday, 17 March 2014

My take on Policy Exchange's Ofsted report

First thoughts

This is one of the best think-tank reports I've read in a very long time. It's timely, pragmatic, while not being too safe. It's also well written (rarer than you might think).

And importantly it's the first report I've seen that makes real use of social media expertise. The authors acknowledge that they've built on the ideas emerging from twitter and the blogosphere and the final product is much stronger as a result:

"We would like to thank all the teachers and other educationalists who have continued to debate the role of Ofsted on blogs and on Twitter and in doing so, influenced our work - even if they didn’t know they were! Social media is a democratic phenomenon which offers a tremendous opportunity for closing the gap between practitioners and policymakers. If ideas are good and arguments are compelling, then it has never been as easy as now to shape what politicians and policymakers are thinking."

How I wish that social media had been in full flow when I was running the Policy Exchange education department back in 2008 - it would have significantly improved my thinking.

The full report is: here.

The key recommendations

The report sets our a new design for inspections with a shift to regular short inspections based primarily on data and self-evaluation. Only schools where inspectors had concerns (or couldn't tell) would get a longer "tailored inspection". This seems eminently sensible and is line with Ofsted's slow shift towards risk-based assessment over the past decade.

There would be no teacher observations in these short inspections. Again I strongly agree. And set out my reasons why this would be an important shift here.

Longer tailored inspections would include teacher observations - but inspectors engaged in these visits would have to be trained to a high standard. This feels like a bit of a fudge. Obviously if we are going to have observations then inspectors must be trained but there's no reason given for why they are necessary.

The problem is that even with the best training available observations are not hugely reliable. The report acknowledges that the gold standard models of observation can achieve 61% agreement between 1st and 2nd observers (p.19). That still an awful lot of teachers getting the wrong grade for their teaching - with potentially significant knock on effects for their career. And to achieve that 61% could require up to six separate observations by different people (p.20) which is phenomenally time consuming and expensive.

Of course inspectors, as part of a longer visit, would want to spend time in classrooms but there would need to a really clear added value to formalising these observations to justify the cost both of resources and to individuals.

I remain of the view that the purpose of even a longer inspection should be to understand whether senior and middle leaders understand their school and not to make potentially invalid judgements about individuals' teaching. As I've said previously:

"Inspections should focus on systems. Essentially Ofsted should be looking at what the school is doing to ensure consistent good teaching. They should be inspecting the school's quality assurance not trying to do the quality assurance themselves. In their classroom visits they should be checking the leadership know their teachers and understand how best to support their future development. They should be checking that they have thought about professional development and about performance management. They should be seeing if the behaviour policy is being enforced; and if the school curriculum is actually being used."

Other recommendations

The full list of recommendations can be found in this blog by Joe Kirby. I agree with nearly all of them - particularly a new requirement that inspectors take a data interpretation test and the suggestion that Ofsted end the practice of having thousands of part-time, contracted, additional inspectors.

I have an issue with the suggestion that schools should only be considered outstanding if they "engage in a serious and meaningful way in some form of school to school improvement with other schools - as chosen by the school itself". This is laudable but very hard to inspect without visiting the other schools adding cost and complexity. It could also lead to quite a lot of fake collaboration. I'd rather have an additional category of "system leader" for those schools that were indisputably playing that role.

I also remain unconvinced that we need a separate system for inspecting academy chains. Ofsted are already doing inspections of multiple schools within a chain - which led directly to the recent reduction in the number of schools run by EACT. It's not clear what another framework would add.

Will any of it happen?

There's no question Ofsted have woken up - in recent months - to the extent of the public relations challenge they have. The social media engagement of their Director of Schools Mike Cladingbowl has been welcome and extremely encouraging. The reforms he has proposed in recent months fit with the direction of travel of the Policy Exchange report - shorter more risk-based assessments, emphasising that individual teachers shouldn't be graded - but they are much less radical.

Under the current regime I suspect this will continue - with incremental shifts in the right direction but no big bang reset. Whether we see the Policy Exchange recommendations implemented in full (or even the end of lesson observations all together) will probably depend on who gets to choose the next Chief Inspector and who that is.

Pages