Know what you’re testing – lessons from the national curriculum SATs

It's the time of year when the national curriculum SATs results are released. This never fails to prompt discussion about various aspects of testing our children and the appropriateness of the tests themselves. Test marking always generates controversy, with this year being no exception. Points picked in a BBC article highlight concerns with marking English papers. One relates to a question where children were asked to correctly write a semi-colon in a sentence and the other about labelling main and subordinate clauses. Whilst we can argue about whether knowing the intricacies of using semi-colons makes our children better communicators, it is the marking of these questions that raises serious problems with the reliability and validity of the tests. Reliability - the accuracy or consistency of the test - and validity - the extent to which the test measures what it is supposed to - are recognised as the cornerstones of any adequate form of measurement. In the case of large scale assessment, where many different markers are needed, obtaining adequate consistency between markers is a major challenge. Good training of markers and detailed guidance on what to accept as correct or incorrect helps, but only goes so far. Inevitably, noise or 'error' creeps into the system. We can never remove error completely but it should be minimised as far as possible. However, pressures of time and cost may mean that insufficient checks are put in place, with reliability suffering as a consequence. What the current SATs more worryingly highlight are concerns over validity. Take the question about correctly punctuating a sentence. Guidance to markers was given as to exactly where the top of the semi-colon should be, the direction of the bottom comma and other features needed to be for a response to be acceptable. So, a child could have put a mark that was clearly a semi-colon and in the correct place, yet still not be considered to have answered the question correctly. Unfair? Almost certainly yes. The key point to consider is the purpose of this question - its validity. If we want to check whether a child can correctly write a semi-colon, then ask them to do this. Give them a blank space and ask them to fill it with a semi-colon. However, if the task - as must be assumed - is to see whether a child knows where a semi-colon goes in a sentence, then remove these artificial constraints that penalise many. The sentence children had to punctuate rightly did not include extra spaces where punctuation was missing. Quite rightly, as this would be too much of a clue to the answer. But when we put punctuation in our writing that's just what we do, as the punctuation mark takes up space on the page. As we know, when you try and squeeze in a mark or letter you've missed out, the result is often not that great. Children's handwriting also naturally differs in size, but here is constrained by the font used on the question paper. Again, we create an artificial situation which completely misses the point of the question. These simple examples illustrate fundamental points. Test constructors need to be clear about what they want to measure and stick to this, removing all 'construct irrelevant variance'. If this is not done, tests are unfair and biased. Tests are also less useful to teachers who might use them to inform children's learning needs, as reasons for a child getting a question wrong are obfuscated. High-stakes assessment will always be scrutinised and through this the limitations of measurement highlighted. Public understanding of these limitations is a necessary part of an accountable test system, but sometimes you look at tests and think they just should have been better.Start typing your update here...

The hidden (un)reliability of school exams

The annual furore over school league tables arrived recently.  League tables indicate the proportion of pupils in a school who attain 5 or more good GCSE exam results, including English and maths.  Though this year the main controversy was around which exams to include in the league tables and which not - resulting in prestigious schools such as Eton and Harrow being left at the bottom - it again raises the issue of how accurate and meaningful such results are.  Challenges to the validity of league tables typically centre around them being too narrow a measure of school performance; surely schools are not there solely to produce children who have 5 or more GCSEs?  Given the focus on exam attainment, there's a crucial fact that's consistently overlooked - how accurate are the exam results such tables are based on? Here's a thought experiment for you.  Imagine taking an object such as a table and asking 20 people to measure how long it is to the nearest millimetre.  What would you expect to see?  Would each of the 20 people come up with the same result or would there be some variation? When I've asked people to consider this or even do it, there is always some variation in the measurements.  Judgements tend to cluster together - presumably indicating the approximate length of the table - but they are not exactly the same.  There's error in any measurement, as captured in the saying 'measure twice, cut once'. If there is error or inconsistencies in measuring a physical object such as a table, what about when we are measuring aspects of knowledge and abilities using exams or other forms of assessment?  Such assessments work by asking a series of questions designed to tap into relevant knowledge and to show understanding and application of principles.  From the responses given we make judgements about where respondents stand on constructs we are interested in measuring; in the case of GCSEs the constructs reflect the amount of knowledge and understanding retained from instruction and learning. This process necessarily involves a degree of judgement and inference.  Markers have to evaluate the adequacy of responses given to questions and, given an exam only covers a limited area of any syllabus, from this make an inference about the overall level of attainment in the subject area.  Understanding the degree of error in any assessment is one of the cornerstones of psychometrics.  Error is captured in the concept of 'reliability' which describes how accurate our assessment tools are.  Effective use of tests should acknowledge error, make it explicit and treat test scores accordingly.  Acknowledging error means that any test score should be treated as an indicator and not an absolute measure.  This is a clear strength of the psychometric approach to measurement which is open and honest about limitations in measurement technology, incorporating these limitations into any consideration of outcomes. Educational exams, however, consistently fail to publically acknowledge error. Though exam results do not come with an error warning, issues in their accuracy are regularly highlighted.  Each year after exam results are announced the media picks up on 'failings' in the exam system.  Usually these centre around re-marking, where parents have appealed against their child's marks and they have been awarded a different grade after the exam paper has been reviewed.  Such stories clearly highlight the issue of reliability, but quickly disappear only to be dusted-off and resurface the following year.  The exam system invariably take these criticisms on the chin, wheels out the usual defences, then goes back to business as usual. Returning to exam league tables, individual results contain error and so do groups of results.  The degree of error varies according to the size of group - so the number of children in a school - but this is another factor conveniently ignored in league tables.  As consumers of league tables we are misled.  Regardless of our views on whether the emphasis on exam results over other factors in assessing school performance is appropriate, until league tables acknowledge error in exam results they cannot be fit for purpose.

The pervasive effects of cognitive ability

"If you only assess one thing, use an assessment of cognitive ability." In this post I highlight the effectiveness of ability tests when used within the recruitment process, in order to determine candidate aptitude.

Online behaviour predicts personality… no big surprise

Notable pieces of academic research too often fail to receive the public attention they deserve.  Not so with a recent article entitled 'Computer-based personality judgments are more accurate than those made by humans', which has attracted a flurry of media interest. Online behaviour is being increasingly mined by psychologists, as vast amounts of information gathered through our everyday use of the internet is readily available to test out their ideas.  The team in this study used data from Facebook to predict users' personality according to the Big Five dimensions (Openness, Conscientiousness, Extraversion, Agreeableness and Neuroticism).  Using a sample of over 80,000 Facebook users, they first asked them to complete a Big Five personality questionnaire then looked at how this predicted their Facebook 'likes'.  Having established links between the Big Five and Facebook likes in part of the sample, they then used this information to predict behaviour in further samples of users. There is much existing research looking at how our judgments of our own personality compare to judgments made by others such as friends, colleagues and partners.  This provided a useful benchmark against which the predictions generated from Facebook likes could be compared.  The research team found that computer-based predictions of users personality were significantly more accurate than human judgments.  The accuracy of prediction varied according to the number of likes a user had, but it took only around 227 likes for the computer to predict a user's personality as well as a spouse - the best human judge.  It took only 10 likes for prediction to be better than the average work colleague. So we're constantly told that our online behaviour reveals much about us.  Be careful with our passwords, bank account details and don't post anything too offensive as you never know who's looking.  But what about something as personal as our preferences and characteristics, an important part of what makes us unique individuals? Well, the answer is that Facebook behaviour revealing something about our personality should come as no surprise at all if we stop and think about how personality assessments work.  The areas of personality assessed by questionnaires - often referred to as 'constructs' or 'factors' - cannot be directly observed as they lie deep within our minds.  However, where we stand on these constructs, whether we are high, low or somewhere in the middle, influences our behaviour.  Anyone who has completed a personality questionnaire will probably remember being faced with a long list of questions asking about their typical behaviours, preferences and similar.  Each question aims to get at a specific behaviour and tells us a little more about the respondent.  Together they build an accurate picture of the respondent's personality.  If personality assessment relies on knowing about behaviour, it should be of little surprise that our behaviour tells us something about our personality.  If it didn't, personality assessment wouldn't work. Like any behaviour, how we interact with technology such as Facebook reveals something about our personality.  It's no great surprise that if you observe enough of a person's behaviour, whether through spending time with them, analysing their likes on Facebook or other online activity, you get to understand about their characteristics.  This is, in fact, one of the main reasons that personality assessments are useful; by knowing a person's characteristics you are able to predict with moderate certainty how they are likely to behave across a range of situations.  Without this element of prediction, personality assessment would be no more useful than a parlour game.

The ongoing war for talent

Recruitment may be down but the war for talent remains a fierce as ever. Indeed, some employers see these difficult times as an opportunity to target competitors' employees who have been made redundant. That is one of the many interesting findings from a recent survey commissioned by StepStone recruitment (see here for the full article). The survey goes on to identify the vital role that HR can play in identifying and nurturing talent. However, with the focus ever more on the bottom line, functions such as HR are increasingly challenged to demonstrate their return on investment (ROI). Demonstrating ROI is an ongoing challenge. Though it is relatively simple to look at the bottom line or other indicators of performance at an organisational level, identifying the impact of individuals is much more of a challenge. Consider this question: What constitutes 'success' for particular job roles in your organisation? And now this one: How well do you measure individuals against these 'success' criteria? These can be challenging questions, but it is exactly these types of question that our expertise in assessment and measuring people performance can help you solve. If you are interested in finding out how you can identify talent and demonstrate your return on investment, please contact us.

The importance of objective assessment in difficult times

Whether you prefer to call it ‘down sizing’, ‘right sizing’, ‘restructuring’ or simply making redundancies, organisations continue to shed staff in an attempt to deal with the economic slowdown. For those who have to make the difficult decision of who to keep and who to let go, the importance of objective information on which to base these decisions will be all too apparent. But how many organisations have this information readily available and, if they do, it can be times like this that really lead us to question the quality of this information. We can readily rely on the usefulness of psychometric tests during selection, but they should not be used when we have information on actual job performance. In the case of redundancies we have (or at least should have) information on how employees perform and need to base our decisions on this. For our decisions to be defensible, however, they need to be based on sound objective measurement. Faced with these difficult decisions, our annual appraisal information is all too often not up to the task. Realise Potential have experience in all aspects of employee evaluation and can provide the information you need to make robust and fair decisions about your employees for any purpose. To find out how we can help you establish rigorous performance evaluations and provide objective information to support all employee decisions, please contact us.

Slowdown in graduate recruitment

The effects of the economic slowdown on graduate recruitment are highlighted in a recent feature by the BBC 'Half graduate recruiters cut jobs'. With vacancies down 5.4 per cent and redundancies up significantly, this means a considerable reduction in many companies' workforces. Whilst such reductions are done to ensure companies' profitability and survival, they also pose often unrecognised threats to their long-term viability. In any pool of employees, there will be a distribution of performance: some will be stars, the majority will be around average and some will perform well below average. If recruitment practices remain static, so will the distribution of performance, meaning there will be fewer stars to drive success. As recruitment is reduced, this also has the effect of shrinking companies' talent pools. This is a long-term and potentially serious threat, as it is from these talent pools that the leaders of tomorrow are grown. To overcome these threats, recruitment needs to get smarter. If we continue to use the same methods, we just replicate the distribution of performance in the current workforce. Here are some of our tips for tackling the recession through great recruitment: Review your recruitment practices to ensure they are as good as you can make themIdentifying the right people is a science - ensure your people are trained in its methodsDon't be afraid to invest in getting the right people in place - this is the best investment your organisation can make. To find out how our services can help you beat the slowdown, please contact us.