Progress Tests... Are They Reliable?

I recently had a small administrative problem with an institution I sometimes teach for, that ultimately made me realize for the first time why I feel uncomfortable with the term “Progress Test”.

For some time now, I have generally avoided using the term Progress Test, favoring instead something more general like “Midterm” or “Module/Unit Assessment”, but I have never really known why. I am a keen proponent of regular assessment, though I prefer practical rather than formal/standardized examinations. Of course, we take assessments primarily to gauge progress, so why the hesitation?

It was only this week that I understood my apprehension. I realized that a progress test is used to show progress made at some point during a programme of learning, be that a semester, a course or an academic year. This comes with expectations of gradually increasing scores from one test to the next. To give a specific example, the institution I was teaching in produces a chart at the end of their courses to show progress made from the “entry test” to the “progress test,” to the final test at the end of the course.

This chart was supposed to show students attaining a certain score on their entry test, a higher score on their progress test and a higher score again on their final test, proving that they had improved along the way. The conflict arose when some of my scores did not meet their expectations, with some students getting a lower score, for example, on the final test than they did on the progress test.

What do progress tests test?

The fact that some of my students got a lower score on their second test than on their first did not indicate that their ability had deteriorated over the course and there is a very simple reason for this. The second test did not test the same things as the first did.

In order for those progress charts to actually show what they are hoped to, you would have to assume that at each point along the course—that is at the beginning, in the middle and at the end—the students were given a general test of English and their performance was recorded. Of course, the scores one would expect to get on each of these tests, though slightly higher each than the last, would in practice be very low. A (CEFR) B1 student taking a general test of English would probably get something like 10%, and if they later took the test again at the mid and end points of a 12-week course, they might expect to reach something between 12 and 15%.

Of course, these are not the sort of results students or clients are looking for. They want to see high scores in the 80s and 90s. This means that assessments must be limited to what the students have been taught, which is a sensible approach to testing. Then it seems obvious that each test is going to differ from the others. A placement test should be somewhat general, as it is with this that we decide what level our students are suited to, so low scores here should not be seen as a problem but rather as a diagnostic. The mid-point assessment cannot test anything that has not yet been taught, as that would be unfair, so it should focus on the first half of the course. The end-point test should then similarly focus on the second half of the course—though it is a good idea to include some review of earlier concepts as well in order to check for retention.

What would the charts show?

As such, you could expect students to get a very low score on the placement test unless they were already advanced users of English. One would hope this would be followed by equally high scores for following tests. High scores here would demonstrate that students had successfully learned what they had been taught. By this reasoning, a lower score on the mid test and a higher score on the end test does not indicate progress but rather that the student did not fully understand the material from the first half but did better in the second half.

Accordingly, when I submitted my reports to go with the scores, I noted that in future learning, several students would benefit from a review of some of the materials taught towards the end of the course, as they had struggled with these in their assessment. This was based on the fact that scores were mostly high in the middle of the course. This learning later proved to be retained quite well, but several students scored lower at the end.

However logical all of this might appear, I understand full well how clients analyse these reports. Anybody looking at a sequence of percentages laid out over a given time period expects to see those percentages rising or falling in line with their targets. When this is presented to the client, it is no wonder they are disappointed to see scores trending downwards. Explaining to them what I have explained here is not the practical solution because a) it is inefficient, b) it is counterintuitive and c) the disappointment they feel when they see the charts will be hard to override no matter how well they understand the explanation—that’s just how emotion work.

So how do we solve this problem?

Ultimately, I do not endorse these charts as a valid way of summarising a course or training programme. I also do not favour percentage scores as a useful representation of learning. I find that it is far more beneficial for students if they are given practical and direct feedback after an assessment that tells them clearly what they have done well and what they need to improve upon.

Charts and numbers are visually attractive but lack substance. Instead, we should present clients and students with practical reports that respond directly to the tasks that students have completed and the knowledge and skills that they have demonstrated. Any assessment, at any stage of the course, should be based specifically on the material that has been taught; results from that assessment should show how well the student has learned said material.

The idea of a steadily increasing scores from regular progress tests is a fallacy. It is implemented to give business people with business minds a pleasing visual in the assumption they will not understand or be satisfied with more educationally valuable feedback. I say, give them more credit, if they don’t appreciate it straight away, then help them see the benefits, but do not resort to useless figures and charts that benefit nobody.

For more on specific approaches to scoring, see my other post on Fractus Learning, ‘Acquisition and the Teacher‘.

Feature image courtesy of Flickr, quinn.anya.

Progress Tests… Are They Reliable?

What do progress tests test?

What would the charts show?

So how do we solve this problem?

Leave a Reply Cancel reply