www.dg.dial.pipex.com1497 readers since 28 Nov 2006 

Bullock (1975)

Notes on the text
Preliminary pages Foreword, Membership, Contents, Introduction

Part 1 Attitudes and standards
Chapter 1 Attitudes to the teaching of English
Chapter 2 Standards of reading
Chapter 3 Monitoring

Part 2 Language in the early years
Chapter 4 Language and learning
Chapter 5 Language in the early years

Part 3 Reading
Chapter 6 The reading process
Chapter 7 Reading in the early years
Chapter 8 Reading: the later stages
Chapter 9 Literature

Part 4 Language in the middle and secondary years
Chapter 10 Oral language
Chapter 11 Written language
Chapter 12 Language across the curriculum

Part 5 Organisation
Chapter 13 The primary and middle years
Chapter 14 Continuity between schools
Chapter 15 The secondary school
Chapter 16 LEA advisory services

Part 6 Reading and language difficulties
Chapter 17 Screening, diagnosis and recording
Chapter 18 Children with reading difficulties
Chapter 19 Adult literacy
Chapter 20 Children from families of overseas origin

Part 7 Resources
Chapter 21 Books
Chapter 22 Technological aids and broadcasting

Part 8 Teacher education and training
Chapter 23 Initial training
Chapter 24 In-service education

Part 9 The survey
Chapter 25: I Introduction
Chapter 25: II Primary commentary
Chapter 25: III Secondary commentary
Chapter 25: IV The questionnaire forms (not online)
Chapter 25: V Technical notes (not online)

Part 10 Sumary of conclusions and recommendations
Chapter 26 Conclusions and recommendations

Appendix A Witnesses and sources of evidence
Appendix B Visits made
Glossary
Index

The Bullock Report (1975)
A language for life

Report of the Committee of Enquiry appointed by the Secretary of State for Education and Science under the Chairmanship of Sir Alan Bullock FBA

London: Her Majesty's Stationery Office 1975
© Crown copyright material is reproduced with the permission of the Controller of HMSO and the Queen's Printer for Scotland.

Chapter 2 Standards of reading
[pages 10 - 35]

2.1 We have been discussing the general context of English and come now to the particular issue of standards of reading, about which a good deal of concern has been expressed. Many people who wrote to us took as their starting point the belief that standards of literacy had fallen. In the course of this chapter we shall examine the basis for this assumption by considering such objective evidence as we have been able to discover. An immediate difficulty is in arriving at a universally acceptable definition of the terms 'literacy', 'semiliteracy', and 'illiteracy', for the uncertainty surrounding them makes objective discussion far from simple. There is a good deal of emotion adhering to the terms, which too often robs them of the benefit of a true perspective. For example, in response to a survey on students' reading 52 university lecturers said all their students were 'illiterate to some degree'. One lecturer in medicine is reported to have said: 'All my students are illiterate when they come to university - the best are literate when they leave'. The 1950 Ministry of Education pamphlet Reading Ability put it succinctly: 'In truth most definitions of illiteracy amount to this - "that he is illiterate who is not as literate as someone else thinks he ought to be".'

2.2 The same document defined 'literate' as 'able to read and write for practical purposes of daily life'. And in the following year UNESCO proposed the criterion 'A person is literate who can, with understanding, both read and write a short, simple statement on his everyday life'. It is a feature of definitions of literacy that they progressively demand more of the person who is to be defined as literate. Thus, a decade later UNESCO had modified its criterion to 'A person is literate when he has acquired the essential knowledge and skills which enable him to engage in all those activities in which literacy is required for effective functioning in his group and community'. The term 'functional literacy' was used by Gray in his 1956 international survey (1) to describe the minimal level of efficiency acceptable to the society in which the individual lived. He saw this in the case of the USA as the standard that would usually be achieved by pupils in grade IV (10 year olds). But in recent years it has been argued that the threshold should be raised to at least grade IX (15 year olds), since much of the reading material to which the adult in society is exposed is written at levels of difficulty far beyond the understanding of one who can merely render print into spoken language. A telling illustration of this was provided by the 'Survival Literacy Study', conducted for the US National Reading Council in 1970. The purpose of the study was to determine the percentage of Americans lacking the functional reading skills to 'survive' as participants in the social and economic life of the country. The reading material consisted of five application forms in common use in daily life, ranked in an ascending order of difficulty. The results showed that 3 per cent of all Americans were unable to read adequately the form of application for public assistance, 7 per cent a simple identification form, 8 per cent a request for a driving licence, 11 per cent an application for a personal bank loan, and 34 per cent an application for medical aid. In our own country it has been suggested (2) that there are at least a million adults with a reading age (3) of below 9.0 who cannot read simple recipes, 'social pamphlets', tax return guides, claims for industrial injuries, national insurance guides for married women, and most of the Highway Code. Indeed, in the study which analysed these reading tasks it was suggested that such material, and the writing in the simplest daily newspaper, required a reading age of 13 for 'a reasonable level of comprehension'. The ability to read a newspaper is obviously one of the most basic and important purposes of the achievement of literacy. In the USA one researcher (4) took a fairly representative sample of eight articles from news publications, applied a readability test to each, and administered the tests to pupils aged 9 to 18. He calculated that readers who could not answer at least 35 per cent of the items could gain little or no information from material at that level of difficulty. Only 33 per cent of the 12 year olds and 65 per cent of the 18 year olds reached this 35 per cent level, though the pupils were drawn from middle class homes in a residential suburb. In other words, one third of all his 18 year olds were unable to read and comprehend news publications they would be likely to encounter in everyday life. Many American government bureaux, publishers and trade unions have retained consultants to help them simplify prose. But it has often been found that even when this has been done the lowest grade of difficulty at which complex subject matter can be written approximates to a reading age of about 15. In other words, the level required for participation in the affairs of modern society is far above that implied in earlier definitions. It is obvious that as society becomes more complex and makes higher demands in awareness and understanding of its members the criteria of literacy will rise.

2.3 It would clearly have been beyond our resources to study in depth the question of comparability with other countries, nor would it have been profitable. If definitions of literacy present difficulties it is obvious that attempts to compare standards between countries present bigger ones. In the present state of research there is little to be gained from speculation on whether any one nation has the advantage over any other. Downing (5) makes this clear when he refers to claims made in Japan, Germany and Finland that the rate of illiteracy there is exceptionally low. He points out that the very low validity of comparative statistics on literacy rates casts grave doubts on the evidence. Moreover, literacy rates expressed as percentages do not indicate actual performance levels. Two countries may claim to have illiteracy levels as low as 1 per cent, yet the actual level of reading achievement in one of the two countries may far surpass that of the other. We are faced again with the question of relativity, for one country's concept of literacy may be very different from another's. Even weaker is the subjective anecdotal evidence about the achievements of a country's children. In a recent survey (6) reading comprehension was tested across fifteen countries. It was found that 'the differences among developed countries are of rather modest dimensions ... the variations do not seem very important or readily interpretable'. We can only conclude with Downing that 'league tables' of literacy levels based on current evidence can have little validity. Nevertheless, though it is difficult to compare standards objectively between the developed nations there is no doubt that some feel a sense of urgency about their own conditions. The USA is a notable example. In 1969 the then US Commissioner of Education announced that one in four American pupils had 'significant reading deficiencies'. He believed there should be a major effort to ensure that by the end of the 1970s no one would be leaving school 'without the skill and the desire necessary to read to the full limits of his capability'. In the following year the Right to Read Effort was established with the purpose of ensuring that by 1980 99 per cent of all Americans under 16 and 90 per cent of all over 16 would have functional literacy. Financed by the US Office of Education, it reflects the anxiety felt in the USA about the 18 million people who have been estimated as unable to read effectively.

2.4 Various figures have been suggested for the probable total of such people in England and Wales. We referred earlier to the figure of a million as one estimate, but some people have put it at twice that number, or even higher. It is, of course, impossible to be certain. In The Trend of Reading Standards (7) 3-18 per cent of the 15 year olds in England were found to be semiliterate by the definition given in the 1950 Ministry of Education pamphlet. This defined a semiliterate as a person whose reading age was 7.0 years or more but less than 9.0 on the Watts-Vernon test. An illiterate was given as one with a reading age of less than 7.0. The percentage of 3-18 per cent represents nearly 15,000 young people on the basis of the known number of 15 year olds in school in 1970. The corresponding percentage of 'semiliterates' in 1948 was 4.3. Thus for the past 23 years or so between 3 and 4 per cent of the pupil population has been leaving school with this level of attainment. Given the total of 15 year olds in each year a simple multiplication sum would produce an indication of all the semiliterate adults who have left school since 1948. The result would, of course, leave many unanswered questions. All it would tell us is that when these people left school they had obtained a low score on a particular reading test. We cannot know what has happened to them since, though it is a reasonable assumption that their reading ability has remained poor. Extrapolation to discover the numbers of those who left with a similar attainment during or before the war would obviously be very unreliable. All estimates of the number of illiterates and semiliterates in the population must therefore be hedged about with reservations. Nevertheless, it is obvious that although they represent a small percentage of the total population their numbers are considerable. Strictly speaking, adult illiteracy does not fall within our terms of reference; but the more closely we have examined the evidence the more certain we have become that attention to it should be included in our recommendations. For it represents in human terms the consequences for those children whom a national survey shows as a statistic.

2.5 Before turning to the empirical evidence in detail we must mention another aspect of literacy which attracts a good deal of public attention. This is the influence of television, to which many references were made in the evidence we received. Some witnesses felt that the growth of visual methods of expression and communication has led to a decline in the use of language, and that this had contributed to an increase in reading difficulties. It was suggested to us that children today are more accustomed to watching television in their leisure time than to using the library for information, and that they are less likely than in the past to see their parents reading. One large education authority, itself a pioneer in educational television services, said that the hours children spent in watching television reduced their felt need to read and write. There is certainly evidence to suggest that children of school age are spending an increasing amount of time before the television set. According to a recent issue of Social Trends (8) children between the ages of 5 and 14 watched an average of 21 hours of television a week in February 1969. By February of 1973 this had risen to 25 hours. Judging from the many letters we received, there is a widespread and strongly held view that this tendency is a growing threat to the development of literacy. Such opinions are essentially subjective, and there is, in fact, very little empirical evidence to show whether television has had any effects on standards of reading or on the amount that children read. In a major study (9) carried out in this country in the 1950s Himmelweit and her colleagues studied the effect of television upon children of two age groups, 10-11 and 13-14. They had the advantage of a control group, matched by ability, social background, etc, which had virtually no access to television, a condition which can no longer be reproduced. The study showed that when children were first exposed to television they read fewer books, but by the time it had been in their homes for 3 years they read at least as many as before. Only the reading of comics seems to have been lastingly affected. One of the most significant conclusions was that comparatively little reading was taking place anyway. 'It's not that they used to read a great deal and then television came and destroyed the ability; they always read extremely little.' The 10+ children were found to be reading on average 2.7 books in a month, and the 13+ children 2.5.

2.6 These figures approximate to those revealed by a study of children's reading by Whitehead (10) and his colleagues. In 1971, some 16 years after Himmelweit's research, they found that the 10+ children in their sample read on average 3.0 books per month, the 12+ children 2.2, and the 14+ children 1.9. This very recent study gives the firmest available indication of a relationship between the amounts of time children spend reading and watching television: '... it is clear that the amount of television viewing accomplished by most children cannot but restrict the amount of time available for other leisure activities, including reading, and we have in fact found an inverse relationship between amount of television viewing and amount of reading'. The authors point out, however, that behind this generalisation lie many individual variations. There exists a substantial number of light viewers (defined as those who watch television less than 3 hours per weekday evening) who read little or nothing; and of heavy viewers (more than 3 hours) who read a great deal. In interviewing the children Whitehead formed the impression that they could be ranged along a continuum. At one end would be those active or hyperactive children who participate in many activities, including reading, sport, hobbies, and television watching. At the other would be the rather inert and apathetic children with few discernible interests. Looked at in these terms the amount of reading a child does is one manifestation of his or her general temperament, personality, situation, and lifestyle. This having been said, it is still possible for there to have been a general 'displacement' of one form of occupation by another. It seems likely that each child has a fixed amount of time, appetite and energy available for leisure activities in general or 'media contact' in particular. If new activities or new media capture part of the available time and energy they will 'displace' by the same amount those which formerly played an equivalent part in the child's life. As children grow older they read less and watch television less, in proportion to the extent their social activities take them out of the home. Social Trends No. 4 (1973) shows that among the 15-19 year olds television viewing time had dropped to an average of 17 hours a week.

2.7 Though the experimental evidence is limited we believe that the general effect of television watching has been to reduce the amount of time spent in private reading. Enough has been said to show that such a conclusion must be qualified in certain particulars. For one thing it cannot be taken for granted that if there were no television, books would automatically be the magnet. The Himmelweit study showed that television had not produced a sudden aversion from books. The Whitehead survey suggested that many children give little time to either. What cannot be known is whether there would have been a steady increase in book reading over the years if television had not intruded. It is a reasonable assumption, however, that at least a proportion of those 25 hours of weekly viewing would be spent in reading.

2.8 Another charge laid against television is that it develops a mass culture, sometimes dreary to the point of mindlessness. Many of our correspondents claimed that it not only reduced interest in the written language but debased the spoken language as well. Radio was held to be at least as guilty in this. Between them radio and television spread the catch phrase, the advertising jingle, and the frenetic trivia of the disc jockey. This is a large question, involving a discussion of cultural change which would take us beyond our terms of reference, but it is clear that the content and form of much radio and television utterance makes the teacher's job a great deal more difficult. On the other hand, both media have effects on children's language growth which are by no means always negative, a point which is developed in Chapter 6. Moreover, there is no doubt that television and radio are sources of material of the highest quality, which can and should be put to good use in schools at all age levels. Recommendations to this effect are made in Chapter 22.

2.9 There is one issue that has to be faced squarely. Nothing this Committee can say in isolation will change the viewing pattern of those evening hours which children spend in front of a set. The control of programmes lies with the broadcasters and the bodies to which they are responsible; and the control of viewing lies with the family. We share the opinion of many of our witnesses about some of the material the children see, but the problem is a difficult one. The broadcasters will say that they have to offer adult entertainment and that the parent is responsible for what he allows his child to watch. The parent may see the issue in much less simple terms; he experiences influences and pressures which make domestic censorship hard to introduce or maintain. We have to take care not to go beyond our terms of reference in suggesting that much more serious thought should be given to the influence of television on children. This is a large issue, and we are confining ourselves to that part of it which relates to the use of language. Broadcasting still contains within it a vigorous tradition of public service, and it is not insensitive to constructive comment. The professional educator is particularly well placed to offer that comment. The child who is watching 25 hours of television a week is spending almost as much time in front of a set as he spends in a classroom. That fact alone makes it a part of his experience so influential as to generate serious obligations on the part of those who provide it.

2.10 In trying to reach conclusions about standards of literacy we had access to two sources of information. One was the testimony of expert witnesses; the other the empirical evidence of surveys. Though these witnesses were not unanimous they showed a common tendency to be cautious in stating an opinion on standards. The following are from the submissions of prominent researchers, who were among the few witnesses who commented directly on standards from a study of the statistical evidence:

'There is no convincing evidence that there has been a reduction in standards. Nowadays, more people have wider needs for literacy in different contexts in everyday life, and where limited abilities occur they are brought before our attention'.

'The most that can probably be said about the movement of reading standards in the last third of a century is that there was a considerable downward movement during the war years followed by an upward movement in the 20 years after the war which may have levelled out in the last few years. Whether pre-war standards were caught up and overtaken is more difficult to say, but other evidence suggests that standards of older children are rather higher and those of younger children lower than those prevailing in the 1930s'.

'Though the NFER Report showed that the improvement in reading standards appears to have ceased, the improved standard of 1960 has been maintained. Nevertheless, more and more children are leaving infant school unable to read, and fewer teachers in junior schools seem to be equipped to teach the basic reading skills'.

Most witnesses did not commit themselves to a view, or where they did they acknowledged that it was essentially based on their own personal impressions. The unequivocal expressions of opinion were contained in the general correspondence we received, and here there was a majority view that standards had declined, or at any rate were at a standstill.

2.11 Later in the chapter we shall consider some of the reasons offered for this suggested lack of progress; but first we must examine the research evidence available to us. This is derived in the main from the series of national surveys carried out by the NFER for the Ministry of Education and later for the Department of Education and Science. The first of these took place in 1948 and has provided the basis for comparison against which subsequent results have been measured. This and later surveys were summed up in the report Progress in Reading 1948-1964 (11), prepared by GF Peaker HMI for the Department of Education and Science. It was claimed that during the 16 years of the surveys there had been an advance of 17 months of reading age for 11 year olds, and 20-30 months for 15 year olds. Not all reviewers have agreed that this represents what the report described as a 'remarkable improvement'. Several pointed out that the 1948 test scores were naturally depressed as a result of the war and that they therefore presented a low baseline which would flatter subsequent results. A fundamental reference point in the thinking of some witnesses was the pre-war situation. But it is very much open to question whether it is possible to relate present day standards to those of before the war. In 1948, when the Watts-Vernon was calibrated against those tests employed pre-war, the sample used for the comparison was a judgemental one arrived at on a local basis. The pre-war tests themselves were not standardised by means of a national sample. Thus there is no firm statistical base for comparison, and in terms of tackling today's problems it is questionable whether there is anything to be gained from attempting it.

2.12 The most recent surveys in the series, and the first since the 1966 summary, were The Trend of Reading Standards (1972) and The Reading Standards of Children in Wales (1973).(12) It should be said at once that it is not easy to make accurate assessment of the results of such surveys without studying them in depth. Indeed, we found in taking evidence that informed people have interpreted the NFER researches in different ways. We accept both publications as responsible and accurate research reports. The limitations of their research, which we shall describe below, are fully discussed in the publications by the authors themselves. It is to the English report to which this section will be largely devoted, since it has generated a good deal of concern about the reading standards of today as compared with those revealed by the 1964 survey.

2.13 The best point at which to begin is to consider the tests from which the results have been derived. The two tests, the Watts-Vernon and the National Survey Form Six (NS6), are narrowly conceived. The first was devised in 1947 as a silent reading test of the incomplete sentence type. It has 35 items and lasts 10 minutes. The second was developed in 1954 along similar lines but with more items (60) and a longer duration (20 minutes). We do not regard these tests as adequate measures of reading ability. What they measure is a narrow aspect of silent reading comprehension. This is not the place to define reading ability, which is analysed in detail in a later chapter, but we must record here our view that the tests in question are able to assess only a limited aspect of it. Both tests are technically reliable in the sense that they measure the same features to the same degree on different occasions. But their doubtful validity is now apparent, in that they measure only in part what they purport to measure*.

*'... the format of the tests does set limits on what aspects of the ability to read with understanding they can and do measure. Since the largest unit of language in these tests is the sentence, they do not measure, at least not to any significant degree, what might be called the inferential aspects of reading, such aspects as the ability to follow an argument or extract a theme'. (The Trend of Reading Standards Start and Wells, p. 17).
2.14 The problem is therefore one of attempting to assess the product of a variety of contemporary aims and methods with instruments constructed many years ago. The Watts-Vernon test was 23 years old and the NS6 16 years old at the time of the last survey. The report gave examples from them to illustrate how they had aged: the use of such words as 'mannequin parade' and 'wheelwright' in the NS6 and 'haberdashers' in the Watts-Vernon. It pointed out that children of today were less likely to use the term 'bathing' for 'swimming', and may be unfamiliar with such expressions as 'four rules of arithmetic' and 'pacific settlement of disputes'. This is a more telling limitation than might as first sight appear. If with the passage of time even two or three of the items become less familiar the effect upon the test results could be important. The comparable mean scores obtained over 23 years differ so slightly that this kind of increase in the difficulty of a few items can have a disproportionate influence upon the result. In other words, if in these items the pupils are at an artificial disadvantage compared with their predecessors then there is an underestimate of their ability. Where changes in mean scores are extremely small from one survey to the next, every item counts.

2.15 Another serious limitation of these tests is that they do not provide adequate discrimination for the more able 15 year old pupils. In other words, many of these pupils are capable of dealing with more difficult items than the tests contain. A fuller account of this 'ceiling effect', as it is called, is given as an annex. The 'ceiling effect' was noted as a defect of the Watts-Vernon test as long ago as 1956, and indeed the introduction of the NS6 was a response to it. There is now evidence to show that the 'ceiling effect' of the NS6 itself is causing problems, since at the senior level in particular many of the items are too easy. Indeed, according to the Welsh report it is perhaps now more serious than for the Watts-Vernon. The NS6 test was used alone in the 1972 survey (13) of reading standards in Northern Ireland. Because of the closeness of the achieved sample to the design sample it is possible to assess the special features of NS6 with somewhat more confidence than is possible from the English and Welsh surveys. And the most significant feature to emerge is the extent to which NS6 fails to allow able 15 year olds to score at a level which reflects their ability.

2.16 A histogram from the Northern Ireland report has been reproduced as Diagram 7 in the annex. This represents the scores for all the 11 and 15 year olds. The difference in the distribution is remarkable. It will be noted that the scores of the 11 year olds are well spread out but that those of the 15 year olds are 'piled up' towards the top end of the scale. This is a clear indication that the more able 15 year olds found the test too easy. The author of the report concludes that 'scores for such pupils may be artificially depressed by the low ceiling of the test. It also follows that the test discriminates adequately only among the pupils whose reading scores are in the lower half of the range'. Some idea of what this suggests about the performance of the more able readers can be gained from a scrutiny of the later items in the test. Quite apart from making demands upon vocabulary they require the reader to handle complex abstractions with some confidence. And yet questions of this calibre have proved too easy to stretch the older and brighter children. When these children are achieving near maximum scores there is little scope for them to improve their performances and thus to affect the mean score. If in their case the kind of ability measured by the tests were improving, the tests themselves would probably be incapable of detecting it. This fact, and the ageing of the tests, would be sufficient to produce a levelling-off in the rate of increase in scores. The principle of extrapolation cannot be applied indiscriminately. It is not necessarily the case that a well-established trend in a certain direction must continue almost indefinitely. Thus, the roughly uniform increase of mean scores on the tests through the 1950s and into the early 1960s encouraged expectations of continued increase at the same rate. This clearly could not happen. Improvements made by the poorer performers can raise the mean score for the age group, but the increase over the years can hardly be expected to go on at the same rate. For these reasons it is important to consider not only the mean scores but also the distributions of scores on tests which are suspected of having these limitations.

2.17 In both the English and Welsh surveys the sampling was inadequate in a number of respects. This was not the fault of the NFER researchers - a postal strike played havoc with their plans - but the reports make clear the reservations of their authors. In the English survey only 60 per cent of the secondary schools were able to reply before the strike began, and 7 per cent of these declined to take part. The result was a sample of secondary schools numbering just over a half of those selected for inclusion. In such circumstances was the achieved sample still representative and random? Moreover, it was further affected by a high degree of pupil absenteeism, due largely to the fact that the testing took place in the last fortnight of the Lent term. There was some evidence to show that the proportion of Easter leavers absent for the tests was much greater than the corresponding proportion of pupils who were not leaving at Easter. This could have the effect of reducing the number of less able pupils taking part. The 1960 and 1961 senior surveys took place in the middle of the autumn term, a time when the absentee rate of the less able was probably lower. The obvious inference is that the 1971 sample estimates for the 15 year olds could be artificially high, since if the absent pupils had in fact participated their scores might have lowered the mean. Start and Wells express it thus:

'... we must suspect that the less able are under-represented in our 1971 samples to a greater extent than in those of the last two surveys. In this case, the sample estimates for the present survey would be spuriously high in relation to those of the 1960 and 1961 surveys. Unfortunately it is not possible to estimate the extent to which the 1971 sample estimates may be spuriously high ... The extent may not be that great ... the Watts-Vernon test sample mean for all seniors would have been, at a very rough guess, 0.2 points of score less than the mean actually obtained. The corresponding figure for the NS6 would be 0.3 points of score lower'.
There is one further factor which may have affected the estimate of standards. This is the possibility that the 11 year olds were handicapped by lack of familiarity with objective-style tests. Start and Wells observe that the children's 'test sophistication' may have been lower in 1970 than six or ten years earlier, when the 11+ examination was widespread and they were more accustomed to meeting tests of this kind.

2.18 In view of all these doubts and caveats it might be wondered whether any firm conclusions can be drawn from the most recent NFER survey, and whether any safe comparisons can therefore be made with earlier surveys. There is, however, a degree of independent confirmation which has been little remarked. In the first place, the movement of scores obtained in England is comparable with those obtained in Wales, and this is evident at both senior and junior levels. In the second place, there is a degree of independence in the results obtained from the Watts-Vernon and NS6 testing of juniors. This is because separate but parallel samples of schools were drawn at the junior level in both the English and Welsh surveys. One sample was given the Watts-Vernon test, the other the NS6. When the trends are repeated in separate samples they give more grounds for confidence in them.

2.19 We have said enough about the limitations of the results derived from the national surveys, and we must add that it is not the fault of the authors that many people have ignored their reservations. Having expressed our own reservations about the tests and sampling we now turn to the tables of results. Reproduced below are the most important tables from the English report:

15 YEAR OLDS

Comparable mean scores with standard errors (14) for pupils aged 15.0 years. Watts-Vernon Test (Maintained schools and direct grant grammar schools).

Table 1

Date of Survey1948195219561961*1971
Mean score20.7921.5221.71(a) 23.6
(b) 24.1
23.46
Standard error0.370.200.260.140.26

*Although only secondary modern and comprehensive children were tested in 1961, these figures are estimates of total school populations: (a) taking other schools at the 1956 level, (b) supposing other schools made the same advance as secondary modern schools between 1956 and 1961.

(Table 3.3 from The Trend of Reading Standards NFER).

Comparable mean scores with standard errors for pupils aged 15.0 years. NS6 test (Maintained schools only).

Table 2

Date of Survey1955*19601971
Mean score42.1844.5744.65
Standard error0.640.730.83

*England and Wales - scores would probably be slightly higher (0.20?) if England only were taken.

(Table 3.4 from The Trend of Reading Standards NFER).

11 YEAR OLDS

Comparable mean scores with standard errors for pupils aged 11.0 years, since 1948. Watts-Vernon test (Maintained schools only).

Table 3

Date of Survey19481952195619641970
Mean score11.5912.4213.3015.0014.19
Standard error0.590.300.320.210.38

(Table 3.1 from The Trend of Reading Standards NFER).

Comparable mean scores with standard errors for pupils aged 11 years 2 months, since 1955. NS6 test (Maintained schools only).

Table 4

Date of Survey195519601970
Mean score28.7129.4829.38
Standard error0.550.520.92

(Table 3.2 from The Trend of Reading Standards NFER).

Our considered view is that the results of the 15 year olds, presented in tables 3.1 and 3.2, are not disturbing in themselves, having regard to the limitations to be found in the tests. Scores on NS6 continue their slight increase, while those on Watts-Vernon increase until 1961 and then decrease slightly by 1971. The point to be emphasised is that the changes in the scores on both tests in the last decade are not large enough to be statistically significant; and the most reasonable conclusion is that the standards of 15 year olds have remained the same over the period 1960-71. The authors of the 1970/71 survey report made the point that 'there seems to have been a steady increase in the weak tail of the 15 year olds over the past 23 years'. This statement is easily misinterpreted, for it refers to the shape of the frequency distribution and not to the actual mean scores of the poorer readers. In the earlier surveys the range of scores which separated the best from the average reader was much the same as that separating the average reader from the weakest. But in later surveys the 'ceiling' of the maximum score has restricted the range of scores available to the above average and average readers. This has resulted in the spread of scores covered by the poorer readers, 'the weak tail', becoming relatively more pronounced. We therefore cannot accept the suggestion made to us in evidence that 'at the bottom of the scale a group approximating to a sixth of the school population has been showing a slight but steady decline in attainment for the last 23 years'. Inspection of Table 3.6 and Figure 3.4 of The Trend of Reading Standards shows that the lowest standards have clearly risen during this period. Similarly, an examination of the difference in scores between the 10th and 90th percentiles at the age of 15 shows that this difference has decreased from 18.8 points of score in 1948 to 17.5 points in 1971. This apparent narrowing of the gap between the poor and the good reader emphasises the restrictions of the maximum score on the test used. The 'ceiling' has pegged back the top sixth, and the bottom sixth have been catching up as their scores improved*.

*This increase in the scores of the lowest achievers must not be confused with the issue of whether there is a rising proportion of poor readers among the children of unskilled and semi-skilled workers. This important question is considered in detail later in the chapter.
2.20 At junior level the following results are to be found:
(a) A steady increase of 3.41 points on Watts-Vernon from 1948 to 1964, then a decrease of 0.81 points between 1964 and 1970.

(b) On NS6 an increase of 0.77 points between 1955 and 1960, then a decrease of 0.10 points between 1960 and 1970.

The picture here is thus one of steadily rising scores through the 1950s to a peak in the early 1960s, and a slight decline in scores (0.81 and 0.10) by 1970. We have also had access to the findings of other surveys of reading standards, but before discussing these we must add a word about how their results should be evaluated. If national standards are under discussion, then one needs either a national sample or a collection of a large number of local surveys which taken together represent accurately the national population. Such a collection does not exist, and it would in any case be extremely difficult to construct an adequate national sample from a series of local studies. One of the reasons why it is so easy to form a distorted view of the national picture is the fact that local surveys are often carried out in large urban areas not typical of the national situation.

2.21 Furthermore, as any survey obtains its results on the basis of sampling from a given population, it is very important to bear in mind any changes over time in the characteristics of the population from which the sample is drawn. If this population alters between surveys the trends they identify will be partly, and even mainly, sociologically determined. It will thus be extremely difficult to isolate valid interpretations of changes in standards of reading, or for that matter of other cognitive abilities. One good example of a changing population is to be found in recruitment to the army, an organisation which monitors its intake so that year by year it is possible to record the mean score of each year's entrants. But little is known about the population of potential army recruits of which these intakes are samples. Indeed it is highly probable that the nature of the intake is decided by economic factors and other causes which fluctuate considerably in the short term. In times of economic boom the pool of potential army recruits is likely to be greatly reduced. Thus the mean scores recorded each year are derived from samples of a changing population, so that one cannot interpret trends with any certainty. A similar argument would apply to business concerns, or even individual schools, which draw their intake from an unspecified population subject to changes wrought by economics, the movement of social groups, or other influencing factors. It is clear, then, that local surveys have severe disadvantages for determining trends in national standards and that the basic difficulty is the well known one of estimating the standards of a definable population from a sample. National trends can only be properly evaluated by national surveys properly designed.

2.22 That having been said, the converse must be recognised. A national survey does not tell us anything about the circumstances of any given area. Local surveys have the considerable virtue that they allow the study of those psychological and sociological differences which might be ignored, avoided, or simply masked by national averages. They can complement the broad generalisations derived from a national survey and identify important local problems which might have equally important consequences for the nation as a whole. The survey carried out by the Inner London Education Authority is a case in point. Inner London is particularly atypical of the country at large. Totally urban, the population is skewed in social class towards the lower income groups, and the proportion of immigrants at the time of the survey was 17 per cent. The 1971 ILEA investigation was a follow-up of children who had been tested in 1968 when aged 8+. At the age of eleven 26,202 children were retested on two parallel versions of an NFER sentence completion test. (The results of these cannot, of course, be directly compared with those derived from the Watts-Vernon and NS6). It was found that the standardised scores of the children at 8+ and 11+ did not alter significantly; the change was from 94.6 to 94.9. It will be noted that the scores were markedly below the average (100) which was obtained by children of the same age when these tests were standardised on a national basis. Within the apparent stability from 8 to 11, however, the proportion of poor readers in the semi-skilled and unskilled groups increased from 17.9 per cent and 25.9 per cent respectively in 1968 to 22.0 per cent and 28.8 per cent in 1971.

2.23 In Aberdeen (15) in 1972 over 2,500 children were tested, representing 99 per cent of the two year groups concerned: 8 and 11. This was a repeat of an assessment carried out 10 years earlier in 1962, when the same coverage of 99 per cent was obtained and the same tests used: NFER (Sentence) Reading Test AD and Test NS6 The results of this survey were of particular interest to us in two respects. Firstly, they are very much in line with those of the English survey, a fact to which the authors themselves draw attention: 'The NFER findings on recent trends in reading standards (i.e. in England) are confirmed by this study. While at the age of 8 years, the standard of performance in reading comprehension (in Scotland) is relatively unchanged, at age 11 there has been a slight decline in average standard. The difference between the 1962 and 1972 averages, however, is small - only two thirds of one point of score in a test with 60 items. It would be reasonable to conclude therefore that standards are essentially unchanged over the past ten years'. Secondly the results provide an interesting analysis by social class. It was found that among children with fathers in professional or managerial jobs the average standard had improved, or at least been maintained. But among those with fathers in semi-skilled or unskilled jobs the average performance at 11 was seriously below the standard of the equivalent social group 10 years earlier.

2.24 As the national surveys in England and Wales did not record the social class of the pupils such information could not be obtained from them. However, there is further evidence to be obtained from other sources which confirms this relationship. For example, the National Child Development Study (16) revealed that 48 per cent of the children from social class V were poor readers at 7, compared with 8 per cent in social class I. Several studies have shown that the position worsens as the children grow older, there being a progressive decline in the performance of children of lower socio-economic groups between the ages of 7 and 11. The Educational Priority study (17) was applied to four educational priority areas, three in inner-city areas and the fourth in two small economically depressed mining towns. The children were given an NFER sentence completion test, and a score of 80 was taken to distinguish non-readers or virtual non-readers. Excluding the immigrant children it was found that the proportions of children in this category in the four areas were 19 per cent, 35.8 per cent, 21.7 per cent and 17.7 per cent. The researchers concluded that the overall performance in EPAs [Educational Priority Areas] is not pulled down by a very low set of scores from a small group in an otherwise normal population. On the contrary, the results display a much more general pattern of low attainment, with very few children falling in the higher scoring groups.

2.25 We are aware that the effects of social class are not specific to reading. Several studies have shown that the correlation is with attainment in general. For example, in research carried out for the Plowden Committee (18), the most powerful variable was found to be the School Handicap Score (SHS), a weighted sum of Father's Occupation, Father's Education, Mother's Education, Number of Books in the Home, and (minus) the number of siblings. Nevertheless, the relation with reading is a reality, and it seems to be universal. In a study in Swedish elementary schools Malmquist (19) found a distinct association between reading ability and social group; and evidence of a wider significance emerges from the comparative study of reading comprehension in 15 countries, in which the SHS was again used. Differences in achievement among the developed countries were of modest dimensions, but they all had in common this important feature: that a child's family background gives a clear prediction of his achievement in reading at age 10 and age 14. There is evidence to suggest that the slight decline in scores at 11 years of age in the 1970/71 survey may well be linked to a rising proportion of poor readers among children of semi-skilled and unskilled workers. Such an influence does not, of course, conflict with our earlier observation that there has been no decline in the attainment of the bottom sixth of the school population. It is perfectly possible to find an increase in the proportion of children from a low socio-economic group and at the same time a narrowing of the gap between the best and the weakest readers. The two ideas are really unrelated, since one refers to proportions, the other to differences in the range of scores. What appears to be happening is that while reading standards at the lower end of the ability range have improved in most socio-economic groups, the poor readers among the children of the unskilled and semi-skilled have not improved their standards commensurately. The result is that the lower end of the ability range has an increased proportion of these children.

2.26 There remains the question of the national reading standards of 7 year olds. It is not, of course, the practice to carry out national surveys of the reading attainment of children of this age, and there is therefore no comparable evidence about their standards from which to draw conclusions. Nevertheless, there is much conjecture about the reading ability of 7 year olds today as compared to that in previous years. It is clear that for many people this is the age which causes the greatest concern. One recurring point in the correspondence we received was the belief that there is an increasing tendency for children to pass from the infant to the junior stage without a good grounding in reading. The research study (20) most often quoted is one carried out in the West Midlands in which it was found that between 1961 and 1967 the percentage who had not started to learn to read had risen sharply. In the earlier year it was true of 25 per cent of children in the sample of 2,000; by the end of the period the figure had risen to 40 per cent. There is a similar indication in the National Child Development Study (21), which said that in 1965: '... some 10 per cent of 7 year olds in the final term of their infant schooling had barely made a start with reading. A further 37 per cent had progressed beyond this stage but continued to need specific help'. Both these studies were based on substantial samples and they can be taken as good indicators of a general situation. Their results are compatible with those of Morris (22) in Kent, where 45 per cent of first year juniors still required teaching help of the kind normally given in the infant school, and 19 per cent were virtually non-readers. Other local studies appear to substantiate these findings. Bookbinder (23) refers to three such studies - in Brighton, Salford, and Bristol - and suggests that children now start to read later but then make more rapid progress than in former times. From the evidence available there seems to be a prima facie case for saying that children of 7 are not as advanced as formerly in those aspects of reading ability which are measured by tests. It can be put no more strongly than that, for the evidence has obvious limitations for purposes of generalisation. If there are doubts about the fact, there are even greater doubts about the putative cause. Much public comment is quite categorical in ascribing it to poor teacher training, or to a neglect of reading in favour of creative activities, which is cited as one of the effects of 'progressivism'. These propositions are examined in the appropriate chapters, but it is worth anticipating briefly in the case of the second, since this is the one most commonly advanced. In our survey we tested the hypothesis that infant schools neglect reading practice, and the relevant tables of results are reproduced in Chapter 13. A questionnaire of this kind obviously cannot assess quality, but it will be seen that in quantitative terms reading featured prominently in the infant schools in our sample, however they were organised. Our own visits to schools and our discussions with HM Inspectors confirm us in the belief that infant schools take seriously their responsibility for teaching children to read, though there are, of course, considerable variations in the extent to which they are successful.

2.27 There is a strong belief, reflected in much of the testimony we received, that if children surge ahead in their first or second year in the junior school they have not lost by starting late. This argument can be examined against a comparison of the West Midlands Study and National Child Development Study findings with those of The Trend of Reading Standards. The samples are by no means a perfect fit, and any comparisons must be treated with appropriate caution. Allowing for this, however, it might be expected that the results of the first two would be reflected in the third. That is to say, a poor showing at 7 at the time of the earlier studies should predict a decline in standards at 11 at the time of the next national survey. To an extent, the prediction has proved true, in the sense that the progress shown in earlier national surveys has not been maintained. On the other hand, it could be argued that the performance of the 11 year olds is very much better than might have been expected from the prognostications, and that a good deal of productive learning has taken place in the junior years. Generalisations from this kind of comparison must not be taken too far. Between 1964 and 1970/71 the mean score of the 11 year olds decreased by 0.81 of an answer on the Watts-Vernon test, and between 1960 and 1970/71 by 0.1 of an answer on the NS6 test. It would clearly be impossible to evaluate this degree of decline in terms of the levels of achievement at 7 and the learning experiences of the children in the intervening years. The most that can be done is to consider the strength of alternative interpretations. As we have seen, one view would have it that there is no cause for alarm; that the decline at 11 is so slight as to be of little consequence, and that it does not justify increasing the pressure in the infant school. The other would say that anything less than a continued rise at 11 is evidence of falling standards, and that the junior schools are not properly equipped to make good the deficiencies of children who come to them as non-readers.

2.28 In Morris's study, cited above, it was found that 76 per cent of the junior school teachers in the sample had received no training in infant methods. 52 per cent lacked any infant school experience, and 18 per cent had no knowledge of how to teach children to read. The ILEA Literacy Survey showed that only 1 in 8 of the junior teachers had received specific training in reading techniques. Smaller-scale studies have pointed to similar conclusions and have suggested that as a general rule the junior school teacher is not equipped to cope in an expert fashion with children who have not made a start in reading. In evidence to the Committee many junior school heads have agreed that this is a difficulty with which they are faced. Their teachers have not received the training to enable them to assess when a child should have acquired a particular 'learning set' in reading, and how to contrive that he does. What is of particular concern to us is that the child who has not started to read might come to be regarded from an early stage as a 'remedial case'. Where the majority of children have made a good start to reading, and where the class teacher makes no claim to be able to teach the beginning stages, it is all too likely that the remainder will be 'withdrawn' for 'remedial' treatment very early in their junior school life. We would not be so unrealistic as to believe that every child should be a competent reader on leaving the infant school. But we would certainly be unhappy with a situation where the foundations of reading were not thoroughly laid there.

2.29 In summing up our conclusions about standards it is necessary to return first to the national survey. Despite all the reservations about the tests, absenteeism, and the size and nature of the achieved sample, the results of the NFER survey still provide the best estimate of the country's reading standards. The remainder of the evidence has, on the whole, given us confidence in our interpretation of this estimate.

At the age of 11 no significant change in reading standards over the decade 1960-1970 emerges from the NS6 survey. But the movement in Watts-Vernon scores from 1964 to 1970 just achieves significance (at the 5 per cent level), so that such movement as did occur was in all probability downwards. The indications are that there may now be a growing proportion of poor readers among the children of unskilled and semi-skilled workers. Moreover, the national averages almost certainly mask falling reading standards in areas with severe social and educational problems.

At 15 years of age reading standards, as measured by the tests, have remained approximately the same over the period 1960-1971, again after an earlier period of steady increase from the 1948 baseline. We believe that the ceiling effect of the tests is causing such distortion in the score distributions that the computed mean scores must be viewed with considerable suspicion. The statistical results from the survey at both age points are not greatly disturbing, but neither do they leave room for complacency. We do not believe it is sufficient to rely on a 1948 baseline for measuring the movement in reading standards; nor are we satisfied that the present methods of monitoring them are adequate. We accordingly recommend that a new system of monitoring should be introduced.

2.30 In the chapters which follow we advocate those measures which we believe will lead to the development of the complex of skills that go to make up literacy. Reading must not be thought of as an uncomplicated skill like walking, acquired when young then left to look after itself. Reading, writing, talking and listening are associated abilities which the school should go on developing throughout a pupil's educational life. Teachers can do this only if they understand these abilities, and that means recognising them as an area of learning which demands expert knowledge. In the secondary school it means an end to the ill-informed view of English that because anyone can speak it anyone can teach it. And it means that all teachers should be made aware in their training of the complex role that language plays in their work, whatever they are teaching. Literacy is a corporate responsibility, in which the leadership should be provided by teachers with specialist knowledge but in which every other teacher shares. Standards will not be raised if the responsibility is seen as falling to a small part of the teaching population. To blame the infant teacher for every 'failed' reader is to misunderstand what reading is all about. To blame the English teacher for every mistake a pupil makes is to misunderstand how language and learning interact. Literacy demands a continuity and community of endeavour.

ANNEX: THE 'CEILING EFFECT' OF READING TESTS

Copyright
We are grateful to the following copyright holders for permission to reproduce diagrams: the Book Publishing Division, National Foundation for Educational Research for The Trend of Reading Standards, and the Northern Ireland Council for Educational Research for Reading Standards in Northern Ireland.

2.31 Suppose we want to measure attainment trends in reading over a long period of time. We do not know the shape of the distribution in the population, but for simplicity we assume it to be at least roughly normal (bell shaped). What are the requirements of the measuring test?

The most fundamental is that it should be valid, in which case the test scores will be distributed over the full range of attainment. Should attainment improve, the range will either be extended upwards or shift bodily upwards, or both. Therefore there should be sufficient room upon the measurement scale for the whole distribution to move along it if improvement should occur in all parts of the range.

Diagram 1 illustrates this, exaggerating the movement.

Diagram 1 Movement of distributions of scores showing no ceiling effect

M1 = First Mean; M2 = Second Mean

This may be represented by means of percentile curves which are typically S-shaped when reflecting a normal distribution. See Diagram 2 below:

Diagram 2 Movement of cumulative distribution of scores showing no ceiling effect

Now suppose we do not have a scale which has sufficient headroom for the whole distribution to move upward in score, since the more able pupils are capable of scoring more than the maximum number of items provided. What would be the effects?

2.32 The effects would be noticeable in the distribution of scores as a 'piling up' of scores at the ceiling of the test, shown here in exaggerated form:

Diagram 3 Movement of distribution of scores showing a ceiling effect

Those pupils who would have been capable of scoring in region x would be held to a score below the ceiling limit. In tests of the multiple choice form and/or with guessing corrections such pupils would not simply all score the maximum. (In a large sample some fluctuation could be expected if the test were retaken, as those who guessed right on one occasion might be wrong on another.)

In the NS6 a guessing correction is employed that terminates scoring if seven successive responses are incorrect. Only correct scores before the incorrect sequence of seven are counted, and those occurring after it are discounted as being probably guesses. Now it may happen that a pupil who answers correctly up to a number less than seven from the maximum could well continue to score beyond the maximum if additional items were available. Thus a small proportion who score as low as 54 on the NS6, which has a 60 maximum, are potential scorers in region x. The proportion is likely to be small because the NS6, like the W-V, is a highly reliable test; that is to say, the fluctuation of scores on retesting would probably be low. In terms of the shape of the percentile score curve the effect would be as shown below:

Diagram 4 Movement of cumulative distribution of scores showing a ceiling effect

The top of the 'S' vanishes and there is 'bellying' of the lower portion of the curve. The mean would lie below the median and not on it as it would have done in an undistorted distribution.

What effects would the distortion produced by the ceiling have on the estimates of population parameters ?

(a) As the distribution was no longer normal in form the sample standard deviation would not denote the same idea of measuring a symmetrical spread about the mean.

(b) As the sample standard deviation would be suspect, so too would the estimate of Standard Error which gives confidence in the probable position of the estimated mean for the population.

(c) The mean itself, though still computable, may be seen no longer to reflect the true attainment of the sample and hence no longer to provide an accurate estimate of the value within the population from which the sample is drawn.

2.33 Let us now examine how this exposition throws light on the results obtained with the W-V and NS6 tests in the recent national surveys.

Figs 3.3 and 3.4 from page 38 of The Trend of Reading Standards show the situation on the Watts-Vernon test for 11 year old and 15 year old children respectively. Diagrams 5 and 6 reproduce these figures. Diagram 5 below (Fig 3.3) shows a parallel set of S-curves moving across the scale as standards rise. There is no ceiling effect with these 11 year olds, because even in 1964 the highest scoring pupil scored only 27, the maximum possible being 35.

Diagram 5 Scores in the reading tests of 1948-70 inclusive (11 year old pupils in maintained schools)

(Figure 3.3 from The Trend of Reading Standards NFER)

Diagram 6 (Fig 3.4), on the other hand, shows that even in 1948 some 15 year old pupils were scoring the maximum. In succeeding years the 'top of the S' has almost disappeared and 'bellying' has increased as the distortion of the distribution of scores has become more marked. There is thus a definite ceiling on the W-V at age 15, a fact which has been known since the early 1950s.

Diagram 6 Scores in the reading tests of 1948-71 inclusive (15 year old pupils in maintained schools)

(Figure 3.4 from The Trend of Reading Standards NFER)

Start and Wells give no comparable graph which presents a similar family of curves for the NS6, but they do give data for 1971 (see diagram 8).

The Northern Ireland survey of 1972 supplies a distribution of scores on NS6 at 15. Fig 3 from page 9 of that survey is reproduced here as diagram 7. There is a clear difference between the roughly normal distribution of reading attainment at age 11 and the distorted and curtailed distribution at age 15. The bell-shaped distribution of 11 year old scores is seen best if the page is turned sideways.

Diagram 7 NS6 score distributions for 11 and 15 year olds in Northern Ireland

(Figure 3 from Reading Standards in Northern Ireland NFER)

The following diagram depicts the percentile distribution for 15 year olds on both the NS6 and W-V in the English survey of 1971. On the larger scales used in this diagram one can see that there is still some 'top to the S' left on the W-V but none at all on the NS6. The ceiling effect of the NS6 is more severe at 15 than that of the W-V.

Diagram 8 Percentile distribution of scores for 15 year olds in all maintained schools in England, 1971

Based on figures from Tables 4.3 and 4.4, Start and Wells, Page 59

This is a surprising result. Many people expect the reverse to be true, because the introduction of NS6 in 1953 was in part an attempt to provide a test of higher ceiling than the W-V, whose ceiling had been recognised. It must be acknowledged that the attempt has failed. However, we cannot simply conclude on the basis of the ceiling effect alone that one test is better than the other, for while the Watts-Vernon test appears to be less affected by 'ceiling effects', the question of validity should be kept in mind, a point discussed in paragraph 2.14.

2.34 What are the implications of the test ceiling for the interpretation of results from the national surveys of reading standards?

Firstly, the meaning of the computed standard error of the mean is in doubt. When there is no longer a symmetrical normal distribution, computing the standard deviation of the sample scores in effect averages out the asymmetry. As it is the standard deviation which is used to estimate the standard error of the mean, the confidence limits set on the position of the mean must be interpreted with great care. More seriously the computed mean itself may be seen to underestimate the probable true mean reading attainment of 15 year olds. We must conclude that neither the W-V nor NS6 can be used to give accurate estimates of reading ability at age 15. Sentence completion tests of this type do not appear capable of discriminating between the performances of the most able of these children. To this extent, reading ability has outstripped the available tests.

References and notes

1. Gray WS The Teaching of Reading and Writing UNESCO: 1956.

2. Moyle D et al Readability of Newspapers Edge Hill College of Education: 1973.

3. The term 'reading age' is used throughout this chapter, since it is the measure most commonly employed when standards of reading are being discussed. A reading age is obtained by transposing test scores on to a scale expressed in terms of years of development. We consider it in many ways a misleading concept which can obscure more than it reveals. Its use assumes that progress in reading can be equated with certain arbitrary units of time. In other words, learning to read is looked upon as consisting of equal steps which can be placed alongside another scale of equal steps, namely months and years. But there are no grounds whatever for supposing that reading progress is a linear process of this kind, and indeed there is evidence to the contrary. Nor is it reasonable to believe that the difference between reading ages of 6.6 years and 8.6 years is the same as the difference between those of 10.6 and 12.6. Even if these facts are disregarded, the concept of reading age is of limited practical value for teachers. If a statement like 'a reading age of 7.0 years' is to have any real meaning, then the characteristics of '7 year old reading' must be known and defined. This would be difficult to achieve. The average 7 year old reader exists only as a statistical abstraction, and unless one can ascribe to reading ages attributes which have real meaning the term is highly misleading. It simply cannot be assumed that children having the same reading age read in the same way, require identical teaching, and will profit from similar books and materials.

4. Bormuth JR An operational definition of comprehension instruction in Psycholinguistics and The Teaching of Reading International Reading Association: 1969.

Bormuth JR Development of Standards of Readability University of Chicago: 1971.

5. Downing J (ed) Comparative Reading Macmillan: New York: 1973.

6. Thorndike RL Reading Comprehension Education in Fifteen Countries: International Studies in Evaluation III International Association for the Evaluation of Educational Achievement: 1973.

7. Start KB and Wells BK The Trend of Reading Standards 1970-71 NFER: 1972.

8. Social Trends No. 4 Government Statistical Service: HMSO: 1973. See also Children as Viewers and Listeners B.B.C. 1974, which gives an interesting analysis of the viewing and listening preferences of children aged 5 to 14.

9. Himmelweit HT, Oppenheim AN & Vince P Television and the Child OUP: 1955.

10. Whitehead F, Capey AC, Maddren W Children's Reading Interests Schools Council: 1974.

11. Progress in Reading 1948-1964 Education Pamphlet No. 50: HMSO: 1966.

12. Horton TR The Reading Standards of Children in Wales NFER: 1973.

13. Wilson JA Reading Standards in Northern Ireland NICER: 1973.

14. Standard Error. When a mean score is shown in the tables it has to be understood that this is an estimate of some unknown true mean score which would have been obtained if every child in the relevant population of 11 and 15 year olds had been tested. Obviously, it is impossible to test on this scale, so each survey has tested only a sample of children from each of the two age groups. Thus, the mean value quoted for 11 year olds in 1970 on the W-V test (Table 3) is an estimate of the true mean score for all the children of that age in that year. Many different samples could be chosen without testing the same children, and if this were to happen one would get several different estimates of the true mean score of the whole age group. The chances that any one sample would give an estimated value precisely the same as the true mean score are, on common sense expectations, small. We are in just that situation. We have one estimate of the true mean score for the whole population at that age. How close is it likely to be to the true value? The standard error quoted with each sample mean score gives us an answer to this question, but the answer has to acknowledge that the estimated mean score values one might obtain by repeated sampling would be sometimes higher and sometimes lower than the true value in the whole population. For instance, in 1964 (from Table 3) the sample of 11 year olds in maintained schools obtained a mean score of 15 on the Watts-Vernon test. The standard error was 0.21. This information tells us that the true mean score of the whole population of such pupils was close to 15. How close? If we said the true mean score for the age group was between 14.79 (15.00 - 0.21) and 15.21 we should be likely to be right 68 times out of 100 or roughly two thirds of the time. It may not be thought adequate to be right two times out of three, but if we wish to be more certain of including the true population score we must widen the limits. We could thus be confident that in claiming the true mean score to lie between 14.58 and 15.42 (15 ± 2 x 0.21) we should be right 95 times out of 100. The combination of a mean score and a standard error derived from a sample must, therefore, be regarded as indicating the range of possible values of the true mean score. If, therefore, we are to be sure that from one year to any other year there has been a definite movement of the mean score, we must allow for the range of possibilities on each testing occasion. In drawing conclusions from the results it is not enough merely to note any difference in the mean scores. There must be no significant chance of an overlap due to either this sample or the one in the earlier test producing an estimate greatly removed from the true value in the age group at the time. And this means that the difference must be sufficiently large for it to be clear that no such overlap has occurred.

15. Nisbet J, Watt J, Welsh J Reading Standards in Aberdeen 1962-1972 University of Aberdeen: 1972.

16. Davie R, Butler N, Goldstein H From Birth to Seven Longman: 1972.

17. Halsey AH Educational Priority Vol. 1. HMSO: 1972.

18. Children and Their Primary Schools HMSO: 1967.

19. Malmquist E Factors relating to Reading Disabilities in the First Grade of Elementary Schools Stockholm: 1958.

20. Gardner K The State of Reading in Crisis in the Classroom ed. N Smart: IPC: 1968.

21. Pringle MLK, Butler NR, Davie R 11,000 Seven Year Olds Longman: 1966.

22. Morris J Reading in the Primary School NFER: 1959: and Standards and Progress in Reading NFER: 1966.

23. Bookbinder GE Variations in Reading Test Norms Educational Research 12,2: 1970.

Chapter 1 | Chapter 3