| | |
| www.dg.dial.pipex.com | 828 readers since 14 Aug 2006 |
Hadow (1924) Notes on the text
|
The Hadow Report (1924)
Psychological tests of educable capacity and their possible use in the public system of education
Chapter 2 General summary of the available evidence bearing on the problems connected with the various types of psychological tests of educable capacity
THE VARIOUS TYPES OF PSYCHOLOGICAL TESTS OF EDUCABLE CAPACITY 50. After careful examination of the evidence, and after consultation with several trained psychologists regarding the proper definition of the expression 'psychological test of educable capacity', we came to the conclusion that the phrase might, for purposes of convenience, be properly interpreted as including the following types of tests: (i) Tests of 'intelligence', i.e. tests designed to measure that general ability which is held by many psychologists to underlie the various special activities of the mind. (ii) Standardised scholastic tests based on average performance, i.e. tests of attainments in particular school subjects, such as reading and arithmetic, elaborated by actual experiment and statistical evaluation. (See footnote at the end of Chapter 3 regarding Dr Adami's objection). (iii) Such vocational tests, including tests of manual ability (1) as are adapted for use in schools and educational institutions. (iv) Tests of mental activities of a specialised kind, e.g. tests of memory, perception, attention, imagery and association. (v) Certain physical tests which have been suggested as a means of assessing educable capacity. (vi) Tests of such aspects of temperament and character as bear directly on educable capacity. We desire, however, to point out that in our opinion the terminology employed in any discussion of psychological tests of educable capacity in their present state of development is necessarily provisional, and that the distinctions involved in the above classification are themselves founded on hypotheses, and however convenient for the purposes of analysis, should not be interpreted as if they were in any sense finally valid. In other words, we consider that these distinctions are probably best regarded as first approximations to the truth, and as such are of considerable value for working purposes, but possess only provisional validity. OBSERVED DISCREPANCIES BETWEEN ABILITY AND ATTAINMENT; THE DESIRABILITY OF RECOURSE TO SOME MEANS OF DISCOVERING INNATE ABILITY APART FROM EXAMINATIONS OF THE ORDINARY TYPE 51. It is obvious, as was pointed out to us by several of our witnesses, that some simple, uniform and trustworthy device for gauging the educable capacity of children is urgently needed. The method at present so widely employed of appraising the abilities of children chiefly on the results of aptitude displayed in acquiring knowledge of the ordinary school subjects is not altogether satisfactory. It is never quite safe, at any rate in young children, to attempt to assess educable capacity by mere attainment as disclosed in the ordinary written or oral examinations, or even by the application of standardised scholastic tests. (2) We discuss in Chapter 3 the general question of observable defects in existing arrangements for examinations designed primarily to discover potential educable capacity or general ability and promise rather than attainments; so at the present stage we will merely call attention to the incongruities which have been noted in many instances between the undoubted innate abilities of a child and his actual achievements in examinations. Meagre attainments in school knowledge may be due to a plurality of causes - to chronic ill health, or temporary illness involving irregular attendance at school, to unfavourable conditions in the home, including insufficient clothing and inadequate or unsuitable nutrition, to lack of interest due possibly to failure on the part of teachers to appeal to some emotional element calculated to stimulate enthusiasm for a specific subject or group of subjects, to positive absence of the will to learn, and so forth. A high level of ability does not therefore invariably entail a good standard of school work. It has often been pointed out that many highly gifted men and women were regarded in their school days as dull or incompetent. Such dullness or incompetence may have been generally present owing to a failure of interest in the pupil or to an arrested development of ability which was subsequently released. In so far as it is the fault of the school it may not be so much the conventional methods of teaching that are inadequate as the ordinary means of diagnosis, consisting chiefly of written, oral and practical examinations. The school should be criticised not so much for failing to adapt itself to such exceptional personalities as for failing to discover them. THE APPLICATION OF 'INTELLIGENCE' TESTS TO SUBNORMAL AND SUPERNORMAL CHILDREN 52. It is claimed on behalf of 'intelligence' tests, whether individual tests or group tests, that if properly applied and evaluated they afford a more objective, more systematic and more trustworthy means of discovering the existence of inborn intelligence and educable capacity in pupils than the ordinary written and oral examinations. We may point out at this stage that, just as there are three types of ordinary examinations, the written, the oral (viva voce) and the practical (whether in laboratory work or clinical work), so there are three methods of applying 'intelligence' tests and vocational tests. Group tests are set in the form of papers like the ordinary written examinations; individual tests are applied orally in a systematic viva voce examination; while performance tests and tests of manual ability or of special vocational aptitude are analogous to the practical examination. The essential feature of such psychological tests seems to be that, unlike ordinary examinations, they only postulate knowledge and experience of a very general kind, and are graded or standardised usually on the basis of the age of the children tested or in terms of mental age. 'Intelligence' tests claim to measure general inborn intellectual ability or 'intelligence', which is envisaged as a purely abstract potentiality - a hypothetical quantity postulated and defined, like most other scientific concepts, for the convenience of separate measurement. It is claimed that these tests of intelligence render it possible to predict from quite an early age what will be the probable intellectual level of a child when he is grown up. In fact most of the psychologists who gave evidence before us assured us that within reasonable limits such forecasts could be made, and that the ratio between a child's mental age and his chronological age, which was known as the 'mental ratio', or intelligence quotient, appeared to remain tolerably constant during the years of growth. The results obtained from the application of the tests are said to show that beyond the stage of puberty - say 16 years of age - inborn intelligence, i.e. general ability, does not develop to any appreciable extent. The claim, accordingly, is made that it is safe to predict, for example, that a child aged 5 with a mental age of 2 who thus has a mental ratio of 2/5 (= 40 per cent), will probably attain a mental age of 4 at the age of 10 and a mental age of 6 at the age of 15, on the ground that inborn intelligence does not appear to develop much after 16, and further, on the ground that such a person would never rise above the six year level and would probably remain mentally defective for the rest of his life. The advocates of 'intelligence' tests further contend that this form of examination has great advantages over ordinary examinations; in the first place, that when properly prepared, applied and marked, the tests neutralise to a very considerable extent the effects of bad teaching and unfavourable surroundings; in the second place that they furnish a measure not of what the child has learned, but of what he can learn, provided that he possesses the temperament and character requisite to enable him to make proper use of the educational opportunities afforded. In other words, that the tests, unlike ordinary examinations, not merely assess the degree of intelligence, but when applied at an early age detect with a fair measure of accuracy the degree of educable capacity and hence the extent of probable attainment. The specific claims made on behalf of mental tests by our various witnesses may be summarised as follows: (a) That they provided a method of comparing children in respect of their inborn capacity and thus of selecting the best candidates for higher instruction, and sifting out defective and dull children for treatment by special educational methods.It was stated that the evidence so far afforded for the value of tests as indexes of educable capacity fell mainly under three heads: (i) Comparison of the results of the tests with the observations of teachers and others;WHAT IS INTELLIGENCE? 53. The purpose and function of tests of intelligence is to determine the general ability and consequently the general educable capacity of the individual pupil. The ordinary connotation of the word 'intelligence' differs considerably from the meaning commonly assigned to it by psychologists. It is therefore important at the outset to define so far as possible the significance which is attached to the term by psychologists. Though the word is generally employed both in ordinary use and by psychological writers in the English, French, Italian and German languages, its connotation is vague and elusive. It was originally closely akin to intellect, the sole difference between the two being that intellect, which was defined by some of the schoolmen as the power to conceive universal ideas, meant the faculty, while intelligence denoted its actual exercise. (5) Both words became degraded in popular usage, and were frequently understood as covering other cognitive processes, including memory and even perception. In course of time intellect became the more exact term, being reserved for a power supposed to be peculiar to man, while intelligence was regarded as being shared by man with many of the lower animals. It is not therefore surprising to find that very divergent theoretical definitions of intelligence (6) have been given by the various psychologists who have aimed at measuring it. The diversity of opinion among psychologists regarding the nature of intelligence or general ability may be illustrated by a recent discussion on the matter by fourteen prominent American specialists on mental testing. (7) Terman, for example, defines intelligence as the power of abstract thinking; Thorndike as the power of good responses from the point of view of truth, Buckingham as the ability to act effectively under given conditions. Ruml pessimistically states that the nature of intelligence can at present hardly be discussed at all owing to the vagueness of the terms involved and our paucity of information about the facts. Indeed, the replies disclose a remarkable variety of opinions. Nor do European psychologists exhibit any greater unanimity. THE VARIOUS HYPOTHESES REGARDING THE NATURE OF THE FACTORS INVOLVED IN GENERAL ABILITY OR GENERAL INTELLIGENCE 54. It is possible, however, to arrange the various hypotheses about the nature of 'intelligence' under three heads; those that envisage it as the effect of a few highly generalised 'faculties' or 'functions'; those that regard it merely as a convenient term for the average of innumerable abilities, all highly specific, and those that view it as a single central intellective factor common to all intellectual processes. (a) The first theory which for purposes of convenience may be termed the 'faculty' theory was held by Binet. In his earlier articles Binet speaks of tests for perception, attention, memory, and reasoning - all terms of the traditional school of 'faculty psychology'. Elsewhere in his writings he approximates to the so-called 'two factor' hypothesis, and declares that 'in intelligence there is a fundamental faculty ... This faculty is judgement, otherwise called good sense or practical sense, the faculty of adapting one's self to circumstances ... Under cover of a test of memory we shall have an appreciation of judgement'. (8) In an elaborate article on 'The Intelligence of the Feeble Minded', published in I909, (9) he appears to regard general intelligence as a complex mental quality involving at least three faculties: (i) The appreciation of a problem and the due direction of the mind towards its execution;The second theory according to which the several traits that go to make up the mind are practically independent is held, with modifications, by Professor EL Thorndike, (10) who regards intelligence as a multiplicity of innate abilities that are related in varying degrees. He admits that there is a positive relation between desirable single traits in a single individual. 'Having a large measure of one good quality increases the probability that one will have more than the average of any other good quality'. The fact that a child has pronounced inborn ability in arithmetic indicates that he will have more than the average innate ability in geography, and even that he will be above the average in his moral qualities, but it is not certain that he will be. Thorndike has since suggested that there are three main types of inborn intelligence, viz. intelligence for words and abstract ideas, motor intelligence or skill with the use of the hands, and social intelligence or the ability to get on well with one's fellows. These three types are positively related, but not necessarily in a high degree. A similar belief in the relative independence of intellectual abilities appears to be held in a modified form by Professor Godfrey Thomson, (11) who informed us that he did not believe in the existence of a factor called general ability or general intelligence, and considered that the statistical work of those who supported that theory was of doubtful validity. The third hypothesis which appears to be at present the most widely accepted in a modified form is that of a central intellective factor. Many of the supporters of this view would probably accept Stern's definition of intelligence as 'general adaptability to the new problems and conditions of life'. (12) Dr Cyril Burt, in his evidence, urged that general intelligence was by far the most important of the factors involved in general educational ability. 'Intelligence' manifested itself in a number of different ways, but he regarded it more as a single complex quality rather than as a group of independent elements. It was best measured by tasks requiring the voluntary maintenance of attention; quick and accurate learning (in the broader sense of the word, namely, adaptation to relatively novel conditions), and on the higher mental levels, reasoning. These should, perhaps, be regarded rather as modes of general ability than as elements entering into general ability as component factors. Professor Spearman, the first and foremost supporter of this unitary theory, prefers to call the general factor 'g', without committing himself to the view that 'g' is precisely identical with intelligence as popularly understood. (13) He writes that 'possibly this central factor is some general fund of mental energy, which again depends upon a general fund of some sort of brain energy'. (14) It is obviously impossible for us in this Report to discuss at length the very difficult and technical questions involved in the various theories regarding the nature of intelligence or general ability and its relation to the various specific manifestations of ability. We have, however, quoted in Appendix IX extracts bearing on this problem from the evidence submitted to us by various psychologists. In spite of the divergence and apparent inconsistency of the various theories regarding the nature of 'general intelligence', many of the individual views seem merely to lay stress on one or other aspect of a complex whole. As is pointed out in Chapter 1 (footnote 36) the differences of view are by no means so complete or irreconcilable as they once appeared. Probably in any one intellectual act, factors of at least three orders play an essential part - the 'central factor' of 'intelligence', the 'group factors', which an older psychology would have termed faculties, and 'specific factors' entirely limited and independent. The different views seem merely to overemphasise one particular type of factor, and to ignore or underrate the part played by the other two. Most psychologists would probably agree that 'intelligence' is a general mental ability operating in many different ways, given as part of the child's natural endowment, as distinct from knowledge or skill acquired through teaching or experience, and more concerned with analysing and coordinating the data of experience than with mere passive reception of them. As, however, this term 'intelligence' is employed by psychologists in a technical sense, we have placed it in inverted commas to indicate that it is used with a special meaning. WHAT DO TESTS OF 'INTELLIGENCE' MEASURE? 55. Though, however, we could find no general consensus as to what intelligence is, almost all our witnesses, teachers and psychologists alike, were unanimous in their definitions of what it was not. All are agreed that intelligence does not cover temperament or character; and that, therefore, the important personal qualities of will, feeling, and emotion are not dealt with by tests of intelligence. Secondly, they were agreed that it does not cover acquired attainments; hence, tests of intelligence give no indication of what a pupil has learnt in reading, spelling, arithmetic, or in any of the higher school subjects. Thirdly, it seems generally agreed that any narrow or limited talent, available for only one type of intellectual work, is not to be named intelligence in this sense. Intelligence is regarded by the majority as a common component entering more or less into all intellectual activities, or, as some would prefer to phase it, as the common level of all particular intellectual performances. What tests of 'intelligence' measure, therefore, is inborn, all-round, intellectual ability, using the word 'intellectual' in a loose sense to include practical activities as well as theoretical, but to exclude processes of emotion and qualities of character. These tests aim at measuring this ability through different processes according to the mental level of the child. With the youngest infants little more than simple sense perception and simple movement can be tested. With the older and brighter children, the commoner tests aim at measuring 'intelligence' through the perception of more abstract relations, such as analogy, immediate inference, and the like. But at nearly every level, except perhaps the lowest and the highest of all, the capacity for learning by experience, and (particularly when a time limit is imposed) for learning quickly by a brief experience, is one of the commonest manifestations of 'intelligence' to be exercised in these tests. It should be emphasised that these tests do not pretend to discover all the intellectual qualities that appear to be present in great artists, great musicians, great poets, and masters of literary form. Such specialised forms of genius imply not only the possession of a high level of general intelligence, but also the possession of a specialised talent in an unusually high degree. Without high intelligence no poet could be great. But without the special poetical qualities - imagination, visualisation, intensity of feeling, or whatever they may be - no great man could be a poet. Further, as psychologists as well as teachers have constantly pointed out, there are children of first-rate 'intelligence' who cannot express themselves clearly at any length on paper, and who fail to do themselves justice in verbal answers to verbal questions. Such children are likely to do ill with such oral tests as those of Binet or Simon, or with such group tests as those in the first series (15) used with the American army and since copied so freely in this country. On the other hand, such children may have very considerable aptitudes in practical or mechanical directions; and with these, if their ability is to reveal itself through any measurable test, some test of a manual or so-called 'performance' type must of necessity be used. A short series of psychological tests cannot, of course, pretend to greater accuracy than the considered judgement of an experienced master or mistress, observing the individual pupils of the class over a period of weeks or terms. They afford only one kind of evidence; and, though even the competent teacher may gain from them something which he could not otherwise obtain - namely, some degree of standardisation or comparability for his judgements - no tests of 'intelligence' can completely take the place of observation and insight. They are supplements, not substitutes. It cannot be too frequently emphasised that even psychologists themselves do not trust solely for their conclusions to the mere quantitative marks from a mechanical set of tests. In the diagnosis of mental deficiency, of backwardness, or of supernormal intelligence, they endeavour always to obtain data and reports dealing with the general conduct and character of the child. There are, it is true, psychological tests which aim at measuring or discovering other qualities besides the intellectual - namely the emotional or quasi-emotional aspects of personality, and the many important traits of character which bear on educable capacity. But, whatever may be thought of the efficiency of such temperamental tests when applied under laboratory conditions, they are in their present state of development unsuited for practical use in school. We are accordingly of opinion that the data afforded by the application of group tests or individual tests of 'intelligence', or both, should always be considered in association with the information regarding individual pupils available from other sources, e.g. school records, personal estimates of teachers, achievements in ordinary examinations, medical data, if any, and information regarding home conditions and parentage. It must always be remembered that character as well as intelligence is, in actual school life, a factor of the highest importance in determining the response of any individual pupil to instruction in any branch of study. Furthermore, there will probably be some pupils who, though they show little aptitude for the ordinary school studies, have a decided bent, say, in an artistic or mechanical direction. If, therefore, we agree that what 'intelligence' tests measure may safely be regarded as a factor operative in all mental operations that can by any latitude be called intelligent, yet at the same time it must be remembered that such a factor is not the sole factor, and that in every distinct kind of mental operation special factors are nearly always involved. In some kinds of mental operation the general factor is of dominant importance, while in others, the special factor is more prominent. For operations of the latter kind the intelligence tests alone would have little diagnostic value. Furthermore, if satisfactory and trustworthy tests of memory, perception, attention, and of such aspects of temperament as affect educable capacity could be devised, it seems certain that the data thus afforded would modify appreciably the data yielded by 'intelligence' tests. THE MAIN PRESUPPOSITIONS UNDERLYING THE USE OF TESTS OF 'INTELLIGENCE' AND THE INDISPENSABLE CONDITIONS WHICH MUST BE COMPLIED WITH IN ORDER TO ENSURE THE VALIDITY OF ANY SET OF 'INTELLIGENCE' TESTS 56. There was general agreement that the main presuppositions underlying the use of 'intelligence' tests were as follows: (i) That there were certain mental factors which remained more or less constant during the lifetime of individual human beings.It would seem from the evidence submitted to us that the educable capacity of a child at any period of his or her life may be assumed to depend on mental factors of two kinds: (a) inborn psychological abilities of a relatively elementary and general nature; and (b) acquired capacities of a more complex and specific character, chiefly memories and habits, such as particular items of knowledge and particular forms of skill. There has been much discussion in recent years as to whether nature (inherited capacity) or nurture (training and environment) is the more important. It is evident, indeed, as Binet frequently pointed out in his articles, that even a child of the greatest potential intelligence can never become highly intelligent in surroundings affording scant opportunity to learn. On the other hand, it is almost certain that a feeble-minded child can never become really intelligent however favourable his environment and however skilled and patient his teachers may be. This applies still more to children with special defects. For example, a deaf mute, whatever may be his inborn ability, will probably grow up apparently feeble-minded unless special methods of instruction are employed to educe and develop his native intelligence. In spite of the hopes that were once entertained regarding results obtainable from special instruction in schools for the mentally defective, a feeble-minded child can seldom, if ever, be converted into a normal child. It is clear that when a group of individuals are brought up in a similar environment, with almost equal opportunities, the differences disclosed among those individuals by means of intelligence tests are due to differences in native ability. Conversely difference of environment may be so wide as to obscure or even traverse the difference of ability which these tests are intended to elicit. For example, the results of intelligence tests applied to children coming from different environments and living under different conditions, some of whom were underfed and ailing, while others were well nourished and healthy, would necessarily afford untrustworthy evidence of inborn capacity and the value of such results would be pro tanto [to such an extent] diminished. It follows that the results would be to some extent invalidated, if the persons were drawn from environments that were widely dissimilar or had been subjected to widely different conditions of life. On a first view it might seem impossible to attempt to measure and appraise the amount or character of an inborn capacity, or group of capacities that manifest themselves solely through learning, as such capacities can be measured only indirectly through what has been acquired. It should, however, be borne in mind, as was pointed out by several of the most eminent psychologists who gave evidence before us, that the distinction between inborn and acquired capacities is mainly a matter of inference. Innate capacity may be indirectly gauged with considerable success by measuring the acquired capacities in a group of children with substantially the same experience. Psychologists infer from differences in acquired intelligence differences in native endowment when they compare individuals in a group of children who have had common experiences and note the differences in the performances and attainments of those individuals. It follows, therefore, that a valid test of intelligence must be based on elements appealing to the common interests and within the common experiences of the group of persons tested. For example, a careful examination of the Binet tests in their original form shows that the separate tests were arranged on the basis of the common experiences of urban children of varying ages. In no instance was a test employed that was based on peculiar conditions or unusual facilities for learning. Tests for any given age were applied on the assumption that all normal children might have been expected to learn the things with which they were ordinarily in contact. Further, no test of intelligence can properly be regarded as valid unless the individual tested has had reasonable opportunities to learn about the various elements involved in the test and has also been interested in learning. For instance, the American army 'Alpha' tests, which during the last few years have been extensively applied in the educational institutions of the United States, are reported to show in nearly every instance higher average marks for men and boys than for women and girls. The apparent inference that the intelligence of men is on the whole rather superior to that of women is obviously invalid because the tests in question were originally designed for application to soldiers, and refer largely to matters in which women were not likely to be interested. It was accordingly pointed out that whenever a test was used for diagnostic purposes certain assumptions were made. In the first place, it was assumed that the child had passed his life in a certain environment. It followed, therefore, that the simpler the environment assumed by the test, that was to say, the more it resembled the environment in which the great mass of the child population lives, the more reliable was the test for application to different age groups, different social classes, different parts of the country, and so forth. For example, the Stanford Revision of the Binet Simon Scale assumed that every child tested had come into contact with twentieth century civilisation and had since the age of not less than seven years spent a certain portion of each year within the walls of an elementary school classroom. If he had not done so, and to the extent to which he had not done so, the test might yield a result which was not wholly valid. Hence a pupil's prolonged absences from school must to some extent be taken into consideration by the 'intelligence' tester. Secondly, the test assumed that the child had made as much use as his 'intelligence' had permitted of the stimuli afforded by such an environment. He might not have done so. His low rating on the 'intelligence' scale might then be due in part to an abnormality of temperament. Our witnesses also emphasised the fact that in order to secure valid results it was most important that the tests should be applied and marked in a rigidly uniform manner. The inconsistent and paradoxical results not infrequently obtained by teachers and others who had applied the Binet tests, or other types of individual tests in a rather unsystematic way, were due to failure or neglect to observe uniformity in application or marking, or in both. Binet repeatedly calls attention to the importance of careful and uniform administration and marking in the application of his own tests, (16) and American psychologists in their writings frequently point out that the significance of results derived from tests may be gravely impaired by lack of uniformity and care in application and scoring. There was general agreement among the psychologists who gave evidence before us that tests, especially individual tests, are more satisfactorily administered if applied by a person specially trained for the work. THE MERITS AND DISADVANTAGES OF THE BINET-SIMON TESTS 57. As we have already indicated, the children for whom tests of 'intelligence' are especially useful are those who are definitely above or below the average. The celebrated Binet-Simon Scale, as finally revised by Binet himself in 1911, or in its later modifications, especially those known as the Stanford Revision, the Yerkes Point Scale, and Burt's adapted Scale for English children, (17) may be regarded as the model of all individual tests of 'intelligence' which have been devised up to the present. It has been extensively employed in this country since about 1910 as an aid to the discovery and special treatment of mentally defective and subnormal children, and to a less degree of supernormal children. It has also been used as an aid for internal classification in elementary and special schools, and to a very much smaller extent in secondary schools. The Binet tests have also been employed for the purpose of checking and supplementing the results obtained by the ordinary examinations for free places and scholarships in secondary schools, and for admission to central schools. In fact the scale is now widely recognised and used as a convenient mental foot-rule. Bearing this in mind. We took special care to ascertain in some detail the opinions of our witnesses on the merits and defects of the scale and its variants, and we give below a full summary of their views. There appears to be general agreement that the revised Binet-Simon tests are of real use for the following: (1) As an aid to the discovery and classification of subnormal and mentally defective children.While most of our witnesses drew attention to certain defects in the scale, which are set out below, there was, on the other hand, general agreement as to its merits, which may be stated as follows: (1) It was claimed that the scale was comparatively simple, and that the technique of applying it could be acquired after some training and practice by teachers and school doctors, provided it were explicitly recognised that no final validity could attach to the diagnosis of a layman who had not received a thorough training in psychological method. Some witnesses, indeed, claimed that with a relatively simple scientific invention, such as the Binet tests of intelligence, a layman might accomplish much in the way of rough preliminary diagnosis, which was formerly possible to none but the expert. It was, however, pointed out that the use of such a device by persons who had not had much psychological training must necessarily be somewhat mechanical, and confined and limited in its scope.The defects attributed to the Binet Scale may be summarised as follows: (1) It was pointed out that these tests had been originally devised as a scale of measurement to assist the administrative authorities in Paris in examining school children suspected of mental deficiency, and recommended for transfer from the ordinary elementary schools to special classes. Indeed Binet himself had expressed the opinion that his scale would be chiefly useful for application to mentally defective children. (19) Many of our witnesses, therefore, while admitting that the scale was of great service as applied to defective children, were disposed to doubt whether it was really suitable for general application to normal and more especially to supernormal children. The scale was admittedly uneven, and it was more difficult to obtain extra points as the higher end was approached. There was also the disadvantage that the mental processes tested were not the same from year to year. On the whole it was generally agreed that while the scale was tolerably accurate for the diagnosis of low grades of ability, it was, on the other hand, relatively useless for the diagnosis of high grades of ability among older children. One witness, who had conducted extensive experiments in the use of the Stanford revision of the scale, pointed out that he found that it ceased to measure accurately the 'intelligence' of the brighter children after the age of about 14 years, though it continued to measure the 'intelligence' of inferior children with sufficient accuracy. (2) Those witnesses who had made the most extensive use of the scale were agreed that it was too linguistic for English children, having been originally designed for French children who received a more elaborate training in the use of language. Tests for later ages involved to a large extent considerable range of verbal imagery. In consequence the somewhat rare type of defective child sometimes described as the subnormal verbalist, who could talk at considerable length though often irrelevantly but could do little, possessed a distinct advantage over children who showed little aptitude for the use of written language and whose vocabulary was limited, but who, nevertheless, might possess some mechanical aptitude; and even more frequently the child who was weak in the comprehension and use of language (if only through some temporary nervousness) might, unless special precautions are taken, be severely handicapped. (3) Several witnesses drew attention to the fact that a child's proficiency in the Binet tests represented the complex result of numerous intermingling factors. In addition to the two essential items - his inherited 'intelligence' and his chronological age - his performance in the tests would be affected by numerous subsidiary conditions, such as industry, goodwill, keenness, emotional stability, information acquired at school, social environment, and sex, which would inevitably impair or improve the result. (4) Most of our witnesses thought that the most potent of these subsidiary conditions was educational opportunity. Many of the tests - some of which were withdrawn by Binet in his final revision of 1911 (20) - were pure tests of school attainments. For example reading, writing and dictation were learnt in English lessons; counting, addition and subtraction of money in arithmetic lessons; drawing from the copy and drawing from memory in drawing lessons; if, therefore, a child's apparent capacity were assessed solely by the Binet Scale, it must depend in no small degree upon his class in school and his educational opportunities in the past. Conversely, it was urged that, where children's educational opportunities had been normal and equal, a child's school class must depend upon his apparent capacity, as, in theory at all events, the individual pupil was classified on entrance to the elementary school, and promoted year by year in accordance with what he had already learnt and by what he seemed likely to learn in future. (5) Several witnesses maintained that the results of the Binet tests were affected by social conditions, whether due to environment or heredity. For example, children from favourable homes were said to prove more responsive to tests than children from less favourable homes. (6) Several medical witnesses pointed out that mental fatigue was a factor which entered into the test examination in the case of nervous, mentally defective and epileptic children. (7) A few witnesses had found that children became familiar with the Binet tests, and in certain cases were specially prepared for them. This applied even to mentally defective children. (8) It was pointed out that a relatively large amount of time was needed in order to apply Binet tests adequately to individual children, as compared with the time required for the collective examination of a whole class in ordinary school subjects. (9) Several witnesses drew attention to the fact that some of the original Binet tests, having been devised for Parisian children, were not particularly suited for application to rural children in England, as they presupposed a knowledge of facts peculiar to urban life. (10) Several medical witnesses had found that the language of the Binet Scale required to be adapted to meet the needs of provincial dialects and colloquialisms in various parts of England and Wales. THE OPINIONS OF MEDICAL EXPERTS ON THE VALUE OF THE BINET-SIMON SCALE AND ITS MODIFICATIONS AS AN AID IN THE DIAGNOSIS OF MENTAL DEFICIENCY IN CHILDREN 58. Our medical witnesses informed us that the Binet-Simon tests had been extensively used as an ancillary method of examining children suspected of mental deficiency since about 1911, more especially in the London area. It was pointed out that these tests were recommended for use by school medical officers in making inquiries into the mental condition of feeble-minded children in the Reports of the Chief Medical Officer of the Board of Education for 1912 and 1913, where it was suggested that the teacher in the ordinary public elementary school should select retarded and backward children and in cooperation with the school medical officer should determine by careful examination which children needed subsequent examination by the Binet tests. (21) The actual data obtained by the application of the Binet tests to children suspected of mental deficiency formed only a part of the evidence on which the official diagnosis of cases under the Acts was based, (22) physical condition, educational attainments, family and personal history and environment being all taken into account in the consideration of such cases, the aim being to ascertain whether the child was fitted to his normal surroundings in life. At the earlier chronological ages the scale both in Binet's own final edition of 1911 and in the later editions of other psychologists had been found to be a useful guide, especially when supplemented by standardised scholastic tests in reading and in simple arithmetical processes. Our medical witnesses stated that in view of the inadequacy of the Binet tests for children at later ages it was the practice to utilise other tests in addition, for example. Burt's reasoning tests, association tests, and performance tests of various types. (23) On the whole, however, they were of opinion that the Binet Scale had proved of great value as a supplement to the medical and pedagogical methods hitherto employed in the diagnosis of supposed defective children, and that its general adoption for these purposes had as it were stabilised the intelligence tests applied by certifying officers and by teachers in special schools. MERITS AND DEFECTS ATTRIBUTED TO THE AMERICAN MODIFICATIONS OF THE BINET SCALE 59. Yerkes' Point Scale. Professor Yerkes' Point Scale (23) consists of 20 exercises, of which 19 are taken from the Binet-Simon series; the remaining one is an analogies test of the type first devised by Dr Burt. Most of Binet's information tests are omitted. Partial credit in marking is given to the various achievements of the children tested according to their merit, and not invariably by the pass or failure method of Binet. The maximum in the Point Scale is 100, and differences in standards, ages, sex and social status, norms have been calculated in terms of the average number of points scored. These standards, not having the fixity of an age scale, can be readily readjusted and quickly revised if necessary. Dr Burt and other witnesses regarded this as an undoubted advantage, though they pointed out that the same principle could be applied to the Binet Scale in its usual form by merely counting the number of tests passed, either actually or by implication, and, if necessary, assigning fractions for partial success. By using for partial performances entire points or marks in place of fractions certain specific tests in the Point Scale are assigned a maximum larger than unity and thus carry greater weight than others. Dr Burt pointed out, that, if this emphasising of certain tests were determined by their diagnostic significance the modification might be of great value, but that for English children the value and maximum mark, as suggested by Yerkes, did not at all correspond. Brevity is secured by the omission of most of the Binet tests that depend, like reading and writing, upon instruction in school, or, like the coin tests, on special experience. On the other hand, several rather inferior tests are still retained, e.g. suggestion, and the comparison of faces, weights and lines. Dr Burt, who discussed the Point Scale method in some detail, was of opinion that, while of much theoretical interest, this particular revision of the Binet Scale on the whole represented for English children no great advance on the original either for the purpose of examining borderline cases of suspected deficiency at the usual age of entrance to special schools, or for testing supernormal and scholarship children at later ages. The Stanford Revision and Extension (by Professor LM Terman). (25) The outstanding merit of this revision is the inclusion of many well thought out tests designed for children of higher ages. Other improvements effected in the original scale are the addition of better or more numerous examples of certain types of test; the allowance of more numerous trials; and the provision of a more definite method of marking, for example, partial credits for partial success in certain tests. Binet had expressed the child's intellectual power by giving his mental age in relation to his chronological age. Yerkes, in his Point Scale, indicated the same facts by giving the total points scored by the individual in comparison with the average points scored by normal children of the age of the child tested. Terman, following Stern and others, uses a somewhat similar method, in stating capacity in terms of the 'mental ratio', or in American phraseology, the 'intelligence quotient', which is obtained by dividing the child's mental age by his chronological age in order to eliminate the actual chronological age. This method of indication has its advantages, but is also open to certain objections. The chief value claimed for the mental ratio is that it seems to express the child's inborn intelligence in a more or less absolute fashion. It is in fact intended to indicate his actual intellectual capacity irrespective of his age. Terman and most other psychologists, as the result of their own experiments, maintain that the intelligence quotient, with possible slight changes, remains constant throughout the life of an individual, at any rate up to the period of puberty. Some of our witnesses who had made a careful study of the Stanford Revision and its application, seemed to think that on the whole this contention was substantiated by the facts, though it was probable in some instances, at least, that the child's intelligence quotient might vary from year to year, and that at times it might have a tendency to increase and at times to diminish. It was pointed out that while the intelligence quotient served a useful purpose in indicating to the teacher the probable intelligence of an individual pupil at each successive stage of his school career, and was important in helping to forecast the extent and character of his success in school studies, it should never be employed for classifying pupils without also taking into consideration their actual mental and chronological ages. It was obvious that children who had the same intelligence quotient might be far apart in actual acquired knowledge because of differences in mental and chronological ages. Furthermore, it was a highly abstract and succinct method of expressing the results obtained from tests and should, for practical purposes in schools, be supplemented by notes on the manner in which each individual pupil had 'attacked' the tests. It should always be borne in mind that in the present state of development of the tests the value of such 'intelligence quotients' was purely relative and in no sense final. Several witnesses maintained that for purposes of instruction children should be grouped partly on the basis of their acquired attainments and, to a lesser degree, on the basis of their chronological age. It should be added that certain considerations render the Stanford Revision tests less suitable for English than for American children. The principal shortcomings of the Revision from this standpoint may be summarised as follows: (1) The older children on whose performances this Revision of the Binet Scale was based by Terman appear to have been of a somewhat higher intellectual level than the average child in ordinary public elementary schools in England.ADVANTAGES AND DEFECTS ATTRIBUTED TO INDIVIDUAL TESTS, OTHER THAN THE BINET SCALE AND ITS MODIFICATIONS 60. Our witnesses informed us that on the whole the tests devised by Professor De Sanctis, of Rome, (26) though occasionally of use as supplementary tests in the diagnosis of mental deficiency, were hardly applicable to normal children, for whom indeed they were not intended, though the general method of the tests was of value for children who could not read or write. Several witnesses were of opinion that the only set of individual tests other than the Binet Scale and its modifications which were suitable for application to English children was Dr Burt's reasoning tests for children of 7 to 14 years of age. It was stated that as compared with the Binet tests these required more particularly the exercise of the higher intellectual processes. On the other hand, they failed to gauge certain aspects of intelligence and were in that respect inferior to the Binet tests. It seemed accordingly desirable that they should be used in conjunction with the tests of other aspects of intelligence, e.g. tests of a performance type. THE GENERAL CHARACTER OF GROUP TESTS OF INTELLIGENCE AND THEIR VALUE FOR DETERMINING EDUCABLE CAPACITY 61. There is an important distinction, in respect of their method of application, between individual tests and group tests. Individual tests are of such a character that they can be administered only to one individual at a time by one investigator. The Binet tests, or tests for a special capacity like pitch discrimination, may serve as types. Group tests are of such a character that they can be administered simultaneously to groups of individuals, larger or smaller according to the circumstances under which the testing is carried out, the size of the group being, however, immaterial as far as the character of the test is concerned. Group tests in this sense (27) are obviously highly desirable for testing on any extensive scale, and therefore raises [sic] very important practical problems. [This sentence does not appear - to me at least - to make sense as it stands. Any ideas?] As has been shown in Chapter 1 the development of group tests is of very recent date. (28) They were in the first instance based on materials of the verbal rather than the performance type, and they are still, on the whole, chiefly linguistic though by no means exclusively so. Naturally the group tests applied to very young children are largely of the performance type such as the Porteus Maze. The Alpha tests (29) employed in the American army, which were published after the end of the [First World] War, have been very extensively used in the United States for students in universities, colleges and high schools. The first of these is a directions test to determine ability to execute commands. The second is an arithmetical problem; the third consists in selecting from three possibilities the best reason for a statement; the fourth represents a list of words associated in pairs, and the examinee has to determine whether these words are associated by the principle of opposition or of likeness. The fifth is a disarranged sentence, and the examinee is required to put the words in their proper order so as to make sense. The sixth is a number completion test in which a series of numbers has to be continued according to the number indicated in the part of the series given. The seventh deals with analogies, or mixed relations; the eighth with range of information. One of the most important tests for revealing individual differences is the completion test devised by Ebbinghaus in 1905 for the purpose of investigating the fatigue of a school day in Breslau. (30) The original test consisted of a paragraph in which words with syllables omitted were presented to the examinee, who was required to fill in the omissions. Terman has a high opinion of this type of test, which, he says, discloses fundamental differences in the thought processes. The analogies or mixed relations test first used by Dr Cyril Burt in 1911 consists in presenting three words in a series, the first and second of which bear a certain relationship. The examinee is required to supply a fourth word that bears the same relationship to the third word as the second does to the first. It is claimed on behalf of this test that it is suited for discovering some of the more complex forms of intelligence. (31) This test is typical of a large number classified under the general name of association. Controlled association tests include, beside the analogies, associations of part with whole or conversely; the genus with the species or the reverse; a word with its opposite and so forth. This test admits a variation by the substitution of pictures or designs for words. The substitution test which determines quickness and accuracy of learning by substituting for one set of characters another according to a key is also included in group 'intelligence' tests. Among group tests which are extensively used may be noted those which deal with vocabulary, and which are really included under range of information, and those which elicit response to verbal orders. The latter have now been modified so that they can be used with pencil and paper. Another type of group test commonly employed at present is an exercise in the simple processes of arithmetic. This type involves concentrated attention, mental alertness and, in some instances, a considerable amount of reasoning power. The marks obtained usually exhibit an appreciable degree of relationship to general intelligence. Obviously, however, it must depend in part at least upon instruction and practice in arithmetic, and probably also upon a special arithmetical ability which is partly independent of general intelligence. It has also been claimed that some of the reading tests, particularly those constructed by Professor EL Thorndike, are able to measure some of the higher mental abilities. The examinees are required to read a paragraph, and then answer certain questions concerning it with the paragraph still before them. It will be seen from this brief description of the character of some of the more important elements in the group tests at present in use what degree of intelligence is required in order to answer them. (32) Several of our witnesses were of opinion that, on the whole, the more complex factors of judgement, inference and of logical analysis were not extensively involved. It would appear that most of the existing group tests are dependent on verbal material and that very few of them have as yet been adequately standardised. Several witnesses, however, thought that it would not be difficult to devise and standardise tests with the available non-linguistic material and yet call into play the same higher mental functions as the verbal tests. The Porteus Maze and the Healy Picture Completion were probably the best efforts in that direction. It was repeatedly pointed out that group tests, though they did not afford the same insight into the child's mind as did individual tests, were the only type that could in practice be applied when large numbers of children had to be examined. As we note in the next section, very few group tests have as yet been devised which are suitable for application to children under 10 years of age. In practice, therefore, the only types of test which are suited to young children under 10 are individual tests of intelligence and standardised scholastic tests in simple subjects such as reading and elementary arithmetic applied orally to each individual pupil. On the whole, our witnesses were of opinion that the data afforded by the use of group tests, though of very considerable value, gave only general evidence, and that in all doubtful and borderline cases they should be supplemented by individual tests, which usually produced more accurate results. THE RELATIVE MERITS AND DISADVANTAGES OF GROUP TESTS AND INDIVIDUAL TESTS OF INTELLIGENCE 62. It was generally agreed that one of the chief advantages of group or collective tests, which were set in the form of written papers, was economy of time, while the principal merit of individual tests, which were applied at an oral interview, was the insight afforded into the child's mind by personal observation of his answers and general attitude. Moreover, group tests were on the whole more finely graded than individual, and if they had been properly elaborated and standardised beforehand their application demanded relatively little special training, whereas for the satisfactory application and interpretation of individual tests, a training in experimental psychology, in the technique of application, and in the use of statistical methods, was indispensable. It was pointed out that the proper application of individual tests imposed a considerable nervous strain on the examiner, and that they were frequently of a type which might admit of special preparation beforehand, whereas in group tests the human factor, as represented by the examiner, was more likely to be kept constant. Up to the present, very few group tests had been devised which were suitable for application to children under 10 years of age. (33) In any case they were best suited for older and brighter children, and were not well adapted for very young children who had but recently acquired the art of writing. Conversely, the individual tests hitherto constructed were described as being especially useful for young children under about 10 years of age. Those devised up to the present were, however, of comparatively little use for older children, particularly above the age of 15 or 16, while, on the other hand, group tests had been constructed which were suitable for children of 16 and upwards, and even for adults. Again, group tests implied to a considerable extent the use of pen or pencil in writing, drawing, underlining, etc. When speed of performance was used as a measure of efficiency an error in rating might creep in, for all pupils did not handle pen or pencil with the same ease and facility. It followed that the results obtained were generally of the nature of statistics, and in order that the group records might be of diagnostic value, it was necessary in most instances to test pupils individually. In other words, the data obtained from group tests were probably best regarded as first aids towards mental diagnosis. We should mention, however, that one distinguished psychologist assured us that he had found the results of written group tests to be almost as efficient as those obtained from individual oral tests above the level of Standard II in elementary schools, though it was necessary to call attention to the dangers of expecting the same norms from group tests as from individual tests. The general opinion of our witnesses was that, for most practical purposes, group tests should be applied first, and should be followed up, when the services of persons competent to administer individual tests were available, by the individual testing of children whose performances in the group papers had been noticeably abnormal, whether below or above the line. Several witnesses directed attention to the fact that none of the group tests at present in use served to gauge with sufficient precision the following aspects of intelligence: (a) ability to control and concentrate attention; (b) ingenuity and inventiveness; (c) practical judgement; (d) ability to manipulate mental imagery. The great advantage of the individual tests was that they furnished the expert with concentrated material for observation. It was pointed out that it was of the highest importance that an examination should call forth maximum effort on the part of every pupil. A group or collective test (in this respect resembling the ordinary school examination) sometimes failed to do this. However carefully the examiner arranged his question it was inevitable that a few pupils should fall short of their best, and the most accurate instructions would sometimes be misunderstood. In individual testing, on the other hand, while the examiner rigidly adhered to standardised procedure he yet found it possible to vary the 'attunement' of the pupil in such a way that he secured real effort and trustworthy results. THE PLACE OF THE INTERVIEW IN THE APPLICATION OF INDIVIDUAL TESTS AND ITS TECHNIQUE 63. Several witnesses thought that the interview, so far as it related to children of school age, was, as at present conducted, frequently rather a test of social opportunity and upbringing than of intellectual capacity. The technique of interviewing had in the past been too empirical and had not been sufficiently studied for its own sake. Indeed, there was little doubt that the psychology of the interview was relatively an unexplored field. In regard to the possibility of teaching the technique of interviewing in connection with psychological tests, several witnesses thought that it should be possible to obtain help from the principles of modern psychotherapy. In general it was urged that more attention should be devoted to the place of the interview both in ordinary examinations and in the application of psychological tests with a view to drawing up a clear statement of the data which could and should be obtained by its use. One witness went so far as to say that the interview as ordinarily conducted was likely to lead to injustice because of the fluctuations in the examiner's powers as well as in those of the candidates. He himself to some extent had found it possible to render judgements arrived at in the result of an interview more objective by methods similar to those employed in the American army rating scale, combined with information given to those conducting such interviews in regard to the fluctuations which were to be expected by reason of 'sampling', or chance. In general, our witnesses were of opinion that the interview would always have its uses, as the most satisfactory results could only be obtained by combining written group tests with oral individual tests. It was pointed out that group tests were at present of little value for ascertaining qualities of temperament and character, but that the expert might in an interview glean useful information, though for a really safe estimate it would be sounder to rely on the judgement of experienced observers who had been in contact with the child during longer periods. One witness thought that it would be good policy to aim in paper examinations at measuring intellectual ability and in interviews at appraising the force and quality of the interests and pursuits of individual candidates. The interview might be employed with advantage in further scrutiny of doubtful and borderline cases which had already been picked out by means of the written answers. Another witness suggested that manual tests might in some instances be introduced into the interview, as they could seldom be applied simultaneously to a large number of children. It should be mentioned that Binet attached great importance to the technique of the interview in the application of his individual tests, and gave detailed suggestions regarding the general conditions of the oral examination. Dr Cyril Burt thought that the technique of the interview was susceptible of great improvement by the application of simple scientific principles. Something had already been done to this end by drawing up questionnaires of facts to be noted and observed and by devising rating scales for the registration of such facts in such a way as to make them comparable. HOW FAR DO TESTS OF INTELLIGENCE THROW LIGHT ON CHARACTER OR TEMPERAMENT, AND HOW FAR ARE THEY AFFECTED BY THEM? 64. Our witnesses have admitted that tests of 'intelligence', while measuring in some degree intellectual ability, are also dependent on the opportunity which the child has to learn and his interest in learning. There are several other considerations involved in the ability to perform them, the principal of which is perseverance and concentration or, in other words, the capacity to hold the mind down to a task and keep the attention alert and concentrated. The tests were not, indeed, primarily designed to reveal character, as Binet pointed out in the following passage of an article written in 1908: 'Our examination of intelligence cannot take account of all those qualities, attention, will, regularity, perseverance, teachableness and courage, which play so important a part in school work, and also in after life; for life is not so much a conflict of intelligences as a struggle between characters, and we must in fact expect that those children whom we consider to be the most intelligent will not always be those who are the most advanced in their studies'. (34) On the other hand, the exercise of will power, implying a certain measure of concentration and persistence, was involved in the performance of an intelligence test, so that, if applied by a skilful and observant tester, tests of intelligence might incidentally throw light on certain aspects of character, but of course only to a slight and limited degree. But, as Binet pointed out in several of his articles, the tests may yield unsatisfactory results owing to peculiarities in the character or temperament of those to whom they are applied. For example, it is a difficult matter to apply them satisfactorily to highly nervous children, or to clever children who are disposed to suspect that there is some catch underlying the apparently simple questions put to them by the tester. Various attempts were made by Binet to devise as adjuncts temperamental tests for the purpose of assessing conscientiousness, suggestibility, and accuracy of reproduction. (35) In the last few years systematic efforts have been made, more especially by American psychologists, to test the feelings and the will. Pressey has endeavoured to detect repulsion and forms of fear by asking the subject to select from a prearranged list of words those that have for him a special meaning, or suggest special dislike or irritation. It is claimed the Porteus Maze affords to some extent a means of measuring recklessness and impulsiveness and, as Dr Burt has pointed out, variability in repeated tests of almost any simple type, as indicated for example by the standard deviation, appears to be partially correlated with instability. On the whole, however, the psychologists who gave evidence assured us that no tests of temperament could at present claim to have passed beyond the stage of tentative experiment. Several witnesses of wide teaching experience assured us that it was improbable that gifts of character, such as earnestness of effort and the will to succeed, would ever in school work triumph over the insuperable obstacle of an extremely low intelligence quotient. Clever children would learn if they wanted; the intellectual attainment of dull children, however much they tried, would never develop beyond a certain point. On the other hand brilliant children frequently had grave faults of character which, if not corrected, limited effectively their educable capacity. Thus, the development of satisfactory quantitative tests of temperamental characteristics would provide a subsidiary estimate which would have to be taken into account in ambiguous and borderline cases. Some of our witnesses suggested that teachers should observe more systematically weakness of character as well as weakness of intellect. Often what seemed an intellectual failing was really due to some temperamental defect. It was urged, also, that there was a definite need for teachers to be able to judge the more elusive characteristics of character and temperament. In the playground, as well as in the classroom, they had excellent opportunities for such observation; and many of them, owing to their special gifts and special experience, were quick and penetrating judges of youthful character. But they sometimes seemed to find difficulty in setting down what they had noticed in clear and accurate terms. Clearly formulated upon sound psychological principles, such judgements, it was asserted, would be of much greater value than the data afforded by any existing tests of temperament or morality. CONNECTION BETWEEN EMOTION AND GENERAL EDUCABLE CAPACITY 65. In a memorandum on the question of tests of educable capacity in relation to the emotions, Mrs SS Isaacs, basing her views upon recent psychological research, held that the emotional factors in mental development must be taken into account on the ground that the level of scholastic or practical achievement might be affected by temporary or permanent emotional conditions. She pointed out that the same test might have very different effects on different children according to the 'complexes' which are aroused in the unconscious self. She held that certain mental tests of general capacity should be revised and that it was desirable that those who applied such tests should receive some training in the observation of emotional reactions, since such training yielded a fuller understanding of the emotional life and afforded a means of investigating those causes of disability and failure which had their origin in unconscious conflict. These views have been placed before several of our witnesses, both psychologists and teachers, and it has been pointed out: (1) That if emotional disturbances or aberrations are liable to affect the results of mental tests they are equally liable to affect the results of the ordinary scholastic tests. It is quite true that 'inhibition' may deleteriously affect the answer to an individual group test. For example, the use of a particular word in a vocabulary test may prevent the child from answering that particular question. It must, however, be remembered that the psychological examiner employs several tests, so that inhibitions of this type are unlikely to persist over the whole paper of group tests.THE VIEWS OF OUR WITNESSES ON THE VALUE OF STANDARDISED SCHOLASTIC TESTS BASED ON AGE PERFORMANCE FOR GAUGING EDUCABLE CAPACITY 66. Several of our witnesses were disposed to believe in the principle underlying standardised (36) scholastic tests and pointed out the desirability of drawing up a standard series of such tests and obtaining norms (36) of performance therewith at successive ages more especially for children over 11. Dr Cyril Burt thought that it would be a much simpler task to draw up a standard series of scholastic tests and to obtain norms of performance with them at every age than to construct tests of 'intelligence'. He considered that the value of such scales would be very great provided their purpose and significance were rightly understood. There was general agreement regarding the importance of obtaining for such tests not only a norm of average performance, but also some measure of the degree of individual variation. Dr Drever was of opinion that a trustworthy series of tests of attainment in the various school subjects was eminently desirable, for while it was true that the more essential and subtle effects of education did not lend themselves to direct testing, it was nonetheless certain that there were important results which did so lend themselves and that those results furnished a good, though not an infallible guide, towards an estimate of the success of education as a whole. In some instances there was not much difficulty in devising satisfactory tests, as, for example, in the fundamental processes of arithmetic: in other subjects, such as in reading or composition, the task of constructing satisfactory tests was harder, and scarcely any of those hitherto devised could be regarded as more than tentative. Some work done by a research committee of the Scottish Educational Institute indicated the possibility of satisfactory testing with respect to composition, but much research was still necessary. He thought that Dr Burt's work on the subject of scholastic tests (37) had furnished tests and norms which must supersede previous work for this country. Caution, however, must be exercised in using Burt's norms which were designed for London children. This was particularly observable in certain subjects, such as spelling for example, where other circumstances might exert a noticeable influence on the ability of the children. Thus one observer had found in an Edinburgh school an advance of a year, or a year and a half on Burt's norms for spelling, due probably to the fact that the spelling problems were easier for the Edinburgh children, owing to the more phonetic character of Scottish pronunciation. Professor Percy Nunn regarded favourably the efforts which were being made to establish norms of performance for pupils of different ages in the examinable parts of the school curriculum, but he strongly deprecated any proposal to impose their use upon the schools by external authority. If used voluntarily by a teacher as a means of finding out how his pupils stood in a given subject, they might often have great value, but their compulsory use would lead to undesirable narrowing of the curriculum in poorer schools, to unhealthy pressure and to the cramping of initiative. He also pointed out that, if standard norms of performance were established it would be necessary to vary them from time to time as the curriculum and teaching methods changed. Other witnesses were of opinion that standardised scholastic tests based on age performance would serve the same function in the realm of attainment as was served by the Binet tests in the realm of intelligence, namely, the ordering of children of a given age on the basis of comparison with the average child of that age. Furthermore a comparison of the results derived from the application of the Binet tests and their modification and the scholastic tests respectively in the case of individual children would afford material from which interesting conclusions might be drawn. We incline to the opinion that such tests even if brought to a high stage of development would not in themselves be sufficient to discover the existence of innate capacity as distinct from acquired knowledge, but would require to be supplemented by tests of intelligence. Furthermore it seems probable that pupils could be prepared for standardised educational tests with greater ease than for 'intelligence' tests, so that it would be very necessary to have numerous sets of standardised tests in each general school subject. THE VALUE OF VOCATIONAL TESTS (INCLUDING TESTS OF MANUAL ABILITY) IN DETERMINING EDUCABLE CAPACITY 67. It was pointed out that the vocational end was one of the earliest purposes with reference to which the desirability of some adequate system of testing was recognised, and that historically, therefore, vocational testing should be regarded as prior both to mental and educational testing, though scarcely any real progress had been made with vocational testing before the development of individual or differential psychology towards the end of the nineteenth century. Vocational tests might be roughly classified in accordance with the degree of congruity between the actual problem set to the examinee and the subsequent situations in view: (a) Vocational tests frequently had reference to situations of a definitely limited range, e.g. those involved in printing, or in driving an electric train. In such cases fitness for a particular vocation might be determined by presenting the candidates with precisely or approximately the same type of situation which the vocation in question required and estimating quantitatively their capacity for dealing with such a situation; subject to the condition that it must not presuppose technical skill and knowledge which had yet to be acquired. It was evident that on the whole tests of this type were more suited for application in the workshop, factory, office, or in the laboratory, than in an educational institution. Some witnesses, after pointing out the limited possibility of vocational tests designed to anticipate the type of situation which the vocation presented, recognised a growing tendency to rely on the second method of testing candidates for those special qualities which were, or were supposed to be, requisite or desirable in the particular vocation. Other witnesses, however, emphasised the greater value of the first class of tests. Incidentally, several witnesses expressed the opinion that the most pressing problem of vocational testing at the present time was an adequate psychological analysis of the various requirements of the industries under consideration. The psychologist could probably select or devise suitable tests, if only he had a precise knowledge of the processes involved in a particular vocational task and of the special capacities on which ability in that vocation depended. Unfortunately, however, such knowledge was in many instances lacking. When psychological tests had been made more satisfactory, they could be used to discover persons having a relatively high degree of the capacities required for a given occupation. Such persons were more educable in the directions indicated by the tests in the sense that they could attain a higher degree of proficiency after training than those less highly endowed. One witness thought it was desirable to devise particular tests for each trade that was at all specialised. It might be possible to have the same tests of general 'intelligence' for all, but after that various groups of trades would require tests for special capacities. It would be necessary to apply a series of such special vocational tests in order to ascertain the particular occupation for which a child was best fitted, but it was important, in the first place, to discover the general level of 'intelligence'. The opinion was expressed that, whereas in occupations which were not mainly of a routine nature the more 'intelligence' a person possessed the greater the probability of success; on the other hand, a positive lack of general 'intelligence' might be an advantage for some kinds of mechanical work. Data in support of this conclusion have actually been published. (38) Vocational tests are at present being employed in this country: (a) in chocolate factories,Several witnesses pointed out how great would be the value of trustworthy tests of vocational tendencies for entrants to trade schools and to central schools. If, for example, such tests could be used to assign pupils to the industrial or commercial type of central school respectively they would be of great service. We were informed that vocational guidance by the aid of vocational tests was already being attempted in certain elementary and trade schools in the London area. (39) THE NATURE AND VALUE OF PHYSICAL TESTS AS SUPPLEMENTARY TO THE PRECEDING 68. The connotation of the expression 'physical tests' is somewhat ambiguous, but our subcommittee defined it as covering the ordinary medical examination of the individual together with the measurement of certain bodily reactions and traits which might possibly be closely connected with - and therefore might throw light upon - mental activities. It is generally agreed that the state of health affects capacity for mental work, and, therefore, that tests of general health and vitality do measure something connected with the intelligence, that any obvious departure from normal health in a candidate should be duly noted, and that the medical reports upon candidates should be available for reference in determining the action to be taken upon the results of tests of intelligence. In this connection physical tests include the following: (a) The measurement of certain physical dimensions such as the length and breadth of the cranium, the blood pressure, lung capacity, etc.Of our witnesses Dr William Brown considered that these physical tests had some slight value; there was, according to his experience, a certain correlation between the results of physical and mental tests, although admittedly the correlation was low. The general opinion of our witnesses and of psychologists as a body is that physical tests are of little service in estimating educable capacity but are possibly of higher value in determining certain orders of vocational capacity. The aesthesiometer has been extensively used to measure the sensitory discrimination of the skin. The test most frequently applied to detect muscular control is the tapping machine, but the results up to the present have proved unsatisfactory. McDougall's plunger apparatus is sometimes employed to measure accuracy of aim as well as speed of performance. One witness who employed both these pieces of apparatus found that neither of them afforded a trustworthy measure of either practical ability or general ability. In Dr Cyril Burt's opinion these tests in general had been primarily designed to measure physical qualities, and as a consequence were of subordinate value in the determination of educable capacity. It might be possible and useful to establish a series of tests specially devised to determine both general intelligence and physical capacity with particular reference to determining the aptitude of pupils for definite vocations and for special forms of education in preparation for these vocations. The only carefully thought out series of tests known to us along these lines are the interesting researches of Dr Mumford, of Manchester, upon the boys of the Manchester Grammar School, and on a limited number of university students. These researches are still in progress. So far as they have gone they indicate a definite correlation between breathing capacity and place in school during the earlier ages and in the lower forms, where boys with a wider scope of breathing, and so, presumably, with a more rapid absorption of oxygen and discharge of carbon dioxide, were, in general, found taking the higher places in class. This relationship between respiratory capacity and mental ability became, however, less noticeable in later years, from sixteen upwards. Another series of observations directed to correlate the marks awarded to some 250 candidates at the School Certificate Examination with their physical growth and breathing capacity has been made and is in process of evaluation by Dr Caradoc Jones, lecturer in mathematics at the University of Manchester, which, when completed, will be presented before the Medical Research Committee which has aided in these researches. The application of breathing tests to a number of second year medical students appeared to show a similar correlationship between higher breathing capacity and high standing in university work. These observations of Dr Mumford (40) need to be repeated on a larger scale with various orders of boys and girls from public schools, and secondary and elementary schools situated both in towns and in the country before any generalisation can safely be made. THE DANGER OF SPECIAL PREPARATION OR 'COACHING' FOR THE TESTS 69. Several witnesses admitted that they had found in practice that children sometimes became familiar with individual tests, and in some instances had prepared for them beforehand. Such previous preparation might be the result either of communication from one child to another, or of special 'coaching' beforehand by teachers or other persons. There was little to be feared in regard to the possibility of communication from one child to another. When individual tests were applied, the pupil would generally be given, in the space of about half an hour 30 to 40 different tests, most of which comprised 3 or 4 parts. Consequently his after recollection of those tests would probably be very vague, and in practice it was found that children could give only very confused accounts of such tests afterwards. As regards group tests of 'intelligence', it was claimed that there was little danger of special preparation beforehand, since there was ample scope for indefinite variation, both in the form and in the actual matter of such tests. It was further pointed out that at present teachers seemed, as a rule, to take the keenest interest in the tests, and it was consequently improbable that any very serious attempt would be made to spoil them by 'coaching'. It could not, however, be assumed that this state of affairs would continue indefinitely, but in any case it was considered probable that coaching for group tests would defeat its own end, in view of the large number and variety of tests of that type. One witness had found in administering the tests that, when children had been practised beforehand, the correlation was much more trustworthy and that a higher average was obtained. Several other witnesses recommended that every child should receive a little preliminary exercise, when either individual tests or group tests of 'intelligence' were applied, in order to put them at their ease. Another witness suggested that all candidates might be put on a level in the matter of familiarity with tests, by being given a preliminary test just before the actual examination. On the whole, our witnesses were of opinion that the dangers of coaching were not very great, and that a little preparation beforehand might even have good results. It was pointed out that a child who had received special preparation beforehand on one test could not do another well unless he possessed the capacity tested, as routine procedure was impossible in such group tests as 'analogies', 'completions', and 'directions'. In any case, a much more trustworthy result could be obtained by applying the tests several times than by applying them once only. If this procedure were adopted, the existence of special preparation beforehand could probably be detected, and the tester would be in a position to observe the actual effects of practice and appraise the results accordingly. It was further pointed out that in testing for general ability the investigator was frequently more concerned with the character of the mental process than with the result, and that when teachers realised this they would recognise the relative futility of any special preparation beforehand. At the same time, most of our witnesses were agreed that, if in the future individual tests were brought into more general use, it would be of great importance to construct several sets of properly standardised tests as well as to provide alternative tests in the same scale. As regards standardised scholastic tests in specific school subjects, such as reading and arithmetic, it was pointed out that teachers in elementary schools were hardly likely to attempt to prepare pupils, entering at 7 or 8 years of age, beforehand for such tests. On the other hand, it was admitted that, if in the future standardised scholastic tests were devised for children in the top forms of elementary schools and in secondary schools, it would be necessary to construct a considerable number of standardised tests for the several ages in order to reduce the danger of coaching. On the whole, our witnesses seemed to be of opinion that it was more probable that older children could be successfully coached for standardised scholastic tests than for tests of 'intelligence'. SUGGESTIONS BY WITNESSES IN REGARD TO THE TRAINING REQUIRED IN ORDER TO ENABLE TEACHERS AND OTHERS TO APPLY AND MARK TESTS OF 'INTELLIGENCE' AND STANDARDISED SCHOLASTIC TESTS 70. We adopt the suggestions made to us by several witnesses that the work in connection with psychological tests of educable capacity may conveniently be classified under four heads: (a) The construction of the tests;The psychologists who gave evidence before us were agreed that the devising and standardising of all types of psychological tests should be entrusted only to trained psychologists. It was, however, pointed out that it was most desirable that psychologists should keep closely in touch with school work and school conditions when constructing such tests. In regard to individual tests of intelligence there was general agreement that for the satisfactory application of such tests and for the accurate interpretation of the data obtained by their use, careful training in experimental psychology, in the technique of applying the tests and in the use of statistical methods was indispensable. It was repeatedly pointed out that results obtained from the application of individual tests by untrained persons should always be received with the utmost caution and were in fact as a rule almost devoid of scientific value. Some of our witnesses thought that experienced teachers, provided they possessed the necessary gifts of personality, could be trained sufficiently in a short time, say two months, to apply the Binet-Simon tests, whether in their original form or in one of the later revisions. On the other hand, there was general agreement that the correct interpretation of the results obtained by the use of individual tests could only be satisfactorily carried out by persons who had received scientific training in a psychological laboratory for at least two years. As regards group tests of intelligence, our witnesses were of opinion that their satisfactory application, when they had once been properly elaborated and standardised, demanded relatively little special instruction, though it was important that the instructions given to the person who actually conducted them should be carried out with precision. The technique involved, however, was relatively simple and many witnesses thought that a teacher should be able to acquire the necessary knowledge by attending a short course in educational psychology with special reference to the significance of group tests and the technique of administering and marking them. On the other hand it was pointed out that the interpretation of the results of group tests was quite as difficult as the interpretation of the data obtained by the use of individual tests, and that it should only be undertaken by persons who had had the necessary special training in a psychological laboratory. In regard to standardised scholastic tests our witnesses were of opinion that the technique of applying them was not difficult and could as a rule be acquired by teachers after a little instruction and practice. As regards vocational tests there was general agreement that they should only be administered and interpreted by trained specialists. RECOMMENDATIONS BY WITNESSES ON THE DESIRABILITY OF ESTABLISHING A CENTRAL ORGANISATION TO TRY AND DIRECT NEW TESTS AND TO COLLATE EXPERIENCE 71. There was general agreement among the psychological experts who gave evidence before us in regard to the desirability of establishing a central committee, or clearing house, which could arrange for continuous investigation in various fields over a number of years. Furthermore, now that the first broad outlines of the tests had been worked out, the process of further refinement and standardisation could only be effected by the combined efforts of a large body of cooperating investigators, including specialists in every relevant subject, and of independent workers engaged in collecting data upon a scale sufficiently extensive for statistical analysis. Several witnesses suggested that to this end the Board of Education might set up a central committee comprising administrative officers and inspectors (both from the Board and from local education authorities), teachers, school doctors, psychologists with a knowledge of school children and trained statisticians. The work of a central committee, organised on these lines, would be not unlike that of the National Physical Laboratory in its own sphere. The central committee could keep in close touch with bodies such as the Education Section of the British Psychological Society, and the National Institute of Industrial Psychology, and the psychological departments of the various English and Welsh universities. In this connection special attention was drawn to the important work, in trying and directing new tests and in collating experience, which is being carried on by the Education Section of the British Psychological Society, and by the National Institute of Industrial Psychology. Dr Drever further suggested an additional plan of state and local intervention, namely that mental testing should be undertaken as supplementary to the medical inspection of school children, and that every educational area should have its own psychological clinic, but that on the practical side these clinics should be in close connection with the schools, with employment bureaux, appointments committees, and the like, while on the research side they should be in touch with the psychological departments of universities, and from the vocational standpoint with the National Institute of Industrial Psychology. Footnotes (1) From the strictly psychological standpoint tests of manual ability should be classified with tests of special mental activities, but as in practice they are closely associated with certain kinds of vocational tests, we have classed them here with such tests. (2) See Section 86. (3) See Mental and Scholastic Tests among Retarded Children, by Mr Hugh Gordon, HMI, Board of Education, Educational Pamphlet No. 44 (1923). (4) Several of our witnesses indicated that the considerations stated above seemed to follow from the fact that the technique of mental tests of intelligence apparently consisted in a sampling of the relatively simpler processes which enter into all intellectual functions. The ordinary examinations on the other hand sampled more complicated forms of the simpler processes. The latter processes were partly the product of innate factors, but the particular complications were largely the result of particular training. (5) e.g. St. Thomas Aquinas, cf. Alagona, S. Thomae Aquinatis Theol. Summae Compendium I. 79 (10). 'An intellegentia sit distincta potentia ab intellectu. R. Non, sed est actus intellectus'. (6) It was Sir Francis Galton, who first brought the concept prominently into notice in England. cf. His two articles on Hereditary Talent and Character in Macmillan's Magazine for 1865. (7) The Journal of Educational Psychology for March, April and May 1921, Vol. XII, Nos. 3, 4 and 5. (8) L'Annee Psychologique (1905, transl. Kite, pp. 42-44). (9) A. Binet and Th. Simon, L'Intelligence des imbecilles in L'Annee Psychologique (1909), pp. 1 - 147, summarised by Terman in The Measurement of Intelligence, p. 45. In a book published in 1909, Binet defined intelligence succinctly as follows: 'Comprehension, invention, direction and power of criticism ('censure'), intelligence lies in these four words'. Les Idees Modernes sur les Enfants, p. 118. (10) Journal of Educational Psychology, XII, 126. (11) See Appendix IX. cf. Prof. Thomson's statement of his sampling theory of ability in Essentials of Mental Measurement, by Brown and Thomson, 1921, pp. 188-192. (12) Journal of Educational Psychology, XII, 127. (13) British Journal of Psychology, V, pp. 51 - 84. (14) See also his recent work on The Nature of Intelligence and the Principles of Cognition. Macmillan & Co., 1923. (15) The 'Alpha' tests: the second, or 'Beta', series contains tests of a more practical, non-verbal type. (16) L'Annee Psychologique, (1911), pp. 166-168. (17) Burt, Mental and Scholastic Tests, pp. 24-68. (18) WB Drummond, A Binet Scale for the Blind (reprinted from the Edinburgh Medical Journal). There is also an American form of the Binet scale adapted for deaf children. (19) L'Annee Psychologique (1908), p. 85. (20) L'Annee Psychologique (1911), p. 146. (21) Report for 1912 (Cd. 7184), pp. 372 foll., and Report for 1913 (Cd. 7730), p. 321 foll. A salutary warning was added against the use of the tests by anyone not fully understanding the proper method, and the conditions and appliances necessary for trustworthy results. (22) Education Act 1921, Section 55, and Mental Deficiency Act 1913, Section 2. (23) See Annual Report of the Chief Medical Officer of the Board of Education, 1920 (Cd. 1522), p. 100 foll., and The Health of the School Child, being the Annual Report of the Chief Medical Officer of the Board of Education, 1922, pp. 111-112. See also the interesting historical account of the use of such tests up to 1910 in Dr FC Shrubsall's Report on methods of testing mental deficiency, printed in Report of 81st Meeting of the British Association, 1911, pp. 195-214. cf. also Dr Shrubsall's article in School Hygiene for August 1921. (24) Yerkes, Bridges, and Hardwick, A Point Scale for measuring ability, 1915. (25) LM Terman, The Measurement of Intelligence, Harrap & Co., 1919. (26) See Chapter 1, Section 30. (27) The expression 'group test' has occasionally been used by some writers in the sense of a test for 'group' abilities as opposed to 'special' abilities. It is never employed, however, in this sense in the present Report. (28) Chapter 1, Section 31. (29) See Appendix VIII for examples. (30) The theory underlying this 'combination' test was expounded by Ebbinghaus in an article published in 1897. Zeitchr. f. Psych, und Psys. d. Sinnesorg., xiii., p. 401. (31) See examples in Appendix VIII. (32) One of the best known sets of group tests devised by English psychologists is the Northumberland tests, constructed by Professor Godfrey Thomson. See Appendix VIII. (33) cf. Burt, Mental and Scholastic Tests, p. 221, 'With children under the age of 10 and below Standard IV the results (of group tests) will correlate less highly with intelligence'. (34) L'Annee Psychologique (1908), p. 77. (35) See Section 46. cf. also A. Binet L'Etude Experimentale de I'Intelligence, Paris, 1903. (36) See Appendix IV (Note on Standardisation and Norms). (37) Burt, Mental and Scholastic Tests, pp. 257 foll. (38) cf. Otis AS 'The Selection of Mill Workers by Mental Tests' J. of Applied Psychology, 1920. IV, 339-341. (39) See Sections 41 and 42. (40) Mumford, Alfred A, The Relation between Mental and Physical Efficiency of Boys at the Manchester Grammar School, Trans. Manchester Statistical Soc., 1921 (Dec. 14), pp. 23-45. Estimation of Physique and Stamina for School Purposes, Lancet, 1915, I. p. 115, and Estimation of Physical Fitness in Terms of Respiratory Movements of the Several Regions of the Chest, Lancet, 1922, I, p. 478. Other papers on the same subject have been published by Dr Mumford, in conjunction with Mr Mathew Young, in Biometrika, 1923, and in the Journal of Scientific Physical Training, Vol. 15, No. 43, p. 12 and its extension, the Journal of School Hygiene and Physical Education, Vol. 15, No. 44, p. 45. |