| | |
| www.dg.dial.pipex.com | 859 readers since 14 Aug 2006 |
Hadow (1924) Notes on the text
|
The Hadow Report (1924)
Psychological tests of educable capacity and their possible use in the public system of education
Appendix V NOTE BY Dr CYRIL BURT ON CORRELATION AS APPLIED TO MENTAL TESTING
The statistical device known as correlation is widely used by education psychologists for measuring the validity of their methods. To test a given test, it is necessary to compare the results obtained from large and representative groups of children, first with later applications of the same test, and secondly with an independent criterion, such as the estimate of a competent teacher who knows intimately all the members of the group. A coefficient of correlation is an index number intended to measure, on a scale of 0 to 1, the amount of agreement between two such series of estimates. Where the agreement is perfect, the figure is unity and positive, (+1.00); where there is no agreement whatever, the correlation is zero (0.00); where disagreement is at a maximum, one estimate exactly reversing the order of the other, the coefficient is negative (-1.00). Where agreement is more or less imperfect, the coefficient is a fraction, ranging between these extremes. The following hypothetical instances illustrate both the nature of correlation generally and what is perhaps the most precise and practicable formula for calculating it in a given case. (See Table 1). Table 1 The nature and calculation of coefficients of correlation (Rank Method) Suppose, for simplicity, there is a small class of ten children (named AB, CD, etc., see col. 1), ranked in order of intelligence by the teacher's estimate from 1st to 10th (col. 2). (In what follows this order will be assumed to be the final criterion). Let us further suppose that to these children some single test is applied - for example a scholarship examination - which gives for the same class a second ranking in order of intelligence. The problem is: how closely does the second order correspond with the first, and how can we measure different degrees of such correspondence? Clearly, the lack of correspondence can readily be gauged by counting up the total number of discrepancies between the two rankings. The fewer the discrepancies, the higher will be the measure of agreement. The absolute number of differences in ranking must tend, of course, to increase with the size of the group. We can allow for this by first determining for a group of this size the maximum number of differences in ranking obtainable, or (more simply) the probable number of discrepancies to be expected by pure chance. The proportion between the actual number of discrepancies and the expected number of discrepancies measures the amount of disagreement; and, by subtracting this proportion from unity, we obtain a measure of positive agreement. (See formula at foot of table.) Professor Spearman's simple 'foot rule' formula does, in fact, consist in this simple calculation. (i) Suppose, first, (example A) that the ranking supplied by the test (col. 3) is identical with that supplied by the teacher (col. 2). There will be no discrepancies (col. 4). The sum total of differences in ranking will be zero. It is clear that the formula proposed will at once yield a coefficient of +1.00. The correlation is positive and complete. (ii) Suppose now (example B) that the test result bears no relation whatever to the result of the teacher's estimate. It is a purely random order (col. 5). Here, for instance, the order given was obtained by shaking up ten dominoes in a dice box, and drawing them one by one, as it were, by lot. Clearly, the discrepancies so obtained should be almost identical with the number of discrepancies that we might expect between any two chance arrangements. It will be seen that the nature of the formula yields a coefficient of approximately zero. (iii) Thirdly, suppose (example C) that the test exactly reverses the order of the teacher, so that the child whom the teacher put at the top of the list is placed by the test at the bottom, and the child whom the teacher placed last is placed first by the test. It is clear that the discrepancies will here be a maximum. And the application of the formula necessarily yields a coefficient of -1.00. (In the formula proposed the only difficult portion is to calculate what, for a group of a given size, is the total number of differences in ranking to be expected by pure chance. This is deduced from the mathematics of probability. (It is best, first of all, to square all the rank-differences - a well-known algebraical device for eliminating the differences of plus and minus signs (cols. 7, 10, 13). This squaring constitutes the main difference between the present formula and the simpler 'foot rule' of Professor Spearman.) The non-mathematical can readily understand the calculation by looking more closely at the extreme case of negative correlation (example C). He will perceive that the maximum number of differences obtainable is, by the very nature of the computation to be made, exactly equal to twice the sum of the alternate numbers from 0 to the number just before the total number in the group, i.e. since the group contains ten children, the sum of the odd numbers from 1 to 9 (col. 9).(iv) Coefficients of exactly unity or exactly zero are rare. The kind of ranking actually obtained in such an examination is illustrated by the ranking given in example D (col. 11). Here it will be seen that the children marked 1½ and the children marked 5 are 'ties'; and have therefore been given the midway ranking. Applying the formula just deduced, the coefficient proves to be about .7. With an actual test this would be considered rather a high degree of correlation. As a matter of fact, however, the ranking given in this illustration (col. 11) was obtained by averaging the teacher's estimate (col. 2) with the chance order (col. 5), thus giving both an equal weight. The distribution of the share of each is, as it were, fifty-fifty. The uninstructed teacher is apt to assume that, with a correlation of .7, the part played by chance is much smaller, in the ratio of (1-.7) to .7, that is 3 parts out of 10. As a matter of fact, the coefficient must first be squared; and the proportion of chance and of intelligence is as (1.00-.712) to .712, that is (since .712=.5 approx.) 5 parts out of 10. Thus, if the correlation coefficient is the best measure of agreement between two series, the square of the coefficient is the best measure of the part contributed by the factor common to both. |