S and G in Italian regions: Re-analysis of Lynn's data and new data

Abstract

I analyze the S factor in Italian states by reanalyzing data published by Lynn (2010) as well as new data compiled from the Italian statistics agency (7 and 10 socioeconomic variables, respectively). The S factors from the datasets are highly correlated (.92) and both are strongly correlated with a G factor from PISA scores (.93 and .88).

Introduction

One can study a given human trait at many levels. Probably the most common is the individual level. The next-most common the inter-national, and the least common perhaps the intra-national. This last one can be done at various level too, e.g. state, region, commune, and city. These divisions usually vary by country.

The study of general intelligence (GI) at these higher levels has been called the ecology of intelligence by Richard Lynn (1979, 1980) and the sociology of intelligence by Gottfredson (1998). Lynn’s two old papers cited before actually contain quite a bit of data which can be re-analyzed too. I will do so in a future post. I also decided that this series of posts will have to turn into one big paper with a review and meta-analysis. There are strong patterns in the data not previously explored or synthesized by researchers.

Lynn has published a number of papers on the regions of Italy (2010a, 2010b, 2012 and Lynn and Piffer 2014) and it is this topic I turn to in this post.

Lynn’s 2010 data

True to his style, Lynn (2010a) contains the raw data used for his analysis. It is fortunate because it means anyone can re-analyze them. His paper contains the following variables:

  1. 3x PISA subtests: reading, math, science
  2. An average of these PISA scores
  3. An IQ derived from the average
  4. Stature for 1855, 1910, 1927 and 1980
  5. Per capita income for 1970 and 2003
  6. Infant mortality for 1955 and 1999
  7. Literacy 1880
  8. Years of education 1951, 1971 and 2001
  9. Latitude.

These data are given for 12 Italian regions.

Lynn himself merely did correlational analysis and discussed the results. The data however can be usefully factor analyzed to extract a G (from the three PISA subtests) and S factor (from all the socioeconomic variables). I imported the data into R.

Lynn’s choice of variables is quite odd. They are not all from the same years, presumably because he picked them from various other papers instead of going to the Italian statistics website to fetch some himself. This opens the question of how to analyze them. I did this: I did a factor analysis (MinRes, default settings for fa() from the psych package) on the new socioeconomic data only, old data only, and all of it. The two factor analyses of the limited datasets did not reveal anything interesting not shown in the full analysis, so I only show results from the full analysis. Note that by doing this, I broke the rule of thumb about the number of variables per case (at least 2) because there are 7 variables in my analysis but only 13 cases with full data. The loading plot is:

S_loadings

This plot reveals no surprises.

The loadings for the G factor with the PISA subtests were all .99, so it is pointless to post a plot. The scatter plot for G and S is:

S_G

And MCV with reversing:

MCV_r

New data

Being dissatisfied with the data Lynn reported, I decided to collect more data. The PISA 2012 results have PISA scores for more regions than before which allows for an analysis with more cases. This also means that one can use more variables in the factor analysis. The new PISA data has 22 regions, so one can use about 11 variables. However, due to some missing data, only 21 regions were available for analysis (Südtirol had some missing data). So I settled on using 10 variables.

To get data for the analysis, I followed the approach taken in the previous post on the S factor in US states. I went to the official statistics bank, IStat, and fetched data for the regions. Like before, for MCV to work well, one needs a diverse selection of variables, so that there is diversity in their S loadings (not just direction of loading). I settled on the following 10 variables:

  1. Political participation index, 9 years
  2. Percent with normal weight, 9 years
  3. Percent smokers, 10 years
  4. Intentional homicide rate, 4 years
  5. Total crime rate, 4 years
  6. Unemployment, 10 years
  7. Life expectancy males, 10 years
  8. Total fertility rate, 10 years
  9. Interpersonal trust index, 5 years
  10. No savings percent, 10 years

For all variables, I calculated the mean for all years. I fetched the last 10 years for all data when available.

For cognitive data, I fetched the regional scores for reading, mathematics and science subtests from PISA 2012, Annex B2.

Factor analysis

I proceeded like above. The loadings plot is:

S2_loadings

There are two odd results. Total crime rate has a slight positive loading (.16) while intentional homicide rate has a strong negative loading (-.72). Lynn reported a similar finding in his 1980 paper on Britain. He explained it as being due to urbanization, which increases population density which increases crime rates (more opportunities, more interpersonal conflicts). An alternative hypothesis is that the total crime rate is being increased by immigrants who live mostly in the north. Perhaps one can get crime rates for natives only to test this. A third hypothesis is that it has to do with differences in the legal system, for instance, prosecutor practice in determining which actions to pull into the legal system.

The second odd result is that fertility has a positive loading. Generally, it has been found that fertility has a slight negative correlation with GI and s factor at the individual level, see e.g. Lynn (2011). It has also been found that internationally, GI has a strong negative relationship, -.5 to -.7 depending on measure, to fertility (Shatz, 2008; Lynn and Harvey, 2008). I also found something similar, -.5, when I examined Danish immigrant groups by country of origin (Kirkegaard, 2014). However, if one examines European countries only, one sees that fertility is relatively ‘high’ (a bit below 2) in the northern countries (Nordic countries, UK), and low in the southern and eastern countries. This means that the correlation of fertility between countries in Europe and IQ (e.g. PISA) is positive. Maybe this has some relevance to the current finding. Maybe immigrants are pulling the fertility up in the northern regions.

There is little to report from the factor analysis of PISA results. All loadings between .98 and .99.

Scatter plot of S and G

S2_G2

MCV with reversing

MCV2_r

Inter-dataset scatter plots

To examine the inter-dataset stability of factor scores:

S_S2 G_G2

For one case, the Lynn dataset had data for a merged region. I merged the two regions in the new dataset to match it up against the one from Lynn’s. This is the conservative choice. One could have used Lynn’s data for both regions instead which would have increased the sample size by 1.

Discussion

The results for the regional G and S in Italian regions are especially strong. They rival even the international S factor in their correlation with the G estimates. Italy really is a very divided country. Stability across datasets was very strong too, so Lynn’s odd choice of data was not inflating the results.

MCV worked better in the dataset with more and more diverse indicator variables for S, as would be expected if the correlation was artificially low in the first dataset due to restriction of range in the S loadings.

Supplementary material

All project files (R source code, data files, plots) are available on the Open Science Framework repository.

Thanks to Davide Piffer for catching an error + help in matching the regions up from the two datasets.

References

  • Gottfredson, L. S. (1998). Jensen, Jensenism, and the sociology of intelligence. Intelligence, 26(3), 291-299.
  • Kirkegaard, E. O. (2014). Criminality and fertility among danish immigrant populations. Open Differential Psychology.
  • Lynn, R. (1979). The social ecology of intelligence in the British Isles. British Journal of Social and Clinical Psychology, 18(1), 1-12.
  • Lynn, R. (1980). The social ecology of intelligence in France. British Journal of Social and Clinical Psychology, 19(4), 325-331.
  • Lynn, R., & Harvey, J. (2008). The decline of the world’s IQ. Intelligence, 36(2), 112-120.
  • Lynn, R. (2010a). In Italy, north–south differences in IQ predict differences in income, education, infant mortality, stature, and literacy. Intelligence, 38(1), 93-100.
  • Lynn, R. (2010b). IQ differences between the north and south of Italy: A reply to Beraldo and Cornoldi, Belacchi, Giofre, Martini, and Tressoldi. Intelligence, 38(5), 451-455.
  • Lynn, R. (2011). Dysgenics: Genetic deterioration in modern populations. Second edition. Westport CT.
  • Lynn, R. (2012). IQs in Italy are higher in the north: A reply to Felice and Giugliano. Intelligence, 40(3), 255-259.
  • Piffer, D., & Lynn, R. (2014). New evidence for differences in fluid intelligence between north and south Italy and against school resources as an explanation for the north–south IQ differential. Intelligence, 46, 246-249.
  • Shatz, S. M. (2008). IQ and fertility: A cross-national study. Intelligence, 36(2), 109-111.

Reviews

License

This article and its reviews are distributed under the terms of the Creative Commons Attribution 4.0 International License, which permits unrestricted use, distribution, and redistribution in any medium, provided that the original author and source are credited.