Eight types of schizophrenia? Not so fast…

Editor’s note: this guest post was contributed by ten leading psychiatric geneticists (see author list at the end of the post) in response to the headline-grabbing claims of a recent paper claiming to have identified eight genetic sub-types of schizophrenia. Similar text has also been posted on PubMed Commons and elsewhere. [DM]

In a study published on September 15, Arnedo et al. asserted that schizophrenia is a heterogeneous group of disorders underpinned by different genetic networks mapping to differing sets of clinical symptoms. As a result of their analyses, Arnedo et al. have made remarkable and perhaps unprecedented claims regarding their capacity to subtype schizophrenia. This paper has received considerable media attention. One claim features in many media reports, that schizophrenia can be delineated into “8 types”. If these claims are replicable and consistent, then the work reported in this paper would constitute an important advance into our knowledge of the etiology of schizophrenia.

Unfortunately, these extraordinary claims are not justified by the data and analyses presented. Their claims are based upon complex (and we believe flawed) analyses that are said to reveal links between clusters of clinical data points and patterns of data generated by looking at millions of genetic data points. Instead of the complexities favored by Arnedo et al., there are far simpler alternative explanations for the patterns they observed. We believe that the authors have not excluded important alternative explanations – if we are correct, then the major conclusions of this paper are invalidated.

Analyses such as these rely on independence in many ways: among variables used in prediction, absence of artifactual relationships between genotypes and clinical variables, and between the methods of assessing significance and replication. Below we identify five specific areas of concern that are not adequately addressed in the manuscript, each of which calls into question the conclusions of this study.

Ancestry/population stratification

Two of the three samples the authors studied (MGS and CATIE) have substantial proportions of subjects of European and African ancestry. The third sample is from southern Europe. Ancestry is an extremely well known confounder in genetic studies with a great capacity to yield false associations. Correct inference from genomic data in samples like these requires exceptional care. In the analyses they present, there is almost no mention of how this known bias was addressed or evaluation of its impact on their results. In the samples they used, their references to sets of SNPs that track together is essentially the definition of uncorrected population structure/stratification. Indeed, a central component of their statistical methodology – nonnegative matrix factorization – has been previously employed as a method for ancestry inference in the population genetics literature.

We were unsuccessful in attempts to obtain the full list of SNPs that Arnedo et al. analyzed. Instead, we evaluated the SNPs listed in Table S3 (448 SNP entries, 245 unique SNPs as SNPs could be present more than once, and 237 SNPs with valid allele frequencies in HapMap3). We computed the absolute value of the difference in allele frequencies between the CEU (northwest European) and YRI (Yorubas from Nigeria) groups for all HapMap3 SNPs passing basic quality control (688K SNPs genotyped using Affymetrix 6.0 arrays to match the MGS sample). We then contrasted the SNPs used by Arnedo et al. with all other affy6 SNPs. The Table S3 SNPs had markedly larger differences between a European and an African group. The mean for the absolute difference in allele frequency was 0.27 for the Table S3 SNPs used by Arnedo et al. versus 0.19 for all other SNPs. These highly significant differences underscore our concerns about population stratification bias.

The X chromosome

We noted that 15 of 237 of the SNPs in Table S3 were on chrX (again, Table S3 contains a fraction of the SNPs used in the modeling). Inclusion of chrX SNPs will partly reflect the sex of participants. Arnedo et al. say in their supplement that they include sex as a covariate in their regressions, but they do not describe how they account for sex in their matrix factorization. For example, since males have only one copy of chrX, genotypes for males will be either 0 or 1 whereas chrX genotypes for females will be either 0, 1 or 2. This difference will be salient to clustering algorithms such as those employed by the authors, so it seems likely that some component of the clusters of individuals identified by Arnedo et al. simply reflect genotype differences between sexes rather than clinical features of schizophrenia. It is well-known in statistical genetics that the sex chromosomes require special handling, but this issue is not addressed by Arnedo et al.

Linkage disequilibrium

Pairs of SNPs that are physically close in the genome are often physically correlated with one another, a phenomenon known as linkage disequilibrium (LD). Furthermore, in samples containing individuals with different ancestry, SNPs on different chromosomes whose allele frequencies differ between populations will appear to be correlated. These are both well-known phenomenon from population genetics.

The typical size of blocks defined by high LD is on the order of 20,000 bases, but LD is far from uniform across the genome. Using a large European sample genotyped with Affymetrix 6.0 arrays, we had previously computed the locations of particularly large blocks of LD (defined using SNPs with r2 > 0.5). The first step in the statistical methodology described by Arnedo et al. is to identify so-called “SNP sets” – sets of SNPs that travel together – which the authors believe contain some information about clinical subtypes of schizophrenia: “we first identified sets of interacting … SNPs that cluster within subgroups of individuals … regardless of clinical status” (no LD limitations were imposed). Of the 237 SNPs in Table S3 from Arnedo et al., 153 (65%) mapped to exceptionally large LD blocks larger than 100,000 bases (median 275kb, interquartile range 165-653kb, maximum 1.2 mb).

Arnedo et al. claim repeatedly that sets of SNPs that travel together are informative about clinical subtypes of schizophrenia. A more parsimonious interpretation of the SNP clusters identified by Arnedo et al. is that these SNPs represent a combination of (1) SNPs in large LD blocks and (2) SNPs whose allele frequencies differ substantially between European and African sample subsets. Indeed, matrix factorization algorithms similar to the methods employed by Arnedo et al. have been used to identify regions with long-range LD.

SNP selection

Arnedo et al. conducted genetic clustering analyses on 2,891 SNPs selected on the basis of in-sample P-values from analysis of association with case-control status and selected from a total of ~700,000 SNPs. It is therefore expected that linear or non-linear combinations of these SNPs will be associated with case-control status in the same sample (their risk statistic); this is true even if the selected SNPs are not truly associated. A permutation test is used to assess the significance of the observed phenotype/genotype clustering. In this permutation test, subjects are randomly allocated to “SNP sets” but, since the SNPs were selected because they differ in allele frequency between cases and controls, this procedure does not generate a valid null distribution. As a result, the reported P-values are incorrect.

The strategy used by Arnedo et al. is an example of estimation and selection of effects in a dataset and then testing (or re-estimating) them in the same data, a common pitfall of prediction analyses. To construct a valid permutation test, the authors should have randomized case-control status in the association analysis step, selected a new set of ~3,000 SNPs and generated a distribution of their coincident test index under a truly null distribution.

Replication

Replication of results is a well-acknowledged strategy for generating confidence in reported findings. Arnedo et al. state that they replicated their findings in two samples but, upon closer examination, it is unclear precisely what replicated, exactly how this was done, and whether the degree of “replication” deviated from that expected by chance. It was also unclear whether the replication control samples were or were not independent from the discovery sample. Such non-independence is another common pitfall in prediction or validation analysis.

Conclusions

Given the remarkable claims made by Arnedo et al., it is essential that alternative explanations be excluded. Unfortunately, the authors do not provide the necessary evidence. As presented, their methodology is opaque (even to experts), meaning that their results cannot be independently validated. Arnedo et al. do not consider alternative explanations for the phenomena that they observe, such as confounding from ancestry and LD, even though these are well-known issues for the statistical methods that they employ and have been studied extensively in the statistical and population genetics literature. In addition, their multistep analysis approach is subject to multiple issues as noted above.

We believe that it is highly likely that the results of Arnedo et al. are not relevant for schizophrenia. We urge great caution in the interpretation of the results of study.

Authors
Gerome Breen, PhD (Institute of Psychiatry, King’s College London, London, UK)
Brendan Bulik-Sullivan (Broad Institute, Cambridge, MA, USA)
Mark Daly, PhD (Broad Institute, Cambridge, MA, USA)
Sarah Medland, PhD (QIMR Berghofer, Brisbane, Australia)
Benjamin Neale, PhD (Broad Institute, Cambridge, MA, USA)
Michael O’Donovan, MD PhD (Cardiff University, Cardiff, UK)
Stephan Ripke, PhD (Broad Institute, Cambridge, MA, USA)
Patrick Sullivan, MD (Karolinska Institutet, Stockholm, Sweden)
Peter Visscher, PhD (University of Queensland, Brisbane, Australia)
Naomi Wray, PhD (University of Queensland, Brisbane, Australia)

(All authors contributed equally to this work)