JOURNAL TRANSCRIPT
Diabetologia (2014) 57:1528–1541 DOI 10.1007/s00125-014-3270-4
REVIEW
Genetic susceptibility to type 2 diabetes and obesity: from genome-wide association studies to rare variants and beyond Niels Grarup & Camilla H. Sandholt & Torben Hansen & Oluf Pedersen
Received: 13 December 2013 / Accepted: 22 April 2014 / Published online: 24 May 2014 # Springer-Verlag Berlin Heidelberg 2014
Abstract During the past 7 years, genome-wide association studies have shed light on the contribution of common genomic variants to the genetic architecture of type 2 diabetes, obesity and related intermediate phenotypes. The discoveries have firmly established more than 175 genomic loci associated with these phenotypes. Despite the tight correlation between type 2 diabetes and obesity, these conditions do not appear to share a common genetic background, since they have few genetic risk loci in common. The recent genetic discoveries do however highlight specific details of the interplay between the pathogenesis of type 2 diabetes, insulin resistance and obesity. The focus is currently shifting towards investigations of data from targeted array-based genotyping and exome and genome sequencing to study the individual and combined effect of low-frequency and rare variants in metabolic disease. Here we review recent progress as regards the concepts, methodologies and derived outcomes of studies of the genetics of type 2 diabetes and obesity, and discuss avenues to be investigated in the future within this research field. Keywords Adiposity . Body mass index . Genetics . Genome-wide association . Glycaemic traits . Obesity . Review . Sequencing . Type 2 diabetes
N. Grarup (*) : C. H. Sandholt : T. Hansen : O. Pedersen The Novo Nordisk Foundation Center for Basic Metabolic Research, Faculty of Health and Medical Sciences, University of Copenhagen, DIKU Building, Universitetsparken 1, 2100 Copenhagen Ø, Denmark e-mail:
[email protected] T. Hansen Faculty of Health Sciences, University of Southern Denmark, Odense, Denmark
Abbreviations DIAGRAM DIAbetes Genetics Replication And Meta-analysis GIANT Genetic Investigation of ANthropometric Traits GWAS Genome-wide association study MAF Minor allele frequency MAGIC Meta-Analyses of Glucose and Insulin-related traits Consortium SNP Single-nucleotide polymorphism
Introduction On a global scale, the prevalence of obesity and type 2 diabetes is increasing dramatically, and WHO reports that more than 500 million people are obese [1] and 346 million have diabetes, 90% of whom have been diagnosed with type 2 diabetes [2]. Lifestyle and environmental factors are crucially important in the development of obesity and type 2 diabetes. Important risk factors for obesity are physical inactivity, excessive energy intake, depression, sleep disorders and low socioeconomic status, while major risk factors for type 2 diabetes include obesity, especially visceral fat deposition, physical inactivity, smoking, male sex, high age, sleep deprivation, urbanisation, low-socioeconomic status and ethnicity [3–6]. In addition, 40–70% of BMI variation is explained by genetic factors [7, 8] and, similarly, the increase in type 2 diabetes risk associated with having a sibling with type 2 diabetes is two- to threefold [9]. Family studies have shown similar heritability estimates of 50–60% for BMI and type 2 diabetes [10]. Since 2007, an explosion in our knowledge of specific genetic risk factors for obesity, type 2 diabetes and related phenotypes has taken place, mainly brought about by genome-wide association studies (GWASs). Here we review
Diabetologia (2014) 57:1528–1541
1529
recent progress in concepts, methodologies and derived outcomes of studies of the genetics of type 2 diabetes and obesity, and we predict some of the directions this research field could take in the near future.
GWASs to discover the genetic basis of type 2 diabetes and obesity Until 2007, genetic mapping of complex diseases such as type 2 diabetes and obesity was primarily achieved by genetic linkage analyses or candidate gene association studies, both of which have implicit shortcomings related to their design, which limit their application. At the same time, study sample sizes were generally too low to reach sufficient statistical power. However, progress in identifying common variants associated with type 2 diabetes and obesity has since been rapid, primarily as a consequence of technological advances in array-based genotyping, which paved the way for GWASs, together with increased sample sizes from international collaborations. These advances have led to the discovery of a wealth of genomic loci convincingly associated with complex metabolic traits [5–7].
Genetics of type 2 diabetes and glucose homeostasis: what is known? As of the beginning of 2014, 90 genetic loci have been firmly established as type 2 diabetes risk loci (Fig. 1) [11–25]. The Fig. 1 Venn diagram of intersection between loci associated at genome-wide significance with type 2 diabetes, measures of adiposity and glucose homeostasis. Genome-wide significant associations for six metabolic traits are shown. Gene symbols shown in the plot are by convention the closest gene and not necessarily the functional gene
risk variant in the TCF7L2 locus, which was discovered in 2006 by a positional linkage strategy in the Icelandic population [26], remains the most influential common type 2 diabetes variant (allelic OR ~1.46) [27]. While GWASs of type 2 diabetes have been highly successful, other type 2 diabetes-associated loci have been identified through studies of quantitative diabetes-related traits. These efforts have discovered 72 loci associated with quantitative traits reflecting glucose homeostasis, i.e. fasting glucose, fasting insulin, 2 h glucose during an OGTT and HbA1c [16, 28–31]. Many of these loci are also associated with type 2 diabetes, yet the overlap between loci for these traits is not extensive (Fig. 1). In recently published reports, a mere 13 of 37 variants associated with fasting glucose were also associated with type 2 diabetes at what are considered statistically significant levels for GWASs [13, 29]. These findings indicate that some genetic variants may exert general modifying effects on fasting glucose levels in the population, while others have specific thresholds at which the genetic effect sets in, thereby conferring risk of type 2 diabetes without modifying levels of fasting glucose at the population level. Genetic variants associated with type 2 diabetes and glucose homeostasis can shed light on the relationship between genetically induced defects in insulin secretion and insulin action in the pathogenesis of type 2 diabetes. Most of the genetic variants found in the first GWASs were demonstrated to primarily cause a decrease in glucose-stimulated insulin response [32, 33]. A more detailed picture of beta cell pathogenesis is now emerging showing the specific impact of individual risk variants through studies of more exact BMI
Type 2 diabetes ANK1 AP3S2 BCAR1 BCL11A BCL2 CCND2 CDC123 CENTD2 CHCHD9 CILP2 DUSP9 FAF1 FAM58A GATAD2A GPSM1 GRK5 HCCA2 HHEX HMG20A HMGA2 HNF1A HNF1B HNF4A INS JAZF1 KCNJ11 KCNK16 KCNQ1 KLF14 KLHDC5 LAMA1 LEP LPP MACF1 MAEA MPHOSPH9 NOTCH2 PAM PAX4 PEPD POU5F1 PRC1 PSMD6 PTPRD RASGRP1 RBM43 RBMS1 SGCG SLC16A13 SPRY2 SREBF1 SRR ST6GAL1 THADA TLE1 TLE4 TMEM154 TMEM163 TP53INP1 TSPAN8 UBE2E2 VPS26A WFS1 ZFAND3 ZFAND6 ZMIZ1
BDNF CADM2 CDKAL1 ETV5 FAIM2 FANCL FLJ35779 GALNT10 GNPDA2 GP2 GPRC5B KCTD15 KLF9 LRP1B LRRN6C MAP2K5 MIR148A MTCH2 MTIF3 NEGR1 NUDT3 PCSK1 PRKD1 PTBP2 QPCTL RBJ RPL27A SEC16B SH2B1 SLC39A8 TMEM160 TMEM18 TNNI3K ZNF608
Waist circumference
WARS2 DNM3 NISCH CPEB4 LY86 BTNL2 VEGFA NFE2L3 ZEB1 ITPR2 HOXC13 ZNRF3
NRXN3 TFAP2B
MC4R
WHR
MSRA
AD AM
TS
9
Fasting glucose
QPCTL/ AD CD CY5 GIPR K C IGF N2A 2CD RR 2BP2 DGKB 4A CD EB 1 S MTNR GCK KAL1 LC 3 1B GL ZB 0A8 PRO IS3 CA ADR ED TC X FA MK2 A2A A 3 F7 1 M1 B C L2 M IKB 48B RY2 T AR FO A K D P A N X P2 RX P KL A2 G LZ D 1 C12 P 2 P MA orf 6 5 C D PC YS SL SK1 D M 2 GL L5 FA 1 C2 R A2 PCSK PL3 S2 G DS1 3 R TO P1 1 PD OR4 B10 S WA X1 RS SIX 1 3
FTO
GRB14/ COBLL1
AR LY SPO PL AL 3 1
Fasting insulin
GCKR IRS1 PPARG ANKRD55 ARL15 PPP1R3B
FAM13A1 HIP1 IGF1 PDGFC PEPD SC4MOL TET2 UHRF1BP1 YSK4
1530
physiological phenotypes and functional molecular genetic studies. Specific defects in glucagon-like peptide 1-stimulated insulin secretion, glucose-stimulated insulin secretion, insulin exocytosis, insulin granule docking or post-transcriptional processing of insulin have been demonstrated to be associated with different variants, supporting the notion that a range of biological processes are involved in the pathogenesis of type 2 diabetes [33–38]. As for other epidemiological studies of insulin secretion, these efforts are impeded by the difficulty of accurately quantifying insulin secretion in an epidemiological setting. The insulin response is generally assessed as the secretion of insulin in response to a number of different secretagogues, all of which provide different physiological information [39]. Therefore, studies of this trait tend to be small and statistically underpowered to detect the modest effect of single SNPs. In the first GWASs of type 2 diabetes and quantitative glucose homeostasis traits, few variants were shown to have an effect on insulin sensitivity [16, 40]. Of interest, the number of SNPs associated with insulin sensitivity has recently increased as a result of larger samples sizes, by the inclusion of concomitant levels of obesity as a covariate in regression models and by implementation of a joint test investigating the main effect while allowing for an interaction effect [29, 30]. Thus, many primary genetic defects in insulin sensitivity may not be mediated by levels of obesity.
Diabetologia (2014) 57:1528–1541
loci have been identified by analysing WHR adjusted for BMI, no overlap exists between these and the 39 BMI loci (Fig. 1). Within the obesity-GWAS framework, case–control approaches in children, adolescents and adults have also been widely used [49–53]. The earliest studies were relatively small and substantial overlap with both BMI and body composition loci were detected. Two recent studies have increased the number of loci to 19, identifying 15 non-overlapping loci associated with clinical obesity among children and adults [54, 55]. There are several possible explanations for the partial genetic overlap of BMI and clinical definitions of obesity. While GWASs of BMI have focused on SNPs associated with mean BMI in populations, studies indicate that effects for many loci are not uniform across the BMI distribution. In studies of childhood BMI, the effect of several loci, including FTO, was stronger in the upper tail of the BMI distribution [56]. Along the same lines, a recent study showed that FTO genotype, which has been shown to be convincingly associated with mean BMI, is associated with variance in BMI [57], and since variance in BMI increases with BMI, this observation may explain the association of FTO with both mean BMI and severe clinical obesity. In addition, these studies point to interaction between genotype and measured or unmeasured environmental factors. Overall, GWASs have to date successfully identified more than 80 different loci associated with adiposity phenotypes. These results point to the existence of aetiologically distinct subsets of extreme phenotypes.
Genetics of obesity and measures of body fat distribution The discovery of variants associated with measures of adiposity through GWAS follows much the same story as identification of the type 2 diabetes risk variants. The discoveries have predominately involved individual studies and metaanalyses using BMI as a quantitative measure of adiposity. The first studies, which included relatively few individuals, identified two loci, FTO and MC4R [41, 42]. The necessity to increase statistical power led to larger studies that included ~32,000 individuals, and as a result an increased number of loci were identified [43, 44]. The largest meta-analysis performed using BMI as a measure of obesity included ~250,000 individuals and increased the number of identified BMI loci to 32 [45]. FTO, the first GWAS-identified obesity locus, remains the one with the largest effect, imposing an allelic 0.39 kg/m2 increase in BMI [41]. In parallel with the studies of BMI, a crude measure of overall adiposity, GWASs of quantitative measures attempting to capture abdominal obesity and specific elements of fat distribution, such as waist circumference and WHR, have also been performed. Studies including up to ~77,000 individuals have identified 19 loci associating with measures of body composition [46–48], the vast majority associating with BMI-adjusted WHR [48]. Since most of the body composition
Transferability of genetic loci across ancestry groups Although thus far the majority of GWASs have been performed on European individuals, a number of important studies of other ethnicities are emerging. These studies have reported novel loci such as KCNQ1 and C2CD4A associated with type 2 diabetes in Japanese individuals [17, 58, 59] and a number of loci for type 2 diabetes in East Asians [20, 22]. For obesity, studies have identified risk variants in PCSK1, GP2 and GALNT10 loci in Asian or African populations [60, 61]. Of interest, studies comparing associations in individuals of different ethnicities can shed light on the shared genetic vulnerability across ethnic groups and possibly add to finemapping efforts in associated loci. For type 2 diabetes, studies have found directionally consistent effects for known loci across ancestry groups [62, 63]. Comparative studies across ancestries in a recent GWAS of four ancestry groups showed that the effects of the many common variants not reaching statistical significance at a genome-wide level are homogenous across ancestry groups, and a trans-ancestry metaanalysis revealed seven novel genome-wide significant loci [25]. In addition, the study showed that fine-mapping associated loci can be improved by taking advantage of ancestry
Diabetologia (2014) 57:1528–1541
differences in linkage disequilibrium. Similarly, studies of obesity have shown highly comparable effects of common variants across major ancestry groups, strongly supporting shared common BMI and obesity loci across populations [60, 61, 64], although ancestry-specific loci have also been shown, such as KLHL32 in Africans and KLF9 in Asians [61, 64].
Glossary 1000 Genomes Project The 1000 Genomes Project, launched in January 2008, is an international research effort to establish a detailed catalogue of human genetic variation. Scientists planned to sequence the genomes of 2,500 participants from a number of different ethnic groups Allele One of a number of alternative forms of the same gene or same genetic locus De novo mutation An alteration in a gene that is present for the first time in one family member as a result of a mutation in a germ cell (egg or sperm) of one of the parents or in the fertilised egg itself Epistasis When the effect of one gene depends on the presence of one or more ‘modifier genes’ (genetic background). Also referred to as gene–gene interaction Exome The protein coding part of the human genome. The exome of the human genome consists of roughly 180,000 exons, constituting about 1% of the total genome, or about 30 megabases of DNA Heritability The proportion of phenotypic variation of a trait that is due to underlying genetic variation Imputation In genetics, imputation refers to the statistical inference of unobserved genotypes. It is achieved by using known haplotypes in a reference population, such as the 1000 Genomes Project, thereby allowing non-genotyped genetic variants to be tested for association with a trait of interest Linkage disequilibrium A non-random association between alleles at different loci Minor allele frequency Ranging from 0% to 50%, this is the proportion of alleles at a locus that contain the less frequent allele. Private variants Variants restricted to probands and immediate relatives Sequencing depth In DNA sequencing, depth refers to the number of times a nucleotide is read during the sequencing process. Deep sequencing indicates that the depth of the process is many times larger than the length of the sequence under study
1531
Common features of the genetic associations with type 2 diabetes, glycaemia and obesity Implicit in the initial design, GWAS-identified variants in type 2 diabetes, glycaemia and obesity are common (minor allele frequency [MAF] >5% in the population). Risk variants exert modest effect sizes on disease risk and variation in phenotype, and for the majority of loci, the causative variant and gene is unknown. For the majority of loci, the most strongly associated variant is not a coding variant but instead resides in an intron or in a non-coding sequence between genes. In addition, high correlation (i.e. linkage disequilibrium) between physically closely located markers makes it difficult to prove causality for associated variants. However, the causative variant and the molecular mechanism of action have been identified for some loci. In the GCKR locus an intronic variant was shown to associate with type 2 diabetes and fasting glucose [65], while subsequent studies detected an amino acid-changing variant (GCKR p.P446L) that was demonstrated to be the causative variant influencing hepatic glucose uptake [66, 67]. Similarly, follow-up studies of TCF7L2 have shown that the originally identified intronic rs7903146 variant is probably the causative SNP, which presumably regulates expression of alternative TCF7L2 isoforms in several target tissues [68–71]. Many of the loci do not contain genes with known biological relevance to obesity or type 2 diabetes, providing an opportunity for novel biological investigations. Finally, common risk variants have been found in a number of genes known to be mutated in monogenic subsets of non-autoimmune diabetes (GCK, HNF1A, HNF1B, HNF4A, PPARG, KCNJ11, GLIS3 and WFS1) [11, 12, 16, 72–80] or obesity (MC4R, POMC, LEPR, BDNF, SH2B1, PCSK1 and NTRK2) [42, 45, 81–87].
Genetic overlap of obesity and type 2 diabetes: epidemiological vs genetic correlation Although type 2 diabetes and obesity are highly interrelated from both epidemiological and pathophysiological viewpoints, the shared genetic aetiology imposed by hitherto identified common variants is limited (Fig. 1). Of 90 loci associated with type 2 diabetes and 56 loci associated with standard measures of adiposity, merely five loci are shared (FTO, MC4R, ADAMTS9, GRB14/COBLL1 and QPCTL/GIPR). Furthermore, at two of these loci (ADAMTS9 and GRB14/ COBLL1), different and only partially correlated genetic variants are responsible for the associations, which brings into question whether they share functional disease mechanisms [11, 13, 14, 48]. There are many possible reasons for this apparent lack of genetic overlap. For example, the associations between genetic loci and traits shown in Fig. 1 are for associations at genome-wide statistical significance and
1532
Diabetologia (2014) 57:1528–1541
therefore do not include shared associations below the level of significance, which may still be genuine. Some light can be shed on the relationship between SNPs associated with type 2 diabetes or glycaemic traits and associations with measures of obesity using online large-scale databases of GWAS results. Figures 2 and 3 illustrate the correlated effects of SNPs associated with type 2 diabetes or BMI for a specific set of GWAS SNPs for a particular trait, comparing the effect on the primary trait with other correlated metabolic traits. For SNPs primarily associated with BMI, there seems to be a positive correlation between the effect size on BMI and the effect of the same SNP on type 2 diabetes (Fig. 2a). This finding indicates that the major reason why so few BMI-associated SNPs have been shown to associate with type 2 diabetes at genome-wide significance is a lack of statistical power to detect the minute derived type 2 diabetes risk increments imposed by BMI-associated variants. Similarly, there is a positive correlation between the effect of BMI SNPs on BMI and on the diabetes-related quantitative traits, fasting glucose and fasting insulin (Fig. 2b, c). In contrast, when looking at the effect sizes on BMI of SNPs associated with type 2 diabetes, there is no obvious correlation between effects; rather, it seems that most of the type 2 diabetesassociated variants have no impact on BMI per se (Fig. 3a). Similar observations are evident when comparing the effects of type 2 diabetes-associated SNPs on type 2 diabetes and
b
1.15 FTO
1.10 MC4R
1.05
1.00
Effect of fasting glucose (mmol/l)
Effect of type 2 diabetes (OR)
a
0.95
QPCTL/GIPR
0.015
0.010
0.005
0
−0.005 0.00
0.02
0.04
0.06
0.08
0.00
Effect on BMI (SD)
0.02
0.04
0.06
0.08
Effect on BMI (SD)
d
1.5
FTO
0.010
TMEM18 LRRN6C RPL27A SH2B1 GNPDA2
1.0
Effect of WHR (SD)
c Effect of fasting insulin (%)
Fig. 2 Correlation of effects of BMI-associated loci on BMI in relation to effects on type 2 diabetes and metabolic traits. Each dot shows the effect of a BMI-associated variant on BMI in relation to the effect on (a) type 2 diabetes, (b) fasting glucose, (c) fasting insulin and (d) WHR adjusted for BMI. Dots coloured red indicate an association ( p