Published ahead of print on April 7, 2005, doi:10.1165/rcmb.2005-0073OC
© 2005 American Thoracic Society DOI: 10.1165/rcmb.2005-0073OC Attempted Replication of Reported Chronic Obstructive Pulmonary Disease Candidate Gene AssociationsChanning Laboratory, Pulmonary and Critical Care Division, and Hematology/Oncology Division, Department of Medicine, Brigham and Women's Hospital; Harvard Medical School; Department of Biostatistics, Harvard School of Public Health; and Veterans Affairs Medical Center, Boston, Massachusetts Correspondence and requests for reprints should be addressed to Edwin K. Silverman, M.D., Ph.D., Channing Laboratory, Brigham and Women's Hospital, 181 Longwood Avenue, Boston, MA 02115. E-mail: ed.silverman{at}channing.harvard.edu
Case-control studies have successfully identified many significant genetic associations for complex diseases, but lack of replication has been a criticism of case-control genetic association studies in general. We selected 12 candidate genes with reported associations to chronic obstructive pulmonary disease (COPD) and genotyped 29 polymorphisms in a family-based study and in a case-control study. In the Boston Early-Onset COPD Study families, significant associations with quantitative and/or qualitative COPD-related phenotypes were found for the tumor necrosis factor (TNF)- 308G>A promoter polymorphism (P < 0.02), a coding variant in surfactant protein B (SFTPB Thr131Ile) (P = 0.03), and the (GT)31 allele of the heme oxygenase (HMOX1) promoter short tandem repeat (P = 0.02). In the case-control study, the SFTPB Thr131Ile polymorphism was associated with COPD, but only in the presence of a gene-by-environment interaction term (P = 0.01 for both main effect and interaction). The 30-repeat, but not the 31-repeat, allele of HMOX1 was associated (P = 0.04). The TNF 308G>A polymorphism was not significant. In addition, the microsomal epoxide hydrolase "fast" allele (EPHX1 His139Arg) was significantly associated in the case-control study (P = 0.03). Although some evidence for replication was found for SFTPB and HMOX1, none of the previously published COPD genetic associations was convincingly replicated across both study designs.
Key Words: association studies case-control studies emphysema genetics single nucleotide polymorphism
Case-control association analysis is a commonly used study design in the field of complex trait genetics. Susceptibility genes successfully identified using this approach include the associations between Factor V Leiden and venous thromboembolism (1), Apolipoprotein E4 and Alzheimer's disease (2), and peroxisome proliferatoractivated receptor- (PPAR ) and type 2 diabetes mellitus (3). However, candidate gene case-control association studies have been criticized because of a lack of replication (4, 5). Often the first published study reports a significant association between a candidate gene polymorphism and a disease of interest, but subsequent studies are unable to confirm the association (6).
Candidate genes can be selected based on previous genetic linkage analysis results (positional candidate genes), but they are more commonly chosen based on known or presumed mechanisms in disease pathophysiology or based on the results of previous association studies. In the example of PPAR In this study of chronic obstructive pulmonary disease (COPD), we used a study design similar to that of Altshuler and colleagues in their study of diabetes. We sought to determine whether any of the published COPD candidate gene associations would withstand the test of replication and whether a convincing COPD genetic association could be found using this approach. Twelve candidate genes with reported significant associations to COPD were identified in the published literature (Table 1 and Table E1 in the online supplement). A total of twenty-nine polymorphisms (24 single nucleotide polymorphisms [SNPs], 1 insertion/deletion [indel], 3 short tandem repeats [STRs], and 1 null deletion) were genotyped in both a family-based COPD study and a case-control COPD study. Results from this study have been previously reported as an abstract (7).
Study Populations Details of subject recruitment and phenotyping in the Boston Early-Onset COPD Study have been reported previously (8). Severe, early-onset COPD probands had an FEV1 < 40% predicted, age < 53 yr, and did not have severe 1-antitrypsin deficiency. First-degree relatives, older second-degree relatives, spouses, and other family members with COPD were invited to participate. This analysis included 949 individuals from 127 extended pedigrees. Ninety-eight percent of the Boston Early-Onset COPD Study participants were white.
The case-control study identified cases from the National Emphysema Treatment Trial (NETT) (9, 10). Subjects participating in NETT had an FEV1 All studies were approved by the appropriate institutional review boards. Participants in the Boston Early-Onset COPD Study and the NETT Genetics Ancillary Study gave written informed consent. Anonymized data were used for the Normative Aging Study participants, as approved by the IRBs of Partners Healthcare System and of the Boston VA.
Candidate Genes and Genotyping
Four additional SNPs in TNF (Table 2) were selected using a linkage disequilibrium tagging algorithm (http://www.innateimmunity.net), based on genotype data available from SeattleSNPs (http://pga.mbt.washington.edu/). Tag SNPs were selected using an LD threshold defined by r2 0.9 and a minor allele frequency 0.05. Six additional SNPs in EPHX1 (Table 2) were selected from public databases (http://snpper.chip.org/, http://snp500cancer.nci.nih.gov/).
Statistical Analysis The case-control data were analyzed in SAS/Genetics (SAS Institute, Cary, NC). Hardy-Weinberg equilibrium was assessed in control subjects using an exact test. Odds ratios and chi-square statistics were calculated from 2 x 2 tables of allele frequencies. Logistic regression was used to control for age and pack-years in both additive and dominant genetic models. Gene-by-smoking and gene-by-gene interactions were tested by including appropriate cross-product terms in the regression models. Haplotype analysis was performed using the expectation-maximization algorithm and score tests, implemented in haplo.stats (20).
In the case-control study, we had previously genotyped a panel of 44 unlinked SNPs to test for population stratification using the method of Pritchard and Rosenberg (21). There was no compelling evidence for overt population stratification (
Study Subjects Characteristics of participants in the Boston Early-Onset COPD Study are shown in Table 3A. The probands were predominantly female, as has been previously reported (8, 23). Probands had severe airflow obstruction (mean FEV1 = 19.2% predicted). Details of the participants in NETT and NAS are found in Table 3B. The majority of NETT subjects were men (63.8%), while the NAS control subjects were all men. Ages of cases and control subjects were similar, but the NETT cases had a significantly greater smoking history (67.4 versus 38.5 pack-years, P < 0.0001). The 304 NETT cases had severe COPD (mean FEV1 = 24.8% predicted).
Family-Based Study In the Boston Early-Onset COPD Study families, all markers were tested for association with the postbronchodilator phenotypes, including FEV1, FEV1/FVC, mild-to-severe airflow obstruction (FEV1 < 80% predicted, with FEV1/FVC < 90% predicted), and moderate-to-severe airflow obstruction (FEV1 < 60% predicted, with FEV1/FVC < 90% predicted). Significant results from the extended pedigree family-based association test are shown in Table 4. The strongest associations were found with the TNF promoter 308G>A SNP for both quantitative and qualitative COPD-related phenotypes (postbronchodilator), in additive genetic models, adjusting for age, sex, height, smoking status (ever versus never), and pack-years. Similar results were obtained using prebronchodilator spirometry phenotypes (data not shown). A coding variant in surfactant protein B (SFTPB), Thr131Ile, was associated with the qualitative phenotype moderate-to-severe airflow obstruction, but a STR near SFTPB (D2S388) was not associated.
In addition, the 31-repeat (137-bp) allele (allele frequency in the extended pedigrees = 0.07) of the Heme oxygenase (HMOX1) promoter STR also was found to be associated with postbronchodilator FEV1 and FEV1/FVC. None of the other variantstwo additional STRs, the MMP1 indel, and the other 12 SNPswere found to be associated with qualitative or quantitative COPD-related traits in the early-onset COPD families. The GSTM1 deletion could not be analyzed in the family-based study design, as heterozygotes for the deletion could not be distinguished consistently from the wild-type. One marker, the 1-antichymotrypsin Pro229Ala SNP (Bonn-1), was found to be monomorphic in the extended pedigrees and was not genotyped in the case-control study. To assess for gene-by-environment interactions, the TNF 308G>A and SFTPB Thr131Ile variants were examined in an analysis stratified by smoking status as well as in a model including an interaction term for pack-years. These analyses did not show evidence of gene-by-environment interaction effects. In the analysis of the additional TNF SNPs in the early-onset COPD families, one SNP in the 3' untranslated region (UTR), rs3091257, had a high rate of Mendelian errors. In addition, this SNP deviated from Hardy-Weinberg proportions in the NAS control subjects and was not analyzed further in either cohort. None of the three additional TNF SNPs were associated with COPD-related phenotypes in the family-based study. In the early-onset COPD families, none of the eight SNPs in EPHX1 were found to be associated in the primary analysis. In a secondary analysis using a dominant genetic model, one intronic SNP (rs1877724) was marginally associated with postbronchodilator values for both FEV1 (P = 0.02) and moderate-to-severe airflow obstruction (P = 0.02).
Case-Control Study
The matrix metalloproteinase-9 (MMP9) STR (D20S838) was initially analyzed by grouping alleles into "Small" and "Large" repeat numbers, according to the method of Joos and colleagues (24); alleles 110 bp (16 repeats in the previous report) were classified as small, and those 112 bp (17 repeats) were considered large. The HMOX1 promoter polymorphism was analyzed as per Yamada and coworkers (16). Alleles were classified as Small (< 129 bp, < 27 repeats), Medium (129139 bp, 2732 repeats) and Large ( 141 bp, 33 repeats). Neither STR marker was significantly associated in these analyses. The MMP9, HMOX1, and SFTPB STRs were then analyzed by comparing each allele with a frequency of at least 0.05 to all other alleles, using both additive and dominant genetic models (Table 5B). The 30-repeat allele (135 bp, allele frequency in NETT cases = 0.43) of the HMOX1 STR was significantly associated in the case-control analysis (adjusted P = 0.04, additive genetic model), whereas the 31-repeat allele (137 bp, allele frequency in Boston Early-Onset COPD Study families = 0.07) had been significant in the family-based study. The 135-bp allele was the most common repeat size and was underrepresented in the cases (allele test, odds ratio = 0.84; 95% CI, 0.681.04).
None of the three additional SNPs in TNF were significant in the case-control study; no evidence of gene-by-smoking interaction was found in this analysis. In the case-control study, none of the additional SNPs in EPHX1 was found to be associated with COPD. However, in a model that included a gene-by-smoking (pack-years) interaction, both the main effect (P = 0.02) and the interaction (P = 0.03) were significant for a silent coding variant in exon 3 (rs2292566). This SNP was not in linkage disequilibrium with the fast allele in exon 4 that was found to be associated with COPD (r2 < 0.1 in NAS control subjects).
Gene-by-Gene Interactions We also tested for interactions between the EPHX1 fast polymorphism, TNF 308 G>A, and SFTPB Thr131Ile, the SNPs that had been significant in either the family-based or case-control study. None of the two-way interactions nor the three-way interaction were significant.
Haplotype Analysis
In this study, we selected 29 polymorphisms in 12 genes that had been reported to be associated with COPD in the published literature and genotyped these variants in a family-based study of early-onset COPD and a case-control COPD study. The most significant association in the family-based study (TNF 308G>A) was not replicated in the case-control study, and the strongest association in the case-control study (EPHX1 fast allele) was not found in the family-based analysis. Two variants showed modest evidence for replication across both study designs. A coding SNP in surfactant protein B (Thr131Ile) that was marginally associated (P = 0.03) with one qualitative trait (moderate-to-severe airflow obstruction) in the Boston Early-Onset COPD families was not associated in the primary case-control analysis, but did show association (P = 0.01) when an SNP-by-smoking interaction was included. An STR in HMOX1 was significant in both studies, though different alleles were associated in each cohort. The associations with SFTPB and HMOX1 merit further investigation. The different effects of gene-by-smoking interaction in the analyses of SFTPB Thr131Ile in the family-based and case-control studies and the different alleles of the HMOX1 repeat driving the associations in the two study designs suggest that these polymorphisms are not the functional variants affecting COPD susceptibility. The effects that we detected may be due to linkage disequilibrium with nearby functional variants. Analysis of additional SNPs in these genes will be required to confirm these genetic associations. Despite our positive results, we cannot exclude that these may be spurious associations due to the multiple comparisons performed.
Several explanations have been proposed to explain the lack of replication that is commonly seen in case-control association studies in complex trait genetics (5, 6). Small sample sizes may lead to inadequate power to detect an association in the initial study or to replicate true associations in subsequent studies. In fact, the majority of the COPD candidate gene association studies listed in Table 1 enrolled fewer than 100 cases and 100 control subjects. Insufficient power is not likely to explain the lack of replication seen in our study. Using the example of TNF 308G>A with a 17% minor allele frequency in the NAS control subjects, our case-control study had 90% power ( Spurious associations may result from multiple testing in studies that assess many genes, markers, and phenotypes (29). No consensus exists on the optimal method to adjust for multiple testing in case-control genetic association studies, though replication in an independent study may provide the strongest evidence for true association. Multiple testing was a potential problem in our family-based study, given the multiple genes and phenotypes tested, though the independent case-control sample provided an opportunity to confirm the findings from the family-based study. Genotyping errors usually bias toward no association, though systematic errors may lead to false positive results. Deviation from Hardy-Weinberg equilibrium (HWE) in the control group may be a sign of genotyping error (29). We found that only one of the markers tested deviated from HWE, and that SNP was excluded from the association analyses. Departure from Mendelian transmission of alleles is another indication of genotyping error that is only applicable to family-based studies; besides the excluded SNP above, only a small number of Mendelian inconsistencies were found in our study. Failure to demonstrate HWE may also be a sign of population stratification, which refers to differences in allele frequency between cases and controls due to ethnic differences and not due to disease status (30). Population stratification can lead to spurious association in case-control studies (21). Careful matching of cases and control subjects on ethnicity provides some protection against population stratification. Several statistical methods, based on genotype data from additional unlinked markers elsewhere in the genome, are available to test for stratification and control for its effects if present (21, 31, 32). None of the published COPD genetic association studies have employed these formal tests. We tested a modest sized panel of SNPs in our case-control study and found little evidence for stratification. The issues described may lead to false positive (multiple testing, population stratification) or false negative (small sample size, genotyping error) results. However, true differences may lead to inconsistent results. COPD is a heterogeneous disease and published association studies have used different phenotype definitions. For example, studies of TNF have defined cases on the basis of airflow obstruction (28), emphysema (33), decline in lung function (34), or chronic bronchitis (27). It is possible that a given genetic variant may confer susceptibility to a specific COPD-related phenotype. In our case-control study, the NETT cases all had emphysema confirmed by chest CT scan. Radiographic evidence of emphysema was not a requirement for entry into the Boston Early-Onset COPD Study, though many probands did have chest CT scans showing emphysema (8). In our family-based study we analyzed quantitative and qualitative traits, based on spirometry, but the case-control study used COPD diagnosis as a binary outcome. However, we used strict spirometric criteria to define cases and controls, so the overall conclusions should not be affected. The power may be greater using quantitative versus qualitative traits, however. For the majority of the genes studied, we genotyped only one or two markers per gene, as has been done in most of the previously reported studies. This method relies on the assumption that the variants tested have functional effects on COPD susceptibility. If another variant in or near the gene were the causal variant, then the true association could be easily missed. Different linkage disequilibrium patterns with the functional variant may lead to variable results in different populations. In two genes, TNF and EPHX1, we tested additional SNPs and used haplotype analysis to study these genes more thoroughly. However, this did not strengthen our findings.
Genetic heterogeneity may also explain the varying results among case-control association studies, especially those done in different ethnic groups. Many of the COPD association studies in Table 1 have shown inconsistent results in white and Asian populations. True differences may be the result of different genetic determinants of disease in diverse populations, variation in geneenvironment interaction due to specific environmental exposures, or different patterns of linkage disequilibrium between the tested marker and the causal variant (35). Though most of our study subjects were whites from the United States, it is possible that severe early-onset COPD represents a unique disease subtype with different genetic determinants than the usually seen, later-onset COPD. However, many of the family members in the Boston Early-Onset COPD Study had less severe airflow obstruction, consistent with more usual forms of COPD. Nevertheless, variants in several genes studied, including TNF- This study highlights the major difficulty with using a candidate gene approach to uncover susceptibility genes for COPD, namely the lack of replication commonly seen in candidate gene studies. Future candidate gene association studies need to employ rigorous genetic epidemiology methods, including adequate sample sizes, control for multiple testing, and testing for population stratification. A more systematic approach to COPD genetics, starting with genome-wide linkage analysis followed by positional candidate gene association testing and/or SNP-based fine mapping, may lead to more consistent results in the search for genetic determinants of COPD.
The authors thank Salvatore Mazza, Michael Hagar, Molly Brown, Alison Brown, and Maura Regan for their genotyping work and Robert Welch (National Cancer Institute) for providing details of the GSTM1 assay. Co-investigators in the NETT Genetics Ancillary Study include: Marcia Katz, Rob McKenna, Malcolm DeCamp, Mark Ginsburg, Neil MacIntyre, James Utz, Barry Make, Philip Diaz, Gerard Criner, Andrew Ries, Mark Krasna, Fernando Martinez, Larry Kaiser, Frank Sciurba, Zab Mosenifar, and Joshua Benditt.
This work was funded by NIH grants HL61575, HL71393, and HL075478 (E.K.S.) and by an American Lung Association Career Investigator Award (E.K.S.). C.P.H. is supported by T32-HL07427. The National Emphysema Treatment Trial (NETT) was supported by the US National Heart, Lung, and Blood Institute (contracts N01HR76101, N01HR76102, N01HR76103, N01HR76104, N01HR76105, N01HR76106, N01HR76107, N01HR76108, N01HR76109, N01HR76110, N01HR76111, N01HR76112, N01HR76113, N01HR76114, N01HR76115, N01HR76116, N01HR76118, N01HR76119), the Centers for Medicare and Medicaid Services, and the Agency for Healthcare Research and Quality. The Normative Aging Study is supported by the Cooperative Studies Program/ERIC of the US Department of Veterans Affairs and is a component of the Massachusetts Veterans Epidemiology Research and Information Center (MAVERIC). This article has an online supplement, which is accessible from this issue's table of contents at www.atsjournals.org Conflict of Interest Statement: C.P.H. has no declared conflicts of interest; D.L.D. has no declared conflicts of interest; C.L. has no declared conflicts of interest; A.A.L. has no declared conflicts of interest; J.J.R. has no declared conflicts of interest; D.K. has no declared conflicts of interest; N.L. has no declared conflicts of interest; J.S.S. has no declared conflicts of interest; D.S. has no declared conflicts of interest; F.E.S. has no declared conflicts of interest; S.T.W. received a grant for $900,065, Asthma Policy Modeling Study, from AstraZeneca from 19972003. He has been a co-investigator on a grant from Boehringer Ingelheim to investigate a COPD natural history model which began in 2003. He has received no funds for his involvement in this project. He has been an advisor to the TENOR Study for Genentech and has received $5,000 for 20032004. He received a grant from GlaxoWellcome for $500,000 for genomic equipment from 20002003. He was a consultant for Roche Pharmaceuticals in 2000 and received no financial remuneration for this consultancy; E.K.S. received grant support and honoraria from GlaxoSmithKline for a study of COPD genetics and received a $500 Speaker Fee from Wyeth for a talk on COPD genetics. Received in original form February 18, 2005 Received in final form March 22, 2005
This article has been cited by other articles:
|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||