Published ahead of print on April 13, 2006, doi:10.1165/rcmb.2005-0359OC
© 2006 American Thoracic Society DOI: 10.1165/rcmb.2005-0359OC Respiratory Epithelial Gene Expression in Patients with Mild and Severe Cystic Fibrosis Lung DiseaseDepartment of Physiology, Department of Medicine, and Department of Pediatrics, The Johns Hopkins University School of Medicine, Baltimore, Maryland; and Department of Medicine, University of Chicago Pritzker School of Medicine, Chicago, Illinois Correspondence and requests for reprints should be addressed to Michael P. Boyle, M.D., F.C.C.P., The Johns Hopkins University School of Medicine, 1830 E. Monument Street, 5th floor, Baltimore, MD 21205. E-mail: mboyle{at}jhmi.edu
Despite having identical cystic fibrosis transmembrane conductance regulator genotypes, individuals with F508 homozygous cystic fibrosis (CF) demonstrate significant variability in severity of pulmonary disease. This investigation used high-density oligonucleotide microarray analysis of nasal respiratory epithelium to investigate the molecular basis of phenotypic differences in CF by (1) identifying differences in gene expression between F508 homozygotes in the most severe 20th percentile of lung disease by forced expiratory volume in 1 s and those in the most mild 20th percentile of lung disease and (2) identifying differences in gene expression between F508 homozygotes and age-matched non-CF control subjects. Microarray results from 23 participants (12 CF, 11 non-CF) met the strict quality control guidelines and were used for final data analysis. A total of 652 of the 11,867 genes identified as present in 75% of the samples were significantly differentially expressed in one of the three disease phenotypes: 30 in non-CF, 53 in mild CF, and 569 in severe CF. An analysis of genes differentially expressed by severity of CF lung disease demonstrated significant upregulation in severe CF of genes involved in protein ubiquination (P < 0.04), mitochondrial oxidoreductase activity (P < 0.01), and lipid metabolism (P < 0.03). Analysis of genes with decreased expression in patients with CF compared with control subjects demonstrated significant downregulation of genes involved in airway defense (P < 0.047) and protein metabolism (P < 0.048). This study suggests that differences in CF lung phenotype are associated with differences in expression of genes involving airway defense, protein ubiquination, and mitochondrial oxidoreductase activity and identifies specific new candidate modifiers of the CF phenotype.
Key Words: cystic fibrosis gene expression phenotype respiratory epithelium
Cystic fibrosis (CF) is the most common lethal autosomal recessive disorder in the white population. Although CF affects multiple organ systems, it is the severe lung disease that leads to the shortened life expectancy of 35.1 yr (1). The gene responsible for the disease, the cystic fibrosis transmembrane conductance regulator (CFTR), was first identified in 1989, and since then over 1,000 different mutations have been reported. F508 is the most common mutation, and over 50% of individuals with CF are homozygous for the F508/ F508 genotype (1).
Although certain aspects of CF phenotype, such as pancreatic insufficiency, are determined by CFTR genotype, most other aspects are not. Within the group of Several obstacles have made investigation into the molecular basis of this variability difficult. First has been the lack of an animal model of CF lung disease, making human studies necessary. Second has been the challenge of accurately identifying the most mild and severe pulmonary phenotypes while controlling for external confounders, such as B. cepacia, resistant P. aeruginosa, allergic bronchopulmonary aspergillus (ABPA), and poor nutrition. Third has been the challenge of identifying molecular differences that lead to a variation in phenotype and are not just a response to chronic infection and inflammation.
The aim of this investigation was to use high-density oligonucleotide microarray studies of in vivo nasal respiratory epithelium to investigate the molecular basis of differences in the CF phenotype by (1) identifying differences in gene expression between
We attempted to address each of the previously described obstacles by matching participants with CF for
The results of this study demonstrate that with the exception of a few potentially interesting genes, there is less difference in nasal respiratory epithelial gene expression profiles between
Participant Selection The study was conducted at the Johns Hopkins Cystic Fibrosis Center and was approved by the Johns Hopkins and Western Institutional Review Boards (protocol #1033335). Each participant voluntarily consented to participate. To be eligible for the study, individuals with CF had to be homozygous for the F508 mutation and meet FEV1 criteria that placed them in the top or bottom 20th percentile for FEV1 for their genotype and age (Table 1). These criteria were based on work published by Schlucter and coworkers (7) identifying FEV1 cutoffs that stratify severity of lung disease in F508 homozygotes. Individuals with ABPA, B. cepacia, atypical mycobacteria, methicillin-resistant Staphylococcus aureus, history of significant reactive airway disease, recent viral infection, or active CF exacerbation were excluded. Non-CF control subjects were recruited from age-matched healthy volunteers. If potential participants demonstrated obvious turbinate inflammation or hemorrhage on initial visual inspection, the brushing was rescheduled for a later date.
Nasal Respiratory Epithelial Cell Collection Bilateral nasal mucosal brushing to collect respiratory epithelium was performed on each subject with a Cytosoft cytology brush (Medical Packaging Corp., Camarillo, CA) under direct visualization of the inferior turbinate using a nasal telescope (Karl Storz Endoscopy America, Culver City, CA). The nasal mucosa was anesthetized with two sprays of 4% tetracaine spray. The cytology brush was then gently rotated on the mucosal surface of the inferior turbinate, removed from the nose, and agitated in a DNAase/RNAase-free microcentrifuge tube containing 1 ml of sterile, chilled PBS. After a 40-µl aliquot was set aside for cytologic evaluation, the sample was centrifuged, the supernatant was removed, and 500 µl of TRIzol Reagent (Invitrogen, Carlsbad, CA) was added to the tube. The sample was immediately snap frozen using liquid nitrogen and stored at 80°C.
Cytopatholologic Evaluation
RNA Extraction and Preparation for Microarray Analysis
Performance of Microarrays
Quality Control Measures
Microarray Data Analysis Probe sets identified as significantly differentially expressed by disease phenotype underwent an intensive search to identify biologic function. Probe set sequences from the Affymetrix web site were "blasted" against the University of California, Santa Cruz genome database to verify identity and update annotation. For individual genes with multiple probe set sequences specific to different regions of the gene, each probe set was checked separately. The resulting list of genes was submitted to PathwayAssist 3.0 (Stratagene, La Jolla, CA) for automated literature search. Gene Ontology classifications using GOMiner (9), conserved protein family domains, and reference literature were used to construct functional groupings of genes. GOMiner was then used to perform a two-sided Fisher's exact test to determine if a significantly greater number than expected of differentially expressed genes occurred in a category (8). All original array data images and files are available at http://pepr.cnmcresearch.org/browse.do?action=list_prj_expandprojectId=97. This site includes data on all arrays performed. The online supplement identifies which specific arrays were used for analysis and which had unacceptable glyceraldehyde-3-phosphate dehydrogenase ratios (see Tables E1 and E2 in the online supplement).
LightCycler Quantitative RT-PCR Confirmation of Results
Patients and Samples A total of 48 individuals (30 with CF, 18 non-CF) underwent nasal brushing to acquire nasal respiratory epithelial cells. Eleven of the brushings (eight CF, three non-CF) did not provide RNA of sufficient quantity or quality after purification to permit microarray analysis. The remaining 37 samples were hybridized to Affymetrix Human Genome HG-U133 A and B microarrays. Twenty-three of the U133A arrays and 21 of the matching U133B arrays passed the stringent quality control guidelines described in MATERIALS AND METHODS and were used for the final data analysis (12 CF, 11 non-CF). The characteristics of the 23 individuals used in the final data analysis are summarized in Table 2. All of the individuals with severe CF and all but two of the individuals with mild CF were infected with mucoid Pseudomonas.
Nasal Cytology To assure that the cells collected and analyzed were predominantly respiratory epithelial cells, the nasal brushing smears were blindly read by the Johns Hopkins cytopathology department (Table 3). This analysis demonstrated an overall mean of 86.7 ± 7.1% respiratory epithelial cells (median, 90%; range, 7398%). Other cell types identified were squamous (5.4 ± 5.6%) and inflammatory cells (7.8 ± 4.1). There was not a significant difference in percentage of inflammatory cells between CF, mild CF, and non-CF samples, although the study was not powered to detect small differences. A mean of 6.6 ± 3.3% of cells were inflammatory in the non-CF group (n = 11), 9.0 ± 6.5% in the mild CF group (n = 5), and 8.9 ± 3.2% in the severe CF group (n = 7) (P = not significant).
Overview of Expression Profiling Results A total of 11,867 of the 44,760 probe sets on the U133 A and B chips were identified as being present or marginal in at least 75% of the nasal respiratory epithelium samples. Using the GeneSpring t test with cross-gene error modeling to determine significance, 709 of the 11,867 were identified as significantly differentially expressed in one of the three disease phenotypes: 32 in non-CF, 69 in mild CF, and 608 in severe CF (Figure 1, supplemental file 1). Combining multiple probe sets for single genes and eliminating expressed sequence tags reduced the numbers to 30 differentially expressed genes in non-CF, 53 in mild CF, and 569 in severe CF.
K-means clustering analysis divided the differentially expressed probe sets into three main groups: genes with increased expression only in non-CF (Figure 1B), genes with increased expression only in mild CF (Figure 1C), and genes with increased expression only in severe CF (Figure 1D). There were few probe sets that demonstrated significantly decreased expression only in mild or severe CF. One notable exception was STAT1, which demonstrated significantly decreased expression in mild-CF (Figure 1B).
Genes Differentially Expressed between Non-CF and CF
Genes Differentially Expressed in Mild CF There were 69 probe sets representing 53 genes that were significantly differentially expressed in individuals with mild CF lung disease compared with those with severe CF and non-CF control subjects. Fifty-two of the 53 genes demonstrated increased expression in mild CF. STAT1, an inducible transcription factor mediating response to IFN (12) and represented by two probe sets (Figure 1C), was the only gene significantly decreased in expression in mild CF. An analysis of the upregulated list demonstrated significant over-representation in several GO categories: lipid metabolism (P < 0.032), G-coupled protein receptors (P < 0.024) and ion transport (P < 0.03) (Table 5). In addition to the genes in these categories, two other genes of interest were found to be upregulated in individuals with mild CF lung disease: statherin (STATH) and adiponectin (ADIPOQ). STATH is a calcium-binding protein found in saliva and is primarily known for its regulation of calcium deposition (13). STATH is also well documented to have significant antibacterial properties (14) and is produced in submucosal glands of the nasal cavity and upper airway (15). ADIPOQ is a potent anti-inflammatory cytokine and inducer of IL-10 and is a modulator of insulin sensitivity (16, 17).
Genes Differentially Expressed in Severe CF A total of 569 genes demonstrated significant upregulation in individuals with severe CF lung disease compared with those with mild disease and non-CF control subjects. Analysis of these genes by gene ontology categories revealed a striking over-representation of the upregulated genes involved oxidoreductase activity (P = 0.01), the ubiquitin cycle (P = 0.04), and lipid metabolism (P = 0.04) (Table 6). One particular cluster of upregulated oxidoreductase genes were those involved in NADH dehydrogenase:ubiquinone complex I, a mitochondrial subunit essential for electron transfer (NDUFS1, NDUFS7, NDUFB3, NDUFB5, NDUFAB1, NDUFA3). Numerous ubiquitin-conjugating enzymes were also significantly upregulated in individuals with severe CF (UBE2A, UBE2B, UBE2E1, UBE2E3, FBXW2, HIP2, and NEDD8) along with two ubiquitin-activating enzymes (UBA2 and UBE1C). Other upregulated genes of interest in severe CF included glutamate-cysteine ligase, the rate-limiting enzyme of glutathione synthesis (18), and activating transcription factor 1, a transcription factor involved in increasing IL-8 inflammatory response (19). A full list of all genes differentially expressed in individuals with severe CF is available in the online supplement (Table E3).
IL-8 The inflammatory chemokine IL-8 has previously been identified as being characteristically elevated in CF (20). IL-8 transcript levels were noted to be significantly elevated in the nasal epithelium of the patients with severe CF compared with non-CF control subjects (Figure 2). In contrast, a wide range of IL-8 transcript expression was exhibited in the patients with mild CF; although the two highest expression levels of IL-8 occurred in individuals with mild CF, the remainder had IL-8 levels similar to non-CF control subjects. Overall, the median IL-8 level in individuals with mild CF lung disease was not significantly different from that seen in non-CF (Figure 2).
STATH and DUOX2 rt-PCR Results Because of its potential importance, we sought to verify the significantly increased expression of STATH in mild CF by quantitative RT-PCR in the samples used for microarray analysis and in an additional 12 mild and severe patient samples collected after the microarray experiments were complete. This separate analysis confirmed a significant difference in STATH expression between those with mild CF lung disease (n = 12) and those with severe CF lung disease (n = 11) (Figure 3; Kruskal-Wallis rank sum test, P = 0.042).
Decreased expression of DUOX2 in individuals with CF was also confirmed by RT-PCR in an independent larger group. DUOX2 expression was significantly lower in individuals with CF (n = 22) compared with those without CF (n = 13) (Figure 4; Kruskal-Wallis rank sum test, P = 0.047).
Despite having identical CFTR genotypes, F508 homozygous CF individuals demonstrate a full range of pulmonary disease. Although several environmental factors influencing severity of lung disease have been identified, there is growing evidence that genetic and molecular differences contribute to the significant variability seen in the CF phenotype. This study used microarray analysis of nasal respiratory epithelium to investigate the molecular basis of variability in CF phenotype by identifying differences in gene expression between F508 homozygotes in the most severe 20th percentile of lung disease and those in the most mild 20th percentile and identifying differences in gene expression between F508 homozygotes and age-matched non-CF control subjects. The results suggest that the most significant differences in gene expression between those with CF and those without include those involved with airway defense, antigen presentation, and protein metabolism. There are also differences in gene expression between those with mild CF lung disease and those with severe lung disease, even in nasal respiratory epithelium without evidence of significant differences in inflammation. These include differential expression of genes involving the ubiquitin cycle, oxidoreductase activity, and lipid metabolism.
A previous comparison of gene expression in CF and non-CF respiratory epithelium has been performed in mice and identified differential expression of multiple gene classes, including those involved in transcription, inflammation, intracellular trafficking, signal transduction, and ion transport (21). Because of the lack of pulmonary pathology in these CF mice, no analysis could be performed to evaluate differences in gene expression associated with severity of lung disease. Another recent study of One challenge to this study was identified from the onset: Could we identify differences in mild and severe CF respiratory epithelium that were not due just to response to chronic infection? This challenge was made evident in initial attempts to study lower airway cells from patients with severe CF by the amount of purulence in cell samples obtained by bronchoscopy in patients with CF just before lung transplantation. To address this concern, we studied gene expression in nasal ciliated respiratory epithelial cells, a commonly used surrogate in CF for lower airway respiratory cells. Just as in CF lower airway respiratory epithelium, a markedly decreased amount of CFTR reaches the apical surface membrane of nasal epithelial cells (23). Electrolyte transport characteristics have also been shown to be nearly identical (24). We tried to minimize the potential influence of active local infection by excluding individuals with symptoms of sinus or pulmonary exacerbation, with the hope that collected cells would be more likely to reflect intrinsic differences in gene expression not due solely to response to local infection.
CF versus Non-CF Differential Gene Expression
One gene ontology category demonstrating clear decreased expression in CF was that involving genes responding to biotic stimulus. These genes included Also significantly decreased in CF was insulin-like growth factor binding protein-3 (IGFBP3), a protein known to be a key modulator of the effects of insulin-like growth factor. Serum levels of IGFBP3 have been repeatedly demonstrated to be decreased in individuals with CF and in some cases correlate with lung function and nutritional status (30). Microarray analysis of CFTR null mice lung tissue has also previously demonstrated a significant decrease in insulin-like growth factor binding proteins (21). Although decreases in IGFBP3 can be related to chronic malnutrition, the significant decrease even in well nourished individuals with mild CF suggests an association between the loss of CFTR function and the decrease in insulin-like growth factor binding proteins. Six of the 30 genes downregulated in CF were involved in lipid metabolism (PIGB, PIGF, PITPNB, SC4MOL, SLC27A2, and UGCG) and are located in the endoplasmic reticulum according to GO. None of these genes has been identified as potentially involved in CF, although abnormalities in lipid metabolism in CF are well known. Despite the large number of differentially expressed genes located in the endoplasmic reticulum, there was no indication of an ER overload response. Inflammatory chemokine IL-8 has been identified as being characteristically elevated in CF (20). Although IL-8 expression was significantly elevated in the severe CF group compared with non-CF control subjects, there was much more variability in IL-8 expression in individuals with the mild CF group. The highest absolute IL-8 values in the study group were found in two individuals with mild CF lung disease and minimal sinus disease. This suggests that although IL-8 levels are usually elevated in CF, marked elevation is not necessarily associated with more aggressive disease. Alternatively, CF nasal respiratory epithelial cells may demonstrate IL-8 characteristics independent of lower airway respiratory epithelium.
There were no differences in CFTR transcript levels between
Mild versus Severe CF Differential Gene Expression STATH, a calcium-binding, 43-amino-acid phosphopeptide known to have antibacterial properties is found in saliva, nasal secretions, and the upper airway. STATH was clearly upregulated in individuals with mild CF lung disease. This increased expression was confirmed in an additional 12 mild and severe patient samples collected after the microarray experiments were complete. STATH plays a key role in the development of the oral cavity biofilm by mediating adhesion of bacteria and was recently identified as being the most prominent protein in the salivaair interface (31). It is known to have bacterial binding epitopes that promote the growth and adhesion in the oral cavity of some organisms (Porphyromonas gingivalis and Fusobacterium nucleatum) while inhibiting the growth of others (Peptostreptococci and S. aureus) (14, 32, 33). Its antimicrobial effect on P. aeruginosa has not been investigated. Given that colonization with mucoid Pseudomonas is known to accelerate a decline in lung function in CF (3), a protein acting as a key determinant of bacterial adhesion in the oral cavity and upper airway is of significant interest.
ADIPOQ, a protein usually produced in adipocytes that potently inhibits inflammation and modulates insulin sensitivity, was also significantly upregulated in ADIPOQ is classically identified as being produced only by adipocytes, although array studies have identified ADIPOQ expression in trachea, skin, adrenal gland, thymus, and thyroid (36). Separate RT-PCR studies confirmed the presence of ADIPOQ transcripts in CF and control nasal brushings, although further studies are needed to determine whether this expression is from respiratory epithelial cells or other cell types present in the sample. Signal transducer and activator of transcription 1 (STAT1), represented by two probe sets, was the only gene identified to be significantly decreased in expression in mild CF. In mucosal T cells, STAT1 is activated by CD2 receptors (37), which are also identified by the microarray data as being decreased in CF. In epithelial cells, STAT1 is essential for cellular antiviral defense and is central in activating the transcription of IFN-induced genes, particularly nitric oxide synthase-2 (38). STAT1 activates transcription by binding directly to regulatory DNA elements (38). STAT1-deficient mice display an absence of responsiveness to IFN and are highly sensitive to infection by virus (39). It has previously been shown that STAT1 induction and activation are impaired in CF, and STAT1 has been proposed as a potential modifier of the CF phenotype (40). Although the microarray data corroborate the decrease of STAT1 in CF, it is unclear why this would be more apparent in individuals with mild CF because low levels might be expected to result in more susceptibility to infection. One possibility is the recently identified increased antiapoptotic effect of IFN in the absence of STAT1 (41). An alternative explanation is that STAT1 is usually low in all CF, but the studied severe group had more active inflammation leading to STAT1 induction. Individuals with severe CF demonstrated the largest number of differentially expressed genes. A total of 569 genes demonstrated significant upregulation in individuals with severe CF lung disease compared with those with mild CF lung disease and non-CF control subjects. Although the respiratory epithelial cells sampled were not acutely infected or exposed to chronic purulence, the increased number of differentially expressed genes may not only reflect intrinsic differences but also cellular exposure to elevated serum levels of circulating inflammatory mediators present in individuals with severe CF lung disease.
Genes involved in the ubiquitin cycle, oxidoreductase activity, and lipid metabolism were those most strongly upregulated in severe CF. Of the 569 upregulated genes identified, nine were ubiquitin-activating and ubiquitin-conjugating enzymes. This strongly suggests a significant increase in the activity of the ubiquitin system in individuals with severe CF. Among the numerous upregulated ubiquitin cycle genes were the specific ubiquitin-activating enzyme UBA2 and its ubiquitin-like protein target NEDD8 (42). Also upregulated was ubiquitin-conjugating enzyme HIP2 (E225K), which by its covalent attachment of ubiquitin identifies proteins for intracellular proteolysis by the 26S proteasome (43). Six of the 569 upregulated genes in severe CF were subunits of the NADH:ubiquinone oxidoreductase complex I, the initial enzyme complex in the electron transport chain of mitochondria. Complex I catalyses the first step in the respiratory electron transport chain in mitochondria, the reduction of ubiquinone by NADH (44). It also produces superoxide in the mitochondrial matrix, which is converted by superoxide dismutase into hydrogen peroxide (45). Abnormalities in NADH dehydrogenase have been identified in CF (46). Further investigation is required to determine if the upregulation of complex I seen in individuals with severe CF is due to increased oxidative stress or is a primary contributor to pathology. One challenge that we had to address in this study was the risk of type I error (false-positive differentially expressed genes) due to the multiple comparisons present in microarray analysis. There are several classic approaches to controlling for multiple comparisons; however, the normal physiologic variability and overlap between CF and non-CF gene expression in vivo, even in genes known to be differentially expressed, such as IL-8 (47), makes the stringent P values required for significance by these classic methods nearly impossible to obtain. This was noted recently in a microarray study of CF versus non-CF epithelium grown in cell culture by Zabner and coworkers (22). Even when they analyzed numerous CF and non-CF epithelial cell samples grown under tightly controlled conditions, their initial correction for multiple comparisons resulted in 0 of 22,238 tested genes being identified as significantly differentially expressed. Our goal was to use available statistical tools to minimize the possibility of type I error as much as possible without being so stringent that we would eliminate all leads to potentially important differentially expressed genes. We did this by first minimizing variability before conducting ANOVA analysis by using a two-component Rocke-Lorenzato model normalization, which corrects for the absolute error that dominates at low expression and the relative errors present at high expression levels (8). Second, we used a deviation from median correction to minimize the effect of outliers. These adjustments resulted in a reduction of 70.1% in the number of genes identified as significantly differentially expressed compared with ANOVA alone. Next, we used gene ontology group analysis to further reduce type I error because falsepositive, differentially expressed genes should be randomly distributed across ontology groups. Finally, we assured that our key findings in individual genes were not due to false discovery by confirming the Statherin and Duox2 data by RT-PCR in a larger, separate CF population. There are other potential limitations to using an array approach to identify candidate modifiers of CF phenotype. First is the lack of tight correlation between transcript levels and functional protein expression. All of the candidate genes identified here require further study at the protein level. Second is an inability to detect meaningful modifiers, which occur in a small percentage of the population. An example of this is the nonfunctional variant of the mannose-binding lectin gene that is present in 510% of the population and has been suggested as being associated with severe CF lung disease (48). The statistics of microarray analysis do not identify as significant a marked decrease in expression in only 510% of the samples. Finally, although using nasal respiratory epithelial cells for analysis is well accepted and provides some significant advantages, these cells may not fully represent the characteristics of lower airway cells.
Overall, the number of genes differentially expressed in the nasal respiratory epithelial cells of individuals with CF compared with non-CF control subjects is less than might be expected. This is particularly true for cells from individuals with mild CF lung disease, with only 69 of the 44,670 assessed probe sets being clearly differentially expressed. This finding is consistent with the recent study by Zabner and colleagues of
In summary, this study provides the first in vivo comparison of respiratory epithelial cell gene expression profiles in
The authors acknowledge the work of the Children's National Medical Microarray Center in processing the samples.
This work was supported by grants K23-HL071847, U01-HL66618, R025-CR02, CFFMerlo00Q0, R01-HL68927, and BAA HL 0204 from the NHLBI and the Cystic Fibrosis Foundation. This article has an online supplement, which is accessible from this issue's table of contents at www.atsjournals.org Originally Published in Press as DOI: 10.1165/rcmb.2005-0359OC on April 13, 2006 Conflict of Interest Statement: None of the authors has a financial relationship with a commercial entity that has an interest in the subject of this manuscript. Received in original form September 21, 2005 Accepted in final form April 3, 2006
This article has been cited by other articles:
| |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||