help button home button
AJRCMB
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS

Published ahead of print on August 28, 2003, doi:10.1165/rcmb.2003-0214OC
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow All Versions of this Article:
2003-0214OCv1
30/3/296    most recent
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Right arrow reprints & permissions
Citing Articles
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Shultz, M. A.
Right arrow Articles by Buckpitt, A. R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shultz, M. A.
Right arrow Articles by Buckpitt, A. R.
American Journal of Respiratory Cell and Molecular Biology. Vol. 30, pp. 296-310, 2004
© 2004 American Thoracic Society
DOI: 10.1165/rcmb.2003-0214OC

Gene Expression Analysis in Response to Lung Toxicants

I. Sequencing and Microarray Development

Michael A. Shultz, Lu Zhang, Yi-Zhong Gu, Gregory L. Baker, Michelle V. Fannuchi, Allan M. Padua, William A. Gurske, Dexter Morin, Sharron G. Penn, Stevan B. Jovanovich, Charles G. Plopper and Alan R. Buckpitt

Global Research, Amersham Biosciences (SV) Corp., Sunnyvale; Department of Anatomy, Physiology, and Cell Biology, and Department of Molecular Biosciences, School of Veterinary Medicine, University of California Davis, Davis, California

Address correspondence to: Michael A. Shultz, Dept. of Molecular Biosciences, School of Veterinary Medicine, University of California Davis, Davis, CA 95616. E-mail: mashultz{at}ucdavis.edu


    Abstract
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
A key challenge in measuring gene expression changes in the lung in response to site-selective toxicants is differentiating between target and nontarget areas. The toxicity for the cytotoxicant 1-nitronaphthalene is highly localized in the airway epithelium. Target cells comprise but a fraction of the total lung cell mass; measurements from whole lung homogenates are not likely to reflect what occurs at the target site. Additionally, the use of generic microarrays to measure expression in airway epithelium may not provide a good representation of transcripts present at the site of toxic action. cDNA libraries from airway and alveolar subcompartments of rat lung were sequenced for the development of a custom microarray representative of these lung regions. We identified 7,460 nonredundant rat lung sequences. Nearly 30% of the sequences on this array are not present on the Affymetrix Rat GeneChip 230. A 20,000-element microarray was developed that delineates differences in gene expression between subcompartments. This is the first in a series of articles employing this microarray for detecting gene expression changes during acute injury produced by 1-nitronaphthalene and subsequent repair.

Abbreviations: 1-nitronaphthalene, 1-NN • ampicillin, amp • Basic Local Alignment Search Tool, BLAST • contiguous sequences, contigs • coefficient of variation, CV • expressed sequence tag, EST • glyceraldehyde-3-phosphate dehydrogenase, GAPDH • Luria Broth, LB • polymerase chain reaction, PCR • relative fluorescent unit, RFU • sodium dodecyl sulfate, SDS • saline-sodium citrate, SSC


    Introduction
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The lung is a major target organ for numerous compounds that reach respiratory tissue both via the circulation and by inhalation. In some cases, selective injury to the lung is a consequence of unique exposure. For example, oxidant gases such as ozone result in selective injury to centriacinar regions of the lung (1). In other cases, the lung is selectively targeted because of its unique biochemical and physiologic functions. The selective lung injury caused by paraquat is related to the active transport of this chemical from the bloodstream resulting in high levels in lung tissue (2). Likewise, a number of compounds cause selective injury to pulmonary epithelial cells irrespective of the route of administration because they undergo metabolic activation by the cytochrome P450 monooxygenases, enzymes which are selectively localized in three of the more than 40 cell types found in the lung—Clara cells, type II cells, and macrophages (3).

1-Nitronaphthalene (1-NN) and various methylnitronaphthalenes are generated by gas phase atmospheric reactions which occur between N2O5 and the parent naphthalene/methylnaphthalene moiety (4). Parenteral administration of 1-NN to rats produces dose-dependent injury to epithelial cells of conducting airways (5) which appears to be related to the generation of electrophilic intermediates by cytochrome P450 monooxygenases (6). Although extensive literature detailing many of the steps involved in the initial metabolic activation and detoxication of bioactivated cytotoxicants including 1-NN exists (7), little is known about the downstream response of the lung to toxicants that produce injury to airway epithelial cells.

The advent and refinement of approaches to analyze expression levels of thousands of genes has provided an important tool for identifying key pathways associated with various disease processes (8): for use as a diagnostic tool to determine tumor type, drug sensitivity, and prognosis (9); for monitoring changes in gene expression during the process of lung development (10); and for interrogating alterations associated with stressors including toxic chemicals (11). A considerable number of studies in toxicology have focused on using gene expression profiles to categorize hepatotoxicants. In general, these studies have demonstrated that necrotic agents can be differentiated from cholestatic agents as well as peroxisome proliferators based on their expression profiles (12).

Far less attention, however, has focused on changes in gene expression in the lung associated with toxicant exposure, and much of this work has been conducted with in vitro models. Studies in isolated human bronchial cells have shown upregulation of genes associated with apoptosis at early times after exposure to cigarette smoke or H2O2 followed by induction of stress proteins, ubiquitin, and genes responsible for the deactivation of reactive oxygen species such as glutaredoxin and dihydrodiol dehydrogenase (13). Likewise, in vitro studies in malignant and nonmalignant human lung cell lines have shown that 2,3,7,8-tetrachlorodibenzo-p-dioxin exposure alters the expression of numerous genes associated with cell signaling, cell cycle, and xenobiotic metabolism (14). More recent work has used gene expression analysis to show that the NO-synthase inhibitor, N(G)-nitro-L-arginine methyl ester, attenuates many of the alterations in expression of cytokines and restores surfactant protein expression associated with nickel exposure in whole animals (15). The correlations of changes in gene expression with the severity of toxicity noted in this work clearly demonstrates the power of such studies in identifying key cellular changes associated with toxicity. However, none of the studies cited above provide a comprehensive view of the changes in gene expression that occur in target sites, such as conducting airways, at times which correspond to both the initial injury and repair phases associated with exposure to a lung toxicant. In the few cases where in vivo models have been used to study the response of the lung to external stimuli or to define changes occurring with development, target areas of the lung have not been separated from nontarget areas.

The work reported here has focused on the development and validation of a custom microarray capable of monitoring changes in expression in both airways and parenchyma of rats exposed to 1-NN, a pulmonary cytotoxicant selective for airway epithelial cells. This array was developed by sequencing expressed genes obtained from airways and parenchyma of control and toxicant-treated rats. This approach ensured that low copy number transcripts from lower abundance cell types and from specific areas of the lung known to be susceptible to environmental toxicants were represented on the microarray. This work clearly demonstrates substantial differences in genes expressed in airways compared with parenchyma. This work also reports a method for identifying tissue-specific transcripts from less sequenced species, and demonstrates the value of building custom tissue-specific arrays for interrogating gene expression changes in the lung or other tissues. These studies are the first in a series designed to comprehensively evaluate changes in gene expression at early times after toxicant exposure where injury is developing as well as later time points where repair is occurring.


    Materials and Methods
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Animals
Male Sprague-Dawley rats (150–200 g) were purchased from Harlan (Indianapolis, IN). Animals were provided food (Purina Lab Chow) and water ad libitum and housed in HEPA-filtered cage racks in an AAALAC-accredited facility at the University of California, Davis for at least 5 d before use in an experiment.

cDNA and Plasmid Library Construction
1-NN–treated (100 mg/kg, administered intraperitoneally in 0.2 ml corn oil/100 g body weight) and control rats (no treatment) were killed 2 h after dosing by administration of lethal doses of sodium pentobarbital. Airways and parenchyma were obtained by dissection using previously described methods (16). Briefly, the trachea was exposed, cannulated, and lungs were removed and inflated with 1% agarose in Waymouth's medium. Total RNA was isolated using TRIzol (Invitrogen, Carlsbad, CA). Samples were DNase digested and mRNA was isolated using OligoTex resin (Qiagen, Valencia, CA). cDNA was synthesized using the TimeSaver cDNA Synthesis Kit with a Directional Cloning Toolbox (Amersham Biosciences, Piscataway, NJ) following the manufacturer's protocols. This incorporated flanking Eco RI and Not I sites for the directional cloning of cDNA during the development of plasmid libraries.

The quality of cDNA for all libraries was assessed after agarose gel electrophoresis. Lengths of synthesized cDNA ranged from 100–5,000 bp. cDNA was size-selected for products > 400 bp by gel purification using the QIAEX II Gel Extraction System (Qiagen).

Plasmid libraries were generated from each size-selected cDNA library by directional cloning (Eco RI/Not I) of cDNA into the multiple cloning site of pGEM-11Zf(-) (Promega, Madison, WI). Plasmid libraries were plated onto Luria Broth (LB) Agar containing ampicillin (amp), 5-bromo-4-chloro-3-indolyl-ß-D-galactoside, and isopropyl-ß-D-thiogalactopyranoside. White bacterial colonies (9,200 per library = 100 x 96-well microtiter plates with 92 colonies plus 4 blank negative controls) were picked and grown overnight at 37°C in 96-deep well culture plates containing 0.5 ml LB-amp. Glycerol stocks for 36,800 cDNA clones were prepared in duplicate for each 96-well plate grown and stored at -80°C.

cDNA Amplification and Sequencing
Each cDNA clone was polymerase chain reaction (PCR) amplified directly from glycerol stock stabs in 20-µl reactions containing T7/SP6 primers (0.125 µM each), 0.25 mM dNTPs, and 0.05 U Taq Polymerase. Reaction conditions consisted of 1 x 95°C for 5 min followed by 35 x (95°C for 30 s, 55°C for 45 s, 72°C for 60 s). PCR products were treated with Exonuclease I and Shrimp Alkaline Phosphatase (37°C for 30 min followed by 85°C for 15 min) in preparation for sequencing. Representative PCR reaction products were sampled for product confirmation by gel electrophoresis. Representative plates typically showed an 85% success rate (one PCR product per lane).

PCR products were sequenced from the 5' end in 10 µl reactions containing 5 µl (1:10 dilution) PCR product, 1 µl T7 primer (5 pmoles/µl), and 4 µl DYEnamic ET Dye Terminator Reaction Mix (Amersham Biosciences). Mixtures were cycle sequenced 30 x (95°C for 15 s, 54°C for 15 s, 60°C for 60 s). Sequencing products were purified by ethanol precipitation and were separated by capillary electrophoresis using a MegaBACE 1000 DNA Sequencing System (Amersham Biosciences) and data analyzed using MegaBACE Sequence Analyzer software (Amersham Biosciences).

Sequence Data Analysis
Sequence processing was performed using Phred, Phrap, and Cross-match software (University of Washington, Seattle, WA). Raw sequencing electropherograms were processed using Phred, generating base-called sequences and corresponding quality scores for each base. Phred quality scores of 10, 20, 30, 40, and 50 represent an accuracy of the base call at 90, 99, 99.9, 99.99, and 99.999%, respectively. Any vector sequence flanking the cDNA insert sequence was identified using Cross-match. Sequences were end-trimmed to eliminate poor quality bases, vector sequence, and polyA/T signals using custom Perl scripts. Processed sequences were assembled into contiguous sequences (contigs) using Phrap for the removal of as many redundant sequences as possible. Sequences not incorporated into any assembled contigs remained as singlets.

Sequence similarity searches for all contigs and singlets were performed against public GenBank databases (http://www.ncbi.nlm.nih.gov) using the Basic Local Alignment Search Tool (BLAST) (BLASTN 2.2.4, Aug-26–2002, NCBI) for identification of cDNA clones. Databases searched included those from the functionally annotated NCBI Reference Sequence project (RefSeq) databases for rat (rat.fna, Nov-19–2002, 4,023 sequences), mouse (mouse.fna, Nov-21–2002, 12,683 sequences), and human (human.fna, Nov-19–2002, 18,232 sequences) as well as the nt database (May-30–2002, 1,218,445 sequences).

cDNA Microarray Construction
A cDNA clone set was identified (as above) from a reduced expressed sequence tag (EST) set for the development of a rat lung microarray. Glycerol stocks corresponding to these minimally redundant ESTs were re-arrayed into new 96-well microtiter plates, and stabs were grown in 0.5 ml LB-amp overnight. Glycerol stocks from this cDNA clone set were created in duplicate and stored at -80°C. Purified plasmids were prepared and stored at -20°C. Each cDNA clone was PCR amplified from the purified plasmid preparations in a 100-µl reaction containing T7/SP6 primers (0.125 µM each), 0.25 mM dNTPs, and 0.05 U Taq Polymerase. Reaction conditions consisted of 1 x 95°C for 5 min, 37 x (95°C for 30 s, 62°C for 60 s, 72°C for 60 s), and 72°C for 5 min. PCR products were purified using MultiScreen-FB filter plates (Millipore, Billerica, MA). PCR products were bound to filter plates in 3.6 M guanidine-HCl/100 mM potassium acetate (pH 4.8), washed four times with 80% ethanol, and eluted in 60 µl 10 mM Tris-HCl (pH 8.0). All PCR reaction products were separated by agarose gel electrophoresis for product confirmation. Representative 96-well plates were sequence confirmed.

PCR products (1:1 mixture with dimethyl sulfoxide) were arrayed into 384-well microarray spotting plates. PCR products and 1 control plate (see below) were spotted using a Lucidea Array Spotter (Amersham Biosciences) in normal mode (2 replicates; 200 µ row and column pitch; 0.9 – 1.1 mm slide thickness setting; 65% humidity) onto Type 7*reflective amino-silane coated glass slides (Amersham Biosciences). Resulting microarrays contained ~ 20,000 distinct spots grouped into a 4 x 12 grid pattern (400 spots/grid). Spotted slides were allowed to sit at 65% humidity for 1 h, and were ultraviolet cross-linked and stored in a desiccator at room temperature until use.

A control plate containing positive controls (rat ß-actin and glyceraldehyde-3-phosphate dehydrogenase [GAPDH]), negative controls (various yeast and Escherichia coli transcripts), and water blanks was incorporated into each spotted microarray. Positive controls were amplified using rat ß-actin and GAPDH control amplimer sets (Clontech, Palo Alto, CA). Spotting concentrations for ß-actin and GAPDH positive controls were 100, 50, 25, 12.5, 6.25, and 3.125 ng/µl. Spots at each concentration were represented 12 times per array. Spotting concentrations for negative controls (5 yeast and 15 E. coli transcripts) were 50 ng/µl, and each was represented 12 times per array. There were 384 water blanks spotted per array.

Probe Synthesis and Microarray Hybridization
To determine the concentrations of spotting material generated from cDNA clones, microarrays were hybridized with Panomer 9 random-sequence oligodeoxynucleotides conjugated with Alexa Fluor 633 (Panomer-633; Molecular Probes, Eugene, OR). Data generated from these hybridizations were used to determine the initial concentrations of spotting material. Representative microarrays from four separate spotting sessions were hybridized with 40 pmoles Panomer-633 in 40 µl under a glass coverslip (10 min at room temp). Following hybridization, cover slips were floated from each slide in 2x saline-sodium citrate (SSC)/0.2% sodium dodecyl sulfate (SDS, room temp), and slides were washed consecutively in 2x SSC/0.2% SDS (room temp, 5 min with gentle shaking), 0.1x SSC (5 dips, room temp), rinsed in water (5 dips), and dried under an N2 stream. Slides were scanned using a GenePix 4000B microarray scanner (PMT 600 V at 635 nm; Axon Instruments, Union City, CA), and images were processed using Lucidea Automated Spotfinder 1.0 software (Amersham Biosciences).

For airway and parenchyma probes, RNA was isolated from microdissected subcompartments obtained from individual lungs of five untreated rats (n = 5). Detailed methods and validation of the quality of mRNA have been presented separately (17). Briefly, rats were exsanguinated, the trachea was exposed and cannulated, and lungs were removed from the body and inflated with RNAlater (Ambion, Austin, TX). Lungs were stored submerged in RNAlater at 4°C until airways and parenchyma were obtained by dissection. Total RNA was isolated using TRIzol (Invitrogen) followed by an RNeasy protocol with on-column DNase digestion (Qiagen). mRNA was isolated using OligoTex (Qiagen). Cy3- and Cy5-labeled cDNA probes were prepared from 1 µg mRNA using the CyScribe cDNA Post-Labeling Kit (Amersham Biosciences) following the manufacturer's protocol. Briefly, first-strand cDNA was synthesized (42°C, 2 h) incorporating amino-allyl dUTP, and mRNA template was removed by degradation under basic conditions. Single-stranded cDNA was purified using a QIAquick PCR Purification Kit (Qiagen). Equal aliquots of eluants were split (~ 0.5 µg single-stranded cDNA each) for labeling with either Cy3- or Cy5-NHS esters, and probes were purified using QIAquick. Spectral properties of each probe were determined at 260 nm, 550 nm (Cy3), and 650 nm (Cy5). Amount of probe (ng), total dye content (pmoles), and specific activity (number of Cy molecules incorporated/number of bases) were calculated.

Purified probes were matched (airway versus parenchyma from individual animals) based on total dye content (30–50 pmoles of each), combined and evaporated in a vacuum centrifuge. Evaporated probes were resuspended in 30 µl hybridization solution (50% formamide, 50 ng/µl mouse Cot-1 DNA, 50 ng/µl polyA, 1x V2 Amersham hybridization buffer), placed on a microarray under a glass coverslip (pre-washed with ethanol and dried under an N2 stream), and hybridized 16–18 h at 42°C in a humidified chamber. Following hybridization, coverslips were floated off the slide in 2x SSC/0.1% SDS (room temp), and slides were washed consecutively (with gentle shaking) in 2x SSC/0.1% SDS (55°C, 5 min), 1x SSC/0.1% SDS (55°C, 5 min), 0.1x SSC (3 x 20 dips, room temp), rinsed in water (20 dips), and dried under an N2 stream. Slides were scanned using a GenePix 4000B microarray scanner (Axon Instruments) (PMT 750 V for Cy3 and 775 V for Cy5), and images were processed using Lucidea Automated Spotfinder 1.0 software (Amersham Biosciences).

Microarray Data Analysis
Expression differences were assessed using mRNA isolated from airway and parenchymal tissue from five rat lungs. Hybridizations were performed comparing subcompartments from individual animals, with Cy3 representing the airway and Cy5 representing the parenchyma. Additionally, dye-swap hybridizations were performed for each animal (Cy5, airway; Cy3, parenchyma) to minimize any effect from dye bias.

Normalization factors were calculated for each dye channel from each array by finding the median for all 50 and 100 ng/µl ß-actin positive control spots (24 spots total per dye channel). Signal intensities from either 50 or 100 ng/µl spotted ß-actin DNA did not differ significantly from each other. Normalization factors were applied to relative fluorescent unit (RFU) signals for all spots from each slide (Cy3 to Cy5 normalization). The normalization factors were also applied to RFU signals to normalize across slides for both dyes. For each animal, four normalized RFU signals (two duplicates/slide x two replicate slides) for either subcompartment were obtained per transcript, and their geometric means were calculated to yield one value per transcript per animal. Means and standard deviations for each lung subcompartment were calculated per transcript from the five individual animal geometric means of RFU signals.

The method for determining significant gene expression differences between the airway and parenchyma RFU signals (n = 5) involved a two-sample unpaired t test with a two-tailed distribution assuming unequal variance. The conversion of RFU to % actin level was calculated using the mean RFU signal for all normalized ß-actin spots at 50 and 100 ng/µl for 10 hybridizations.

Signal ratios (airway/parenchyma) also were calculated using the normalized RFU signals. Four ratios per animal (duplicates and inverse replicates) were obtained per transcript. Their geometric means were calculated to yield one ratio value per transcript per animal. Means and standard deviations were calculated from the geometric means of ratios obtained from five individual animals.

To determine whether there was any dye bias effect introduced from using the secondary labeling procedure described in these studies, normalized hybridization data from airway samples labeled with Cy3 and Cy5 were compared with each other (Airway-Cy3 versus Airway-Cy5). Similarly, Parenchyma-Cy3 data were compared with Parenchyma-Cy5. Additionally, ratios generated from Airway-Cy3/Parenchyma-Cy5 were compared with ratios generated from Airway-Cy5/Parenchyma-Cy3. Data for such comparisons were normalized as described above and generated as follows: two normalized RFU signals (two duplicates/slide) for either subcompartment/dye were obtained per transcript; their means were calculated; and means and standard deviations for each lung subcompartment/dye were calculated per transcript from each of the five individual animal RFU signals. Signal ratios (airway/parenchyma) also were calculated using the normalized RFU signals. Two ratios per animal (duplicates) were obtained per slide, and their means calculated. Means and standard deviations were calculated from five individual animal means.


    Results
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
Rat Lung cDNA Library Sequencing
A final set of 22,940 high quality sequences (accuracy of base call = 99.95%, average sequence length 516 bases) was available for analysis (Table 1). All high quality sequences were assembled into contigs to remove redundant data, and this resulted in a reduced rat lung sequence set of 8,746. Roughly 80% of the sequences assembled into 3,956 contigs, whereas 20% of the sequences remained as unincorporated singlets. The 80:20 ratio of contigs to singlets was consistent across all four libraries (airway and parenchyma) from control and 1-NN–treated rats.


View this table:
[in this window]
[in a new window]
 
TABLE 1 Rat lung sequence set data reduction after removal of poor quality data, contiguous sequence assembly with Phrap, and after removal of further redundants identified after BLAST

 
For EST identification, alignments were performed with this reduced rat lung sequence set against currently available NCBI sequence databases including the nt database and the RefSeq databases from rat, mouse, and human. ESTs were further analyzed to identify any other unincorporated redundant sequences. BLAST results were searched for replicates by comparing GenBank accession numbers and also by manually searching for replicates in gene identities (for ESTs not sharing accession numbers as for homologous sequences from different species). This resulted in a final set of 7,460 nonredundant ESTs (Table 1).

The data in Figure 1A represent the best BLAST alignments for each of the 7,460 ESTs. Results are classified by the source species supplying the best alignment as well as by its e-value score derived from the BLAST search. An e-value of 1 x 10-5 was used as the lowest confidence threshold for positive identification, and all ESTs with e-values higher than this threshold value were classified as unidentified and possibly novel. Of the sequences searched, 82% (6,130) showed some degree of homology to sequences in the public databases (Table 2). Of the nonredundant rat lung library sequences, 19% were best identified by a rat sequence, 49% by a mouse sequence, 14% by a human sequence, and 0.4% by sequences from other species. Of the 6,130 positively identified ESTs, only 3,414 had known biological function according to their best BLAST alignment (Table 2). The remaining 2,716 sequences, largely derived from EST or bacterial artificial chromosome clone sequencing projects, had no known function.



View larger version (27K):
[in this window]
[in a new window]
 
Figure 1. (A) Representation of 7,460 rat lung sequences identified by BLAST against sequences in the NCBI public databases. All sequences were aligned against the rat, mouse, and human RefSeq databases (Nov. 2002) and the nt database (May 2002). Data are classified by the source species supplying the best alignment from all database sequences. Identified sequences were rated by their BLAST e-value scores. The confidence level of the match is color-coded by the e-value score. Confidence level of the alignment is inversely proportional to the e-value with 0 representing the highest quality match. An e-value of 1 x 10-5 was used as the lowest confidence threshold, and all sequences with this score or greater were considered novel. (B) Venn diagram representing the overlap in content between the rat lung sequence set and the Affymetrix GeneChip Rat Expression Set 230. Rat lung sequences were aligned against the Affymetrix rat sequences using BLAST. Overlapping sequences were color-coded by their BLAST e-value scores.

 

View this table:
[in this window]
[in a new window]
 
TABLE 2 Sequence identity breakdown for unique rat lung sequence set after BLAST alignments with public databases

 
Slightly less than 18% (1,330) of the ESTs from the rat lung sequences were determined to be novel (Table 2). Of these 1,330 ESTs, 606 and 471 were singlet sequences obtained from airway and parenchyma libraries, respectively. The remaining 253 ESTs were contig sequences condensed from 563 individual sequences—319 from airway libraries and 244 from parenchyma libraries. Comparisons of the length and quality scores for positively identified versus novel sequences showed that the average lengths were similar (known: 555 ± 138 bases and novel: 491 ± 170 bases), and the average Phred quality score per base was also similar (known: 36 ± 6 and novel: 34 ± 6).

An alternative approach to monitoring alterations in gene expression to lung toxicants would have been to use arrays that are commercially available. The current Affymetrix Rat GeneChip (Set 230, April 2,003) consists of 31,142 sequences designed from GenBank, dbEST, RefSeq, the UniGene database (Build 99, June 2002) and refined by analysis and comparison with a draft assembly (June 2002) of the rat genome (Baylor College of Medicine Human Genome Sequencing Center, Houston, TX). Of the ESTs identified through the current studies, nearly one-third are not represented on the current Affymetrix array (Figure 1B). Of the sequences which overlapped with those on the Affymetrix array (71%), e-value scores indicated that more than 58% were either identical or excellent matches to rat sequences in the public databases.

The most highly sequenced transcripts from all four libraries were determined from the nonredundant EST set, and the frequencies (as a percentage of sequences obtained from each library) for the five most abundant sequences ranged from 0.7–4.6% (control airway), 0.7–1.8% (1-NN–treated airway), 0.7–1.8% (control parenchyma), and 1.3–2.8% (1-NN–treated parenchyma). The five most highly sequenced transcripts from the control airway library (6,865 sequences) were dimethylarginine dimethylamino hydrolase (313 copies), Clara cell secretory protein (155 copies), synaptic vesicle glycoprotein 2b (61 copies), ß-carotene 15, 15-dioxygenase (60 copies), and thymosin ß-4 (45 copies). The top five transcripts sequenced from the airway after 1-NN treatment (5,152 sequences) were Clara cell secretory protein (92 copies); dimethylarginine dimethylamino hydrolase (71 copies); palate, lung, and nasal epithelium carcinoma–associated mRNA (71 copies); tyrosine aminotransferase (49 copies); and synaptic vesicle glycoprotein 2b (37 copies).

The five most highly sequenced transcripts from the control parenchyma library (4,997 sequences) were surfactant-associated protein 1 (91 copies), eukaryotic translation elongation factor 1 {alpha} 2 (71 copies), synaptic vesicle glycoprotein 2b (54 copies), selenoprotein P (39 copies), and surfactant-associated protein C (38 copies). After treatment with 1-NN, the top five most sequenced transcripts from the parenchyma (5,926 sequences) were surfactant-associated protein C (167 copies), surfactant-associated protein 1 (136 copies), Clara cell secretory protein (118 copies), lysozyme (110 copies), and dimethylarginine dimethylamino hydrolase (76 copies).

Rat Lung cDNA Microarray Development and Quality Control
A minimally redundant cDNA clone set derived from this rat lung sequencing project was used to generate a rat lung cDNA microarray (as described in MATERIALS AND METHODS) including blanks and positive and negative controls. The resulting microarrays contained ~ 20,000 distinct elements with each PCR product spotted twice. The concentrations of spotting material generated from cDNA clones were determined after spotting from hybridizations with fluorescently-labeled random-sequence oligo-deoxynucleotides. The average spotting concentration of the PCR products (as determined from the control standard curves) was 50.2 ± 23.7 ng/µl (mean ± SD, n = 8,887).

The data obtained after hybridizations with rat airway and parenchyma probes were analyzed to determine the spotting concentrations that would allow for minimal spot saturation. A standard curve generated from ß-actin–positive controls was used to determine that spotting concentrations at or above 30 ng/µl were sufficient to avoid saturation of the target on the slides for highly expressed transcripts such as ß-actin. Over 80% of the PCR products were spotted at concentrations at or above 30 ng/µl. Spot signal intensities for ß-actin decreased at a linear rate below a spotting concentration of 25 ng/µl. Approximately 2% of the spots on these microarrays were derived from concentrations of spotting material at or below 5 ng/µl.

The housekeeping genes (ß-actin and GAPDH) spotted on each array as positive controls were used to determine the variability in the data collected from airway versus parenchyma hybridizations (n = 10, see MATERIALS AND METHODS). Actin transcript levels showed a coefficient of variation (CV) between 11 and 18% in the airways and between 12 and 25% in the parenchyma for DNA spotted at 50 or 100 ng/µl. GAPDH transcript levels for DNA spotted at these concentrations showed a CV between 17 and 23% in the airways and between 14 and 24% in the parenchyma. CVs from all normalized expression data averaged 26.6 ± 13.4% (airways) and 25.4 ± 12.8% (parenchyma). The signal intensities from negative controls (PCR products from E. coli and yeast genes spotted at 50 ng/µl) averaged between 0.4 and 0.9% actin level with CVs ranging from 67–89% for either airway or parenchyma. For selected transcripts, PCR products were deposited in several quadrants of each slide allowing for replicate analysis within an individual microarray. Signal ratios generated from such replicate data points (3–19 replicates) correlated well with CVs averaging < 20% regardless of spotting concentration.

Studies to determine whether there was significant dye bias introduced by the secondary labeling procedure used in this investigation showed that, after normalization of signal intensities for all elements on slides hybridized with Airway-Cy3 versus Airway-Cy5 or Parenchyma-Cy3 versus Parenchyma-Cy5, data were virtually identical regardless of the dye used. When comparing ratio data (Airway-Cy3/Parenchyma-Cy5 or Airway-Cy5/Parenchyma-Cy3), signal ratios for over 80% of the spots were within 30% of each other (~ 90% of spots within 40% of each other). Only 2% of the spots showed very poor matches between dye-swap data—varying more than 100% of each other. These data were derived from spots yielding much lower transcript signals (< 1% actin level) and significantly higher CVs than data with better correlations.

Transcript Expression Differences Between Rat Lung Airways and Parenchyma by Microarray
Before investigations using these microarrays for assessing gene expression changes in response to pulmonary cytotoxicants, the effectiveness of these custom arrays was demonstrated by interrogating gene expression differences in highly defined lung subcompartments. Relative gene expression levels were measured in rat airways and parenchyma, and the data obtained from these arrays were validated by comparison to selective gene or protein expression reported in the literature. Fluorescently labeled cDNA probes were generated from lung subcompartments of five rats and compared with each other on duplicate arrays (dye-swap hybridizations). Data obtained from these 10 hybridizations were normalized within individual slides (channel to channel) and across multiple slides based on ß-actin levels as described in MATERIALS AND METHODS.

Airway and parenchymal epithelial cells of the lung serve very distinct functions. Consistent with the different functions, several thousand genes (a total of 25.6% of the transcripts represented on this array) were expressed at significantly different levels (P <= 0.05) in the airway compared with the parenchyma (Figure 2). A total of 492 genes showed significantly higher levels of expression in the airway, whereas 1,782 showed significantly higher levels of expression in the parenchyma. Of the novel transcripts included on this microarray, 1,287 (> 95%) were detectable on the microarray, showing expression levels above 2x negative controls. Of these, 283 (~ 22%) showed significant differential expression (P <= 0.05) between airway and parenchymal subcompartments. More than 97% of all transcript signals measured from either lung subcompartment were above 2x negative controls (~ 2% actin level). Genes showing the greatest microarray expression differences between the airways and parenchyma are shown in Figure 3 and Figure 4 along with the copy numbers for these transcripts as assessed from the sequencing experiments. The gene identities and cellular function corresponding to those in Figures 3 and 4, where known, are listed in Tables 3 and 4, respectively. The percentage of these genes falling into any of the ascribed functional classification groups are summarized in Table 5.



View larger version (37K):
[in this window]
[in a new window]
 
Figure 2. Scatter plot of mRNA expression data obtained from airway versus parenchyma hybridizations on the rat lung 20K microarray. RNA was collected from individual rat lung airways and parenchyma, and probes fluorescently labeled. Duplicate arrays (dye-swaps) were analyzed for five rats (10 hybridizations), and data normalized based on ß-actin (see MATERIALS AND METHODS) within individual arrays and across the ten arrays. Data represented are the means (n = 5) from individual rats and are shown as expression levels relative to ß-actin (% actin level). Data on both axes are plotted on a log10 scale. Highlighted data (green and red) show significant differences (P <= 0.05) between expression levels in airways compared with parenchyma.

 


View larger version (46K):
[in this window]
[in a new window]
 
Figure 3. Airway transcripts (50) showing the highest fold expression differences by microarray between the airway and parenchyma subcompartments. All microarray data presented are statistically significant (P <= 0.05). The bar graph shows the normalized expression levels in the airways and parenchyma relative to ß-actin (% actin level). All data were normalized to ß-actin between subcompartments and across multiple slides. Data are the means ± SD (n = 5) from individual rats as described in MATERIALS AND METHODS. In tabular format are the mean fold differences between airways and parenchyma obtained from microarray expression data as well as the number of times the transcript was sequenced from either the Control Airway (CA) or Control Parenchyma (CP) cDNA libraries. Gene identities and their associated functional categories are listed in Table 3.

 

View this table:
[in this window]
[in a new window]
 
TABLE 3 Gene identities for the 50 highest fold differentially expressed transcripts in the airways corresponding to Figure 3

 


View larger version (50K):
[in this window]
[in a new window]
 
Figure 4. Parenchymal transcripts (50) showing the highest fold expression differences by microarray between the airway and parenchyma subcompartments. All microarray data presented are statistically significant (P <= 0.05). The bar graph shows the normalized expression levels in the airways and parenchyma relative to ß-actin (% actin level). All data were normalized to ß-actin between subcompartments and across multiple slides. Data are the means ± SD (n = 5) from individual rats as described in MATERIALS AND METHODS. Mean fold differences between airways and parenchyma obtained from microarray expression data as well as the number of times the transcript was sequenced from either the Control Airway (CA) or Control Parenchyma (CP) cDNA libraries. Gene identities and their associated functional categories are listed in Table 4.

 

View this table:
[in this window]
[in a new window]
 
TABLE 4 Gene identities for the 50 highest fold differentially expressed transcripts in the parenchyma corresponding to Figure 4

 

View this table:
[in this window]
[in a new window]
 
TABLE 5 Summary of functional classifications for the 50 highest fold differentially expressed transcripts in the airways and parenchyma

 
As would be expected, a number of marker genes show significant differential expression between these lung subcompartments. For example, airway nonciliated bronchiolar epithelial (Clara) cells produce a variety of secretory proteins including Clara cell secretory protein (Ugb) and the secretoglobins (Scgb3a1 and Scgb3a2). All of these genes are expressed at substantially higher levels in the airway compared with parenchyma (7.4, 23.6, and 8.2 fold, respectively). Likewise, the surfactant pulmonary-associated proteins (Sftpc and Sftpa1) are products responsible for stabilizing the alveolar airspace. These transcripts were expressed at much lower levels in the airway than in the parenchyma (-3.2 and -2.3 fold, respectively).

The availability of custom microarrays generated from extensive sequencing of airway and parenchymal cDNA libraries allowed a comprehensive comparison of the relative expression levels of genes in both lung compartments. Accordingly, transcript expression levels were sorted by signal intensities to determine the relative transcript abundance in airway and parenchyma (Figure 5). Three distinct populations were present in either subcompartment which were categorized as "low abundance" (< 10% actin level), "moderate abundance" (10–99% actin level), and "high abundance" (>= 100% actin level). Although relative expression levels were similar between subcompartments, the airways appear to have slightly more low abundance mRNA transcripts and slightly less moderately abundant transcripts than the parenchyma (Figure 5A). Thirty-eight percent of airway transcripts were expressed at low abundance, whereas slightly fewer (33%) parenchymal transcripts were expressed at that level. Fifty-five percent of airway transcripts were expressed at moderate abundance levels, whereas slightly more (60%) were expressed at those levels in the parenchyma. The number of highly expressed transcripts (expressed at or above 100% actin level) was similar between the airways (7.1%) and the parenchyma (6.8%). When comparing relative abundance between transcripts that were identified by BLAST or found to be novel, it was clear that more novel genes were expressed at low levels in both the airways and parenchyma (Figures 5B and 5C). These data are presented as percentages of either the novel or identified transcript populations to facilitate comparisons. Over half of the genes found to be novel were expressed at low levels (55% in airways; 51% in parenchyma) compared with genes that had already been identified in public databases (34% in airways; 29% in parenchyma).



View larger version (26K):
[in this window]
[in a new window]
 
Figure 5. Distribution of transcript mRNA expression levels (expressed as % actin level) from rat airways or parenchyma obtained on the rat lung 20K microarray. Distribution bins on the x-axis are plotted on a log10 scale. Data presented in A show the relative abundance distributions for nonredundant transcripts (7,460) in either the airway (Air) or parenchyma (Par). The remaining data (B and C) show the relative abundance distributions for transcripts which were either identified (Id) or novel (Nov) based on BLAST e-values (see Figure 1A). Data in B (showing airways) and C (showing parenchyma) are presented as the percent of transcripts in the identified transcript population (6,130 transcripts) or the novel transcript population (1,330 transcripts) to facilitate the comparison of abundance classes.

 
The five most highly expressed transcripts in the airway were hyaluronan-mediated motility receptor (2,084% actin level), a transcript encoding HDCMA18P protein (1,893% actin level), Clara cell secretory protein (1,635% actin level), a transcript encoded by the mitochondrial genome (1,593% actin level), and a transcript encoding hypothetical protein MGC14961 (1,584% actin level). In the parenchyma, the most highly expressed transcripts included a novel sequence (2,642% actin level), lysozyme (1,761% actin level), leucine-rich repeat LGI family, member 4 (1,725% actin level), a RIKEN cDNA 4833421E05 (1,674% actin level), and surfactant-associated protein C (1,547% actin level).


    Discussion
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 
The lung is an important target for a number of cytotoxic and carcinogenic chemicals that reach the respiratory system via both inhalation and parenteral routes of exposure. The overall goal of the research program outlined in this article, and to be presented in subsequent articles, is to define the response of target lung cells to cytotoxicant exposure during phases of acute injury and repair using global gene expression analysis. Although microarrays have been used to evaluate changes in gene expression during lung development (10) and to define alterations in expression profiles associated with fibrosis (8), this approach has not been widely used to understand changes that occur in the lung in response to cytotoxic agents.

One of the challenges in elucidating the response of the lung to external stress is related to the fact that the respiratory system is composed of over 40 different cell types with highly specialized functions specific to the unique physiology and biochemistry of this tissue. Accordingly, the methods used to sample the lung for delineating alterations in homeostasis associated with toxic exposures must differentiate between target and nontarget areas of this organ system. Although earlier studies have been able to demonstrate temporal relationships in gene transcript levels for extracellular proteins during development, the use of whole lung homogenates in the studies may not accurately reflect important changes occurring in specialized sites within the lung (10). Significant progress in the preservation and isolation of mRNA from subcompartments of the lung has been reported (17).

Likewise, the specialized functions of the respiratory system suggest that use of commercially available arrays may not provide adequate coverage of a number of genes specifically linked to the unique physiology of this tissue. This point is fully supported by the data shown in Figure 1B indicating that nearly 30% of the elements present on the custom arrays described in this study were not represented in the most recent version of the rat gene array (Set 230) from Affymetrix. Similarly, when compared with the public databases, nearly 20% of the genes sequenced in this study did not align to sequences from rat, mouse, or human sequence databases at the time these studies were conducted (Figure 1A, Table 2). Eight of these novel genes (out of the 100 represented in Figures 3 and 4) were highly differentially expressed in airway or parenchyma, suggesting that their gene products may have very specific roles in the maintenance of homeostasis in these tissue subcompartments.

Although there have been large sequencing efforts for human and mouse genetic material, data from rat is less abundant. For example, the RefSeq database for rat contains only 4,023 fully annotated sequences compared with databases for mouse and human (which contain 12,683 and 18,232 sequences, respectively). This work documents 7,460 nonredundant cDNA clones identified after the analysis of 22,940 high quality sequences from rat airway and parenchyma subcompartments. The number of airway sequences which may have been overlooked by sequencing a whole lung cDNA library is estimated to exceed 650 airway transcripts (close to 10% of this cDNA clone set) unless the whole lung library had been sequenced in far greater depth than has been achieved so far. From the sequences analyzed, a set consisting of 8,789 sequences was contributed to the rat sequencing effort (GenBank accession nos.: CF106750, CF115538).

In addition to the uncharacterized sequences obtained during the course of this work, the development of spotting material for use in glass slide arrays offers the ability to generate arrays at a cost that allows studies to be conducted in several animals at each time point and at multiple doses. In a number of published studies (10), the significant cost of commercial chips resulted in the use of pooled samples. Although there is ample experimental data to show that the chips properly interrogate the levels of expression in that pool, this approach precludes any assessment of the variance between animals. Statistically valid conclusions on alterations in gene expression in response to stressors can only be derived from data sets that contain multiple animals (or multiple pools of animals) measured individually.

The sequencing project and microarray studies presented here generated two unique data sets representing differential expression of transcripts from the airways and parenchyma of rat lung. Although for many of the higher abundance transcripts there was a good correlation between the two data sets, for much of the data, it was difficult to make comparisons between the expression levels on the arrays with the number of transcripts sequenced. Better comparisons would have required a much larger sequencing data set, such as those for SAGE analysis, which can involve as many as 300,000 sequence tags (18).

Although all of the microarray work described in this article included the use of dye reversal hybridizations, the data show that when using the indirect labeling procedure, use of a single dye provides good quantitative measurements. Ratio data generated from airway-Cy3 versus parenchyma-Cy5 and airway-Cy5 versus parenchyma-Cy3 hybridizations were within 30% of each other for over 80% of the transcripts, and the majority were within 10%. As would be expected, more variability was noted with genes which were not highly expressed. Significant differences in signal ratios (> 2-fold difference) were observed in < 2% of the data.

One of the possible disadvantages associated with the development of custom arrays that use spotted PCR products is related to cross-hybridization for genes with closely related sequences. Although in the vast majority of cases the arrays produced in this study appeared to properly interrogate differences in expression between the airway and parenchymal subcompartments, some transcripts, such as those coding for some of the cytochrome P450 monooxygenases, were expected at much higher levels in airway compared with parenchyma. For example, a transcript expected to be present at significantly higher levels in the airway compared with parenchyma was CYP2F4 (unpublished observations, Baldwin and colleagues). Significant differences in expression between the airway and parenchyma were not observed for any of the CYP genes, including CYP2F4. Although many of the CYPs are highly abundant in airway Clara cells, several have also been demonstrated in alveolar type II cells, macrophages, and endothelial cells of the parenchyma (19). Lack of differential expression between the subcompartments may be due to more equal distribution of these CYPs than previously expected. However, some measure of cross-hybridization between the different CYP homologs is likely because they belong to a highly homologous family of enzymes. Some level of cross-hybridization may have decreased the ability of the current arrays to discriminate among the various CYPs. These findings underscore the importance of confirming differentially expressed genes by secondary techniques, especially when using cDNA microarrays, to avoid any artifacts introduced by such possible cross-hybridization effects.

Spot intensities were used to compare relative abundance of transcripts from the airway and parenchymal subcompartments. Although mRNA populations differ between cell types, those from typical mammalian cells are categorized into three distinct abundance classes (20). Similarly, transcripts in the lung appear to fall into three fairly distinct abundance classes, but with variations between airway and parenchyma in the "low" and "moderate" abundance classes. Roughly 7% of all transcripts measured had abundance levels at or above 100% actin level and thus would be characterized as "highly abundant." In the "typical" mammalian cell, < 1% of all transcripts fall into this highly abundant category. The data presented in Figure 5 indicate that cells in the lung produce more highly abundant transcripts than observed in other tissues. This is not surprising considering the diversity of cell types in the lung, each with highly specialized functions.

It is important to note that the measurements of abundance levels presented here are meant only as observations of relative abundance. The accuracy of using fluorescence data from cDNA microarrays such as those used in these studies as a measure of abundance is limited and should be viewed with caution. Variation in spotting material concentration, variation in length of spotted PCR products, and differential incorporation of fluorescent molecules (Cy3 or Cy5) into labeled cDNA probes are a few of the factors that may potentially lead to artifacts in these measurements. There are several pieces of data that provide confidence that the relative abundance data presented contains minimal artifacts introduced as a result of low concentrations of spotting material. These include the following: (i) over 80% of microarray spots were deposited at sufficient concentrations (> 30 ng/µl) to measure highly abundant transcripts; (ii) when comparing only those data obtained from PCR products at concentrations > 30 ng/µl to the complete data set, expression abundance patterns (low:moderate:high abundance classes) are not altered significantly; (iii) control data indicate that transcripts expressed at 10% actin level should be measured accurately at spotting concentrations as low as 5 ng/µl; and (iv) over 98% of the microarray spots were deposited at or above 5 ng/µl, and this concentration should be sufficient for over 90% of transcripts (low and moderate abundance). A further issue relates to the fact that the lengths of PCR products spotted varied thus potentially affecting probe-target binding dynamics. These factors are less of a concern when using the short oligonucleotides (50- to 70-mers) available commercially for spotting glass slide arrays. Partial sets have become available for the rat within the past months.

In addition, differential incorporation of fluorescent molecules during probe labeling may introduce bias that could affect the measurement of abundance. The microarray work described here included the use of dye reversal hybridizations to minimize dye-biased data. As discussed earlier, using an indirect labeling method provides relatively equal incorporation of dye to the cDNA probe regardless of the dye incorporated (Cy3 or Cy5), and therefore, we feel confident that dye-biased effects were minimized.

One of the primary goals of the work described here has been to validate the use of this array for interrogating changes in gene expression in response to pulmonary toxicants. Thus, the relative expression levels for genes in the airway versus parenchyma should correspond with reports from in situ hybridization and immunocytochemistry studies on the localization of various genes and gene products in the lung. Transcript populations that are expressed primarily in either parenchyma or airway should show deviations from transcripts that are expressed at relatively equal levels in either lung subcompartment (such as housekeeping genes). The data presented in Figure 2 clearly show three distinct populations that fit this paradigm.

Among the most selectively expressed transcripts in the airways were Clara cell secretory protein (Ugb) and the secretoglobins (Scgb3a1 and Scgb3a2). These transcripts are expressed primarily by the Clara cells of the airway epithelium (21). Other transcripts known to be expressed in the airways included DNCL2B which is likely from ciliated cells (22), Plunc from the trachea and bronchi (23), and Dcn which has been shown in the extracellular matrix of the airway (24).

Similarly in the parenchyma there was selective expression of the pulmonary surfactant proteins (Sftpc and Sftpa1) which are known to be expressed highly by alveolar epithelial Type II cells (21). Other genes which were more highly expressed in parenchyma than in airways included Cldn5, Anxa8, Ca2, Lyz, and Kdap, and again, these have been shown to be expressed in alveolar epithelial Type II cells (21, 2528). In addition, Ca4, Ace, and Defb2 are expressed in alveolar macrophages or capillary endothelial cells (2931). Substantial differences in the expression of several hundred genes between conducting airway and alveolar lung regions supports the contention that proper understanding of toxicant-induced changes in gene expression requires the use of methods capable of selecting target and nontarget areas of the lung as well as analysis with a lung-derived microarray.

Other transcripts which have been reported in whole lung but whose subcompartment localization is unknown included MLL3, S100C, Hpgd, Men1, and Egfl6 (3236). The current studies indicate that MLL3 and S100C are more highly expressed in the airways than the parenchyma while Hpgd, Men1, and Egfl6 are more highly expressed in the parenchyma (Tables 3 and 4). The genes listed over the last few paragraphs only comprise 20% of the top 100 most differentially expressed genes between these subcompartments. The remaining 80% of transcripts were either previously uncharacterized in the lung or of unknown biological function in any tissue. Although many of the sequences have been reported recently from the rat genome sequencing project, this study shows that these genes are expressed in either airway or parenchyma. Interestingly, many of the genes listed in Tables 3 and 4 involved in intermediary metabolism or in cell/organism defense and homeostasis were expressed more highly in the parenchyma compared with the airways (Table 5). This is consistent with the nature of this region of the lung: (i) it requires specific conditions conducive for efficient gas exchange, and (ii) cells are in a very oxidative environment requiring vigorous intermediary metabolism. Conversely, in the airways, a higher percentage of genes could be classified in cell structure/motility and this is consistent with the mucociliary function of the cells lining the airways.

Of the top 100 mRNA transcripts showing the highest fold expression differences between the two lung subcompartments, over 50% had no known biological function. This indicates that many uncharacterized genes identified through the approach described here may contribute significantly to the specialized functions of the lung. Over 40% of the transcripts showing significant expression differences between the airways and parenchyma had no known biological function associated with them. It was also interesting to note that the parenchymal subcompartment showed over three times the number of significantly different transcripts than did the airways, implying that a greater heterogeneity of the mRNA transcripts may be required for homeostasis in the parenchyma. These findings underscore the need to establish fundamental baseline knowledge of the lung such as that provided by this gene expression data set to aid in the functional characterization of specific regions of the lung. Accordingly, the gene expression data generated from these studies will be available for other investigators to query through NCBI's Gene Expression Omnibus public gene expression and hybridization array data repository (http://www.ncbi.nlm.nih.gov/geo/) and through our web site (http://faculty.vetmed.ucdavis.edu/faculty/arbuckpitt/BuckpittLab).

In summary, this sequencing effort has allowed for the construction of a focused microarray containing 7,460 nonredundant transcripts derived from pulmonary tissue and supplies valuable baseline information of transcripts represented in the respiratory system. Nearly one-third of the elements present on this custom microarray are not currently represented on Affymetrix rat arrays; thus, this microarray represents a valuable and unique resource for delineating gene expression differences between lung subcompartments and in response to toxicants. The microarray data presented here validates the capability of this array to delineate gene expression differences in lung tissues and provides prima facie gene expression data from different lung subcompartments. We have used these arrays to monitor gene expression changes in target and nontarget areas of the lung in an effort to gain insight into some of the fundamental responses of pulmonary epithelial cells during injury and repair from a metabolically activated cytotoxicant. Those studies will be presented in subsequent papers. Future studies will be directed at identifying the functions of the gene products for those genes whose expression levels are altered by toxicant exposure.

Note from Authors: The field of genomics and microarrays is rapidly and continuously evolving. This has made it difficult to present these sequencing data in a fashion that is as "up-to-date" as possible while trying to keep pace with the changes in publicly available sequence data. In the time the data for this manuscript were acquired, analyzed, and submitted for publication, the Rat Genome Project has made significant progress in its sequencing of the rat genome. As of June 30, 2003, 90% of the rat genome (v.3.1) has become available. A recent comparison of the rat lung sequences to the updated rat genome shows that roughly 90% of the "novel" sequences presented here have now been identified by that project. Many of these sequences, though, remain uncharacterized and their biological functions are still unknown. When the 7,460 lung sequences were compared with the latest annotated sequence databases (RefSeq, July 2003) for rat, mouse, and human (4,549, 16,117, and 20,365 sequences, respectively), the number of sequences in this rat lung data set with known biological function had only increased from ~ 46% with earlier databases to 51% with the latest databases.

Although the new rat genome sequences help confirm the quality of the lung sequences generated through this approach, the authors have kept the identity nomenclature from the original BLAST comparisons with the databases described in MATERIALS AND METHODS solely to emphasize the potential of this approach in identifying unique tissue-specific sequences from less-sequenced species. Although the rat genome will be fully sequenced in the near future, sequence information for many other species remains incomplete, and application of the approach described in this article would be highly useful. Regardless of the sequence nomenclature used, the information presented in this article provides prima facie evidence for the expression of 7,460 genes in respiratory tissue.

Furthermore, this does not affect the validity of the microarray developed using this approach or the subsequent gene expression data generated. The comparison made to the Affymetrix Rat GeneChip (Set 230, April 2003) is as up to date as possible (no new releases have been made available), and the noted differences undoubtedly reflect the lag time required to "mine" sequence data and develop the necessary oligonucleotide sets for inclusion on such microarrays. The approach described in this manuscript has allowed for the completion of the toxicology studies initially proposed before the availability of a suitable commercially available rat array containing many of these lung-specific genes.


    Acknowledgments
 
This study was supported by National Institutes of Environmental Health Sciences ES 09681, ES 04699, and Amersham Biosciences. UC Davis is a Center for Environmental Health Sciences and support for core facilities is acknowledged. M.A.S. was supported during the preparation of this manuscript by T32HL07013. The authors acknowledge the research team at Aeomica for their support provided for the sequencing project. They also gratefully acknowledge Dr. David Rank and Dr. Russell Thomas for informatics advice and David Jenkins for support in the construction of the microarray.

Received in original form June 10, 2003

Received in final form August 13, 2003


    References
 Top
 Abstract
 Introduction
 Materials and Methods
 Results
 Discussion
 References
 

  1. Pinkerton, K. E., R. R. Mercer, C. G. Plopper, and J. D. Crapo. 1992. Distribution of injury and microdosimetry of ozone in the ventilatory unit of the rat. J. Appl. Physiol. 73:817–824.[Abstract/Free Full Text]
  2. Smith, L. L. 1982. Young Scientists Award lecture 1981: The identification of an accumulation system for diamines and polyamines into the lung and its relevance to paraquat toxicity. Arch. Toxicol. Suppl. 5:1–14.[Medline]
  3. Gram, T. 1997. Chemically reactive intermediates and pulmonary xenobiotic toxicity. Pharmacol. Rev. 49:297–341.[Abstract/Free Full Text]
  4. Gupta, P., W. Harger, and J. Arey. 1996. The contribution of nitro and methylnitro-naphthalenes to the vapor phase mutagenicity of ambient air samples. Atmos. Environ. 30:3157–3160.[CrossRef]
  5. Paige, R., V. Wong, and C. G. Plopper. 1997. Dose-related airway specific epithelial toxicity of 1-nitronaphthalene in rats. Toxicol. Appl. Pharmacol. 147:224–233.[CrossRef][Medline]
  6. Verschoyle, R. D., P. Carthew, C. R. Wolf, and D. Dinsdale. 1993. 1-Nitronaphthalene toxicity in rat lung and liver: effects of inhibiting and inducing cytochrome P450 activity. Toxicol. Appl. Pharmacol. 122:208–213.[CrossRef][Medline]
  7. Watt, K., D. Morin, M. J. Kurth, R. S. Mercer, C. G. Plopper, and A. Buckpitt. 1999. Glutathione conjugation of electrophilic metabolites of 1-nitronaphthalene in rat tracheobronchial airways and liver: identification by mass spectrometry and proton nuclear magnetic resonance spectroscopy. Chem. Res. Toxicol. 12:831–839.[CrossRef][Medline]
  8. Kaminski, N., J. D. Allard, J. F. Pittet, F. Zuo, M. J. Griffiths, D. Morris, X. Huang, D. Sheppard, and R. A. Heller. 2000. Global analysis of gene expression in pulmonary fibrosis reveals distinct programs regulating lung inflammation and fibrosis. Proc. Natl. Acad. Sci. USA 97:1778–1783.[Abstract/Free Full Text]
  9. Guo, Q. M. 2003. DNA microarray and cancer. Curr. Opin. Oncol. 15:36–43.[CrossRef][Medline]
  10. Mariani, T. J., J. J. Reed, and S. D. Shapiro. 2002. Expression profiling of the developing mouse lung: insights into the establishment of the extracellular matrix. Am. J. Respir. Cell Mol. Biol. 26:541–548.[Abstract/Free Full Text]
  11. Hamadeh, H. K., P. R. Bushel, S. Jayadev, K. Martin, O. DiSorbo, S. Sieber, L. Bennett, R. Tennant, R. Stoll, J. C. Barrett, K. Blanchard, R. S. Paules, and C. A. Afshari. 2002. Gene expression analysis reveals chemical-specific profiles. Toxicol. Sci. 67:219–231.[Abstract/Free Full Text]
  12. Waring, J. F., R. A. Jolly, R. Ciurlionis, P. Y. Lum, J. T. Praestgaard, D. C. Morfitt, B. Buratto, C. Roberts, E. Schadt, and R. G. Ulrich. 2001. Clustering of hepatotoxins based on mechanism of toxicity using gene expression profiles. Toxicol. Appl. Pharmacol. 175:28–42.[CrossRef][Medline]
  13. Yoneda, K., K. Peck, M. Chang, K. Chmiel, Y. P. Sher, J. Chen, P. C. Yang, Y. Chen, and R. Wu. 2001. Development of high-density DNA microarray membrane for profiling smoke- and hydrogen peroxide-induced genes in a human bronchial epithelial cell line. Am. J. Respir. Crit. Care Med. 164:S85–S89.[Abstract/Free Full Text]
  14. Martinez, J. M., C. A. Afshari, P. R. Bushel, A. Masuda, T. Takahashi, and N. J. Walker. 2002. Differential toxicogenomic responses to 2,3,7,8-tetrachlorodibenzo-p-dioxin in malignant and nonmalignant human airway epithelial cells. Toxicol. Sci. 69:409–423.[Abstract/Free Full Text]
  15. McDowell, S. A., K. Gammon, B. Zingarelli, C. J. Bachurski, B. J. Aronow, D. R. Prows, and G. D. Leikauf. 2003. Inhibition of nitric oxide restores surfactant gene expression following nickel-induced acute lung injury. Am. J. Respir. Cell Mol. Biol. 28:188–198.[Abstract/Free Full Text]
  16. Plopper, C., A. Chang, A. Pang, and A. Buckpitt. 1991. Use of microdissected airways to better define metabolism and cytotoxicity in murine bronchiolar epithelium. Exp. Lung Res. 17:181–196.[Medline]
  17. Baker, G. L., M. A. Shultz, M. V. Fanucchi, D. Morin, A. R. Buckpitt, and C. G. Plopper. 2003. Lung subcompartment gene expression utilizing in situ RNA preservation and microdissection. Toxicol. Sci. (In press).
  18. Zhang, L., Z. Wei, V. E. Velculescu, S. E. Kern, R. H. Hruban, S. R. Hamilton, B. Vogelsetin, and K. W. Kinzler. 1997. Gene expression profiles in normal and cancer cells. Science 276:1268–1272.[Abstract/Free Full Text]
  19. Buckpitt, A., and M. Cruikshank. 1997. Biochemical function of the lungs: metabolism of xenobiotic agents. In Toxicology of the Respiratory System. R. Roth, editor, Elsevier, New York. 159–186.
  20. Alberts, B., D. Bray, J. Lewis, M. Raff, K. Roberts, and J. D. Watson, eds. 1994. Molecular Biology of the Cell. Garland, New York.
  21. Singh, G., and S. L. Katyal. 1992. Secretory proteins of Clara cells and Type II cells. In Comparative Biology of the Normal Lung. Volume 1. Treatise on Pulmonary Toxicology. R. A. Parent, editor. CRC Press, Boca Raton. 93–108.
  22. Woolley, D. 2000. The molecular motors of cilia and eukaryotic flagella. Essays Biochem. 35:103–115.[Medline]
  23. Weston, W. M., E. E. LeClair, W. Trzyna, K. M. McHugh, P. Nugent, C. M. Lafferty, L. Ma, R. S. Tuan, and R. M. Greene. 1999. Differential display identification of plunk, a novel gene expressed in embryonic palate, nasal epithelium, and adult lung. J. Biol. Chem. 274:13698–13703.[Abstract/Free Full Text]
  24. Wang, Y., K. Sakamoto, J. Khosla, and P. L. Sannes. 2002. Detection of chrondroitin sulfates and decorin in developing fetal and neonatal rat lung. Am. J. Physiol. Lung Cell. Mol. Physiol. 282:L484–L490.[Abstract/Free Full Text]
  25. Wang, F., B. Daugherty, L. L. Keise, Z. Wei, J. P. Foley, R. C. Savani, and M. Koval. 2003. Heterogeneity of claudin expression by alveolar epithelial cells. Am. J. Respir. Cell Mol. Biol. 29:62–70.[Abstract/Free Full Text]
  26. Liu, L., J. Q. Tao, H. L. Li, and U. J. Zimmerman. 1997. Inhibition of lung surfactant secretion from alveolar type II cells and annexin II tetramer-mediated membrane fusion by phenothiazines. Arch. Biochem. Biophys. 342:322–328.[CrossRef][Medline]
  27. Fleming, R. E., M. A. Moxley, A. Waheed, E. C. Crouch, W. S. Sly, and W. J. Longmore. 1994. Carbonic anhydrase II expression in rat type II pneumocytes. Am. J. Respir. Cell Mol. Biol. 10:499–505.[Abstract]
  28. Mori, K., Y. Kon, A. Konno, and T. Iwanaga. 2001. Cellular distribution of napsin (kidney-derived aspartic protease-like protein, KAP) mRNA in the kidney, lung and lymphatic organs of adult and developing mice. Arch. Histol. Cytol. 64:319–327.[CrossRef][Medline]
  29. Fleming, R. E., E. C. Crouch, C. A. Ruzicka, and W. S. Sly. 1993. Pulmonary carbonic anhydrase IV: developmental regulation and cell-specific expression in the capillary endothelium. Am. J. Physiol. 265:L627–L635.
  30. Mendelsohn, F. A., V. Pupic, B. Jackson, R. Cubela, and C. I. Johnston. 1991. Acute and chronic effects of perindopril on tissue angiotensin converting enzyme activity. Am. J. Hypertens. 4:220S–225S.[Medline]
  31. Ryan, L. K., J. Rhodes, M. Bhat, and G. Diamond. 1998. Expression of beta-defensin genes in bovine alveolar macrophages. Infect. Immun. 66:878–881.[Abstract/Free Full Text]
  32. Ruault, M., M. E. Brun, M. Ventura, G. Roizes, and A. De Sario. 2002. MLL3, a new human member of the TRX/MLL gene family, maps to 7q26, a chromosome region frequently deleted in myeloid leukaemia. Gene 284:73–81.[CrossRef][Medline]
  33. Ohta, H., T. Sasaki, M. Naka, O. Hiraoka, C. Miyamoto, Y. Furuichi, and T. Tanaka. 1991. Molecular cloning and expression of the cDNA coding for a new member of the S100 protein family from porcine cardiac muscle. FEBS Lett. 295:93–96.[CrossRef][Medline]
  34. Matsuo, M., C. M. Ensor, and H. H. Tai. 1996. Cloning and expression of the cDNA for mouse NAD(+)-dependent 15-hydroxyprostaglandin dehydrogenase. Biochim. Biophys. Acta 1309:21–24.[Medline]
  35. Walch, A. K., H. F. Zitzelsberger, M. M. Aubele, A. E. Mattis, M. Bauchinger, S. Candidus, H. W. Prauer, M. Werner, and H. Hofler. 1998. Typical and atypical carcinoid tumors of the lung are characterized by 11q deletions as detected by comparative genomic hybridization. Am. J. Pathol. 153:1089–1098.[Abstract/Free Full Text]
  36. Yeung, G., J. J. Mulero, R. P. Berntsen, D. B. Loeb, R. Drmana, and J. E. Ford. 1999. Cloning of a novel epidermal growth factor repeat containing gene EGFL6: expressed in tumor and fetal tissues. Genomics 62:304–307.[CrossRef][Medline]
  37. Adams, M. D., A. R. Kerlavage, R. D. Fleischmann, R. A. Fuldner, C. J. Bult, N. H. Lee, E. F. Kirkness, K. G. Weinstock, J. D. Gocayne, O. White, G. Sutton, J. A. Blake, R. C. Brandon, M. W. Chiu, R. A. Clayton, R. T. Cline, M. D. Cotton, J. Earle-Hughes, L. D. Fine, L. M. FitzGerald, W. M. FitzHugh, J. L. Fritchman, N. S. M. Geoghagen, A. Glodek, C. L. Gnehm, M. C. Hanna, E. Hedblom, P. S. Hinkle Jr., J. M. Kelley, K. M. Klimek, J. C. Kelley, L.-I. Liu, S. M. Marmaros, J. M. Merrick, R. F. Moreno-Palanques, L. A. McDonald, D. T. Nguyen, S. M. Pellegrino, C. A. Phillips, S. E. Ryder, J. L. Scott, D. M. Saudek, R. Shirley, K. V. Small, T. A. Spriggs, T. R. Utterback, J. F. Weidman, Y. Li, R. Barthlow, D. P. Bednarik, L. Cao, M. A. Cepeda, T. A. Coleman, E.-J. Collins, D. Dimke, P. Feng, A. Ferrie, C. Fischer, G. A. Hastings, W.-W. He, J.-S. Hu, K. A. Huddleston, J. M. Greene, J. Gruber, P. Hudson, A. Kim, D. L. Kozak, C. Kunsch, H. Ji, H. Li, P. S. Meissner, H. Olsen, L. Raymond, Y.-F. Wei, J. Wing, C. Xu, G.-L. Yu, S. M. Ruben, P. J. Dillon, M. R. Fannon, C. A. Rosen, W. A. Haseltine, C. Fields, C. M. Fraser, and J. C. Venter. 1995. Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequences. Nature 377:3–17.[Medline]




This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)