|
|||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| |
Abstract |
|---|
|
|
|---|
Lung surfactant protein (SP)-D belongs to the family of soluble collagenous C-type lectins, named collectins. SP-D participates in the local innate immune defense of the lung, eliciting various effector functions by acting as a pattern recognition receptor for the carbohydrate structures on inhaled microorganisms and particulate matter. This work describes the isolation and characterization of the mouse SP-D gene (Sftpd), which spans 8 exons over 14 kb of sequence and shows an overall organization similar to other collectin genes. The complete 5' untranslated region of the messenger RNA, absent from the published complementary DNA for mouse SP-D, was also cloned and is shown to be encoded by a single exon. Analysis of 3.5 kb of 5' flanking nucleotide sequence for Sftpd is described and reveals positional conservation of a number of transcription factor binding sites on comparison of Sftpd with the human SP-D gene and the bovine conglutinin gene. In addition, a single copy SP-D-like gene has been shown to be present in mammals, birds, and amphibians but is absent in fish. An atypical, rodent-specific, long terminal repeat of retroviral origin containing a minisatellite that has become inserted in Sftpd is described. Three new polymorphic microsatellites are also described, one of which is just 160 base pairs upstream of Sftpd. This microsatellite was used to map the gene to the central region of chromosome 14; fine-scale mapping indicates that it lies in a 5.64-centimorgan area between D14Mit45 and D14Mit60. This will allow the easy identification of the collectin gene cluster and aid in the construction of a physical map over this region.
| |
Introduction |
|---|
|
|
|---|
Pulmonary surfactant is a complex mixture of lipids and proteins important for both normal respiratory function (1) and immune defense in the lung (2). Two of the surfactant proteins (SPs), SP-A and SP-D, belong to a subgroup of mammalian lectins termed collectins (3). The collectins have collagenous stalks and carboxyl-terminal, calcium-dependent lectin domains, with an overall quaternary structure similar to the first component of complement C1q (3), and are implicated as pattern recognition receptors for non-self carbohydrate structures in innate immunity (3, 4). Other members of the collectins include the serum protein mannan-binding lectin (MBL), which activates complement after binding to carbohydrate structures in a manner analogous to C1q binding to immune complexes (5). Bovidae have two additional serum collectins: conglutinin and CL-43 (3).
Evidence for the role of SP-D as a pattern recognition receptor within the innate immune system stems from both functional and expression data. The functions of SP-D have been investigated at three levels, with respect to the interaction between effector cells and pathogens. First, SP-D binds to carbohydrate structures on pathogens, such as influenza A virus, Escherichia coli (via lipopolysaccharides), and Cryptococcus neoformans. Second, SP-D binds two different cell types: leukocytes and alveolar macrophages (6, 7). Third, SP-D can elicit cellular responses, such as acting as an opsonin in the phagocytosis of gram-negative bacteria (8).
The expression pattern of SP-A and SP-D shows predominant localization to the lung; at the cellular level, the two are expressed by alveolar type II and Clara cells (9, 10). Their secretion is mediated via a pathway distinct from the other surfactant proteins and lipids (9). Unlike the other, more hydrophobic SPs (SP-B and SP-C), which are localized exclusively to the lung, minor sites of synthesis have been shown for SP-A and SP-D in the gastrointestines. The distinct tissue, cellular, and subcellular localization and expression of SP-A and SP-D are profoundly different from the other surfactant-associated proteins, implying a role independent of surfactant function. Indeed, clear evidence for the role of SP-A has emerged from SP-A-deficient transgenic mice, which display normal pulmonary compliance but show enhanced susceptibility to infection (11).
Lung gene transcription has been well characterized for other lung-associated proteins (12). Initial analysis of the human SP-D promoter has demarked regions for cell type-specific expression and hormonal regulation (13). Hormonal control regions do not correlate with the locations of typical hormonal response units, consistent with an indirect, late-acting hormonal regulation on SP-D gene transcription (14). Likewise, the promoter regions responsible for cell-specific expression do not account for the expression profile of SP-D.
The human gene arrangement of the collectins has been established by radiation hybrid mapping (15). The surfactant collectins, the two SP-A genes, one SP-A pseudogene, and the SP-D gene form a tight cluster on chromosome 10 (10q22.2-q23.1). The SP-A gene cluster lies telomeric of the SP-D gene, whereas the MBL gene lies toward centromere, at 10q21 (16). From genetic mapping data, a less-dispersed collectin locus probably exists in mice, consisting of the single copy SP-A and SP-D genes with one of the two MBL genes present in mice (MBL-A), lying in a 5-centimorgan (cM) region of chromosome 14, syntenic with the human collectin gene cluster (17).
This paper describes the characterization and partial sequencing of the mouse SP-D gene, and identification of the 5' extent of the mouse SP-D complementary DNA (cDNA). Furthermore, the mouse SP-D promoter region is analyzed in relation to the conservation seen between the human SP-D and the bovine conglutinin genes (CGN1s) to identify regions that may be important for transcriptional control, thus forming the basis for further investigation. Zooblot analysis has also been performed to indicate the emergence of SP-D. Also, three short sequence length polymorphisms have been described that place the collectin locus in the context of a linkage map as a simple and widely available marker, which will aid in physical mapping and thus allow detailed analysis of the collectin loci.
| |
Materials and Methods |
|---|
|
|
|---|
Generation of a Mouse SP-D cDNA Probe
The cDNA probe used to screen for the mouse SP-D gene was generated by reverse transcriptase-polymerase chain reaction (RT-PCR) on mouse lung RNA. First-strand cDNA was generated using monkey leukemia virus RT (MLV-RT) (50 U; Promega, Southampton, UK) on 10 µg of total mouse BALB/c lung RNA (purified using RNAzolB; Ambion, Austin, TX) primed with a species-conserved oligonucleotide D4 (CCGGAATTCAAGATCTCCACACAGTCCTC). MLV-RT was heat-inactivated, and 1 µl of the single-stranded product (20 µl) was used as a template in five 50-µl standard PCR (1.5 mM MgCl2, 25 mM KCl, 1 U Taq DNA polymerase, 200 µM of each deoxynucleotide triphosphate [dNTP], and 0.2 µM of each primer). The PCR program consisted of a 5-min denaturing step at 95°C followed by 35 cycles of 94°C for 45 s and 53°C for 1 min, and 2 min at 72°C, with primer D4 and a species-conserved sense oligonucleotide D3 (CCGGAATTCCTGGAAGCAGAAATGAAGAC). The 1.1-kb PCR product was subcloned into pBluescript (Stratagene L.T.D., Cambridge, UK) and sequenced by fluorescent dye-terminator cycle sequencing using primer walking (20).
Library Screening
A custom-made 129/SV murine embryonic D3 stem cell
genomic library constructed in
FIX-II (Stratagene,
kindly donated by Prof. J. Heath; School of Biochemistry,
University of Birmingham, Birmingham, UK) was screened
with the 1.1-kb [
-32P]-labeled cDNA of mouse SP-D.
From approximately 0.5 × 106 plaques, seven positive
clones were identified that hybridized to the probe under
high-stringency conditions (0.2× saline sodium citrate
[SSC] [150 mM NaCl and 15 mM Na3C6H5O7 · 2H2O, pH
7.0] and 0.1% [wt/vol] sodium dodecyl sulfate [SDS] at
68°C), with a 3-d exposure to X-ray film (X-OMAT AR;
Kodak, Cambridge, UK). The clones were purified to homogeneity after rescreening three times.
Characterization of Positive Clones by PCR
To determine which positive clones contained a full-length
gene for mouse SP-D, primers were designed at the 5' and
3' ends of each exon based on the human intron-exon structure and the mouse SP-D cDNA (17, 21). PCR was performed on each phage clone using various primer pairs to
ascertain which exons were represented. Four of the seven
positive clones contained all of the translated exons. The
phage clone containing all coding exons for mouse SP-D
and the longest 5' region (
mSPD-II) was subcloned into
pBluescript to produce the genomic subclone pBmSPD-II.
Restriction Mapping
Restriction digestion was performed on the
clones and
the plasmid subclone pBmSPD-II. The 3' end of the gene
was further characterized by using a 10-kb NotI-EcoRI
subclone of pBmSPD-II. Restriction fragments were separated on a 0.8% (wt/vol) agarose gel, Southern-blotted (22)
onto nylon membranes (Hybond-N; Amersham Life Sciences, Little Chalfont, Buckinghamshire, UK), and [
-32P]-
labeled exons (exons 2 and 8) or exon-intron-exon PCR
products (exons 3-5 and exons 6-7, inclusive) were used as probes.
Sequencing of the Gene
Intron-exon boundaries were sequenced by primer walking using fluorescent dye-terminator cycle sequencing with AmpliTaq FS (ABI Prism; Perkin-Elmer, Warrington, Cheshire, UK) on pBmSPD-II. A 4.5-kb HindIII fragment containing the entire 5' region of the gene was shotgun-cloned into M13mp18. A total of 48 clones was sequenced using fluorescent dye-labeled M13 reverse primer by cycle sequencing. Sequence gaps were filled by a combination of fluorescent dye-terminator sequencing on PCR products generated from suitable M13 clones and primer walking on pBmSPD-II.
Rapid Amplification of 5' cDNA Ends
Rapid amplification of cDNA ends (RACE) at the 5' end (23) was performed to obtain the sequence of the 5' untranslated region (UTR) of mouse SP-D. A single-stranded DNA template was synthesized using 50 U of MLV-RT on mouse total lung RNA with an oligonucleotide specific for exon 7 of SP-D (GTGGGAGAAGGCAACCTC). After heat inactivation and ethanol precipitation, the single-stranded DNAs were tailed with deoxycytidine using terminal deoxynucleotyltransferase (Boehringer Mannheim, Lewes, East Sussex, UK). PCR was performed for 30 cycles on the single-stranded DNA template using an "anchor" primer (GACTCGAGTCGACATCGATG17H) and a primer specific to SP-D exon 3 (CACGTTCTCCCTTTGGTC). A nested PCR was performed using one-fiftieth of the first PCR product as a template with an antisense primer specific for the 3' end of exon 2 (CACCCTTCTCACCCCGT) and an anchor primer (GACTCGAGTCGACATCGAT). The 5' RACE product was subcloned into pBluescript and analyzed by restriction digestion. A total of 10 independent clones were cycle-sequenced with fluorescent dye primer-labeled M13 primers.
Polymorphism Analysis
Two dinucleotide (CA) repeats and a polyadenosine stretch in the 3' UTR (see Figure 1) were investigated for simple-length polymorphisms. Primers were designed (Table 1) to generate a PCR product of approximately 150 base pairs (bp). PCR was performed on genomic DNA samples from different inbred mice (provided by Yvonne Boyd, MRC Mammalian Genetics Unit, Harwell, UK). PCR was carried out for 30 cycles of amplification: 94°C denaturation for 30 s, 55°C or 60°C for 30 s (see Table 1), and 72°C for 1 min with 25 ng of genomic DNA, in a 25-µl reaction volume (16 mM [NH4]2SO4, 1.5 mM MgCl2, 67 mM Tris-HCl [pH 8.8], 0.01% [vol/vol] Tween-20, 1 U Taq DNA polymerase, 200 µM of each dNTP, and 0.25 µM of each primer). Reaction product, 5 µl, was run on a 6% (wt/vol) nondenaturing polyacrylamide sequencing gel and visualized by silver staining.
|
|
Chromosomal Localization
A random selection of 50 mouse genomic DNA samples was obtained from the European Collaborative Interspecific Backcross (EUCIB) panel (Human Genome Mapping Project, Resource Centre, Hiaton, Cambridge, UK) (24). All the animals used were from a C57BL/6 × Mus spretus F1 generation backcrossed with either parental strain. For fine-scale mapping, a panel of informative recombinant mice for D14Mit45-D14Mit5 was analyzed. The inheritance of the simple sequence length polymorphism (SSLP) flanking the mouse SP-D gene (Sftpd), assessed by PCR, was used to determine the chromosomal location and fine-scale position of the gene with respect to known EUCIB markers on the mouse genome.
Zooblot
Genomic DNA was extracted from several different animals (Genomic DNA extraction kit; Qiagen, Hilden, Germany). Genomic DNA, 10 µg, was digested with BamHI,
separated on a 1% (wt/vol) agarose gel, then blotted onto
nylon membrane. The membrane was probed with an
[
-32P]-labeled mouse SP-D probe (the carbohydrate recognition domain [CRD]-encoding part of exon 8). The
membrane was washed under medium stringency conditions (0.1× SSC and 0.1% [wt/vol] SDS at 50°C) and exposed to X-ray film for 32 h.
| |
Results |
|---|
|
|
|---|
Characterization of the Mouse SP-D Genomic Clones
Restriction mapping, sequencing, and PCR analysis showed
that the
clone,
mSPD-II, contained all of the protein
coding regions for mouse SP-D and the longest 5' region.
Two contiguous stretches of sequence were constructed
using primer walking on the plasmid subclone (pBmSPD-II) and shotgun sequencing of the 5' region of the gene
(accession numbers AF047741 and AF047742).
The gene is approximately 14 kb long (Figure 1). The
first exon encodes 39 bp of the complete 5' UTR (see below). Exon 2 includes the remaining 3 bp of the 5' UTR,
the signal peptide, N-terminal region, and the first seven
triplets of collagen-like sequence. The four subsequent exons, 3-6, each of 117 bp, encode the rest of the collagen-like region. The final two exons, 7 and 8, encode the
helical coiled-coil region and the C-type lectin domain plus
the 3' UTR, respectively. The predicted messenger RNA
(mRNA) from the genomic sequence differs from the published cDNA B6/CBAF1J sequence for mouse SP-D (17)
in the 3' UTR, where a polyadenosine tract is 2 bp shorter
in the 129/SV genomic sequence. This represents a polymorphism within the 3' UTR.
The gene size and the intron-exon organization of Sftpd are similar to both the human SP-D gene and CGN1. All exons are in phase I with introns interrupting glycine codons of the collagen-like sequence (Table 2), consistent with the genomic structures of all the collectin genes sequenced to date.
|
Zooblot
To confirm the integrity of the cloned gene, the restriction pattern performed with several different enzymes, was identical with that of 129/SV genomic DNA, using the exon encoding the CRD of mouse SP-D as a probe (results not shown). This probe was also cross-hybridized with the genomic DNA of different species. At mild stringency, cross hybridization was seen across all the mammals (human, mouse, cattle, sheep, and rabbit), birds (goose), and amphibians (frog) tested, and all indicate a single copy SP-D-like gene (Figure 2). Cross hybridization was not seen with fish.
|
Dispersed Repetitive Elements and Polymorphisms
The increase in the size of the murine SP-D intron between exons 5 and 6 compared with the analogous human intron is a result of an insertion of a 360-bp dispersed repetitive element (Figure 1). The element is an ORR-1, a subfamily of the rodent-specific mammalian apparent long terminal repeat (LTR) retrotransposons (MaLRs) (25). The Sftpd ORR-1 shows the highest degree of homology with the consensus sequence for ORR-1A (accession no. U17093) and includes the typical hallmarks for solitary LTRs (shown in Figure 3).
|
Two dinucleotide repeats (CA) and a polyadenosine tract were identified within the gene for mouse SP-D (Figure 1). All were shown to be polymorphic between different inbred mouse strains by PCR (results submitted to the Mouse Genome Database, accession no. J:48340). The polymorphic CA repeat within the promoter region (CA1) showed three different alleles in seven strains of 21 to 23 dinucleotide arrays. This repeat is also present in the rat SP-D promoter region, but absent from both the human SP-D (13) and bovine conglutinin promoter regions (26). The second CA repeat (CA2), within intron 7, showed a higher degree of variation than did CA1, with five different alleles across seven inbred mouse strains, ranging from 16 to 25 dinucleotide arrays. It is unknown whether this repeat is present in the rat or human genes for SP-D. The polyadenosine tract within the 3' untranslated region of mouse SP-D displayed three alleles across 10 strains of inbred mice and has a repeat length varying from 19 to 27 bp.
Chromosomal Localization
The SSLP CA1 was used to determine the chromosomal
location of Sftpd relative to other genetic markers, to aid
in physical mapping of the collectin locus. Initially, genotyping was carried out on 50 DNA samples randomly selected from the EUCIB (24), followed by analysis of an
informative recombinant panel for the region D14Mit45-
D14Mit5. Gene ordering and linkage to the EUCIB anchor
loci was carried out on the MBx database and by visual inspection of the data (24). The EUCIB data confirms an
earlier report that Sftpd lies in the central region of mouse
chromosome 14 (17) and is linked to D14Mit45 (
2 value
of 71.14) and D14Mit60 (
2 value of 27.14) (Figure 4B).
Further analysis of critical recombinants, while minimizing the number of double recombinants, established the
locus order D14Mit45, D14Mit212, D14Mit56 and Sftpd,
D14Mit141, D14Mit60 (Figure 4C). Of the 27 scored recombinant mice between D14Mit45 and D14Mit60, 16 were recombinant between D14Mit45 and Sftpd, whereas
11 were recombinant between Sftpd and D14Mit60, showing tighter linkage of Sftpd to D14Mit60 on the genetic
map and showing cosegregation to D14Mit56.
|
Use of 5' RACE to Identify the 5' UTR of the Mouse SP-D cDNA
RACE at the 5' end was performed on total mouse lung RNA, using nested PCR on a single-stranded DNA template that was primed with a mouse SP-D-specific oligonucleotide, yielding a product of approximately 250 bp. The sequence of seven independent clones of the 5' RACE product were in agreement with the cDNA sequence of mouse SP-D. An additional 37 bp of 5' UTR sequence was identified in the clones, identical to a continuous stretch of sequence in the 5' region of the mouse SP-D gene (exon 1; see Figure 5). This is consistent with the 5' extent of the mouse SP-D gene transcribed in the lung, as shown by primer extension (results not shown). Three truncated 5' RACE products were identified, two initiated at position 63 and a third at position 70 of the full-length cDNA.
|
Analysis of the Promoter Region
Analysis of the sequence around the first nucleotide of exon 1 of Sftpd showed a strong similarity to the consensus sequence for the initiation of transcription (YNNNYAYYYYY) (27), consistent with the transcription start site identified by 5' RACE and that shown for humans and rats (13).
The repetitive DNA elements divide the 5' sequence
flanking the transcription start site into two areas. The upstream half,
3671 to
1809, is crammed with dispersed
repetitive elements, occupying 55% of the sequence, containing three short interspersed element (SINE)-like sequences that are upstream of three long interspersed element (LINE)-like sequences. The sequence immediately upstream of the transcription start site is devoid of known
dispersed repetitive elements (see Figure 1).
Comparative sequence analysis between the human and mouse promoters for SP-D, using dot-plot analysis, revealed a conserved promoter that spans 700 bp upstream from exon 1 (results not shown). A similar area of identity was seen with bovine conglutinin, although fewer clusters of identity are immediately adjacent to the transcription start site (results not shown). This is consistent with the idea that the proximal region of a promoter confers cell-type specificity, and that more distal regions are important for modulation of transcription.
An array of transcription factor binding sites was identified within the 5' region flanking the mouse SP-D gene
(Figure 5). However, the sites positionally conserved between the murine and human SP-D genes or conglutinin
are probably more functionally significant (listed in Table
3). Two blocks of sequence identity are interesting because they both contain conserved cell-specific transcription factors. A degenerate TATA box (CATAAAT) at
30 is conserved, and differs by one nucleotide from the
consensus TATA box (TATAWAW). Also found within
this conserved block of sequence are transcription factor
binding sites for Cdx (MTTTATR), which is involved in
intestinal-specific gene transcription (28), and a core recognition sequence (RTAAAYA) for the forkhead (fkh) family of transcription factors (29), which is involved in organ-specific gene expression along the foregut axis (30).
Upstream, within a larger block of conserved sequence at
512 to
459, lies a conserved binding site for the transcription factor AP-2 (CCCMNSSS) at
509 and a conserved central core motif for thyroid transcription factor (TTF)-1 (31) at
489. Positional conservation of SP-1,
H-AP-1, and nuclear factor-interleukin-6 regulatory elements could not be seen in the Sftpd as described for the
human SP-D gene SFTPD and the rat gene (13).
|
It is interesting to note that the rodent and human genes for SP-D do not contain CAAT boxes within the vicinity of the TATA box. Indeed, alignment between Sftpd and CGN1 shows that the analogous region of Sftpd has been disrupted by the polymorphic microsatellite (CA1).
| |
Discussion |
|---|
|
|
|---|
This study describes the molecular cloning and a detailed characterization of the mouse gene Sftpd, including the identification of the start of transcription and three new polymorphic markers for Sftpd. The gene extends over 14 kb and consists of 8 exons (Figure 1). All intron-exon boundaries begin with a 5' GT and terminate with a 3' AG (Table 2), conforming to the GT-AG rule. The positions of introns are conserved between the mouse and human SP-D genes (21), with all the protein coding exons in phase I. The mouse SP-D gene was partially sequenced and two individual contiguous stretches of sequence were constructed (Figure 1). The first contig includes 3.5 kb of the 5' flanking region of the gene, exon 1 and exon 2 (accession no. AF047741). The second contig encompasses exon 3 to exon 8 (accession no. AF047742). The 5' UTR, absent from the published mouse cDNA sequence (17), was identified by 5' RACE, and the majority is shown to be encoded by a single exon of 39 bp (exon 1). This exon shows 78% identity to the first exon of human SP-D and 80% identity to the first exon of rat SP-D (13). Exons 2 to 8 of Sftpd show complete identity to the published mouse cDNA sequence (17), with the exception of a polyadenosine tract in the 3' UTR, which is polymorphic. Exon 2 codes for the last 3 bp of 5' UTR, the signal peptide, and the N-terminal region, important for the quaternary protein structure (32), and 120 bp of the collagen region. This is followed by four 117-bp exons encoding uninterrupted collagen sequence. The neck region, which initiates trimerization of the polypeptide chain (33) and the CRD domain, is encoded by separate exons (exons 7 and 8). In addition to the CRD, exon 8 also codes for a 126-bp 3' UTR.
The intron-exon structure of mouse SP-D is consistent with other members of collectin gene family. Glycine codons of the collagen motif (Gly-Xxx-Yyy) are split in phase I by introns, resembling the organization of the nonfibrous vertebrate and invertebrate collagens (34). Split glycine codons have also been identified in the type IV collagens, the complement C1q genes, and the macrophage scavenger receptor gene as noted by Crouch and colleagues (21), and, more recently, have been seen in the human ficolin genes (35). Phase I interrupted glycine codons are believed to be the most ancestral collagen exons (36). Phase I introns are also the most prevalent type and hence are compatible with exon shuffling (37). An outstanding feature of SP-D is that the collagen region is encoded by four identically sized 117-bp exons, representative of collectin exon duplications. Despite the presence of these four genomic interruptions, the Gly-Xxx-Yyy collagen-like motif is not perturbed as seen for other collagen-containing proteins, such as C1q and MBL (38).
Two polymorphic dinucleotide repeats and one polyadenosine tract have been identified in Sftpd that represent the only SSLP for the collectin gene loci. Because SSLP is a more easily applied technique than restriction fragment length polymorphism, these markers will be useful for positional cloning and disease linkage studies. The polymorphic nature of these markers permits their use across many different backcross and recombinant inbred panels. A polyadenosine tract length polymorphism was detected in the 3' UTR. The allele seen in the B6/CBAF1J cDNA sequence for SP-D (17) does not represent the parental alleles (C57BL/6 and CBA). This may reflect allelic differences between laboratory inbred strains. The significance of a second polyadenosine tract close to the polyadenylated tail of an mRNA is not known. The dinucleotide repeat within the promoter region of SP-D is present in rodents but not in the human promoter. This repeat may prove to be interesting at the protein level because dinucleotide repeats within the promoter region are thought to be responsible for intraspecies phenotypic variation (39). It would be interesting to determine whether variation in repeat length within the promoter or the 3' UTR gives quantitative variations in the expression levels of SP-D in rodent populations.
One of the polymorphic repeats, CA1, was used in backcross analysis, mapping the mouse gene for SP-D to chromosome 14 (Figure 4); this is in agreement with an earlier report (17). In addition, fine-scale mapping of Sftpd shows that it lies within a 5.64-cM region between D14Mit60 and D14Mit45 (Figure 4B). This information will aid physical mapping of the mouse collectin locus and makes available a simple method for identification of the collectin locus for genome scanning and positional cloning experiments.
The genomic sequence surrounding the 3' UTR of the mouse SP-D gene does not share sequence similarity with human Alu repetitive elements or the human hypervariable minisatellite (D1S8), as described for the 3' UTR of the mouse SP-D cDNA (17). However, there are two points of interest regarding dispersed repetitive elements. First, the promoter region of Sftpd is essentially defined by the absence of repetitive elements proximal to the start of transcription, in contrast to the distal region flanking the gene that contains three LINE-like and three SINE-like sequences (see Figure 1). Second, a dispersed repetitive element, a solitary LTR (Figure 3), is present in intron 5 of the mouse SP-D gene and belongs to the rodent-specific ORR-1A subfamily of MaLRs (25). This second element is atypical because it contains an inserted minisatellite; such hybrid elements have the potential to be hypervariable in both murine and human genomes and are regarded as hot spots of recombination (40).
The cDNA and protein sequences of SP-D and SP-A have so far been described only for mammals (humans, rodents, cows, and guinea pigs). Recently, an SP-A-like RNA and protein have been detected in fish (41). We used an SP-D CRD probe to identify a single copy SP-D-like gene in all animals with lungs: mammals, avians, and amphibians (Figure 2). SP-D expression in mammals is not restricted to the lung since SP-D mRNA has been detected at a variety of different nonpulmonary sites. However, in fish, an SP-D-like gene does not appear to be present. Therefore, SP-D probably evolved for lung-related functions, and transcriptional control of the gene was adapted for expression at secondary sites, which is reflected by the low expression levels at these secondary sites of expression. SP-D appears as a single copy gene across the mammals, birds, and amphibians tested, unlike MBL and SP-A, in which the copy number of each gene varies between different species.
Comparison of syntenic regions of different genomes is useful in identifying genes and regulatory regions. Dot-plot analysis and sequence alignment of the human and murine 5' flanking regions of SP-D genes suggest that the regulatory region of the SP-D gene extends 700 bp upstream of the transcription start site. This agrees with transcription analysis performed on the human SP-D promoter (13). Numerous stretches of sequence identity are revealed by alignment of the mouse and human SP-D promoter, some clearly showing importance as they coincide with consensus transcription factor binding sites (Table 3) (42).
The minimal promoter for human SP-D that confers cell-type specificity stretches 161 bp upstream from the transcription start site (13). Within this region, putative transcription factor binding sites are positionally conserved in murine and human SP-D promoters, including AP-1, H-APF-1, Cdx, and the core motif for the fkh family of transcription factors. The AP-1 site has been shown to be functional (13), and is also conserved in the bovine conglutinin promoter. The only conserved lung-specific transcription factor binding site within the minimal promoter is for the fkh family of transcription factors (29). This is intriguing because members of this family also contribute to lung-specific expression of Clara cell 10-kD protein (CC-10) and SP-B (43).
TTF-1 plays important roles in lung-specific expression of SP-A, SP-B, SP-C, and CC-10 (30, 44). The transcription factor binding site core motifs of TTF-1 (31) were used to find two positionally conversed sites in the human and murine SP-D promoters; these sites were situated within larger blocks of sequence identity and warrant further investigation.
Two E-boxes are positionally conserved in the human and murine SP-D promoters; E-box motifs are known to be present in the enhancer regions of the SP-A promoter (47). Another conserved transcription binding site is the H-APF-1 site. H-APF-1 is commonly found in acute phase response genes and is induced by lipopolysaccharides, which upregulate rat SP-D expression in vivo (48).
Analysis of the transcriptional control of SP-D will help elucidate the role of this gene within the immune system and address questions of what immune signals interact with SP-D. Consequently, understanding the effects that deficiency or different levels of SP-D have upon the host, whether in animal models or by defining functional polymorphisms of SP-D in the human population, is also an essential route of investigation.
| |
Footnotes |
|---|
Abbreviations: base pair(s), bp; complementary DNA, cDNA; bovine conglutinin gene, CGN1; centimorgan(s), cM; carbohydrate recognition domain, CRD; forkhead, fkh; long interspersed element, LINE; long terminal repeat, LTR; mammalian apparent LTR, MaLR; mannan-binding lectin, MBL; monkey leukemia virus reverse transcriptase, MLV-RT; messenger RNA, mRNA; polymerase chain reaction, PCR; rapid amplification of cDNA ends, RACE; mouse SP-D gene, Sftpd; short interspersed element, SINE; surfactant protein, SP; simple sequence length polymorphism, SSLP; thyroid transcription factor, TTF; untranslated region, UTR.
(Received in original form February 19, 1998 and in revised form October 9, 1998).
Data deposition: The sequences reported in this publication have been submitted to Genbank and EMBL database (accession numbers AF047741 and AF047742). The complete linkage data are available from the EUCIB homepage (http://www.hgmp.mrc.ac.uk/MBx/MBxHomepage. html). Information regarding polymorphisms has been submitted to the Mouse Genome Database (http://www.informatics.jax.org), accession number J:48340.Acknowledgments: The authors thank Dr. Yvonne Boyd for both helpful discussion and sound advice on the mouse backcross analysis; Drs. Duncan Campbell and Kurt Drickamer for critical discussion of this manuscript; the rapid and free EUCIB service provided by the MRC-funded HGMP resource center at Hinxton, UK; and Drs. John Broxholme, Karsten Skjødt, and Lars Vitved for technical advice and guidance. One author (U.H.) is supported by the Benzon Foundation.
| |
References |
|---|
|
|
|---|
1. Rooney, S. A., S. L. Young, and C. R. Mendelson. 1994. Molecular and cellular processing of lung surfactant. FASEB J. 8: 957-967 [Abstract].
2.
Wright, J. R..
1997.
Immunomodulatory functions of surfactant.
Physiol.
Rev.
77:
931-962
3. Holmskov, U., R. Malhotra, R. B. Sim, and J. C. Jensenius. 1994. Collectins: collagenous C-type lectins of the innate immune defense system. Immunol. Today 15: 67-74 [Medline].
4. Medzhitov, R., and C. A. Janeway Jr.. 1997. Innate immunity: the virtues of a nonclonal system of recognition. Cell 91: 295-298 [Medline].
5. Thiel, S., T. Vorup, Jensen, C. M. Stover, W. Schwaeble, S. B. Laursen, K. Poulsen, A. C. Willis, P. Eggleton, S. Hansen, U. Holmskov, K. B. Reid, and J. C. Jensenius. 1997. A second serine protease associated with mannan-binding lectin that activates complement. Nature 386: 506-510 [Medline].
6. Crouch, E. C., A. Persson, G. L. Griffin, D. Chang, and R. M. Senior. 1995. Interactions of pulmonary surfactant protein D (SP-D) with human blood leukocytes. Am. J. Respir. Cell Mol. Biol. 12: 410-415 [Abstract].
7.
Holmskov, U.,
P. Lawson,
B. Teisner,
I. Tornoe,
A. C. Willis,
C. Morgan,
C. Koch, and
K. B. Reid.
1997.
Isolation and characterization of a new member of the scavenger receptor superfamily, glycoprotein-340 (gp-340), as a
lung surfactant protein-D binding molecule.
J. Biol. Chem.
272:
13743-13749
8. Pikaar, J. C., W. F. Voorhout, L. M. van Golde, J. Verhoef, J. A. Van Strijp, and J. F. van Iwaarden. 1995. Opsonic activities of surfactant proteins A and D in phagocytosis of gram-negative bacteria by alveolar macrophages. J. Infect. Dis. 172: 481-489 [Medline].
9. Voorhout, W. F., T. Veenendaal, Y. Kuroki, Y. Ogasawara, L. M. van Golde, and H. J. Geuze. 1992. Immunocytochemical localization of surfactant protein D (SP-D) in type II cells, Clara cells, and alveolar macrophages of rat lung. J. Histochem. Cytochem. 40: 1589-1597 [Abstract].
10.
Crouch, E.,
D. Parghi,
S. F. Kuan, and
A. Persson.
1992.
Surfactant protein
D: subcellular localization in nonciliated bronchiolar epithelial cells.
Am.
J. Physiol.
263:
L60-L66
11. LeVine, A. M., M. D. Bruno, K. M. Huelsman, G. F. Ross, J. A. Whitsett, and T. R. Korfhagen. 1997. Surfactant protein A-deficient mice are susceptible to group B streptococcal infection. J. Immunol. 158: 4336-4340 [Abstract].
12. Korfhagen, T. R., S. W. Glasser, and B. R. Stripp. 1994. Regulation of gene expression in the lung. Curr. Opin. Pediatr. 6: 255-261 [Medline].
13. Rust, K., L. Bingle, W. Mariencheck, A. Persson, and E. C. Crouch. 1996. Characterization of the human surfactant protein D promoter: transcriptional regulation of SP-D gene expression by glucocorticoids. Am. J. Respir. Cell Mol. Biol. 14: 121-130 [Abstract].
14. Mariencheck, W., and E. Crouch. 1994. Modulation of surfactant protein D expression by glucocorticoids in fetal rat lung. Am. J. Respir. Cell Mol. Biol. 10: 419-429 [Abstract].
15.
Hoover, R. R., and
J. Floros.
1998.
Organization of the human SP-A and
SP-D loci at 10q22-q23: physical and radiation hybrid mapping reveal gene
order and orientation.
Am. J. Respir. Cell Mol. Biol.
18:
353-362
16.
Sastry, K.,
G. A. Herman,
L. Day,
E. Deignan,
G. Bruns,
C. C. Morton, and
R. A. Ezekowitz.
1989.
The human mannose-binding protein gene: exon
structure reveals its evolutionary relationship to a human pulmonary surfactant gene and localization to chromosome 10.
J. Exp. Med.
170:
1175-1189
17. Motwani, M., R. A. White, N. Guo, L. L. Dowler, A. I. Tauber, and K. N. Sastry. 1995. Mouse surfactant protein-D: cDNA cloning, characterization, and gene localization to chromosome 14. J. Immunol. 155: 5671-5677 [Abstract].
18. Sastry, R., J. S. Wang, D. C. Brown, R. A. Ezekowitz, A. I. Tauber, and K. N. Sastry. 1995. Characterization of murine mannose-binding protein genes Mbl1 and Mbl2 reveals features common to other collectin genes. Mamm. Genome 6: 103-110 [Medline].
19. Moore, K. J., M. A. D'Amore, Bruno, T. R. Korfhagen, S. W. Glasser, J. A. Whitsett, N. A. Jenkins, and N. G. Copeland. 1992. Chromosomal localization of three pulmonary surfactant protein genes in the mouse. Genomics 12: 388-393 [Medline].
20.
Sanger, F.,
S. Nicklen, and
A. R. Coulson.
1977.
DNA sequencing with
chain-terminating inhibitors.
Proc. Natl. Acad. Sci. USA
74:
5463-5467
21.
Crouch, E.,
K. Rust,
R. Veile,
H. Donis,
Keller, and
L. Grosso.
1993.
Genomic organization of human surfactant protein D (SP-D): SP-D is encoded on chromosome 10q22.2-23.1.
J. Biol. Chem.
268:
2976-2983
22. Southern, E. M.. 1975. Detection of specific sequences among DNA fragments separated by gel electrophoresis. J. Mol. Biol. 98: 503-517 [Medline].
23.
Frohman, M. A.,
M. K. Dush, and
G. R. Martin.
1988.
Rapid production of
full-length cDNAs from rare transcripts: amplification using a single gene-specific oligonucleotide primer.
Proc. Natl. Acad. Sci. USA
85:
8998-9002
24.
Breen, M.,
L. Deakin,
B. Macdonald,
S. Miller,
R. Sibson,
E. Tarttelin,
P. Avner,
F. Bourgade,
J. L. Guenet,
X. Montagutelli,
C. Poirier,
D. Simon,
D. Tailor,
M. Bishop,
M. Kelly,
F. Rysavy,
S. Rastan,
D. Norris,
D. Shepherd,
C. Abbott,
A. Pilz,
S. Hodge,
I. Jackson,
Y. Boyd,
H. Blair,
G. Maslen,
J. A. Todd,
P. W. Reed,
J. Stoye,
A. Ashworth,
L. McCarthy,
R. Cox,
L. Schalkwyk,
H. Lehrach,
J. Klose,
U. Gangadharan,
S. Brown, and
(European Backcross Collaborative Group).
1994.
Towards high resolution maps of the mouse and human genomes
a facility for ordering markers to 0.1 cM resolution.
Hum. Mol. Genet.
3:
621-627
25.
Smit, A. F..
1993.
Identification of a new, abundant superfamily of mammalian LTR-transposons.
Nucleic Acids Res.
21:
1863-1872
26. Kawasaki, N., N. Itoh, and T. Kawasaki. 1994. Gene organization and 5'-flanking region sequence of conglutinin: a C-type mammalian lectin containing a collagen-like domain. Biochem. Biophys. Res. Commun. 198: 597-604 [Medline].
27. Breathnach, R., and P. Chambon. 1981. Organization and expression of eucaryote split genes coding for proteins. Annu. Rev. Biochem. 50: 349-383 [Medline].
28.
Suh, E.,
L. Chen,
J. Taylor, and
P. G. Traber.
1994.
A homeodomain protein
related to caudal regulates intestine-specific gene transcription.
Mol. Cell
Biol.
14:
7340-7351
29.
Overdier, D. G.,
A. Porcella, and
R. H. Costa.
1994.
The DNA-binding
specificity of the hepatocyte nuclear factor 3/forkhead domain is influenced by amino-acid residues adjacent to the recognition helix.
Mol. Cell.
Biol.
14:
2755-2766
30.
Bohinski, R. J.,
R. Di Lauro, and
J. A. Whitsett.
1994.
The lung-specific surfactant protein B gene promoter is a target for thyroid transcription factor
1 and hepatocyte nuclear factor 3, indicating common factors for organ-specific gene expression along the foregut axis.
Mol. Cell. Biol.
14:
5671-5681
31.
Yan, C.,
Z. Sever, and
J. A. Whitsett.
1995.
Upstream enhancer activity in
the human surfactant protein B gene is mediated by thyroid transcription
factor 1.
J. Biol. Chem.
270:
24852-24857
32.
Brown Augsburger, P., D. Chang, K. Rust, and E. C. Crouch.
1996.
Biosynthesis of surfactant protein D. Contributions of conserved NH2-terminal
cysteine residues and collagen helix formation to assembly and secretion.
J. Biol. Chem.
271:
18912-18919
33. Hoppe, H. J., P. N. Barlow, and K. B. Reid. 1994. A parallel three stranded alpha-helical bundle at the nucleation site of collagen triple-helix formation. FEBS Lett. 344: 191-195 [Medline].
34.
Exposito, J. Y., and
R. Garrone.
1990.
Characterization of a fibrillar collagen gene in sponges reveals the early evolutionary appearance of two
collagen gene families.
Proc. Natl. Acad. Sci. USA
87:
6669-6673
35. Endo, Y., Y. Sato, M. Matsushita, and T. Fujita. 1996. Cloning and characterization of the human lectin P35 gene and its related gene. Genomics 36: 515-521 [Medline].
36.
Exposito, J. Y.,
D. Le Guellec,
Q. Lu, and
R. Garrone.
1991.
Short chain
collagens in sponges are encoded by a family of closely related genes.
J.
Biol. Chem.
266:
21923-21928
37. Patthy, L.. 1987. Intron-dependent evolution: preferred types of exons and introns. FEBS Lett. 214: 1-7 [Medline].
38.
Drickamer, K., and
V. McCreary.
1987.
Exon structure of a mannose-binding protein gene reflects its evolutionary relationship to the asialoglycoprotein receptor and nonfibrillar collagens.
J. Biol. Chem.
262:
2582-2589
39. Kashi, Y., D. King, and M. Soller. 1997. Simple sequence repeats as a source of quantitative genetic variation. Trends Genet. 13: 74-78 [Medline].
40. Kelly, R., M. Gibbs, A. Collick, and A. J. Jeffreys. 1991. Spontaneous mutation at the hypervariable mouse minisatellite locus Ms6-hm: flanking DNA sequence and analysis of germline and early somatic mutation events. Proc. R. Soc. Lond. B. Biol. Sci. 245: 235-245 [Medline].
41. Sullivan, L. C., C. B. Daniels, I. D. Phillips, S. Orgeig, and J. A. Whitsett. 1998. Conservation of surfactant protein A: evidence for a single origin for vertebrate pulmonary surfactant. J. Mol. Evol. 46: 131-138 [Medline].
42.
Faisst, S., and
S. Meyer.
1992.
Compilation of vertebrate-encoded transcription factors.
Nucleic Acids Res.
20:
3-26
43.
Hellqvist, M.,
M. Mahlapuu,
L. Samuelsson,
S. Enerback, and
P. Carlsson.
1996.
Differential activation of lung-specific genes by two forkhead proteins, FREAC-1 and FREAC-2.
J. Biol. Chem.
271:
4482-4490
44. Bruno, M. D., R. J. Bohinski, K. M. Huelsman, J. A. Whitsett, and T. R. Korfhagen. 1995. Lung cell-specific expression of the murine surfactant protein A (SP-A) gene is mediated by interactions between the SP-A promoter and thyroid transcription factor-1. J. Biol. Chem. 270:6531-6536 [published erratum appears in J. Biol. Chem. 270(27):16482].
45.
Kelly, S. E.,
C. J. Bachurski,
M. S. Burhans, and
S. W. Glasser.
1996.
Transcription of the lung-specific surfactant protein C gene is mediated by thyroid transcription factor 1.
J. Biol. Chem.
271:
6881-6888
46. Toonen, R. F., S. Gowan, and C. D. Bingle. 1996. The lung enriched transcription factor TTF-1 and the ubiquitously expressed proteins Sp1 and Sp3 interact with elements located in the minimal promoter of the rat Clara cell secretory protein gene. Biochem. J. 316: 467-473 .
47.
Gao, E.,
J. L. Alcorn, and
C. R. Mendelson.
1993.
Identification of enhancers in the 5'-flanking region of the rabbit surfactant protein A (SP-A) gene
and characterization of their binding proteins.
J. Biol. Chem.
268:
19697-19709
48. McIntosh, J. C., A. H. Swyers, J. H. Fisher, and J. R. Wright. 1996. Surfactant proteins A and D increase in response to intratracheal lipopolysaccharide. Am. J. Respir. Cell Mol. Biol. 15: 509-519 [Abstract].
This article has been cited by other articles:
![]() |
Y. He and E. Crouch Surfactant Protein D Gene Regulation. INTERACTIONS AMONG THE CONSERVED CCAAT/ENHANCER-BINDING PROTEIN ELEMENTS J. Biol. Chem., May 24, 2002; 277(22): 19530 - 19537. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. van Eijk, H. P. Haagsman, T. Skinner, A. Archibold, K. B. M. Reid, and P. R. Lawson Porcine Lung Surfactant Protein D: Complementary DNA Cloning, Chromosomal Localization, and Tissue Distribution J. Immunol., February 1, 2000; 164(3): 1442 - 1450. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. He, E. C. Crouch, K. Rust, E. Spaite, and S. L. Brody Proximal Promoter of the Surfactant Protein D Gene. REGULATORY ROLES OF AP-1, FORKHEAD BOX, AND GT BOX BINDING PROTEINS J. Biol. Chem., September 29, 2000; 275(40): 31051 - 31060. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Proc. Am. Thorac. Soc. | Am. J. Respir. Crit. Care Med. |