Gart | GeneID:14450 | Mus musculus
Gene Summary
[
] NCBI Entrez Gene
| Gene ID | 14450 | Official Symbol | Gart |
|---|---|---|---|
| Locus | N/A | Gene Type | protein-coding |
| Synonyms | Gaps; Prgs | ||
| Full Name | phosphoribosylglycinamide formyltransferase | ||
| Description | phosphoribosylglycinamide formyltransferase | ||
| Chromosome | 16 C3-C4|16 63.0 cM | ||
| Also Known As | |||
| Summary | N/A | ||
Orthologs and Paralogs
[
] Homologs - NCBI's HomoloGene Group: 637
| ID | Symbol | Protein | Species |
|---|---|---|---|
| GeneID:2618 | GART | NP_000810.1 | Homo sapiens |
| GeneID:14450 | Gart | NP_034386.2 | Mus musculus |
| GeneID:33986 | ade3 | NP_523497.2 | Drosophila melanogaster |
| GeneID:58141 | gart | NP_571692.1 | Danio rerio |
| GeneID:180935 | GARS/AIRS/GART | NP_509122.1 | Caenorhabditis elegans |
| GeneID:281183 | GART | NP_001035563.1 | Bos taurus |
| GeneID:288259 | Gart | XP_573258.1 | Rattus norvegicus |
| GeneID:395315 | GART | NP_001001469.1 | Gallus gallus |
| GeneID:458518 | GART | XP_514869.2 | Pan troglodytes |
| GeneID:837515 | AT1G09830 | NP_172454.1 | Arabidopsis thaliana |
| GeneID:852617 | ADE5,7 | NP_011280.1 | Saccharomyces cerevisiae |
| GeneID:1279199 | AgaP_AGAP009786 | XP_318881.2 | Anopheles gambiae |
| GeneID:2541034 | ade1 | NP_596304.1 | Schizosaccharomyces pombe |
| GeneID:2704394 | NCU00177.1 | XP_322263.1 | Neurospora crassa |
| GeneID:2896609 | KLLA0A00957g | XP_451041.1 | Kluyveromyces lactis |
| GeneID:4344854 | Os08g0191200 | NP_001061170.1 | Oryza sativa |
| GeneID:4351721 | Os12g0197100 | NP_001066357.1 | Oryza sativa |
| GeneID:4622064 | AGOS_AFR254C | NP_985801.1 | Eremothecium gossypii |
| GeneID:5051303 | MGG_11343 | XP_001413666.1 | Magnaporthe grisea |
Gene Classification
[
] Gene Ontology
| ID | Category | GO Term |
|---|---|---|
| GO:0005737 | Component | cytoplasm |
| GO:0005524 | Function | ATP binding |
| GO:0003824 | Function | catalytic activity |
| GO:0016742 | Function | hydroxymethyl-, formyl- and related transferase activity |
| GO:0016874 | Function | ligase activity |
| GO:0030145 | Function | manganese ion binding |
| GO:0046872 | Function | metal ion binding |
| GO:0008168 | Function | methyltransferase activity |
| GO:0000166 | Function | nucleotide binding |
| GO:0004637 | Function | phosphoribosylamine-glycine ligase activity |
| GO:0004641 | Function | phosphoribosylformylglycinamidine cyclo-ligase activity |
| GO:0004644 | Function | phosphoribosylglycinamide formyltransferase activity |
| GO:0016740 | Function | transferase activity |
| GO:0009058 | Process | biosynthetic process |
| GO:0006189 | Process | 'de novo' IMP biosynthetic process |
| GO:0009113 | Process | purine base biosynthetic process |
| GO:0006164 | Process | purine nucleotide biosynthetic process |
MicroRNA and Targets
[
] MicroRNA Sequences and Transcript Targets from miRBase at Sanger
| RNA Target | miRNA # | mat miRNA | Mature miRNA Sequence |
|---|---|---|---|
| ENSMUST00000023684 | MI0004998 | gga-miR-460 | CCUGCAUUGUACACACUGUGUG |
| ENSMUST00000023684 | MI0003186 | hsa-miR-502-3p | AAUGCACCUGGGCAAGGAUUCA |
| ENSMUST00000023684 | MI0003193 | hsa-miR-506 | UAAGGCACCCUUCUGAGUAGA |
| ENSMUST00000023684 | MI0003140 | hsa-miR-512-5p | CACUCAGCCUUGAGGGCACUUUC |
| ENSMUST00000023684 | MI0003141 | hsa-miR-512-5p | CACUCAGCCUUGAGGGCACUUUC |
| ENSMUST00000023684 | MI0003177 | hsa-miR-522 | AAAAUGGUUCCCUUUAGAGUGU |
| ENSMUST00000023684 | MI0003160 | hsa-miR-524-3p | GAAGGCGCUUCCCUUUGGAGU |
| ENSMUST00000023684 | MI0003152 | hsa-miR-525-3p | GAAGGCGCUUCCCUUUAGAGCG |
| ENSMUST00000023684 | MI0003561 | hsa-miR-555 | AGGGUAAGCUGAACCUCUGAU |
| ENSMUST00000023684 | MI0003562 | hsa-miR-556-5p | GAUGAGCUCAUUGUAAUAUGAG |
| ENSMUST00000023684 | MI0003664 | hsa-miR-649 | AAACCUGUGUUGUUCAAGAGUC |
| ENSMUST00000023684 | MI0005560 | hsa-miR-885-3p | AGGCAGCGGGGUGUAGUGGAUA |
| ENSMUST00000023684 | MI0005527 | hsa-miR-886-3p | CGCGGGUGCUUACUGACCCUU |
| ENSMUST00000023684 | MI0005562 | hsa-miR-887 | GUGAACGGGCGCCAUCCCGAGG |
| ENSMUST00000023684 | MI0005763 | hsa-miR-941 | CACCCGGCUGUGUGCACAUGUGC |
| ENSMUST00000023684 | MI0005764 | hsa-miR-941 | CACCCGGCUGUGUGCACAUGUGC |
| ENSMUST00000023684 | MI0005765 | hsa-miR-941 | CACCCGGCUGUGUGCACAUGUGC |
| ENSMUST00000023684 | MI0005766 | hsa-miR-941 | CACCCGGCUGUGUGCACAUGUGC |
| ENSMUST00000023684 | MI0000407 | mmu-miR-106b* | CCGCACUGUGGGUACUUGCUGC |
| ENSMUST00000023684 | MI0000150 | mmu-miR-124 | UAAGGCACGCGGUGAAUGCC |
| ENSMUST00000023684 | MI0000716 | mmu-miR-124 | UAAGGCACGCGGUGAAUGCC |
| ENSMUST00000023684 | MI0000717 | mmu-miR-124 | UAAGGCACGCGGUGAAUGCC |
| ENSMUST00000023684 | MI0000222 | mmu-miR-129-5p | CUUUUUGCGGUCUGGGCUUGC |
| ENSMUST00000023684 | MI0000585 | mmu-miR-129-5p | CUUUUUGCGGUCUGGGCUUGC |
| ENSMUST00000023684 | MI0000565 | mmu-miR-16* | CCAGUAUUGACUGUGCUGCUGA |
| ENSMUST00000023684 | MI0000244 | mmu-miR-201 | UACUCAGUAAGGCAUUGUUCUU |
| ENSMUST00000023684 | MI0000568 | mmu-miR-20a* | ACUGCAUUACGAGCACUUAAAG |
| ENSMUST00000023684 | MI0000698 | mmu-miR-214 | ACAGCAGGCACAGACAGGCAGU |
| ENSMUST00000023684 | MI0000974 | mmu-miR-215 | AUGACCUAUGAUUUGACAGAC |
| ENSMUST00000023684 | MI0000731 | mmu-miR-217 | UACUGCAUCAGGAACUGACUGGA |
| ENSMUST00000023684 | MI0000603 | mmu-miR-328 | CUGGCCCUCUCUGCCCUUCCGU |
| ENSMUST00000023684 | MI0000584 | mmu-miR-34a | UGGCAGUGUCUUAGCUGGUUGU |
| ENSMUST00000023684 | MI0001525 | mmu-miR-433 | AUCAUGAUGGGCUCCUCGGUGU |
| ENSMUST00000023684 | MI0002400 | mmu-miR-465a-3p | GAUCAGGGCCUUUCUAAGUAGA |
| ENSMUST00000023684 | MI0002400 | mmu-miR-465a-5p | UAUUUAGAAUGGCACUGAUGUGA |
| ENSMUST00000023684 | MI0005498 | mmu-miR-465b-5p | UAUUUAGAAUGGUGCUGAUCUG |
| ENSMUST00000023684 | MI0005499 | mmu-miR-465b-5p | UAUUUAGAAUGGUGCUGAUCUG |
| ENSMUST00000023684 | MI0005500 | mmu-miR-465c-5p | UAUUUAGAAUGGCGCUGAUCUG |
| ENSMUST00000023684 | MI0005501 | mmu-miR-465c-5p | UAUUUAGAAUGGCGCUGAUCUG |
| ENSMUST00000023684 | MI0002404 | mmu-miR-469 | UGCCUCUUUCAUUGAUCUUGGUGUCC |
| ENSMUST00000023684 | MI0003484 | mmu-miR-483* | UCACUCCUCCCCUCCCGUCUU |
| ENSMUST00000023684 | MI0004703 | mmu-miR-501-3p | AAUGCACCCGGGCAAGGAUUUG |
| ENSMUST00000023684 | MI0003538 | mmu-miR-503 | UAGCAGCGGGAACAGUACUGCAG |
| ENSMUST00000023684 | MI0003538 | mmu-miR-503* | GAGUAUUGUUUCCACUGCCUGG |
| ENSMUST00000023684 | MI0005554 | mmu-miR-511 | AUGCCUUUUGCUCUGCACUCA |
| ENSMUST00000023684 | MI0005519 | mmu-miR-590-5p | GAGCUUAUUCAUAAAAGUGCAG |
| ENSMUST00000023684 | MI0004965 | mmu-miR-652 | AAUGGCGCCACUAGGGUUGUG |
| ENSMUST00000023684 | MI0004611 | mmu-miR-674 | GCACUGAGAUGGGAGUGGUGUA |
| ENSMUST00000023684 | MI0004649 | mmu-miR-685 | UCAAUGGCUGAGGUGAGGCAC |
| ENSMUST00000023684 | MI0004660 | mmu-miR-692 | AUCUCUUUGAGCGCCUCACUC |
| ENSMUST00000023684 | MI0004661 | mmu-miR-692 | AUCUCUUUGAGCGCCUCACUC |
| ENSMUST00000023684 | MI0004698 | mmu-miR-713 | UGCACUGAAGGCACACAGC |
| ENSMUST00000023684 | MI0004651 | mmu-miR-719 | AUCUCGGCUACAGAAAAAUGUU |
| ENSMUST00000023684 | MI0004306 | mmu-miR-761 | GCAGCAGGGUGAAACUGACACA |
| ENSMUST00000023684 | MI0000583 | mmu-miR-96 | UUUGGCACUAGCACAUUUUUGCU |
Selected Publications
[
] Gene-related publications indexed at PubMed
- [
] Evsikov AV, et al. (2006) "Cracking the egg: molecular dynamics and evolutionary aspects of the transition from the fully grown oocyte to embryo." Genes Dev. 20(19):2713-2727. PMID:17015433 - [
] Carninci P, et al. (2005) "The transcriptional landscape of the mammalian genome." Science. 309(5740):1559-1563. PMID:16141072 - [
] Katayama S, et al. (2005) "Antisense transcription in the mammalian transcriptome." Science. 309(5740):1564-1566. PMID:16141073 - [
] Gerhard DS, et al. (2004) "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)." Genome Res. 14(10B):2121-2127. PMID:15489334 - [
] Watahiki A, et al. (2004) "Libraries enriched for alternatively spliced exons reveal splicing patterns in melanocytes and melanomas." Nat Methods. 1(3):233-239. PMID:15782199 - [
] Okazaki Y, et al. (2002) "Analysis of the mouse transcriptome based on functional annotation of 60,770 full-length cDNAs." Nature. 420(6915):563-573. PMID:12466851 - [
] Reymond A, et al. (2002) "Human chromosome 21 gene expression atlas in the mouse." Nature. 420(6915):582-586. PMID:12466854 - [
] Gitton Y, et al. (2002) "A gene expression map of human chromosome 21 orthologues in the mouse." Nature. 420(6915):586-590. PMID:12466855 - [
] Strausberg RL, et al. (2002) "Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences." Proc Natl Acad Sci U S A. 99(26):16899-16903. PMID:12477932 - [
] Kawai J, et al. (2001) "Functional annotation of a full-length mouse cDNA collection." Nature. 409(6821):685-690. PMID:11217851 - [
] Tanaka TS, et al. (2000) "Genome-wide expression profiling of mid-gestation placenta and embryo using a 15,000 mouse developmental cDNA microarray." Proc Natl Acad Sci U S A. 97(16):9127-9132. PMID:10922068 - [
] Carninci P, et al. (2000) "Normalization and subtraction of cap-trapper-selected cDNAs to prepare full-length cDNA libraries for rapid discovery of new genes." Genome Res. 10(10):1617-1630. PMID:11042159 - [
] Shibata K, et al. (2000) "RIKEN integrated sequence analysis (RISA) system--384-format sequencing pipeline with 384 multicapillary sequencer." Genome Res. 10(11):1757-1771. PMID:11076861 - [
] Carninci P, et al. (1999) "High-efficiency full-length cDNA cloning." Methods Enzymol. 303():19-44. PMID:10349636 - [
] Reeves RH, et al. (1997) "High-resolution recombinational map of mouse chromosome 16." Genomics. 43(2):202-208. PMID:9244437 - [
] Sanghani SP, et al. (1997) "Tight binding of folate substrates and inhibitors to recombinant mouse glycinamide ribonucleotide formyltransferase." Biochemistry. 36(34):10506-10516. PMID:9265631 - [
] Kan JL, et al. (1995) "Analysis of a mouse gene encoding three steps of purine synthesis reveals use of an intronic polyadenylation signal without alternative exon usage." J Biol Chem. 270(4):1823-1832. PMID:7829519 - [
] Kan JL, et al. (1993) "Mouse cDNAs encoding a trifunctional protein of de novo purine synthesis and a related single-domain glycinamide ribonucleotide synthetase." Gene. 137(2):195-202. PMID:8299947 - [
] Cheng S, et al. (1993) "GART, SON, IFNAR, and CRF2-4 genes cluster on human chromosome 21 and mouse chromosome 16." Mamm Genome. 4(6):338-342. PMID:8318737 - [
] Tsirka SE, et al. (1993) "Multiple active conformers of mouse ornithine decarboxylase." Biochem J. 293 ( Pt 1)():289-295. PMID:8328969 - [
] Mjaatvedt AE, et al. (1993) "High-resolution mapping of D16led-1, Gart, Gas-4, Cbr, Pcp-4, and Erg on distal mouse chromosome 16." Genomics. 17(2):382-386. PMID:8406490 - [
] Avraham S, et al. (1992) "Negative and positive cis-acting elements in the promoter of the mouse gene that encodes the serine/glycine-rich peptide core of secretory granule proteoglycans." J Biol Chem. 267(1):610-617. PMID:1730621 - [
] Threadgill DS, et al. (1991) "Mapping HSA 3 loci in cattle: additional support for the ancestral synteny of HSA 3 and 21." Genomics. 11(4):1143-1148. PMID:1783381 - [
] Cox DR, et al. (1985) "Comparative gene mapping of human chromosome 21 and mouse chromosome 16." Ann N Y Acad Sci. 450():169-177. PMID:3160288
Fully grown oocytes (FGOs) contain all the necessary transcripts to activate molecular pathways underlying the oocyte-to-embryo transition (OET). To elucidate this critical period of development, an extensive survey of the FGO transcriptome was performed by analyzing 19,000 expressed sequence tags of the Mus musculus FGO cDNA library. Expression of 5400 genes and transposable elements is reported. For a majority of genes expressed in mouse FGOs, homologs transcribed in eggs of Xenopus laevis or Ciona intestinalis were found, pinpointing evolutionary conservation of most regulatory cascades underlying the OET in chordates. A large proportion of identified genes belongs to several gene families with oocyte-restricted expression, a likely result of lineage-specific genomic duplications. Gene loss by mutation and expression in female germline of retrotransposed genes specific to M. musculus is documented. These findings indicate rapid diversification of genes involved in female reproduction. Comparison of the FGO and two-cell embryo transcriptomes demarcated the processes important for oogenesis from those involved in OET and identified novel motifs in maternal mRNAs associated with transcript stability. Discovery of oocyte-specific eukaryotic translation initiation factor 4E distinguishes a novel system of translational regulation. These results implicate conserved pathways underlying transition from oogenesis to initiation of development and illustrate how genes acquire and lose reproductive functions during evolution, a potential mechanism for reproductive isolation.
This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
Antisense transcription (transcription from the opposite strand to a protein-coding or sense strand) has been ascribed roles in gene regulation involving degradation of the corresponding sense transcripts (RNA interference), as well as gene silencing at the chromatin level. Global transcriptome analysis provides evidence that a large proportion of the genome can produce transcripts from both strands, and that antisense transcripts commonly link neighboring "genes" in complex loci into chains of linked transcriptional units. Expression profiling reveals frequent concordant regulation of sense/antisense pairs. We present experimental evidence that perturbation of an antisense RNA can alter the expression of sense messenger RNAs, suggesting that antisense transcription contributes to control of transcriptional outputs in mammals.
The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.
It is becoming increasingly clear that alternative splicing enables the complex development and homeostasis of higher organisms. To gain a better understanding of how splicing contributes to regulatory pathways, we have developed an alternative splicing library approach for the identification of alternatively spliced exons and their flanking regions by alternative splicing sequence enriched tags sequencing. Here, we have applied our approach to mouse melan-c melanocyte and B16-F10Y melanoma cell lines, in which 5,401 genes were found to be alternatively spliced. These genes include those encoding important regulatory factors such as cyclin D2, Ilk, MAPK12, MAPK14, RAB4, melastatin 1 and previously unidentified splicing events for 436 genes. Real-time PCR further identified cell line-specific exons for Tmc6, Abi1, Sorbs1, Ndel1 and Snx16. Thus, the ASL approach proved effective in identifying splicing events, which suggest that alternative splicing is important in melanoma development.
Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts. There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones. Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences. These are clustered into 33,409 'transcriptional units', contributing 90.1% of a newly established mouse transcriptome database. Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome. 41% of all transcriptional units showed evidence of alternative splicing. In protein-coding transcripts, 79% of splice variations altered the protein product. Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs. The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
Genome-wide expression analyses have a crucial role in functional genomics. High resolution methods, such as RNA in situ hybridization provide an accurate description of the spatiotemporal distribution of transcripts as well as a three-dimensional 'in vivo' gene expression overview. We set out to analyse systematically the expression patterns of genes from an entire chromosome. We chose human chromosome 21 because of the medical relevance of trisomy 21 (Down's syndrome). Here we show the expression analysis of all identifiable murine orthologues of human chromosome 21 genes (161 out of 178 confirmed human genes) by RNA in situ hybridization on whole mounts and tissue sections, and by polymerase chain reaction with reverse transcription on adult tissues. We observed patterned expression in several tissues including those affected in trisomy 21 phenotypes (that is, central nervous system, heart, gastrointestinal tract, and limbs). Furthermore, statistical analysis suggests the presence of some regions of the chromosome with genes showing either lack of expression or, to a lesser extent, co-expression in specific tissues. This high resolution expression 'atlas' of an entire human chromosome is an important step towards the understanding of gene function and of the pathogenetic mechanisms in Down's syndrome.
The DNA sequence of human chromosome 21 (HSA21) has opened the route for a systematic molecular characterization of all of its genes. Trisomy 21 is associated with Down's syndrome, the most common genetic cause of mental retardation in humans. The phenotype includes various organ dysmorphies, stereotypic craniofacial anomalies and brain malformations. Molecular analysis of congenital aneuploidies poses a particular challenge because the aneuploid region contains many protein-coding genes whose function is unknown. One essential step towards understanding their function is to analyse mRNA expression patterns at key stages of organism development. Seminal works in flies, frogs and mice showed that genes whose expression is restricted spatially and/or temporally are often linked with specific ontogenic processes. Here we describe expression profiles of mouse orthologues to HSA21 genes by a combination of large-scale mRNA in situ hybridization at critical stages of embryonic and brain development and in silico (computed) mining of expressed sequence tags. This chromosome-scale expression annotation associates many of the genes tested with a potential biological role and suggests candidates for the pathogenesis of Down's syndrome.
The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http:mgc.nci.nih.gov).
The RIKEN Mouse Gene Encyclopaedia Project, a systematic approach to determining the full coding potential of the mouse genome, involves collection and sequencing of full-length complementary DNAs and physical mapping of the corresponding genes to the mouse genome. We organized an international functional annotation meeting (FANTOM) to annotate the first 21,076 cDNAs to be analysed in this project. Here we describe the first RIKEN clone collection, which is one of the largest described for any organism. Analysis of these cDNAs extends known gene families and identifies new ones.
cDNA microarray technology has been increasingly used to monitor global gene expression patterns in various tissues and cell types. However, applications to mammalian development have been hampered by the lack of appropriate cDNA collections, particularly for early developmental stages. To overcome this problem, a PCR-based cDNA library construction method was used to derive 52,374 expressed sequence tags from pre- and peri-implantation embryos, embryonic day (E) 12.5 female gonad/mesonephros, and newborn ovary. From these cDNA collections, a microarray representing 15,264 unique genes (78% novel and 22% known) was assembled. In initial applications, the divergence of placental and embryonic gene expression profiles was assessed. At stage E12.5 of development, based on triplicate experiments, 720 genes (6.5%) displayed statistically significant differences in expression between placenta and embryo. Among 289 more highly expressed in placenta, 61 placenta-specific genes encoded, for example, a novel prolactin-like protein. The number of genes highly expressed (and frequently specific) for placenta has thereby been increased 5-fold over the total previously reported, illustrating the potential of the microarrays for tissue-specific gene discovery and analysis of mammalian developmental programs.
In the effort to prepare the mouse full-length cDNA encyclopedia, we previously developed several techniques to prepare and select full-length cDNAs. To increase the number of different cDNAs, we introduce here a strategy to prepare normalized and subtracted cDNA libraries in a single step. The method is based on hybridization of the first-strand, full-length cDNA with several RNA drivers, including starting mRNA as the normalizing driver and run-off transcripts from minilibraries containing highly expressed genes, rearrayed clones, and previously sequenced cDNAs as subtracting drivers. Our method keeps the proportion of full-length cDNAs in the subtracted/normalized library high. Moreover, our method dramatically enhances the discovery of new genes as compared to results obtained by using standard, full-length cDNA libraries. This procedure can be extended to the preparation of full-length cDNA encyclopedias from other organisms.
The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3' end and 5' end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can be revealed by seven RISA systems within one month.
Five intersubspecific backcrosses and an intercross were used to establish a sex-averaged recombinational map spanning 56 cM across most of mouse Chromosome 16 (Chr 16). A total of 123 markers were ordered using an interval mapping approach to identify 425 recombination sites in a collection of 1154 meioses from 1155 progeny generated in the six crosses. The markers include the 10 "classic" Chr 16 reference markers, 26 additional genes or transcripts including two phenotypic markers (Pit1dw and Kcnj6wv), and 87 simple sequence length polymorphisms (SSLPs). One set of monozygotic twins was detected among the 304 meioses mapped to highest resolution. The reference markers and SSLPs allow the map to be well integrated with existing maps of Chr 16. The average distance between crossover sites is less than 500 kb for most chromosomes, making this collection of recombinant chromosomes useful as a binning and ordering resource for YAC-based physical map assembly on Chr 16.
The binding of the prototypical folate inhibitor of de novo purine synthesis, 5,10-dideazatetrahydrofolate (DDATHF), and its hexaglutamate to recombinant trifunctional mouse glycinamide ribonucleotide formyltransferase (rmGARFT) was studied by equilibrium dialysis and by steady-state kinetics using sensitive assays that allowed initial rate calculations. rmGARFT was expressed in insect cells infected with a recombinant baculovirus and purified by a two-step procedure that allowed production of about 25 mg of pure protein/L of culture. The binding of DDATHF to GARFT was approximately 50-fold tighter than previously reported, with Kd and Ki values of 2-9 nM, making the parent form of this antifolate a tight-binding inhibitor. The binding of the hexaglutamate of DDATHF to rmGARFT had Kd and Ki values of 0.1-0.3 nM, consistent with the view that polyglutamation enhances binding of antifolates to GARFT. Kinetic analyses using either mono- or hexaglutamate substrate did not yield different values for the Ki for the hexaglutamate form of DDATHF, in contradiction with previous reports. Both the folate substrate commonly used to study GARFT, 10-formyl-5,8-dideazafolate, and its hexaglutamate were found to have very low Km values, namely, 75 and 7.4 nM, respectively, and the folate reaction products for these substrates were equally potent inhibitors, results which modify the interpretation of previous kinetic experiments. The product analog DDATHF and beta-glycinamide ribonucleotide bound to enzyme equally well in the presence and absence of the other, an observation at variance with the concept that GARFT obeys an ordered sequential binding of the substrates. We conclude that the kinetics of mouse GARFT are most consistent with a random order of substrate binding, that both the inhibitor DDATHF and the folate substrate are tight-binding ligands, and that polyglutamate forms enhance the affinity of both substrate and inhibitor by an order of magnitude.
A single mouse genomic locus encodes proteins catalyzing three steps of purine synthesis, glycinamide ribonucleotide synthetase (GARS), aminoimidazole ribonucleotide synthetase (AIRS), and glycinamide ribonucleotide formyltransferase (GART). This gene has 22 exons and spans 28 kilobases. The existence of a second genetic locus and closely related pseudogenes was ruled out by Southern analysis. Mouse tissues express two related classes of messages encoded by this single locus: a trifunctional GARS-AIRS-GART mRNA and a monofunctional GARS mRNA. These transcripts used the same set of multiple transcriptional start sites, and both used the same first 10 exons. CCAAT and TATA elements were not found for this locus. Exon 11, which represented the last coding sequence of the GARS domain, was differentially utilized for the two messages. The trifunctional mRNA was generated by splicing exon 11 to exon 12, the first coding sequence for the AIRS domain with subsequent use of a polyadenylation signal at the end of exon 22. Genomic sequence corresponding to the 3'-UTR of the monofunctional GARS mRNA was contiguous with exon 11, so that the smaller message arose from the recognition of one of the multiple polyadenylation signals present within the intron between exons 11 and 12. Hence, polyadenylation of the primary transcript at a position corresponding to an intron of the genomic locus was responsible for the generation of the monofunctional GARS class of mRNAs. This utilization of an intronic polyadenylation site without alternative exon usage is comparable to the mechanism whereby both secreted and membrane-bound forms of the immunoglobulin mu heavy chain are made from a single genetic locus.
Three of the enzymatic activities of de novo purine synthesis, glycinamide ribonucleotide synthetase (GARS), aminoimidazole ribonucleotide synthetase (AIRS) and glycinamide ribonucleotide formyltransferase (GART), can be catalyzed by a single 110-kDa protein in mouse cells. Western blots using a polyclonal antibody (Ab) to this protein identified two species, the trifunctional 110-kDa protein and a 50-kDa cytosolic protein with GARS, but not GART activity. We used Ab and, subsequently, oligodeoxyribonucleotide screens to isolate cDNAs corresponding to these two proteins from mouse T-cell cDNA expression libraries. The sequence of one class of these cDNAs and the partial sequence of a corresponding genomic clone defined an open reading frame (ORF) encoding a 1010-amino-acid (aa) protein, individual domains of which showed high homology to each of the monofunctional bacterial GARS, AIRS and GART proteins, and to each domain of chicken and human trifunctional GARS-AIRS-GARTs. cDNAs corresponding to the smaller protein contained a 1.3-kb ORF with complete identity to the GARS domain of, but with a 3' untranslated region different from, the trifunctional cDNAs. Hence, both cDNAs appear to derive from the same gene due to either differential splicing or use of an intronic polyadenylation signal. The functional requirement for the expression of both trifunctional protein with GARS activity and monofunctional, catalytically active GARS is unknown.
Purified recombinant mouse ornithine decarboxylase (ODC) was denatured with urea or with guanidinium chloride. Enzymic activity was efficiently recovered upon dilution of the denaturing agent. ODC renatured after urea treatment was further characterized. Kinetics of decarboxylation of the natural substrate ornithine or of the suicide substrate alpha-difluoromethylornithine (DFMO) were not significantly changed by denaturation/renaturation. Surprisingly, the renatured enzyme was not stably labelled with radioactive DFMO, in contrast with the native enzyme not subjected to denaturation. Native and renatured ODC did not differ in their c.d. spectra, but the former contained four reactive cysteine residues and the latter seven. These data indicate that a conformational change results from denaturation/renaturation that does not alter decarboxylation of substrates, but does change the accessibility or stability of the cysteine-360 residue modified by decarboxylated DFMO.
More than 500 backcross progeny from four intersubspecific backcrosses were typed for six markers on distal mouse chromosome 16. Five of these represented genes that mapped within the Sod-1 to Ets-2 interval, which was shown previously to contain the weaver (wv) gene. The map order, including previously mapped reference markers, is (cen)-D16H21S16-D16Led-1-App-Sod-1-Gart-Gas-4-Cbr++ +-wv-Pcp-4-Erg-Ets-2. This gene order recapitulates the order of the genes on human chromosome 21 where known. Two of these markers further define the region containing the weaver gene to a 3.9-cM segment between Cbr and Pcp-4. In addition, Pcp-4 was localized to human chromosome 21 by the presence of a human-specific restriction fragment in WAV-17, a mouse-human somatic cell hybrid with human chromosome 21 as the only human contribution.
The gene that encodes a proteoglycan peptide core rich in serine and glycine (SG-PG) is selectively expressed by hematopoietic cells that store in their cytoplasmic granules negatively charged proteoglycans bound ionically to numerous positively charged proteins. With deletion analysis, a negative transcription regulatory element was located between residues -250 and -190 of the 5'-flanking region of the mouse SG-PG gene, and a positive regulatory element was located between residues -118 and -81. The negative regulatory element was dominantly active in fibroblasts that do not express the SG-PG gene whereas the positive regulatory element was dominantly active in hematopoietic cells that do express the SG-PG gene. Site-directed mutagenesis was used to demonstrate that the proximal element within the gene's atypical promoter resided between residues -40 and -20. As assessed by gel mobility shift analyses, the nuclei of rat basophilic leukemia-1 cells and rat-1 fibroblasts contain a number of trans-acting factors that interact with the positive and negative cis-acting regulatory elements of the SG-PG gene. Furthermore, some of these trans-acting factors appear to be different for the two cell types. These studies on cell types that do and do not express the SG-PG gene indicate that transcription of this proteoglycan peptide core gene is regulated constitutively by both positive and negative cis-acting elements located 5' of an atypical promoter.
Homologs to genes residing on human chromosome 3 (HSA 3) map to four mouse chromosomes (MMU) 3, 6, 9, and 16. In the bovine, two syntenic groups that contain HSA 3 homologs, unassigned syntenic groups 10 (U10) and 12 (U12), have been defined. U10 also contains HSA 21 genes, which is similar to the situation seen on MMU 16, whereas U12 apparently contains only HSA 3 homologs. The syntenic arrangement of other HSA 3 homologs in the bovine was investigated by physically mapping five genes through segregation analysis of a bovine-hamster hybrid somatic cell panel. The genes mapped include Friend-murine leukemia virus integration site 3 homolog (FIM3; HSA 3/MMU 3), sucrase-isomaltase (SI) and glutathione peroxidase 1 (GPX1) (HSA 3/MMU ?), murine leukemia viral (v-raf-1) oncogene homolog 1 (RAF1; HSA 3/MMU 6), and ceruloplasmin (CP; HSA 3/MMU 9). FIM3, SI, and CP mapped to bovine syntenic group U10, while RAF1 and GPX1 mapped to U12.