EIF4E1B | GeneID:253314 | Homo sapiens
[ ] NCBI Entrez Gene
|Gene ID||253314||Official Symbol||EIF4E1B|
|Full Name||eukaryotic translation initiation factor 4E family member 1B|
|Description||eukaryotic translation initiation factor 4E family member 1B|
|Also Known As||Eukaryotic translation initiation factor 4E type 1B|
Orthologs and Paralogs
|GeneID:489099||EIF4E1B||XP_546215.2||Canis lupus familiaris|
- [ ] Kimura K, et al. (2006) "Diversification of transcriptional modulation: large-scale identification and characterization of putative alternative promoters of human genes." Genome Res. 16(1):55-65. PMID:16344560
- [ ] Joshi B, et al. (2005) "Phylogenetic analysis of eIF4E-family members." BMC Evol Biol. 5():48. PMID:16191198
- [ ] Ota T, et al. (2004) "Complete sequencing and characterization of 21,243 full-length human cDNAs." Nat Genet. 36(1):40-45. PMID:14702039
- [ ] Schmutz J, et al. (2004) "The DNA sequence and comparative analysis of human chromosome 5." Nature. 431(7006):268-274. PMID:15372022
By analyzing 1,780,295 5'-end sequences of human full-length cDNAs derived from 164 kinds of oligo-cap cDNA libraries, we identified 269,774 independent positions of transcriptional start sites (TSSs) for 14,628 human RefSeq genes. These TSSs were clustered into 30,964 clusters that were separated from each other by more than 500 bp and thus are very likely to constitute mutually distinct alternative promoters. To our surprise, at least 7674 (52%) human RefSeq genes were subject to regulation by putative alternative promoters (PAPs). On average, there were 3.1 PAPs per gene, with the composition of one CpG-island-containing promoter per 2.6 CpG-less promoters. In 17% of the PAP-containing loci, tissue-specific use of the PAPs was observed. The richest tissue sources of the tissue-specific PAPs were testis and brain. It was also intriguing that the PAP-containing promoters were enriched in the genes encoding signal transduction-related proteins and were rarer in the genes encoding extracellular proteins, possibly reflecting the varied functional requirement for and the restricted expression of those categories of genes, respectively. The patterns of the first exons were highly diverse as well. On average, there were 7.7 different splicing types of first exons per locus partly produced by the PAPs, suggesting that a wide variety of transcripts can be achieved by this mechanism. Our findings suggest that use of alternate promoters and consequent alternative use of first exons should play a pivotal role in generating the complexity required for the highly elaborated molecular systems in humans.
BACKGROUND: Translation initiation in eukaryotes involves the recruitment of mRNA to the ribosome which is controlled by the translation factor eIF4E. eIF4E binds to the 5'-m7Gppp cap-structure of mRNA. Three dimensional structures of eIF4Es bound to cap-analogues resemble 'cupped-hands' in which the cap-structure is sandwiched between two conserved Trp residues (Trp-56 and Trp-102 of H. sapiens eIF4E). A third conserved Trp residue (Trp-166 of H. sapiens eIF4E) recognizes the 7-methyl moiety of the cap-structure. Assessment of GenBank NR and dbEST databases reveals that many organisms encode a number of proteins with homology to eIF4E. Little is understood about the relationships of these structurally related proteins to each other. RESULTS: By combining sequence data deposited in the Genbank databases, we have identified sequences encoding 411 eIF4E-family members from 230 species. These sequences have been deposited into an internet-accessible database designed for sequence comparisons of eIF4E-family members. Most members can be grouped into one of three classes. Class I members carry Trp residues equivalent to Trp-43 and Trp-56 of H. sapiens eIF4E and appear to be present in all eukaryotes. Class II members, possess Trp-->Tyr/Phe/Leu and Trp-->Tyr/Phe substitutions relative to Trp-43 and Trp-56 of H. sapiens eIF4E, and can be identified in Metazoa, Viridiplantae, and Fungi. Class III members possess a Trp residue equivalent to Trp-43 of H. sapiens eIF4E but carry a Trp-->Cys/Tyr substitution relative to Trp-56 of H. sapiens eIF4E, and can be identified in Coelomata and Cnidaria. Some eIF4E-family members from Protista show extension or compaction relative to prototypical eIF4E-family members. CONCLUSION: The expansion of sequenced cDNAs and genomic DNAs from all eukaryotic kingdoms has revealed a variety of proteins related in structure to eIF4E. Evolutionarily it seems that a single early eIF4E gene has undergone multiple gene duplications generating multiple structural classes, such that it is no longer possible to predict function from the primary amino acid sequence of an eIF4E-family member. The variety of eIF4E-family members provides a source of alternatives on the eIF4E structural theme that will benefit structure/function analyses and therapeutic drug design.
As a base for human transcriptome and functional genomics, we created the "full-length long Japan" (FLJ) collection of sequenced human cDNAs. We determined the entire sequence of 21,243 selected clones and found that 14,490 cDNAs (10,897 clusters) were unique to the FLJ collection. About half of them (5,416) seemed to be protein-coding. Of those, 1,999 clusters had not been predicted by computational methods. The distribution of GC content of nonpredicted cDNAs had a peak at approximately 58% compared with a peak at approximately 42%for predicted cDNAs. Thus, there seems to be a slight bias against GC-rich transcripts in current gene prediction procedures. The rest of the cDNAs unique to the FLJ collection (5,481) contained no obvious open reading frames (ORFs) and thus are candidate noncoding RNAs. About one-fourth of them (1,378) showed a clear pattern of splicing. The distribution of GC content of noncoding cDNAs was narrow and had a peak at approximately 42%, relatively low compared with that of protein-coding cDNAs.
Chromosome 5 is one of the largest human chromosomes and contains numerous intrachromosomal duplications, yet it has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding conservation with non-mammalian vertebrates, suggesting that they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-coding genes including the protocadherin and interleukin gene families. We also completely sequenced versions of the large chromosome-5-specific internal duplications. These duplications are very recent evolutionary events and probably have a mechanistic role in human physiological variation, as deletions in these regions are the cause of debilitating disorders including spinal muscular atrophy.