C16orf42 | GeneID:115939 | Homo sapiens
Gene Summary
[
] NCBI Entrez Gene
| Gene ID | 115939 | Official Symbol | C16orf42 |
|---|---|---|---|
| Locus | N/A | Gene Type | protein-coding |
| Synonyms | MGC24381 | ||
| Full Name | chromosome 16 open reading frame 42 | ||
| Description | chromosome 16 open reading frame 42 | ||
| Chromosome | 16p13.3 | ||
| Also Known As | hypothetical protein LOC115939 | ||
| Summary | N/A | ||
Orthologs and Paralogs
[
] Homologs - NCBI's HomoloGene Group: 6922
| ID | Symbol | Protein | Species |
|---|---|---|---|
| GeneID:41849 | CG4338 | NP_650441.1 | Drosophila melanogaster |
| GeneID:68327 | 0610007P22Rik | NP_080952.1 | Mus musculus |
| GeneID:115939 | C16orf42 | NP_001001410.1 | Homo sapiens |
| GeneID:176995 | F52C12.2 | NP_741332.1 | Caenorhabditis elegans |
| GeneID:360494 | RGD1565744 | XP_340769.3 | Rattus norvegicus |
| GeneID:416590 | C16orf42 | XP_414890.1 | Gallus gallus |
| GeneID:436643 | zgc:92086 | NP_001002370.1 | Danio rerio |
| GeneID:479890 | LOC479890 | XP_537015.1 | Canis lupus familiaris |
| GeneID:508714 | C25H16ORF42 | XP_585533.1 | Bos taurus |
| GeneID:830871 | AT5G10070 | NP_974761.1 | Arabidopsis thaliana |
| GeneID:854167 | TSR3 | NP_014648.1 | Saccharomyces cerevisiae |
| GeneID:1277553 | AgaP_AGAP008424 | XP_317021.2 | Anopheles gambiae |
| GeneID:2542142 | SPAC1F3.04c | NP_593007.1 | Schizosaccharomyces pombe |
| GeneID:2674813 | MGG_00851 | XP_368393.1 | Magnaporthe grisea |
| GeneID:2707891 | NCU06104.1 | XP_325959.1 | Neurospora crassa |
| GeneID:2892859 | KLLA0D01111g | XP_453120.1 | Kluyveromyces lactis |
| GeneID:4331931 | Os03g0195200 | NP_001049256.1 | Oryza sativa |
| GeneID:4619535 | AGOS_ACR007W | NP_983410.1 | Eremothecium gossypii |
Gene Classification
[
] Gene Ontology
| ID | Category | GO Term |
|---|---|---|
| GO:0016021 | Component | integral to membrane |
| GO:0016020 | Component | membrane |
RefSeq Isoforms
[
] RefSeq Annotation and UniProt Database
| No. | RefSeq RNA | RefSeq Protein | UniProt Equivalent |
|---|---|---|---|
| 1 | NM_001001410 UCSC Browser | NP_001001410 | |
MicroRNA and Targets
[
] MicroRNA Sequences and Transcript Targets from miRBase at Sanger
| RNA Target | miRNA # | mat miRNA | Mature miRNA Sequence |
|---|---|---|---|
| ENST00000007390 | MI0000060 | hsa-let-7a | UGAGGUAGUAGGUUGUAUAGUU |
| ENST00000007390 | MI0000061 | hsa-let-7a | UGAGGUAGUAGGUUGUAUAGUU |
| ENST00000007390 | MI0000062 | hsa-let-7a | UGAGGUAGUAGGUUGUAUAGUU |
| ENST00000007390 | MI0000064 | hsa-let-7c | UGAGGUAGUAGGUUGUAUGGUU |
| ENST00000007390 | MI0000065 | hsa-let-7d | AGAGGUAGUAGGUUGCAUAGUU |
| ENST00000007390 | MI0000066 | hsa-let-7e | UGAGGUAGGAGGUUGUAUAGUU |
| ENST00000007390 | MI0000067 | hsa-let-7f | UGAGGUAGUAGAUUGUAUAGUU |
| ENST00000007390 | MI0000068 | hsa-let-7f | UGAGGUAGUAGAUUGUAUAGUU |
| ENST00000007390 | MI0000433 | hsa-let-7g | UGAGGUAGUAGUUUGUACAGUU |
| ENST00000007390 | MI0000450 | hsa-miR-133a | UUUGGUCCCCUUCAACCAGCUG |
| ENST00000007390 | MI0000451 | hsa-miR-133a | UUUGGUCCCCUUCAACCAGCUG |
| ENST00000007390 | MI0003129 | hsa-miR-146b-5p | UGAGAACUGAAUUCCAUAGGCU |
| ENST00000007390 | MI0000285 | hsa-miR-205 | UCCUUCAUUCCACCGGAGUCUG |
| ENST00000007390 | MI0000296 | hsa-miR-219-1-3p | AGAGUUGAGUCUGGACGUCCCG |
| ENST00000007390 | MI0005536 | hsa-miR-220c | ACACAGGGCUGUUGUGAAGACU |
| ENST00000007390 | MI0000299 | hsa-miR-222* | CUCAGUAGCCAGUGUAGAUCCU |
| ENST00000007390 | MI0000080 | hsa-miR-24 | UGGCUCAGUUCAGCAGGAACAG |
| ENST00000007390 | MI0000081 | hsa-miR-24 | UGGCUCAGUUCAGCAGGAACAG |
| ENST00000007390 | MI0000089 | hsa-miR-31 | AGGCAAGAUGCUGGCAUAGCU |
| ENST00000007390 | MI0000815 | hsa-miR-339-3p | UGAGCGCCUCGACGACAGAGCCG |
| ENST00000007390 | MI0000781 | hsa-miR-373* | ACUCAAAAUGGGGGCGCUUUCC |
| ENST00000007390 | MI0001445 | hsa-miR-423-3p | AGCUCGGUCUGAGGCCCCUCAGU |
| ENST00000007390 | MI0001721 | hsa-miR-431 | UGUCUUGCAGGCCGUCAUGCA |
| ENST00000007390 | MI0001652 | hsa-miR-450a | UUUUGCGAUGUGUUCCUAAUAU |
| ENST00000007390 | MI0003187 | hsa-miR-450a | UUUUGCGAUGUGUUCCUAAUAU |
| ENST00000007390 | MI0002468 | hsa-miR-484 | UCAGGCUCAGUCCCCUCCCGAU |
| ENST00000007390 | MI0002469 | hsa-miR-485-5p | AGAGGCUGGCCGUGAUGAAUUC |
| ENST00000007390 | MI0002470 | hsa-miR-486-3p | CGGGGCAGCUCAGUACAGGAU |
| ENST00000007390 | MI0003125 | hsa-miR-490-3p | CAACCUGGAGGACUCCAUGCUG |
| ENST00000007390 | MI0003126 | hsa-miR-491-5p | AGUGGGGAACCCUUCCAUGAGG |
| ENST00000007390 | MI0003127 | hsa-miR-511 | GUGUCUUUUGCUCUGCAGUCA |
| ENST00000007390 | MI0003128 | hsa-miR-511 | GUGUCUUUUGCUCUGCAGUCA |
| ENST00000007390 | MI0003140 | hsa-miR-512-5p | CACUCAGCCUUGAGGGCACUUUC |
| ENST00000007390 | MI0003141 | hsa-miR-512-5p | CACUCAGCCUUGAGGGCACUUUC |
| ENST00000007390 | MI0003161 | hsa-miR-517* | CCUCUAGAUGGAAGCACUGUCU |
| ENST00000007390 | MI0003165 | hsa-miR-517* | CCUCUAGAUGGAAGCACUGUCU |
| ENST00000007390 | MI0003174 | hsa-miR-517* | CCUCUAGAUGGAAGCACUGUCU |
| ENST00000007390 | MI0003170 | hsa-miR-518a-5p | CUGCAAAGGGAAGCCCUUUC |
| ENST00000007390 | MI0003173 | hsa-miR-518a-5p | CUGCAAAGGGAAGCCCUUUC |
| ENST00000007390 | MI0003596 | hsa-miR-548b-3p | CAAGAACCUCAGUUGCUUUUGU |
| ENST00000007390 | MI0003585 | hsa-miR-578 | CUUCUUGUGCUCUAGGAUUGU |
| ENST00000007390 | MI0003586 | hsa-miR-579 | UUCAUUUGGUAUAAACCGCGAUU |
| ENST00000007390 | MI0003588 | hsa-miR-581 | UCUUGUGUUCUCUAGAUCAGU |
| ENST00000007390 | MI0003599 | hsa-miR-589 | UGAGAACCACGUCUGCUCUGAG |
| ENST00000007390 | MI0003605 | hsa-miR-593 | UGUCUCUGCUGGGGUUUCU |
| ENST00000007390 | MI0003607 | hsa-miR-595 | GAAGUGUGCCGUGGUGUGUCU |
| ENST00000007390 | MI0003615 | hsa-miR-602 | GACACGGGCGACAGCUGCGGCCC |
| ENST00000007390 | MI0003629 | hsa-miR-616* | ACUCAAAACCCUUCAGUGACUU |
| ENST00000007390 | MI0003633 | hsa-miR-619 | GACCUGGACAUGUUUGUGCCCAGU |
| ENST00000007390 | MI0003635 | hsa-miR-621 | GGCUAGCAACAGCGCUUACCU |
| ENST00000007390 | MI0003647 | hsa-miR-632 | GUGUCUGCUUCCUGUGGGA |
| ENST00000007390 | MI0003665 | hsa-miR-650 | AGGAGGCAGCGCUCUCAGGAC |
| ENST00000007390 | MI0003672 | hsa-miR-663 | AGGCGGGGCGCCGCGGGACCGC |
| ENST00000007390 | MI0003761 | hsa-miR-668 | UGUCACUCGGCUCGGCCCACUAC |
| ENST00000007390 | MI0005541 | hsa-miR-875-5p | UAUACCUCAGUUUUAUCAGGUG |
| ENST00000007390 | MI0005537 | hsa-miR-888 | UACUCAAAAAGCUGUCAGUCA |
| ENST00000007390 | MI0005715 | hsa-miR-923 | GUCAGCGGAGGAAAAGAAACU |
| ENST00000007390 | MI0005762 | hsa-miR-940 | AAGGCAGGGCCCCCGCUCCCC |
| ENST00000007390 | MI0005763 | hsa-miR-941 | CACCCGGCUGUGUGCACAUGUGC |
| ENST00000007390 | MI0005764 | hsa-miR-941 | CACCCGGCUGUGUGCACAUGUGC |
| ENST00000007390 | MI0005765 | hsa-miR-941 | CACCCGGCUGUGUGCACAUGUGC |
| ENST00000007390 | MI0005766 | hsa-miR-941 | CACCCGGCUGUGUGCACAUGUGC |
| ENST00000007390 | MI0000101 | hsa-miR-99a | AACCCGUAGAUCCGAUCUUGUG |
| ENST00000007390 | MI0000746 | hsa-miR-99b | CACCCGUAGAACCGACCUUGCG |
| ENST00000007390 | MI0002401 | mmu-miR-466a-5p | UAUGUGUGUGUACAUGUACAUA |
| ENST00000007390 | MI0005502 | mmu-miR-466b-5p | GAUGUGUGUGUACAUGUACAUG |
| ENST00000007390 | MI0005503 | mmu-miR-466b-5p | GAUGUGUGUGUACAUGUACAUG |
| ENST00000007390 | MI0005504 | mmu-miR-466b-5p | GAUGUGUGUGUACAUGUACAUG |
| ENST00000007390 | MI0005505 | mmu-miR-466c-5p | GAUGUGUGUGUGCAUGUACAUA |
| ENST00000007390 | MI0005506 | mmu-miR-466e-5p | GAUGUGUGUGUACAUGUACAUA |
| ENST00000007390 | MI0004673 | mmu-miR-669c | AUAGUUGUGUGUGGAUGUGUGU |
| ENST00000007390 | MI0004601 | mmu-miR-673-3p | UCCGGGGCUGAGUUCUGUGCACC |
| ENST00000007390 | MI0004682 | mmu-miR-698 | CAUUCUCGUUUCCUUCCCU |
| ENST00000007390 | MI0004685 | mmu-miR-701 | UUAGCCGCUGAAAUAGAUGGA |
| ENST00000007390 | MI0004693 | mmu-miR-709 | GGAGGCAGAGGCAGGAGGA |
| ENST00000007390 | MI0004695 | mmu-miR-711 | GGGACCCGGGGAGAGAUGUAAG |
| ENST00000007390 | MI0004651 | mmu-miR-719 | AUCUCGGCUACAGAAAAAUGUU |
| ENST00000007390 | MI0000635 | rno-miR-347 | UGUCCCUCUGGGUCGCCCA |
Selected Publications
[
] Gene-related publications indexed at PubMed
- [
] Gerhard DS, et al. (2004) "The status, quality, and expansion of the NIH full-length cDNA project: the Mammalian Gene Collection (MGC)." Genome Res. 14(10B):2121-2127. PMID:15489334 - [
] Martin J, et al. (2004) "The sequence and analysis of duplication-rich human chromosome 16." Nature. 432(7020):988-994. PMID:15616553 - [
] Strausberg RL, et al. (2002) "Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences." Proc Natl Acad Sci U S A. 99(26):16899-16903. PMID:12477932 - [
] Daniels RJ, et al. (2001) "Sequence, structure and pathology of the fully annotated terminal 2 Mb of the short arm of human chromosome 16." Hum Mol Genet. 10(4):339-352. PMID:11157797
The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5'-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline.
Human chromosome 16 features one of the highest levels of segmentally duplicated sequence among the human autosomes. We report here the 78,884,754 base pairs of finished chromosome 16 sequence, representing over 99.9% of its euchromatin. Manual annotation revealed 880 protein-coding genes confirmed by 1,670 aligned transcripts, 19 transfer RNA genes, 341 pseudogenes and three RNA pseudogenes. These genes include metallothionein, cadherin and iroquois gene families, as well as the disease genes for polycystic kidney disease and acute myelomonocytic leukaemia. Several large-scale structural polymorphisms spanning hundreds of kilobase pairs were identified and result in gene content differences among humans. Whereas the segmental duplications of chromosome 16 are enriched in the relatively gene-poor pericentromere of the p arm, some are involved in recent gene duplication and conversion events that are likely to have had an impact on the evolution of primates and human disease susceptibility.
The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http:mgc.nci.nih.gov).
We have sequenced 1949 kb from the terminal Giemsa light band of human chromosome 16p, enabling us to fully annotate the region extending from the telomeric repeats to the previously published tuberous sclerosis disease 2 (TSC2) and polycystic kidney disease 1 (PKD1) genes. This region can be subdivided into two GC-rich, Alu-rich domains and one GC-rich, Alu-poor domain. The entire region is extremely gene rich, containing 100 confirmed genes and 20 predicted genes. Many of the genes encode widely expressed proteins orchestrating basic cellular processes (e.g. DNA recombination, repair, transcription, RNA processing, signal transduction, intracellular signalling and mRNA translation). Others, such as the alpha globin genes (HBA1 and HBA2), PDIP and BAIAP3, are specialized tissue-restricted genes. Some of the genes have been previously implicated in the pathophysiology of important human genetic diseases (e.g. asthma, cataracts and the ATR-16 syndrome). Others are known disease genes for alpha thalassaemia, adult polycystic kidney disease and tuberous sclerosis. There is also linkage evidence for bipolar affective disorder, epilepsy and autism in this region. Sixty-three chromosomal deletions reported here and elsewhere allow us to interpret the results of removing progressively larger numbers of genes from this well defined human telomeric region.

