FAM166C | |||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Identifiers | |||||||||||||||||||||||||||||||||||||||||||||||||||
Aliases | FAM166C, chromosome 2 open reading frame 70, family with sequence similarity 166 member C, C2orf70 | ||||||||||||||||||||||||||||||||||||||||||||||||||
External IDs | MGI: 1922684; HomoloGene: 49920; GeneCards: FAM166C; OMA:FAM166C - orthologs | ||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||||||||||||
Wikidata | |||||||||||||||||||||||||||||||||||||||||||||||||||
|
Family with Sequence Similarity 166, member C (FAM166C), is a protein encoded by the FAM166C gene. The protein FAM166C (aliases c2orf70, LOC339778) is localized in the nucleus. It has a calculated molecular weight of 23.29 kDa. It also contains DUF2475, a protein of unknown function from amino acid 19–85. The FAM166C protein is nominally expressed in the testis, stomach, and thyroid .
Gene
The FAM166C gene, also known as C2orf70, is located on the positive-sense strand of locus 2p23.3. It has 9 exons, however due to overlap only 4 are distinguishable in the human genome. FAM166C spans from 26,562,565 to 26,581,166 for a total length of 18.6 kpb.
Gene neighborhood
The gene neighborhood for FAM166C consists of DRC1, LOC112840921, OTOF, CIB4 and LOC122756675. LOC112840921 and LOC122756675 are both predicted transcriptional regulatory regions. DRC1 (dynein regulatory complex subunit 1) encodes a central component of the nexin-dynein complex, a regulator of ciliary diene. Mutations in this gene can lead to ciliary dyskinesia. OTOF encodes the protein otoferlin which has been suggested to be involved in vesicle membrane fusion. Mutations can lead to neurosensory nonsyndromic recessive deafness, DFNB9. CIB4 (Homo sapiens calcium and integrin binding family member 4) encodes the CIB4 protein which regulates integrin alphaIIb subunit activation.
Transcripts
FAM166C has 2 different transcript variants. The most abundant variant is FAM166C transcript variant 1, which is 718 nucleotides in length.
Accession Number | Transcript Length | Number of Exons | Protein Length | Isoform |
NM_001105519.3 | 718 | 4 | 201 | 1 |
NM_001322426.2 | 754 | 5 | 184 | 2 |
Protein
The FAM166C protein is 201 amino acids in length with a predicted molecular weight of 23 kDA and an isoelectric point of 10. It has higher than normal levels of tyrosine and proline and lower than normal levels of isoleucine.
Domains and structure
The FAM166C protein has one domain of unknown function called DUF2475 from amino acids 19–85. FAM166C isoform 1 secondary structure appears to be primarily alpha helical in nature with only short segments predicted to be beta sheets. Tertiary structure predictions shows 5 distinct alpha helices with high confidence.
Isoforms
FAM166C has 2 different splice isoforms. The most abundant isoform is FAM166C protein isoform 1 which is 201 amino acids in length.
Name | Transcript variant | Peptide length | Domains present |
---|---|---|---|
Isoform 1 | 1 | 201 aa | DUF275 |
Isoform 2 | 2 | 184 aa | DUF275 |
Regulation
Gene level regulation
Promoter
FAM166C has 3 possible promoters that produce complete protein isoforms, however Isoform 1 is only encoded by GXP_1493451. Isoform 2 is also encoded by GXP_1493451.
Transcription Factor Binding Sites
GXP_1493451 contains over 250 transcription factor binding sites. The most conserved and likely to bind include a forkhead box protein factor (V$FOXP2.01), a collagen krox domain factor (V$CKROX.01) and an E2F transcription factor(V$E2F3.01).
Expression pattern
FAM166C has overall low levels of expression compared to other proteins but within the tissues it is expressed in, it appears most prominently in the testes, stomach and thyroid. Within the cell, FAM166C is localized to the nucleus and contains 2 nuclear localization signals. Protein antibody staining is highly indicative of nuclear membrane localization specifically.
Transcript level regulation
The 5' UTR of FAM166C transcript variant 1 is 29 bp in length. Analysis of potential 3d structures identifies one hairpin structure, however, the 5' UTR differs heavily among orthologs indicating this is unlikely to be an important region for transcriptional regulation.
The 3' UTR is 89 bp in length and contains one polyadenylation signal at 699 bp. It is conserved among human transcript variants, but only small segments are well conserved among orthologs. It contains 2 predicted mi-RNA binding sites in areas of moderate conservation at 631 bp (has-miR-3184-3p) and at 641 bp ( has-miR-4539, has-miR-12113). 3D predictions identify two stem loop structures.
Protein level regulation
FAM166C is predicted to have 7 phosphorylation sites, 2 acetylation sites and one O-GlcNAc site, which are well conserved among orthologs.
The above image is a conceptual translation of FAM166C transcript variant 1/ protein isoform 1. Phosphorylation sites are highlighted in green, N-linked acetylation sites are highlighted in indigo, internal acetylation sites are highlighted in pink, O-ß-GlcNAc sites are highlighted in yellow, nuclear localization signals are highlighted in light blue and the poly A signal is highlighted in red. The start and stop of transcription are marked with colored green and red text respectively. DUF275 is marked with brackets and amino acids conserved among all known orthologs are bolded.
Homology and evolution
Paralogs
The human FAM166C gene has two paralogs called FAM166A and FAM166B. They are located at 9q34.3 and 9p13.3 respectively. The function of both proteins is not currently well understood.
Orthologs
FAM166C has orthologs in species as distant as insects. Mammalian orthologs are moderately similar to human FAM166C, with percent identity greater than 70%. Orthologs in reptiles, birds and amphibians range from 65% to 40%. In fish and invertebrates, identity ranges from 40% to 20%. No orthologs were found in fungi, bacteria or plants.
Genus/Species | Common Name | Taxonomic Order | Estimated Date of Divergence (MYA) | Accession number | Sequence length (aa) | Sequence identity (%) | Sequence similarity (%) | |
---|---|---|---|---|---|---|---|---|
Mammalia | Homo sapiens | Human | Primates | 0 | NP_001098989.1 | 201 | 100 | 100 |
Mus musculus | Mouse | Rodentia | 90 | NP_083561.1 | 200 | 68.2 | 80.6 | |
Canis Lupis familiaris | Dog | Carnivora | 96 | XP_038546893.1 | 201 | 78.6 | 90 | |
Camelus ferus | Wild Bactrian camel | Artiodactyla | 96 | XP_006189760.1 | 201 | 82.1 | 89.1 | |
Reptilia | Podarcis muralis | Common wall lizard | Squamata | 312 | XP_028581109.1 | 201 | 66.2 | 80.1 |
Thamnophis elegans | Western terrestrial garter snake | Squamata | 312 | XP_032071055.1 | 201 | 65.7 | 79.6 | |
Gopherus evgoodei | Goode's thornscrub tortoise | Testudines | 312 | XP_030410948. | 201 | 64.7 | 78.1 | |
Aves | Gallus gallus | Chicken | Galliformes | 312 | XP_420014.2 | 128 | 58.5 | 76.4 |
Apteryx rowi | Okarito brown kiwi | Apterygiformes | 312 | XP_025911147.1 | 199 | 40.8 | 52.9 | |
Amphibia | Ranitomeya imitator | Mimic poison frog | Anura | 351.8 | CAF5191195. | 200 | 67.2 | 80.6 |
Bufo bufo | Common toad | Anura | 351.8 | XP_040286881.1 | 155 | 51 | 71 | |
Microcaecilia unicolor | Tiny cayenne caecilian | Gymnophiona | 351.8 | XP_030051403.1 | 173 | 46.8 | 69.3 | |
Geotrypetes seraphini | Gaboon caecilian | Gymnophiona | 351.8 | XP_033791776.1 | 173 | 49.0 | 67.9 | |
Fish | Alosa sapidissima | American shad | Clupeiformes | 435 | XP_041950360.1 | 201 | 41.1 | 62.4 |
Salmo trutta | Brown trout | Salmoniformes | 435 | XP_029626333.1 | 192 | 4.21 | 60.9 | |
Carcharodon carcharias | Great white shark | Chondrichthyes | 473 | XP_041043370.1 | 200 | 41.0 | 60.8 | |
Invertebrata | Ciona intestinalis | Vase tunicate | Enterogona | 676 | XP_002130039.1 | 206 | 43.9 | 62.3 |
Anneissia japonica | Sea lily | Crinoidea | 684 | XP_033123630.1 | 203 | 38.9 | 58.8 | |
Saccoglossus kowalevskii | Acorn worm | Enteropneusta | 684 | XP_002733424.1 | 197 | 41.5 | 58.5 | |
Photinus pyralis | Common eastern firefly | Coleoptera | 797 | XP_031329322.1 | 200 | 23.7 | 39.9 |
Evolution
The FAM166C gene appears most distantly in insects which diverged from humans approximately 797 million years ago. Orthologs of FAM166A and FAM166B also occur in insects. FAM166C evolves at a moderate rate; a 1% change in amino acid sequence required around 10 million years. Based on sequence similarity of orthologs, FAM166C evolves at a rate in the middle of cytochrome c and fibrinogen alpha.
Clinical significance
Disease association
Colorectal cancer
Several studies have evaluated FAM166C as a potential target for colorectal cancer treatment. In one study, researchers evaluated FAM166C for drug treatment viability for G12A colorectal cancer. FAM166C was one of 11 genes that had a significantly different twofold change between KRAS G12 (mutated oncogene suppressor) colorectal cancer patients and wild type colorectal cancer patients. Another study identified FAM166C as one of four potential targets for CVB-D, an autophagy cell death inducer of colorectal cancer cells, based on its over-expression in colon adenocarcinoma.
Mutations (SNPs of interest)
Using GWAS, a FAM166C SNP was identified as being correlated with high levels of bacterial colonization, a trait that may be associated with periodontitis.
Using whole exome sequencing and the human reference genome as a comparison, a novel FAM166C SNP was identified as the only gene mutation having a polyphen score of 0.954 indicating it was likely deleterious and may be involved in one of the patient's bilateral cleft lip and palate.
References
- ^ GRCh38: Ensembl release 89: ENSG00000173557 – Ensembl, May 2017
- ^ GRCm38: Ensembl release 89: ENSMUSG00000029182 – Ensembl, May 2017
- "Human PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
- "Mouse PubMed Reference:". National Center for Biotechnology Information, U.S. National Library of Medicine.
- "protein FAM166C isoform 1 [Homo sapiens]". National Center for Biotechnology Information. U.S. National Library of Medicine. Retrieved 3 October 2021.
- "C2orf70". The Human Protein Atlas. Knut and Alice Wallenberg Foundation. Retrieved 3 October 2021.
- ^ "FAM166C family with sequence similarity 166 member C [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov.
- "Genome Data Viewer - NCBI". www.ncbi.nlm.nih.gov. Retrieved 17 December 2021.
- "LOC122756676 Sharpr-MPRA regulatory region 9124 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
- "LOC112840921 Sharpr-MPRA regulatory region 888 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
- "DRC1 dynein regulatory complex subunit 1 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
- "OTOF otoferlin [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
- "CIB4 calcium and integrin binding family member 4 [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-18.
- ""ExPASy - Compute pI/Mw tool"". Expasy.
- "SAPS < Sequence Statistics < EMBL-EBI". www.ebi.ac.uk.
- "protein FAM166C isoform 1 [Homo sapiens] - Protein - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
- "A Protein Secondary Structure Prediction Server". JPred4.
- "AlphaFold Protein Structure Database". alphafold.ebi.ac.uk. Retrieved 2021-12-18.
- ^ "Genomatix". Archived from the original on 24 February 2001. Retrieved 17 December 2021.
- "FAM166C family with sequence similarity 166 member C [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 2021-12-16.
- "PSORT II Prediction". psort.hgc.jp. Retrieved 16 December 2021.
- ^ "Subcellular - FAM166C - The Human Protein Atlas". www.proteinatlas.org. Retrieved 2021-12-17.
- ^ "Homo sapiens family with sequence similarity 166 member C (FAM166C), transcript variant 1, mRNA". 2021-02-16.
- "miRDB - MicroRNA Target Prediction Database". mirdb.org. Retrieved 2021-12-17.
- "NetPhos 3.1". DTU Health Tech.
- "NetAcet- 1.0 DTU Health Tech". DTU Health Tech.
- "YinOYang 1.2". DTU Health Tech.
- "FAM166A family with sequence similarity 166 member A [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 25 October 2021.
- "FAM166B family with sequence similarity 166 member B [Homo sapiens (human)] - Gene - NCBI". www.ncbi.nlm.nih.gov. Retrieved 25 October 2021.
- Kumar S, Stecher G, Suleski M. "TimeTree: The Timescale of Life". www.timetree.org. Retrieved 25 October 2021.
- Ohnami S, Maruyama K, Chen K, Takahashi Y, Hatakeyama K, Ohshima K, et al. (September 2021). "BMP4 and PHLDA1 are plausible drug-targetable candidate genes for KRAS G12A-, G12D-, and G12V-driven colorectal cancer". Molecular and Cellular Biochemistry. 476 (9): 3469–3482. doi:10.1007/s11010-021-04172-8. PMC 8342352. PMID 33982211.
- Jiang F, Chen Y, Ren S, Li Z, Sun K, Xing Y, et al. (July 2020). "Cyclovirobuxine D inhibits colorectal cancer tumorigenesis via the CTHRC1‑AKT/ERK‑Snail signaling pathway". International Journal of Oncology. 57 (1): 183–196. doi:10.3892/ijo.2020.5038. PMC 7252468. PMID 32319595.
- Divaris K, Monda KL, North KE, Olshan AF, Lange EM, Moss K, et al. (July 2012). "Genome-wide association study of periodontal pathogen colonization". Journal of Dental Research. 91 (7 Suppl): 21S–28S. doi:10.1177/0022034512447951. PMC 3383103. PMID 22699663.
- Shah NS, Sulong S, Sulaiman WA, Halim AS (2020). "Genetic Variations Associated with Non-Syndromic Cleft Lip and Palate in Malays with Whole Exome Sequencing: Case Report and Gene Review". Malaysian Journal of Human Genetics. 1 (1): 35–44. Retrieved 17 December 2021.