Monarch geneset OGS2.0

DPOGS203391
TranscriptDPOGS203391-TA918 bp
ProteinDPOGS203391-PA305 aa
Genomic positionDPSCF300003 + 845650-846567
RNAseq coverage16x (Rank: top 81%)
Annotation
HeliconiusHMEL0088156e-9455.56% 
BombyxBGIBMGA002052-TA2e-8848.20% 
DrosophilaCG3731-PB7e-6438.51% 
EBI UniRef50UniRef50_P319304e-6237.99%Cytochrome b-c1 complex subunit 1, mitochondrial n=91 Tax=Eukaryota RepID=QCR1_HUMAN
NCBI RefSeqXP_309120.32e-6538.93%AGAP000935-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1479029349e-6638.64%ubiquinol-cytochrome c reductase core protein I [Xenopus laevis]
NCBI nr blastxgi|2608098354e-6340.94%hypothetical protein BRAFLDRAFT_287788 [Branchiostoma floridae]
Group
Gene OntologyGO:00468722.2e-53metal ion binding
GO:00038242.2e-53catalytic activity
GO:00065086.1e-20proteolysis
GO:00042226.1e-20metalloendopeptidase activity
GO:00082706.1e-20zinc ion binding
KEGG pathwayxla:3794011e-66 
 K00414 (QCR1, UQCRC1)maps-> Huntington's disease
    Oxidative phosphorylation
    Alzheimer's disease
    Cardiac muscle contraction
    Parkinson's disease
InterPro domain[92-304] IPR0112372.2e-53Peptidase M16, core
[93-305] IPR0112494.4e-30Metalloenzyme, LuxS/M16 peptidase-like, metal-binding
[44-218] IPR0078636.1e-20Peptidase M16, C-terminal
Orthology groupMCL34498 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203391-TA
ATGCTTAAGCATGATGAATCAACAGTAAATATATTATATGACTATTTACATTCAACAGCGTTCCAAGGTACCCAGTTAGGACAAACTATTATGGGAACTAGTAACAACTTGTGTAAATTTCAAATGCCTTCTGTTAGAAATCTTGTAAAGAAAATGTTTATTCCACAAAGAACTGTATTAGCTGCCGTTGGTGGTGTCACTCACGATACGATGGTGAACATTGGGAATAAATATTTTAGGAAAACTAAAGACCCCAAATGTATCTTTTTAGGCCCTTGTAGATATACAGGATCCGAAATATCCTACAGAGATGATTCAATGCCCATGGGGCATGTAGCGATAGCTGTGGAAGGCCCTCCCTTTTCCAGTAAAGATAAAATCTTTATGGATCTAGCTGCCTCATACATAGGGGGTTGGGACACCTCTCAACCTGGCGGAACAAACCACGGAACTTACACCGCATTGATGGGCTCAGCAGGACGTAATTGCGAGTCTTATAAAACTTTCCAATTTGTCTATAACGACACCAGTCTATGGGGAGCACAATTTATATCACCGAGAATTGATTTAGACGACATGTTATACATCATTCAGGACACTTGGATGAGATTATGTGACTTGATTACTGACGGTGAACTAATAAGGCCCAAAAGTGAATTAAAATCAAAGATTTTGATGCAAAATCAAAGTACAGAAAAAGCTTGTCACGATATCGGACAACACCTACTGCGAACTGGCAATCGACCGACTATTGCTGATAGATTTCGTGAAATTGATAATATAACTGCTAAGCAGTTAAAAAAGGTTTGCGATAAATACATATATGATAGATGTCCAGTGGTAGTGGGAATTGGTTCCATAGAATGCTTATATCCGTATACAAATGTAAGAGATGCCATGCGTTGGCTAAGAGTGTAA

Protein sequence:

>DPOGS203391-PA
MLKHDESTVNILYDYLHSTAFQGTQLGQTIMGTSNNLCKFQMPSVRNLVKKMFIPQRTVLAAVGGVTHDTMVNIGNKYFRKTKDPKCIFLGPCRYTGSEISYRDDSMPMGHVAIAVEGPPFSSKDKIFMDLAASYIGGWDTSQPGGTNHGTYTALMGSAGRNCESYKTFQFVYNDTSLWGAQFISPRIDLDDMLYIIQDTWMRLCDLITDGELIRPKSELKSKILMQNQSTEKACHDIGQHLLRTGNRPTIADRFREIDNITAKQLKKVCDKYIYDRCPVVVGIGSIECLYPYTNVRDAMRWLRV-