Monarch geneset OGS2.0

DPOGS207441
TranscriptDPOGS207441-TA3135 bp
ProteinDPOGS207441-PA1044 aa
Genomic positionDPSCF300051 - 750927-756285
RNAseq coverage4106x (Rank: top 3%)
Annotation
HeliconiusHMEL0123280.089.75% 
BombyxBGIBMGA009911-TA0.084.39% 
DrosophilaUba1-PA0.065.20% 
EBI UniRef50UniRef50_Q8T0L30.065.20%GH24511p n=24 Tax=Opisthokonta RepID=Q8T0L3_DROME
NCBI RefSeqXP_966352.10.073.24%PREDICTED: similar to ubiquitin-activating enzyme E1 [Tribolium castaneum]
NCBI nr blastpgi|2700149080.073.12%hypothetical protein TcasGA2_TC011512 [Tribolium castaneum]
NCBI nr blastxgi|910943310.073.24%PREDICTED: similar to ubiquitin-activating enzyme E1 [Tribolium castaneum]
Group
Gene OntologyGO:00054881.6e-87binding
GO:00086415.1e-66small protein activating enzyme activity
GO:00064645.1e-66protein modification process
GO:00038242.6e-34catalytic activity
GO:00055243e-27ATP binding
KEGG pathwaytca:6576210.0 
 K03178 (UBE1, UBA1)maps-> Ubiquitin mediated proteolysis
    Parkinson's disease
InterPro domain[42-1042] IPR0180750Ubiquitin-activating enzyme, E1
[438-932] IPR0090361.4e-146Molybdenum cofactor biosynthesis, MoeB
[287-426] IPR0160401.6e-87NAD(P)-binding domain
[69-93] IPR0000115.1e-66Ubiquitin/SUMO-activating enzyme E1
[620-881] IPR0232803.7e-62Ubiquitin-like 1 activating enzyme, catalytic cysteine domain
[914-1039] IPR0189651.9e-50Ubiquitin-activating enzyme e1, C-terminal
[463-601] IPR0005942.6e-34UBA/THIF-type NAD/FAD binding fold
[840-906] IPR0001273e-27Ubiquitin-activating enzyme repeat
[607-650] IPR0195729.5e-22Ubiquitin-activating enzyme
Orthology groupMCL10386 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207441-TA
ATGTCTAGTGCTGAAGTCGCCGATAATTCCGTTGACCCCCCGGCGAAAAAGCGGAAGCTAAACACAGGAGAGGCGAGTTGCAAATCCTCAGCAATGGCGAACAATGGAACGCGTGTAGAGGATGAAATTGACGAGAGCTTATATTCACGACAGTTATACGTTCTCGGTCACGATGCTATGCGCCGAATGGCAAATTCGGATGTTTTGATTTCTGGTCTCGGAGGTCTTGGTGTAGAAATTGCCAAAAATGTGATACTTGGCGGAGTCAAGTCAGTAACACTTCATGATGCAAAAACCTGCACCATTGCTGATTTATCATCTCAGTTTTACCTCTCCGAGGCAGATATTGGTAAAAACAGAGCAGAAGCATCCTGTGAACAGCTTTCAGAACTGAATCGCTATGTGCCGACTACATCATATACCGGACCACTTACTGAGGAGTTTCTGAAGAAGTACCGTGTTGTAGTATTGACTGGCGCTTCTTGGGAACAACAGGAGCAAGTTGCTGCTATAACACACGCTAACAATATAGCCTTAATCATTGCGGACACCCGGGGTCTGTTTTCTCAGGTTTTCTGTGATTTCGGACCCGAGTTCACGGTGCTAGATGTGACTGGAGAGAACCCAGTATCAGCCATGATAGCTGACATTACCCATGAATATGAAGCTGTGGTGACATGCTTGGATGATACTCGTCATGGGCTGGAAGATGGAGATTATGTTACATTTAGCGAGATTCAAGGTATGTCTGAGTTAAACGGCTGTGAACCACGTAAGATTAAGGTGCTGGGACCATACACCTTCAGTATTGGAGACACAACAAACTGCTCTAAGTATGTCAGAGGCGGCATCGTCACCCAAGTGAAAATGCCCAAAAAACTTAGCTTCAAACCTCTGAAAGAATCCATCAAGAATCCAGAGTTCCTGATTACTGATTTTGGTAAGATGGATTATCCTCAACAACTGCATGTAGGGTTTGCAGCCCTCCACAAGTTCCAAGCAGCTGAGGGTCGACTCCCCAAACCTTGGTGTGACGCTGATGTCAGCAAGTTCATGGGTGTCGTGGAGAGTATTGTCCAAGGCGAGGAATTGTTTAAAAAGGGTGAAATTGACATTAATAAGGAACTACTAGAAACATTCTGCAAGGTCTCAGCTGGAGATCTTAATCCCATGAATGCTGCAATAGGAGGAGTGGTCGCTCAGGAAGTAATGAAGGCCAGCTCGGGCAAGTTCCATCCTATAGTTCAGTGGCTGTACCTTGATGCTATCGAGTGTCTTCCAAAAGACAGATCGGGTCTCAACGAGGAGTACTGTAAACCCATTGGCTGCAGATATGATGGCCAGATAGCAGTATTTGGACAGAATATCCAAAAGAAGATTGGGGAGCTGAAGTATTTCATTGTGGGCGCGGGCGCCATCGGTTGTGAGTTGCTGAAGAACTTTGCCATGATGGGTGTGGGCGCTGCCGGCGGCGCCGTCACCGTTACGGATATGGATCTCATTGAGAAGTCTAACCTCAACCGCCAGTTCCTCTTCCGACCTCAAGACGTTCAGAAACCCAAGTCCAGTACAGCTGCCAGGGTTATCAAACAAATGAATCCATCAATGAACGTAATAGCCCAGGAGCACCGCGTGTGTCCCGAGACGGAGTGTGTATATGACGACGCGTTCTTTGAGGCCCTGGACGGAGTGGCCAACGCCTTGGACAACGTGGACGCCAGGATATACATGGACCGGCGCTGTGTGTACTACAGGAAACCCTTGTTGGAGAGTGGCACCCTCGGCACCAAGGGCAACACTCAGGTGGTGGTTCCCTTCCTGACCGAGTCCTACAGCTCATCTCAAGACCCGCCTGAGAAGAGCATCCCGATCTGTACCCTTAAGAACTTCCCCAACGCCATCGAGCACACTCTGCAGTGGGCTCGGGACGAGTTCGAGGGTCTGTTCCGTCAGGCCGCGGAGCACGCCGCACAATACTTGCGCGACCCACACTTCCTCGAGAGAACTATGAATCTACCGGGCAGCCAGCCGCTCGACGCTCTGGAGAGTGTTCAGAACGCGATCGTGGACCGCCCCATGAACTTCGACGACTGCGTGACCTGGGCCCGCATGCACTGGGAGGCTCAGTATTCCAACCAGATCAAACAGTTGCTATACAACTTCCCGCCCAAGCAGGTCACTTTACTGGGCGCCCCCTTCTGGTCTGGACCCAAACGGTGTCCCTCACCTCTAGAATTCGACCCCGAAGATGAACTGCACATGGACTACATCGTGGCCGCCGCCAACCTGAAGGCTCAGGTGTATGGCATACCGACGTGTGTGGACAGAGAGAGGATCGCTAAAGTCGCCATGACTGTAGAGGTGCCTAAATTCAAACCGAAGTCGGGCGTCAAAATCGCAGTAACGGATGCTCAGCTGCAACAGAGCGACGACAAAATGGACCAGGATAAGGTGGAGACCATAGTGGACAACTTGCCCCCGCCGAACAAACTCGGCAACCTTAAAATAACCCCGCTGGAGTTCGAGAAAGATGACGACACCAACTTCCACATGGACTTCATCGTGGCCGCGTCCAACCTGCGCGCCGCCAACTACAAGATCCCGCCCGCCGACAGACACCGCTCCAAGCTCATCGCCGGCAAGATCATCCCCGCCATCGCCACCACCACATCCGTGGTCGCCGGCCTCGTCTGCCTCGAGCTGTACAAGCTCGCCCAGGGCTTCAACACTCTAGAAGTCTTCAAGAACGGCTTCGTCAACTTGGCCTTACCGTTCTTCGGGTTCTCCGAGCCGATCGCCGCGCCCACCAACACGTACTACGACAAAAAATGGACGCTCTGGGACAGGTTCGAGGTGAAGGGGGAGATCACGTTACAGGAGTTCATAGATTACTTCAAAAACGAGCACAAACTGGATATCACGATGCTGTCCCAGGGCGTGTGCATGCTGTACTCGTTCTTCATGCTGAAAGCCAAACGCCAGGAGCGCCTCAACCTGCCGATGTCCGAAGTGGTCATGAAGGTGTCCAAGAAGAAGCTTGAGCCGCACGTGAAGGCGCTGGTGTTCGAGCTGTGCTGCAACGACGAGGACGACAACGACATCGAGGTGCCGTACGTCAAGTACACGCTGCCCTAA

Protein sequence:

>DPOGS207441-PA
MSSAEVADNSVDPPAKKRKLNTGEASCKSSAMANNGTRVEDEIDESLYSRQLYVLGHDAMRRMANSDVLISGLGGLGVEIAKNVILGGVKSVTLHDAKTCTIADLSSQFYLSEADIGKNRAEASCEQLSELNRYVPTTSYTGPLTEEFLKKYRVVVLTGASWEQQEQVAAITHANNIALIIADTRGLFSQVFCDFGPEFTVLDVTGENPVSAMIADITHEYEAVVTCLDDTRHGLEDGDYVTFSEIQGMSELNGCEPRKIKVLGPYTFSIGDTTNCSKYVRGGIVTQVKMPKKLSFKPLKESIKNPEFLITDFGKMDYPQQLHVGFAALHKFQAAEGRLPKPWCDADVSKFMGVVESIVQGEELFKKGEIDINKELLETFCKVSAGDLNPMNAAIGGVVAQEVMKASSGKFHPIVQWLYLDAIECLPKDRSGLNEEYCKPIGCRYDGQIAVFGQNIQKKIGELKYFIVGAGAIGCELLKNFAMMGVGAAGGAVTVTDMDLIEKSNLNRQFLFRPQDVQKPKSSTAARVIKQMNPSMNVIAQEHRVCPETECVYDDAFFEALDGVANALDNVDARIYMDRRCVYYRKPLLESGTLGTKGNTQVVVPFLTESYSSSQDPPEKSIPICTLKNFPNAIEHTLQWARDEFEGLFRQAAEHAAQYLRDPHFLERTMNLPGSQPLDALESVQNAIVDRPMNFDDCVTWARMHWEAQYSNQIKQLLYNFPPKQVTLLGAPFWSGPKRCPSPLEFDPEDELHMDYIVAAANLKAQVYGIPTCVDRERIAKVAMTVEVPKFKPKSGVKIAVTDAQLQQSDDKMDQDKVETIVDNLPPPNKLGNLKITPLEFEKDDDTNFHMDFIVAASNLRAANYKIPPADRHRSKLIAGKIIPAIATTTSVVAGLVCLELYKLAQGFNTLEVFKNGFVNLALPFFGFSEPIAAPTNTYYDKKWTLWDRFEVKGEITLQEFIDYFKNEHKLDITMLSQGVCMLYSFFMLKAKRQERLNLPMSEVVMKVSKKKLEPHVKALVFELCCNDEDDNDIEVPYVKYTLP-