Monarch geneset OGS2.0

DPOGS200311
TranscriptDPOGS200311-TA3267 bp
ProteinDPOGS200311-PA1088 aa
Genomic positionDPSCF300026 - 53126-60168
RNAseq coverage760x (Rank: top 17%)
Annotation
HeliconiusHMEL0105080.085.40% 
BombyxBGIBMGA005589-TA0.077.77% 
Drosophilagw-PC4e-9841.35% 
EBI UniRef50UniRef50_UPI00021A83AC0.041.16%UPI00021A83AC related cluster n=2 Tax=unknown RepID=UPI00021A83AC
NCBI RefSeqXP_395115.30.041.42%PREDICTED: similar to CG31992-PA, isoform A, partial [Apis mellifera]
NCBI nr blastpgi|3320263730.042.92%Trinucleotide repeat-containing gene 6A protein [Acromyrmex echinatior]
NCBI nr blastxgi|3320263730.043.27%Trinucleotide repeat-containing gene 6A protein [Acromyrmex echinatior]
Group
Gene OntologyGO:00001667.9e-17nucleotide binding
GO:00055151.1e-08protein binding
KEGG pathway 
InterPro domain[936-1022] IPR0126777.9e-17Nucleotide-binding, alpha-beta plait
[517-579] IPR0090601.1e-08UBA-like
Orthology groupMCL11151 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200311-TA
ATGGATGCTATGCTGGAGTTATTACAAAATGAAGGAAAAAAGATCAAGCAAATGTATAAGAGTGACATGGATTCATGGGGAGTCCCACATCAATTAAGGCTTCGAGGAGGAGGGGAAACCTCACTTAACGCTGCAAGTGGCTGGGGTAGCCCACCTGCTTCCAATACTGCTGGAAACTGGAGTGGTAATTCCAGTGGCCAATCCAATAATGGAACAAACAATACTCAACAGTGGAACAATACCAACAGGCCACCCACCTCTCAAGCTGACATGAACAGCAACAAGGGTAATGGTAGTCAAATTCCACCAACCAGTGCAGCATCTCAAAGTTGGCAGAGCTCACCACCTAATATGCCAAATGCACAAACCAATAATAATGGTGCACCTACAAATAATGGTACAAATAATGCTGTCAATGGGAACAATGGAAGCAACCCATCAAATAATGTTACAAGTTCCAATGGCCCCAGTGGTGGAAATGGAAATGGTAATAGTACCAACAATACATCTTCAGCAAAAATTGAGCAATTGAACTCGATGAGGGAAGCTTTATTTAGCCAGGATGGTTGGGGAGGTCAACATGTCAACCAAGATACAAATTGGGATGTACCTGGATCACCAGAACCAGGTTCGAAAGTGGAACCAGCAGCTTCCGGGCAACCTGCTTGGAAACCAAATGTCAATAATGGTGCTTTCCTCGGAACTGATTTATGGGAAGCCAACTTAAGAAATGGAGGTCAGCCTCCACCACCTCCGGCGACGAAGACCCCATGGGGTCACACTCCGACAACGAATATCGGAGGTACATGGGGAGAAGACGACGATGCATCAGATTCTGCTAATGTATGGAACGGTCCACCGCCCGCCCAGCAATGGCCGGCTGGTCCTCCACAACACACCCAGCAGTGGGGGGTTCCCAAGAAGGATGATTGGAACGCCTGGGGTGAGCCTCATAGGCCGGCAGATCCACGTCTGGATCCGCACCGCACACCAGACCCACGTCAACAGCCGTCGGACCCCCGACATGACATTCGTGGAGGCATATCCGGTCGTCTGAACGGCGACATGTGGAGCCAACACCATCAACACCCTGCTGGACCCAATAAAATGATGCCCAGCGGAGGCGTTAGTCAGTGGGGCGGACAGGGTCCTAAGGAAGCGATTAAATCTGCCGGCTGGGAGGAGCCGTCTCCACCAGCGGCTCGTCGTGGAGGCACCGGGTTTGACGACGGCACGTCTCTCTGGGCCCAGCGTGGAATGGGTGGTATGTCTCGCGGAGCGCCTCAGGGCCCGCCGCCTTCTCAGCGCATGGCTCCAGCTCCTTCTAAGCCTGACGGTGTGTGGGCCGCCCACGCCCAACGCAACGGTCCGTGGGAAGAATCACACGGCTGGGGCGAACGCGACGTTCACTGCTCCTGGCAAGACCCCAGCGGGCCGACACTTTGGCCCGTACCAAAACCGAAGCCCAGCGGCCCGGCGGCAGCCTGGCCTGACGACATCGGCGAGTGGGGAGGCCCAAAGCCACCACCGAGCGGAGCTCTGGGTAAACAACTGCCGAAGGAAGTGGTGTGGAATAGTAAGCAGTTCAGATATTTAGTCGAGATGGGCTACAAGAAAGAGGAGGCCGAAGCTGCACTCCGCAGTCGCGACATGAACGCGGAAGAGGCACTGGACATGTTGGCGGCGGCGCGAGGCGACTGCTGGAGGCGAGATGACAATTTCCACTCGGGTGGTTTCCCGCCGCAGCCCTCCGTGCCGGCCGTCTCACCAGCTGTGGTACAGAAGCTGCTCAACCAGCAGCCGCCGCCAGCAGTTCAGCATCACCCTTCGTACAACCCCAATAGTGGAACGGGCAGCAGCGGCCAACCGAGCACTGCTCAACTGCGCATGCTGGTTCAGCAGATCCAAATGGCAGTGTCTGCGGGTTACTTAAACCATCAGATATTGAACCAGCCGTTGGCACCACAGACACTGGTGCTGCTCAACCAGCTATTGCAGCAAATAAAAGTCCTGCAGCAACTTGTCAACCAGCACCAGCTGGCCATACCGAAAGCGAACTCCACTTTGACGTTACAATATTCCATGCAAATAACTAAAGCTAAACAACAGATTCAAGCCTTACAGAACCAAATCACAACTCAACAAGCGCTGTACATGAAGCAGCAAGCTGCAGCATCAGATCTGTTCAAGCAGCCCCATGACCATTTGTCAAACATGCAGCAGAACTTTAGCGAGATGGCTATCTCTAAGGATTCTCAGTCTGGTTTTGGTAGTAGCGGTAACCAGCAGTCCCGCTTGAACCAGTGGAAGCTTCCATCGTTGGATAAAGACGGTGAGGGTACCGACTTCAGCCGCGCCCCCGGCACCGCCAAGTCTACCACGAGCCCGCAACTCAATCAGATTAGTTTGCAGCCTGATACTACCTGGGGTATGGGTCGCTCGGAAGGTTGGGGCGATGGTTCAGATGTGGCAGACGGAAAAGACGCCTGGCCCTCACACCCCTCGCATCCGCCCGCATACGACCTGGTCCCCGAGTTCGAACCTGGGAAGCCTTGGAAGGGCAATCAAATGAAGAATGTAGAAGACGACCCAGCCATGACACCTGGATCAGTCGTACGTTCACCGATATCTTTGACCAATATCAAGGACAGCGACATGCTCGGAGGAAAGACTTCGCCGCCCGGCGGCGAGAGGACGCTATCCTCCGCCACGTGGAGCTACGCCCCGCCAGCCACCAGCGCCGGCGGACTCAAGCCGCTAGACGTTTGGGGGGCTAAGCCCCGTCCCGCCCCGCCTGGACTCAATAAGTGGCCACAGCACCACGTCAACTCCCGTGCTGCCCCCTCGTGGCAGACATCTACCTGGCTGCTTCTGAGAAATCTCACTGCTCAGATCGACGGATCAACTTTGAAGACTTTATGTGTGCAACACGGCCCGTTGCAAAACTTCCACCTCTACCTCAACCAGGGACTCGCTCTCGCTCGCTACTCGACTCGCGAAGAGGCGGCTAAGGCCCAGATGGCATTAAACAACTGCGTTCTAAGCAACACGACTATCTTCGCGGAGTCGCCGGCCGAGTCTGACGTGCAGCTGATACTGCAACACCTGGGCTCTGGCGGTGGCGGCGCCTGGCGCGGAGGAGCCTCTAAGGACGGCTGGAACGGCGCCTTCCCCGGCCTGTGGCAGGAGCAGCACGAGCAGCGCGCGACTCCGTCGTCGCTGAATTCGTTCCTGCCGCCGGACCTGCTCGGCGGCGAGTCCATCTAA

Protein sequence:

>DPOGS200311-PA
MDAMLELLQNEGKKIKQMYKSDMDSWGVPHQLRLRGGGETSLNAASGWGSPPASNTAGNWSGNSSGQSNNGTNNTQQWNNTNRPPTSQADMNSNKGNGSQIPPTSAASQSWQSSPPNMPNAQTNNNGAPTNNGTNNAVNGNNGSNPSNNVTSSNGPSGGNGNGNSTNNTSSAKIEQLNSMREALFSQDGWGGQHVNQDTNWDVPGSPEPGSKVEPAASGQPAWKPNVNNGAFLGTDLWEANLRNGGQPPPPPATKTPWGHTPTTNIGGTWGEDDDASDSANVWNGPPPAQQWPAGPPQHTQQWGVPKKDDWNAWGEPHRPADPRLDPHRTPDPRQQPSDPRHDIRGGISGRLNGDMWSQHHQHPAGPNKMMPSGGVSQWGGQGPKEAIKSAGWEEPSPPAARRGGTGFDDGTSLWAQRGMGGMSRGAPQGPPPSQRMAPAPSKPDGVWAAHAQRNGPWEESHGWGERDVHCSWQDPSGPTLWPVPKPKPSGPAAAWPDDIGEWGGPKPPPSGALGKQLPKEVVWNSKQFRYLVEMGYKKEEAEAALRSRDMNAEEALDMLAAARGDCWRRDDNFHSGGFPPQPSVPAVSPAVVQKLLNQQPPPAVQHHPSYNPNSGTGSSGQPSTAQLRMLVQQIQMAVSAGYLNHQILNQPLAPQTLVLLNQLLQQIKVLQQLVNQHQLAIPKANSTLTLQYSMQITKAKQQIQALQNQITTQQALYMKQQAAASDLFKQPHDHLSNMQQNFSEMAISKDSQSGFGSSGNQQSRLNQWKLPSLDKDGEGTDFSRAPGTAKSTTSPQLNQISLQPDTTWGMGRSEGWGDGSDVADGKDAWPSHPSHPPAYDLVPEFEPGKPWKGNQMKNVEDDPAMTPGSVVRSPISLTNIKDSDMLGGKTSPPGGERTLSSATWSYAPPATSAGGLKPLDVWGAKPRPAPPGLNKWPQHHVNSRAAPSWQTSTWLLLRNLTAQIDGSTLKTLCVQHGPLQNFHLYLNQGLALARYSTREEAAKAQMALNNCVLSNTTIFAESPAESDVQLILQHLGSGGGGAWRGGASKDGWNGAFPGLWQEQHEQRATPSSLNSFLPPDLLGGESI-