Monarch geneset OGS2.0

DPOGS210192
TranscriptDPOGS210192-TA3171 bp
ProteinDPOGS210192-PA1056 aa
Genomic positionDPSCF300283 - 33473-61568
RNAseq coverage632x (Rank: top 20%)
Annotation
HeliconiusHMEL0095157e-6837.38% 
BombyxBGIBMGA013794-TA4e-7170.37% 
Drosophilaenc-PE2e-3931.47% 
EBI UniRef50UniRef50_Q16WL53e-5235.23%Encore protein n=1 Tax=Aedes aegypti RepID=Q16WL5_AEDAE
NCBI RefSeqXP_001653717.16e-5335.23%encore protein [Aedes aegypti]
NCBI nr blastpgi|1571209441e-5135.23%encore protein [Aedes aegypti]
NCBI nr blastxgi|1892388911e-5330.34%PREDICTED: similar to encore protein [Tribolium castaneum]
Group
KEGG pathwayaag:AaeL_AAEL0091892e-52 
 K02360 (ENC)maps-> Dorso-ventral axis formation
InterPro domain[30-124] IPR0137831.1e-08Immunoglobulin-like fold
[34-136] IPR0130989.2e-06Immunoglobulin I-set
Orthology group 
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210192-TA
ATGGTCGGACGGAATATGTTGTTGTGCATTTATCTGAGAGAGAAAAGAAAAATATATTATGAATGTAATATACAAAGAGAAAAAGGACCACCATACATAAGAACATTACCCCCGATAAAAGTACAAAGCGGTGATTCTTTGAAATTAAGATGTCCCTTCTATGGATTCCCAATAAGTAAATTAGAATGGGAACACAGAGGCAAAAAGATCATTAGCACAATGCTGCCACAGCACGCGAGATATAAAAGGACGGAAAACAGATTCTCGGATAAACATAACATAATAGAACGCAGGGAGAGACAGGTGATTGAGACTAGTGAGGATGGATTGCTTAATATACAGAGGGTCGTCAAAGAGGAGAACGGGGAAATGTATACATGTATAGTTTACAGTCCTTCTGGAGAGATGGCTAGGAGATCCTTTGAGATCCAGGTGGTTGAAGCCCCGGAACTGGATGAACTGAGAGTTGGCTCCGGCTTGAAGGAGGGTCAAATAGTACAAATAACCTGCAACATTATAGGTGGAGATCCCCCTATATTCTTCTCCTGGTTGAAAGACGGTATGAAAATACCCGCCAGTTTGAAGATTAGTACTGGAATGGAGCAACCGGAAGTCGAGAAGGGCTGCGGCAGCTCGGAGTACGGAGCACAACTTACAAGAAATAGAAGTTTCAAATCGAAGCAGCTAGTCCGCAGTCAGGCGATACGAGAAGCAACGTCGCCGCCACGAACAGCGTCACCGTTGGCGAGCGACAAACCAAACGTGACCGGAAACGCGGTCGAGAAGAGGTCGAACAGCGAGACCTCCAGTCACAGCGACGGAGGTTCCGTAGTCAGCAATGACAGCAAACAACCCGTTGAAATACAGATAACATCCAACTCCTGGGAGGACGGTAACAGTGACCAACGCCAGCGGACCCGCCGCTGGCTGGGTCACCACCACTCAGACTCCGCGAGGGACCTTCTGACCCACGGGCCGCGGGTGGCCTGCGTCTGTGGGGCCTGCGAGTGTCCGCACTGCAGAAGTTACTATGATGTGATATCGTCTTATGTTCTGCTCACGTCTTCAGAGGGCGCAGAAGAAAACACGCGTGTCCGACGAAACAGGACTCTGGCATCGTTTGTTCTGACGATTGTCCTGATTGCACCGACTCAGAACATGCGGGTCCTAACGGTGACGGCGCTAGGATGTCCGGGAGTCTGGACAGCGACGATCAGGCGTATTATTGCAGGTATTAAAAGAACTCCATCTGGATGTTTCCAGCACCTGGTCCACGTTGTAACCGGTCATCCTTATATTTCCAGATGCATCGATCGCAAGGACAAGCCGAAGAGTATGTGCATAGGTGGCAATGAATTTGAGGATAAGACGGATCTATCAGGTCCGGAGCTGGTGGCCTTCATCAAAGAAACCCTGAACAAGAACCCAAGAGACCGCGCCACTCTGTTGAGGATTGAAAAGGAACTGCATGGTCTTGTAACAGACAACAGTGACCGAGTCACGGCGGCGTCTGTCCTCACGTGCCCTGGTCACCAGTCCATCACTAACCATCATCATCTCATCATCCTTATTATGACTACTAGCGCCCTCCCCAGTCGTTGCATCGTCCGCTTCCCCGTGATGACGTCATACGGCCGTATGCTGGTCCACCGCTGCGCAGCTCTGTTCCAACTGGCGCACCACCTCGACCACTCGAACAAGAACTCCGTGCTGGTGTCCAAAAGCGGCTCTTATCACACTGTGACAGGCACGTGCGGTGGCCGCCTGCCCTGTACCTCCTTCCGTGAGTGGTGCACCACCGTGTTCCCGAGGTCACCCACACACGAGGACACGCTCGCCAAGTCCATTCTAAAGCGTTGTTCGGGACCACCTGGTACAGCTAGCTCGGCAGCTGGCAGGAGCAAGTCGTTGGAACAACGAGAGAGGGAATACGAGAGAGTCAGGAGAAGAATTTTCAGCACGGATAACTGCACCCAGGACGAGACGCAATGGCCCTGGCTGACTTCCGGACCCGTTAAGCTGCTGACACCGGACACTGGCAGGAACAAGTTATTGAAGGTGCACTCCCTGGAGGCGAAGTCCCCGGGCCGGGGGGTGGTGTCGAAGAGTCACAGCTTCGGGGGATACACGGACCCCCAGCAGAGAGTGCTCAGCAGACAGGGTGACCTGGCGTCATCCAGCTGGCGTCTCTCTCCGTCCAGCTCCGGGTACAAGACCCTCAGTTTGCGGAGCACGGATTCAGTCACACCATCACCCACAGGCGGTGCGAGTCCTGAGCCGGGGCCCCCTTCCCTGTGCGTGCCGGGGACCTCGGGGGCCCTCGTGTGGGCTGTGACCGACATGGCCGCGGTCCCGCCCGGGGCACTAGTCATACATCCGCAGACAGGCCGGCCGCTCACCAACCCGGACGGCAGCCTGTACCACTTCGACCCCGAGAACCCGCCGCGCCTGTACGCGGACCGGGGGGAGGTAACCTGCTCGCCCCAGCTACACCACGCGAGGGTCGACGGCAGCGCGGAGAAGAGACGCGGGAAACTGGAAAAACAGAACTCCTTCATAGATAACGAATGCGACTTCGATTCCAAGCGGGACAAGCGCTGTGACTGCGCTCCAGACAACGACGGAGGCCAGCGGAAACCGAAGACGCCGGCAGCGGCGAGCCCGAACAACACGCGGACCGCGCACGACGAGCAGGCGGCGCCGGCCGCGCCCGCGCCCAACGGAGACGTGGACCAAAGCGAAGTCGCTGAAATAAAACAGGCTCTCGAGAATATTAAAATAACACAGAAATCACCGGTCAAAGAAAAGAAAGACGTCCAAATCGAACCCGTCAATCAGATTCAGTCTCCGAGGTACGATGCGGCCAATCAGGTCGCGTCTTCGCCGAGGTTCGAGTCGCCGGCCAGTCAGACGGCCTCGAACCAGCCGCAGAGGTTCGAAACGGCCAATCAGATGCAGCAGATTCAGAGATTCGATTCACCGGCCAATCACGTGCAGGCTGTCCAGAGATTCGATTCCCCCGCGAATAACAGACAGTACGACAGATATGACGTCCCCAGTAAGGCTTTGGAGAATAGGAATTTTGACAATCAGAGGAAGTTTCTAGAAGAGGCTTATCACGAGAGCTACGTCCCGTATAAGAGTGAGGAGGTGAGCTCGTGTGCGCTCTAA

Protein sequence:

>DPOGS210192-PA
MVGRNMLLCIYLREKRKIYYECNIQREKGPPYIRTLPPIKVQSGDSLKLRCPFYGFPISKLEWEHRGKKIISTMLPQHARYKRTENRFSDKHNIIERRERQVIETSEDGLLNIQRVVKEENGEMYTCIVYSPSGEMARRSFEIQVVEAPELDELRVGSGLKEGQIVQITCNIIGGDPPIFFSWLKDGMKIPASLKISTGMEQPEVEKGCGSSEYGAQLTRNRSFKSKQLVRSQAIREATSPPRTASPLASDKPNVTGNAVEKRSNSETSSHSDGGSVVSNDSKQPVEIQITSNSWEDGNSDQRQRTRRWLGHHHSDSARDLLTHGPRVACVCGACECPHCRSYYDVISSYVLLTSSEGAEENTRVRRNRTLASFVLTIVLIAPTQNMRVLTVTALGCPGVWTATIRRIIAGIKRTPSGCFQHLVHVVTGHPYISRCIDRKDKPKSMCIGGNEFEDKTDLSGPELVAFIKETLNKNPRDRATLLRIEKELHGLVTDNSDRVTAASVLTCPGHQSITNHHHLIILIMTTSALPSRCIVRFPVMTSYGRMLVHRCAALFQLAHHLDHSNKNSVLVSKSGSYHTVTGTCGGRLPCTSFREWCTTVFPRSPTHEDTLAKSILKRCSGPPGTASSAAGRSKSLEQREREYERVRRRIFSTDNCTQDETQWPWLTSGPVKLLTPDTGRNKLLKVHSLEAKSPGRGVVSKSHSFGGYTDPQQRVLSRQGDLASSSWRLSPSSSGYKTLSLRSTDSVTPSPTGGASPEPGPPSLCVPGTSGALVWAVTDMAAVPPGALVIHPQTGRPLTNPDGSLYHFDPENPPRLYADRGEVTCSPQLHHARVDGSAEKRRGKLEKQNSFIDNECDFDSKRDKRCDCAPDNDGGQRKPKTPAAASPNNTRTAHDEQAAPAAPAPNGDVDQSEVAEIKQALENIKITQKSPVKEKKDVQIEPVNQIQSPRYDAANQVASSPRFESPASQTASNQPQRFETANQMQQIQRFDSPANHVQAVQRFDSPANNRQYDRYDVPSKALENRNFDNQRKFLEEAYHESYVPYKSEEVSSCAL-