Monarch geneset OGS2.0

DPOGS214351
TranscriptDPOGS214351-TA1869 bp
ProteinDPOGS214351-PA622 aa
Genomic positionDPSCF300020 + 440405-445465
RNAseq coverage557x (Rank: top 23%)
Annotation
HeliconiusHMEL0200410.080.81% 
BombyxBGIBMGA003971-TA0.097.27% 
DrosophilaHil-PA0.077.13% 
EBI UniRef50UniRef50_Q0E9080.077.13%Hillarin, isoform A n=30 Tax=Pancrustacea RepID=Q0E908_DROME
NCBI RefSeqXP_976023.10.082.32%PREDICTED: similar to AGAP005020-PA isoform 2 [Tribolium castaneum]
NCBI nr blastpgi|910866630.082.32%PREDICTED: similar to AGAP005020-PA isoform 2 [Tribolium castaneum]
NCBI nr blastxgi|2700097470.082.32%hypothetical protein TcasGA2_TC009044 [Tribolium castaneum]
Group
KEGG pathway 
InterPro domain[1-622] IPR0139984.8e-186Nebulin
[228-296] IPR0029318.7e-09Transglutaminase-like
Orthology groupMCL13441 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214351-TA
ATGAAACACAGGCAGGAAGAAGATGATTTGTATAGAAAGTTTTCTAAACACAGAGAGGAAGAAAATCGTAGAATACGAGAAGAAATACAGGACGAGTGGGAGAGGGAGTTAGAAAGATTAACAAACCGATTCCAACAAGAGATGCAAGTGAAGAAACGAAGACCAGAATCTGAGATAGGAGCCCTGACACTTCGACATCAACAGGAGAGAGCTGATCTGGAGAAGAATATGACTCTCCGGAGAGACAAGAAGAAGGAGAGCTTGACTAGAAAGATGTTAGAACACGAGAGGGCTGCTACTGCAGCACTGGTTGAAAAGCAAAGTCACGAGATGATGGAACTGATCCAGGAGCGTAGATCTGAATACATGGCAGCATCCTCCATATTCCTGGACGGAGAAGAAGCACCCCCTTATCCTTCTCGTGCTCCGCCCCCTTTGCCACCGCTTGTATCCAAATTCCACATATACACAGATCCTGCGGAATTCGCGGATGTTGATAAGATTGCTATTTCCGTAGCGCAAGAGGATCAAAAAACTTTTACCGATTTGGTCCGACAACTCGTGGGTAGATGTGCGAGTGATGTCGAGAAAGCAAGAACCATTTTCCGCTGGATAACTGTGAAGAACCTCAACAACATACAGTTTGACGAGAACCTCCGAGGGGATTCCCCCCTGGGATTACTTAGAGGCATCAAGCACGGCACCGAGAGTTATCACGTCCTGTTTAAGAGACTGTGCAGTTATGCTGGTCTCCACTGCGTGGTAATCAAGGGGTACAGTAAATCAGCTGGCTACCAGCCTGGAGTACGTTTCGAAGACAATCGCTTCCGCAACTCTTGGAACGCGGTGTACGTGGCCGGGGCCTGGCGCTTTGTGCAATGCAACTGGGGGGCGAGACACCTTGTTAACGCTAAAGATGCTCCCAAGCCAGGAAACAGAGGAAAGAGCGACAGCTTGAGATATGAATACGACGATCACTATTTCCTGACGGATCCTCGCGAGTTCATCTACGAGTTCTACCCGCTTCAGCCTGACTGGCAGCTGTTGAAGACGCCCATCACTCTACACGATTTCGAGGAACTTCCCTTCGTGAGGTCGCTGTTCTTTAGATACGGACTCTACTTCAGCGATCCCAACACCAAAGCTGTTATGTACACCGACTCTACTGGTGCGGCGACTATGCGTATAGCCATGCCGGCACACATGCAGAGCTCGTTGATCTTCCACTATAACCTTAAGTTCTACGACACGGAGGGCGACGGTTTTGACGGGGTCAGCCTTAAGCGGTTCGTCATGCAGTCTGTGGTTGGTAATGTTGTTTCGTTCCGTGTACACGCGCCCTGTTCCGGGGCCTTTCTCCTGGACATTTTCGCGAACGCCGTCACACCCAGGGAATACCTCACCGGCGAGCCCATGAAATTCAAAAGCGTTTGCAAATTTAAGATTTGCTGCGCCGAACTACAAACAGTAATGGTGCCGCTACCAGATTGTGCTAGCGGTGAGTGGGGGCCGACTAAAGCGACCAGACTCTTCGGCCTCGTCCCCATCACGCACCAGGAAGCACTTGTATTCGCCGGCAGAGAACTAGAGATTCAGTTCCGAATGTCGCGCCCTCTAGCGGACTTTATGGCGACTTTACACAAAAATGGCATCGATGAGAAACGGCTGTCCAAATACGTGCAACAAAACGTCTCGGACGATATCGTCAGCTTTTACATAACATTCCCAGAGGAAGGTCAATACGGTTTGGACATATACACTCGCGAGCGCGGGGGACCCACGGCCATACACAACGGCTCCAGCGAGAAGGAGAAACACCTACTTACACACTGCTGCAAATATCTCATCAACAGCAGTAAACGGAACTAA

Protein sequence:

>DPOGS214351-PA
MKHRQEEDDLYRKFSKHREEENRRIREEIQDEWERELERLTNRFQQEMQVKKRRPESEIGALTLRHQQERADLEKNMTLRRDKKKESLTRKMLEHERAATAALVEKQSHEMMELIQERRSEYMAASSIFLDGEEAPPYPSRAPPPLPPLVSKFHIYTDPAEFADVDKIAISVAQEDQKTFTDLVRQLVGRCASDVEKARTIFRWITVKNLNNIQFDENLRGDSPLGLLRGIKHGTESYHVLFKRLCSYAGLHCVVIKGYSKSAGYQPGVRFEDNRFRNSWNAVYVAGAWRFVQCNWGARHLVNAKDAPKPGNRGKSDSLRYEYDDHYFLTDPREFIYEFYPLQPDWQLLKTPITLHDFEELPFVRSLFFRYGLYFSDPNTKAVMYTDSTGAATMRIAMPAHMQSSLIFHYNLKFYDTEGDGFDGVSLKRFVMQSVVGNVVSFRVHAPCSGAFLLDIFANAVTPREYLTGEPMKFKSVCKFKICCAELQTVMVPLPDCASGEWGPTKATRLFGLVPITHQEALVFAGRELEIQFRMSRPLADFMATLHKNGIDEKRLSKYVQQNVSDDIVSFYITFPEEGQYGLDIYTRERGGPTAIHNGSSEKEKHLLTHCCKYLINSSKRN-