Monarch geneset OGS2.0

DPOGS213726
TranscriptDPOGS213726-TA3372 bp
ProteinDPOGS213726-PA1123 aa
Genomic positionDPSCF300310 + 149316-157343
RNAseq coverage512x (Rank: top 24%)
Annotation
HeliconiusHMEL0176600.074.67% 
BombyxBGIBMGA011631-TA0.068.93% 
DrosophilaCG32479-PA2e-10346.42% 
EBI UniRef50UniRef50_Q17EP13e-11551.84%Putative uncharacterized protein n=1 Tax=Aedes aegypti RepID=Q17EP1_AEDAE
NCBI RefSeqXP_001664058.16e-11651.84%hypothetical protein AaeL_AAEL003733 [Aedes aegypti]
NCBI nr blastpgi|3504083713e-11551.76%PREDICTED: hypothetical protein LOC100746491 [Bombus impatiens]
NCBI nr blastxgi|1571378814e-13034.21%hypothetical protein AaeL_AAEL003733 [Aedes aegypti]
Group
Gene OntologyGO:00065111.6e-41ubiquitin-dependent protein catabolic process
GO:00042211.6e-41ubiquitin thiolesterase activity
KEGG pathway 
InterPro domain[745-1107] IPR0013941.6e-41Peptidase C19, ubiquitin carboxyl-terminal hydrolase 2
Orthology groupMCL15193 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213726-TA
ATGGATTTAATGAAAACGGAATATGAGTTCCTAGACTTGTCAGATGTGAAAGAAGCGGAACTAAGCAGCCTGCAGTGCGCTCTCTTCGAAAAAAAGCCCAGAGTTGCGTCCACCGCACCCAACGGTTGGAACGATCCTACTGTAGACGTTTCATCTTGTTCATCGAGTACGGCGGGAGCACCGACGGTGTCGAGCAGCCTGGACAGCCTGTCCGGCGGCGAGTGCGAAATGGCGCCACACTCGCCACAGCGAGGCGCGGGTGGGCCACCACACGCGCCGCCGCCATACGTCCAGCCGCCGACATACCCGCCACACCAATGGCCCGTCGCCCCGCCAAACGTATACGTCAGCCAAGTCACAGCCAATGTGAACGTTCACGGCTACATGGGCCAGTACTACCAACCTCCGCAGCCACAATACATCCCACCTCCACAAGTTGAACGCCCGGCGAGGAATCAGAGGAGAGAGCGACGAAATAAAAGAGCCCCATCTCCTCCACCGCCGCAACCGCCTCCCTACTATGTGCCGTATTCTCAATACTATCCTGCGGCTCAAGCTCAAGGCGCGCCGCTGTACCATCTGCCTATGTATCAGCCCTTGATGTATGGGCCATATGCATATCCACCTTATTACCCTGAGTACCCTATACCAGTTGAAGGCGACGCTGGTGATAAGGGGCCCGATGAATATCAGCAAGAAGTTGTCATGGAACAAGAAGCGGTAGATGCTTATTATGCAAGCGCTCATTACGCCGCTCCGCCATACGGACCGCCAGTCGATGGAGGTGTAGAATACATGCCGCCTTTGTATCTGCCTCCACCGCATCATCCAGCCCAAATGCACATACCGCAACAACAGTTACATCAACCCGTCCCTCATCAATTCAATGTACATGCCAAAAATTTTGTACAAGGGCAAAATCAAATTAAAAACTACACACCTGACAAGACTCAGGAACCAAAACCACCTGTTGTTGCACCTTCATCTACTCCGCCAACAACGGTATCATCAACGACAGTCAGTCCAGTTGAATCGCTGCCAATCAAAGATCTCAAGATTAACAAAGGACCCGGAAGTCCAAACCAGGAAAGGTCTCCAGAAATTCCAAAAGACGCCACATCTAACTCTAAAATTTCTCCAACTCTTAAAACCGATCCATCAAAACCAGCTTGGACATCTGATAATAAACCTCAAGAACCGAGTGTAGCTCAAAATACACCGAAAACTTTTACGCCAACGCCAACTACTAATGTTCCGTCAGCAAGCGCAAAAGTCCCCCCAGTTCCTACAAAAGCTCCAAAGGGACCGACAGCACCATTCTCAGCTAACAAACAGCTTCCTAAACCACCGCTACCATCAGCTGTTCCGGTACAGCAATCGGTTACCACACCAAAAGCGCCGTTTGGTAACAGACAAAAGCGTGAAGGAAATTCAAATCGTTCACCATCCACGGAAATGCCAGAAATGGATAAACCTGCACCGATAGAGCATACCAAGCGCGAGCCTCCCCTACCACCTAGCAAGGCGCCGATGCCTATATCTATTACACTACACGCTCAAGGACCGCCGGTGATTGTGACAAACAAATCTCCTTTTGCACACTCAAGGAAGGTCGTTCCCGTTCCGGAACTACCTCCAGTACCACAGCCTCCACCTCCAGCGCCGACAGCATCAGATTTTCCTCCACCCCCCACGCCCAGAAATAGGGGAGAACCCGTTCCCCCACCAGTGGTACAACCGCAACCGCAACCAGCACCAGGAAAGTCTTGGGCTAGTCTTTTCTCAAACAAACCCAGCAGTATAACTACGACAATCGCACAAACAACTGTTGCTCCTACAGAAGAACCGTCAAGCCCAACAACTTTGACGCCGCCGGCTGCTACAAACATTCAGAAACCCGTAGCAAAAGTCCCTCCATATGATGCTTCACCGTTACAAACGAATTCAGTAGAAAAACAAATTGCACCAAGGCCTATACCGACGCCTGCACCCACTCTGTCGTATTCAGAAAAGACTTCAGTGAATGCTGTGAGCAATGTTACTACTACTATGCCTCCGGCTAAAACAGCGACATCCCCTACTACAGAGGTTCGGGAAATGCCAATACAAAAGGAAGCTACTACACCAGCTTTACCACTACCACCTTCACCATTCAGTGATGATCCCAATTCATACAGGATGGGAGAGTTTTTGTCTAAATACACGCTGGACAATAGGCCAGTTTCTTTAACACCTCGCGGCCTTACAAATCATTCAAACTACTGCTATGTGAACGCTATACTTCAGGCTTTGATAGCTTGTCCGCCATTCTACAATATGTTAAAGGCGCTGCCTTACCAAACTAGGCGTGGGAAGTCCAGTACTCCAGTTATCGATTCTATGGTCGAGCTATGTTACGCTTTCGGTCCATTACCGAGCGCAAACCGAAGAGGCCGTGGTGAATCTGGCGCGTCGGGAGCTCCGGCCGTGCCCGCCATGTCGCCGCTAGATGGCTCGGCGGGTCTCCGAGTTTTGAGAGCGTTGCGACCCTTCCCCGGCTCACAAGAAGGTCGCCAGGAAGACGCCGAGGAATTCCTTGGATGCTTACTAAACTCGCTCAATGATGAAATGCTCGAGTTAATAAAATTAGTTGAACCTGAAGAGCCAAAAGATTTGAATGGAAAGCCAAATGGCATTGTAGCACAAGAACAACCCCCAGACGAGGACAATGATGATGACGAGTGGAAGGTGATGGGTCCTCGTAACCGTGGTGCTGTTGAACGTCGCTGGGCGGCACGTCGGACACCAGTAGCAGATATCTTCAGAGGTCGCACTCGCCTACGTCTTCACAGGGCCCCTAATCATGACGTCACAGATGCCGTACAACCATTCTTCACACTCCAACTTGATATTGAGCGTTCTACCACAGTTAAAGATGCGTTAGAACTTCTCGCCGGCAAGGATACTTTAGAAGGTGTATCGGATGCTTGGCAGCAATTGTCTCTGGAACAACTCCCTGTAGTGCTATTGCTGCATTTGAAATGTTTCCAACTGGATTCCGAGGGCCACACAGCCAAAATTGTGAAGAACATTGACTTCCCCATTGATCTCAAAATTGACCCCAAAATAATTTCATCGAAGCACACGACTAAGCAACGTCTATACAAACTGTTTGCTGTTGTGTACCATGAAGGTGTAGAGGCTGTGAAGGGACACTATCTGACGGACACCTTCCACGGACAGGTTGGATGGATTAGGTACGACGACTCCACTGTGACTCAAGTGACGGATGCCCAGGTGTTGAAACCCAAGCCGCCAAGGATGCCGTACCTGCTGATGTATCGTAGGCACGATACGCTTGCACCTAATCGTCAATCTGGCAAGGCGGAATAG

Protein sequence:

>DPOGS213726-PA
MDLMKTEYEFLDLSDVKEAELSSLQCALFEKKPRVASTAPNGWNDPTVDVSSCSSSTAGAPTVSSSLDSLSGGECEMAPHSPQRGAGGPPHAPPPYVQPPTYPPHQWPVAPPNVYVSQVTANVNVHGYMGQYYQPPQPQYIPPPQVERPARNQRRERRNKRAPSPPPPQPPPYYVPYSQYYPAAQAQGAPLYHLPMYQPLMYGPYAYPPYYPEYPIPVEGDAGDKGPDEYQQEVVMEQEAVDAYYASAHYAAPPYGPPVDGGVEYMPPLYLPPPHHPAQMHIPQQQLHQPVPHQFNVHAKNFVQGQNQIKNYTPDKTQEPKPPVVAPSSTPPTTVSSTTVSPVESLPIKDLKINKGPGSPNQERSPEIPKDATSNSKISPTLKTDPSKPAWTSDNKPQEPSVAQNTPKTFTPTPTTNVPSASAKVPPVPTKAPKGPTAPFSANKQLPKPPLPSAVPVQQSVTTPKAPFGNRQKREGNSNRSPSTEMPEMDKPAPIEHTKREPPLPPSKAPMPISITLHAQGPPVIVTNKSPFAHSRKVVPVPELPPVPQPPPPAPTASDFPPPPTPRNRGEPVPPPVVQPQPQPAPGKSWASLFSNKPSSITTTIAQTTVAPTEEPSSPTTLTPPAATNIQKPVAKVPPYDASPLQTNSVEKQIAPRPIPTPAPTLSYSEKTSVNAVSNVTTTMPPAKTATSPTTEVREMPIQKEATTPALPLPPSPFSDDPNSYRMGEFLSKYTLDNRPVSLTPRGLTNHSNYCYVNAILQALIACPPFYNMLKALPYQTRRGKSSTPVIDSMVELCYAFGPLPSANRRGRGESGASGAPAVPAMSPLDGSAGLRVLRALRPFPGSQEGRQEDAEEFLGCLLNSLNDEMLELIKLVEPEEPKDLNGKPNGIVAQEQPPDEDNDDDEWKVMGPRNRGAVERRWAARRTPVADIFRGRTRLRLHRAPNHDVTDAVQPFFTLQLDIERSTTVKDALELLAGKDTLEGVSDAWQQLSLEQLPVVLLLHLKCFQLDSEGHTAKIVKNIDFPIDLKIDPKIISSKHTTKQRLYKLFAVVYHEGVEAVKGHYLTDTFHGQVGWIRYDDSTVTQVTDAQVLKPKPPRMPYLLMYRRHDTLAPNRQSGKAE-