Monarch geneset OGS2.0

DPOGS201130
TranscriptDPOGS201130-TA1725 bp
ProteinDPOGS201130-PA574 aa
Genomic positionDPSCF300065 - 645033-654833
RNAseq coverage3210x (Rank: top 4%)
Annotation
HeliconiusHMEL0149773e-7864.78% 
BombyxBGIBMGA003934-TA2e-6456.15% 
DrosophilaCG6972-PA1e-7046.74% 
EBI UniRef50UniRef50_E2ALF62e-12444.73%UPF0326 protein FAM152B n=12 Tax=Neoptera RepID=E2ALF6_CAMFO
NCBI RefSeqXP_393278.35e-12746.73%PREDICTED: similar to CG6972-PA [Apis mellifera]
NCBI nr blastpgi|3071759018e-12444.73%UPF0326 protein FAM152B [Camponotus floridanus]
NCBI nr blastxgi|1571374491e-12944.69%hypothetical protein AaeL_AAEL013800 [Aedes aegypti]
Group
KEGG pathwaytbr:Tb09.160.20505e-22 
 K01227 (E3.2.1.96)maps-> Other glycan degradation
InterPro domain[8-135] IPR0085804.9e-34Domain of unknown function DUF862, eukaryotic
Orthology groupMCL11626 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201130-TA
ATGTCTACTGAGGAGGGGGAGCCTGTGGACCTCTACATCTATGACCTCACAAAGGGTCTCGCTTCATTACTGTCACCAACTATACTTGGGCGCCAGGTGGAGGGCGTCTGGCACACGGCGGTGGTGGTGTTCGGCCGGGAGTACTTCTACGGAGGCGGCGGCGTCACCAGTTGTGCGCCGGGCAGCACCCAGCTTGGGGCTCCGTACCAGGTGGAGCGCCTCGGGACCACGTACGTGCCCTTCCCCGTGTTCCAGGAGTACATCCAGGGGCTCGCTACTAGCTCCTACACAGGCCAGGAGTACCGTCTGCTGGAGCACAACTGCAACCACTTCAGCGACGAGGTGGCTCAGTTCGTGTGCGGAGCTCGCGTCCCCAAGCACATCGTTTCTCAGGCCGAGCGGGACCTGCCCCCGCCGCTGAGGGTGGCGCTGCAAGCCGCCCTCGACCACCTCGTGCCGGACGGAGCGCCCGTCTACGGCGGAGTGAGACACAGTCGCCGGGACAGCCCCGACTACCTCACGCTCAACGACCAGATCGAGGAGGCCAGAGTGGCGTCCCAGGAGCTGGACGCGAGGCGGAGCACGCTCGCGGAAAAGTTGGCGAGGAAGGAGAGGCGGAAGGAAAAGAAGAGGAGAAAACAGATGGGAGGAGATCAGTCAGGGGAAGAGGGGGGCGGAGTCGAACTTGGACCCGAGGACACGGAGAGGGGAGGCGAGATGTCTGAAGCGGTGGAGGTTTTGGAGGCTCGGCCCGGACCCAGCACGCCGCCCCGCGAGGATGATCGCCCGCGGCCCAAAGACCCTCCCATACTGTTCAAGGACATAGACGGCGTGGCGGAGTACGAGGCGCTCGTGAAAGCTCTGGAGGGAGTCGACCTCAACGAGGAGGAGCGTCGCAGCTTGGACGAGTTACAGCAATACCTGGTGGCCGGGGAGGGCTCCTGGGTGCTCGGGGATGACTTCCTCGCCTTCGTCGGTCGCGTGCTGTCAGACTCTTGCTTGGCGTCGGCGGCGCGCGTGTCGATGCTCCGCTGCCTGTGCTGCGCCGCGCTTCGTGAGGACGTGTCGCTCGTGCTGCATCAGGACCGTCGCCACCACGCGCTGCTCTCATACGCGTACAACATCGACCGCCTCCCGGTGGACGAGCAGCTGGCGCTCCTTCTGTTCATGGTGAACCTGTTCTCGGGTCCGTCGTCGTCGGAGTGGTTGCTGTACATCAGCGAGTGGTCGGCGGGCGGGCCTCCTCTGTCCAACATCCGCGTCACCACCAAGGTGTGCGTGCACGGCGTGCTGGCCCCCGAGCCGGCGCTGAGAGACGCGGGCACCGCGCTGCTGTACAACGTAGCCACCAAGGAGGTAAAGACTGTGGTGTTCGACGAGGTGTGTGTAGAGCTGTGCATGGCGGCGCTCCAGCTGTGCTCGTCTGCTCCGGCCGAGGAGCTCCTGTGGCGCGCGCTGGCCTCTCTCGCCCGCCTCGCGGAACACTCACACGACGTGCCGCAACTCGTCGCACTCGTCGGACCTGACCCCAGCGCCTTCAGGTACACACACATCAACACACACAAACTCCTTCACCACTTGTCTACTCGACTGTACATGACGCGGCGCTCTGTGCAGGGGGACCAGCCCTCGAGTGGACGAGCAGGTGGACCTCATCACACAGAGAGTAGCGGCGAGGGGGTAGAAGGGGAGAGGGAGGAGATGGGAGGCGACCTGTACATATAG

Protein sequence:

>DPOGS201130-PA
MSTEEGEPVDLYIYDLTKGLASLLSPTILGRQVEGVWHTAVVVFGREYFYGGGGVTSCAPGSTQLGAPYQVERLGTTYVPFPVFQEYIQGLATSSYTGQEYRLLEHNCNHFSDEVAQFVCGARVPKHIVSQAERDLPPPLRVALQAALDHLVPDGAPVYGGVRHSRRDSPDYLTLNDQIEEARVASQELDARRSTLAEKLARKERRKEKKRRKQMGGDQSGEEGGGVELGPEDTERGGEMSEAVEVLEARPGPSTPPREDDRPRPKDPPILFKDIDGVAEYEALVKALEGVDLNEEERRSLDELQQYLVAGEGSWVLGDDFLAFVGRVLSDSCLASAARVSMLRCLCCAALREDVSLVLHQDRRHHALLSYAYNIDRLPVDEQLALLLFMVNLFSGPSSSEWLLYISEWSAGGPPLSNIRVTTKVCVHGVLAPEPALRDAGTALLYNVATKEVKTVVFDEVCVELCMAALQLCSSAPAEELLWRALASLARLAEHSHDVPQLVALVGPDPSAFRYTHINTHKLLHHLSTRLYMTRRSVQGDQPSSGRAGGPHHTESSGEGVEGEREEMGGDLYI-