Monarch geneset OGS2.0

DPOGS212386
TranscriptDPOGS212386-TA1005 bp
ProteinDPOGS212386-PA334 aa
Genomic positionDPSCF300019 + 808871-811514
RNAseq coverage407x (Rank: top 30%)
Annotation
HeliconiusHMEL0133882e-12569.76% 
BombyxBGIBMGA012038-TA2e-15881.14% 
DrosophilaCG4933-PA2e-14668.30% 
EBI UniRef50UniRef50_Q9NPF42e-14271.64%Probable tRNA threonylcarbamoyladenosine biosynthesis protein OSGEP n=151 Tax=Eukaryota RepID=OSGEP_HUMAN
NCBI RefSeqXP_971657.14e-15575.22%PREDICTED: similar to o-sialoglycoprotein endopeptidase [Tribolium castaneum]
NCBI nr blastpgi|910920928e-15475.22%PREDICTED: similar to o-sialoglycoprotein endopeptidase [Tribolium castaneum]
NCBI nr blastxgi|910920922e-14975.22%PREDICTED: similar to o-sialoglycoprotein endopeptidase [Tribolium castaneum]
Group
Gene OntologyGO:00065088.5e-90proteolysis
GO:00042228.5e-90metalloendopeptidase activity
KEGG pathway 
InterPro domain[24-299] IPR0009058.5e-90Peptidase M22, glycoprotease
[6-298] IPR0178618.5e-87Peptidase M22, glycoprotease, subgroup
Orthology groupMCL11768 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212386-TA
ATGGTCGTAGCCATAGGGTTTGAAGGCAGTGCCAATAAATTGGGAATAGGGATAGTAAGAGATGGTGAAATACTTGCCAACGTGAGACGAACCTACATTACACCGCCTGGAGAAGGATTTCTTCCAAGGGAAACAGCTGAACATCATCAGGAAAATATTCATGTAGTACTTAAAGAAGCTTTTGAAACTTCTGGAATAACTCCCGATGATATAGATGTGGTCTGTTACACGAAGGGGCCAGGTATGGGTGCGCCACTGATGGTCTGTGCTGTAGTTGCGAGAACTTGTGCTAAACTGTGGAAGAAACCCATTCTTGGAGTTAATCATTGCATAGGCCATATAGAAATGGGTCGTCTGATAACGAAAGCACACAACCCTGCTGTTTTATATGTGAGTGGAGGCAACACACAAATCATAGCGTACTCAAGACAGAGGTACAGGATATTTGGTGAAACTATTGATATAGCCGTTGGAAACTGTTTAGATCGGTTTGCTAGAGTATTGAAACTATCAAACGCACCGAGCCCTGGATACAACATTGAACAACTAGCAAAGAAAGGAAAGAAATACTTACACCTTCCATACTGTGTGAAAGGCATGGATGTTAGTTTCTCTGGTATACTTTCATACATGGAGGATAAGATCGATGATTTGTTAAAAGAATACACCCCCGAAGACCTTTGCTACTCATTACAAGAAACTGTTTTCGCAATGTTGGTGGAAATAACTGAGAGGGCGATGGCTCACTGTGGATCTGAAGAGGTTCTTCTTGTCGGTGGAGTGGGATGCAATCAAAGACTTCAAGATATGATGGAAGTTATGTGTAAAGAGAGACAAGCAAAAATATTTGCTACAGATGAAAGATTTTGTATTGACAACGGTGTGATGATAGCGTACGCCGGATCACTGGCATACAGCAGCGGAGCCAGGATGGAATTCAAAGACACCACAATAACACAGAGATATAGAACTGATGACGTCTTAGTGACCTGGAGAGATGATTGA

Protein sequence:

>DPOGS212386-PA
MVVAIGFEGSANKLGIGIVRDGEILANVRRTYITPPGEGFLPRETAEHHQENIHVVLKEAFETSGITPDDIDVVCYTKGPGMGAPLMVCAVVARTCAKLWKKPILGVNHCIGHIEMGRLITKAHNPAVLYVSGGNTQIIAYSRQRYRIFGETIDIAVGNCLDRFARVLKLSNAPSPGYNIEQLAKKGKKYLHLPYCVKGMDVSFSGILSYMEDKIDDLLKEYTPEDLCYSLQETVFAMLVEITERAMAHCGSEEVLLVGGVGCNQRLQDMMEVMCKERQAKIFATDERFCIDNGVMIAYAGSLAYSSGARMEFKDTTITQRYRTDDVLVTWRDD-