Monarch geneset OGS2.0

DPOGS203940
TranscriptDPOGS203940-TA2805 bp
ProteinDPOGS203940-PA934 aa
Genomic positionDPSCF300005 - 171616-198638
RNAseq coverage10x (Rank: top 84%)
Annotation
HeliconiusHMEL0120880.073.02% 
BombyxBGIBMGA000748-TA0.077.09% 
DrosophilaCG2930-PB7e-8843.41% 
EBI UniRef50UniRef50_E0VT655e-16445.48%Oligopeptide transporter, putative n=2 Tax=Pediculus humanus corporis RepID=E0VT65_PEDHC
NCBI RefSeqXP_002429309.11e-16445.48%Oligopeptide transporter, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2420176672e-16345.48%Oligopeptide transporter, putative [Pediculus humanus corporis]
NCBI nr blastxgi|2420176678e-15945.48%Oligopeptide transporter, putative [Pediculus humanus corporis]
Group
Gene OntologyGO:00160201.6e-72membrane
GO:00068571.6e-72oligopeptide transport
GO:00052151.6e-72transporter activity
GO:00038248.7e-16catalytic activity
GO:00042521.1e-08serine-type endopeptidase activity
GO:00065081.1e-08proteolysis
KEGG pathway 
InterPro domain[21-567] IPR0001091.6e-72Oligopeptide transporter
[704-797] IPR0090038.7e-16Peptidase cysteine/serine, trypsin-like
[6-304] IPR0161968.7e-09Major facilitator superfamily domain, general substrate transporter
[703-792] IPR0012541.1e-08Peptidase S1/S6, chymotrypsin/Hap
Orthology groupMCL22608 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203940-TA
ATGATGTGTTATACCATGCCGCTGATGGGAGCCGTATTGGCAGATAATTTTATTGGTAGATATCGGGTAATTCTTTATTTTTCTGTAATTTATTTTATTGGTACAATATTAATATGCTGTTCTGCGATCCCGCCGCTTTGTCTTCCGTCAACGTCAACATCTATGATCGGTTTGGCTCTTATAGCAACTGGGACTGGTGGCATTAAACCATGTGTTGCGGCATTTGGTGGCGATCAGTTTCGCTTGCCTAAAGATACGGAAAGGTTGCAGCGTTTTTTCTCAACTTTTTACTGCACCGTTAACTTTGGAGGATTTATGGGAATGGTGGTGACTCCGGCTCTTCGCAGAAGCTACATGTGCTTTGGGGATGATGCTTGTTACGCTCTTGGATTTGGTTTTCCGGCTGTTTTAGTTCTCTTATCGATCATTATATTTGTGATGGGAAAACCATGGTATAGAATAAAAAAACCACGGGATAATATTACCATAAAATTTGTTTCTTGTGCCTGGTACGCTTTCAAAAGGCGTTTAAGATACGATAAAAATATAGATGGTCCCCCTCACAAGCATTGGTTGGATTATTCTATCGATAAGTATGGTGCCAAGCTAGTTTCTGATATGAAGGTGGTGTTCTCTATTTTGTATTTATATCTCCCTGTGCCAATTTATTGGTCTTTATTTGATCAACAAGGTTCGCGTTGGACTTTCCAAGCTTCGAGATTAAAAAGCGAAGTACTCGGAATAACTCTGATGCCTGATCAATTGCAAGTAATGAACCCTGCCATGGTGTTACTGATGATCCCCGTATGTCCCGGTGGAGCCTGGCCCTTATTGCCAGATCTGGGGCCTTTACATAAGATGTTTGCTGGCGGCATTCTGGCTGCAATGGCCTTTGTGTGTGCTGGTATTTTGCAAATAAGTATTGAGCGTTCTACACTTCAAGTACCAGGACACCAGCAGACTGGGATAATCATTATGAACACTCTCGATTGTCCTGTTGATATCGCCGTCAGTGGAGAGGGTATGGCTCGTGTTGATATTCACGGTACAGCGCTGATGACACCGCTACCAAGTCGGACTTATGGTATTGTAGTGTCCTGCCCTCGTGATTGTGCCGGCAGAATAATGAAGAAACATAATTTTGAAACTAAATTAAAGACCGTTTCCGAAATGTTTATGCCAATAATTGTCGGTCAGAATTCAGATGATGAAATATCCTTATATTTTATGGATCCCGGACCGTACATAAAATCAATGACAGGGAAACCGAAATTAAAGGTCGTGTATGTTGGAGATACAGGTCCTAATAAGAATGTTTCGATATGCGTTGAAACGGAAAAACGTTTGAATGATATTTATTACGTATCAGACTCCCCTATCGATCACATCGGAGAGTCGGCTTACATGTGCCTTCAAACAGGAAAGTTTACTTGGCGTGCTGATAGTACAGGGGGTGAAGTATCGGGTGCGGGTGAAGGCTATTTGCGAACCGGAGGAGTTTATGTTTTATGTCTGAGGGAGCGACTAGGCCGGTTGGATACAGCTGTGCTCCATGCACCAAATCCTCCTAATGAACTGCACCTGGCGTGGATCGTTCCCCAATACTTGCTTGTGTCTATGGCGGAGATCATGTTCGCGGTGTCCGGTCTTGAGTTTTCTTTTACACAAGCTCCGAAAAGTATGAAGACGATTACAATCGCAACTTGGTACATGTCAGTGGCTATTGGTAATCTAATTGTCATTCTAGTGGCTCAAACAAAAGTTTTTGAATCAAGGGCTACCGAATTCTTTGTATATGCAGGCGTGCTAACGGGCGCTATGTTGATATTCCTATGGATGGCACAAGGCTATTGTTCTAGAACAATGGAAGAAGACGGATCCTCCACTGAGAGCCGCCCTTTATTACAAAAGCATTCCAGGGTAATATCAATACACTCTATGTCTACAAGAAGTATAGCGTACAGTGCTCCTGGAAGTAGTGCAGCATCTTGTTTGGGGCTCATGGAGAAAACAGAAGCTGCTGCGTGGCCTTCTCATGGCCTTATACACGGAAAAGAGTTGTGTCTGCCACGCGGGTCACAATATTATCCAGGCGAGATGTTGGACCCTATGCTGAGAACTATTTCATTAACTATTACTTCACCTAATTATTATTGCCAGGGATACAGAAAGCACGAGACTCCAGTCCGCAGAGGTATGTTTTGTACTGGCATGGCTCGTGAGGATCATCCAACTTTTCCCTGTCTGGCAGTGCCCGGGGCCCCGTTAGTGGTGAAGGGAAAATTGGCGGCCATCTTGTCTTGGGGCTTTGGATGTGGTTATCAGAACGATCTCCCTCTCGTATATACTGACATCAAATATTACATAAATTGGATAACTCAAAACGTGGCAAAAATACGAAAGATTGCTAAAAAAGATTTTAAGCCACTATTCGACGCAACTAAATCATGGGTTCTCTTGGAATGGTTTACAAAATCAAGAATAGTGAAACCAAAACGCCACTATCACAGAAACAGGGAGCTACAACTAATTAATCTAGACCAAAGTCTAAGTAACCTCCGAGGGCAGATTTTTGATTTAAGAGATTTTATTTTTAACAGACAATATAAAGAAAAAAAGCTGACAATGTATAAAGAAATCCAAAATTCAGCAAAAGAAGGAAATAATACAAAACATCTTATAGAAAAAATGAAACTTCTTTCATATACGGGAAAACCACTTCCTTTCTTCAGTAATAAGTCTTTATTAGAAATAGCTTCCAATTACGATGACTTCAGTAGTTCTAATGAAGATGAAAAGAATTAA

Protein sequence:

>DPOGS203940-PA
MMCYTMPLMGAVLADNFIGRYRVILYFSVIYFIGTILICCSAIPPLCLPSTSTSMIGLALIATGTGGIKPCVAAFGGDQFRLPKDTERLQRFFSTFYCTVNFGGFMGMVVTPALRRSYMCFGDDACYALGFGFPAVLVLLSIIIFVMGKPWYRIKKPRDNITIKFVSCAWYAFKRRLRYDKNIDGPPHKHWLDYSIDKYGAKLVSDMKVVFSILYLYLPVPIYWSLFDQQGSRWTFQASRLKSEVLGITLMPDQLQVMNPAMVLLMIPVCPGGAWPLLPDLGPLHKMFAGGILAAMAFVCAGILQISIERSTLQVPGHQQTGIIIMNTLDCPVDIAVSGEGMARVDIHGTALMTPLPSRTYGIVVSCPRDCAGRIMKKHNFETKLKTVSEMFMPIIVGQNSDDEISLYFMDPGPYIKSMTGKPKLKVVYVGDTGPNKNVSICVETEKRLNDIYYVSDSPIDHIGESAYMCLQTGKFTWRADSTGGEVSGAGEGYLRTGGVYVLCLRERLGRLDTAVLHAPNPPNELHLAWIVPQYLLVSMAEIMFAVSGLEFSFTQAPKSMKTITIATWYMSVAIGNLIVILVAQTKVFESRATEFFVYAGVLTGAMLIFLWMAQGYCSRTMEEDGSSTESRPLLQKHSRVISIHSMSTRSIAYSAPGSSAASCLGLMEKTEAAAWPSHGLIHGKELCLPRGSQYYPGEMLDPMLRTISLTITSPNYYCQGYRKHETPVRRGMFCTGMAREDHPTFPCLAVPGAPLVVKGKLAAILSWGFGCGYQNDLPLVYTDIKYYINWITQNVAKIRKIAKKDFKPLFDATKSWVLLEWFTKSRIVKPKRHYHRNRELQLINLDQSLSNLRGQIFDLRDFIFNRQYKEKKLTMYKEIQNSAKEGNNTKHLIEKMKLLSYTGKPLPFFSNKSLLEIASNYDDFSSSNEDEKN-