Monarch geneset OGS2.0

DPOGS201087
TranscriptDPOGS201087-TA3033 bp
ProteinDPOGS201087-PA1010 aa
Genomic positionDPSCF300185 + 214672-226023
RNAseq coverage373x (Rank: top 32%)
Annotation
HeliconiusHMEL0058380.056.71% 
BombyxBGIBMGA007196-TA4e-17471.34% 
Drosophilarb-PB0.055.80% 
EBI UniRef50UniRef50_E2BLT70.047.09%AP-3 complex subunit beta-2 n=4 Tax=Arthropoda RepID=E2BLT7_HARSA
NCBI RefSeqXP_624446.20.045.83%PREDICTED: similar to ruby CG11427-PA isoform 2 [Apis mellifera]
NCBI nr blastpgi|3072048630.047.09%AP-3 complex subunit beta-2 [Harpegnathos saltator]
NCBI nr blastxgi|3072048630.048.58%AP-3 complex subunit beta-2 [Harpegnathos saltator]
Group
Gene OntologyGO:00085650protein transporter activity
GO:00057940Golgi apparatus
GO:00068970endocytosis
GO:00150310protein transport
GO:00068863.3e-146intracellular protein transport
GO:00301173.3e-146membrane coat
GO:00161923.3e-146vesicle-mediated transport
GO:00054889.1e-117binding
KEGG pathwayame:5520640.0 
 K12397 (AP3B)maps-> Lysosome
InterPro domain[1-1010] IPR0171080Adaptor protein complex AP-3, beta subunit
[36-560] IPR0025533.3e-146Clathrin/coatomer adaptor, adaptin-like, N-terminal
[36-613] IPR0119899.1e-117Armadillo-like helical
[26-729] IPR0160244.8e-94Armadillo-type fold
Orthology groupMCL10934 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201087-TA
ATGTCCGGTACAACATCATACAATAATGAGAAAGTGGTGCAAGGAGAAGTTGAATATCCTGCTAGCGATCCGGCCTCCGGGGCTTTCTTTCAGCCAGATTATAAGAAAAATGAAGACCTGAAGCTGATGTTGGACGGATCAAAAGACTCTCTCAAGTTAGAGGCCATGAAAAGAATTATTGGCTTGATTGCAAAAGGAAGAGATGCATCCGACTTATTTCCAGCAGTAGTGAAGAATGTTGTATCAAAGAATCTTGAAGTGAAGAAGTTGGTGTATGTGTACCTGGTCAGGTATGCAGAGGAGCAACAGGACTTGGCGCTGCTATCAATCAGTACGTTTCAACGAGCTTTAAAGGATCCAAACCAGTTGATCCGAGCCAGTGCTCTCCGAGTGTTATCATCGATCAGAGTGCCAATGATAGTTCCTATCGTAATGCTTGCTATCCGGGACTCCGCCAGCGACATGAGCCCTTACGTTAGGAAGACGGCAGCCCACGCTATACCTAAACTATATAGCCTGGATCCAGATCAAAAGGAAGAACTGGTGGCCATCATCGACAAGCTTCTATCGGACAAGGCCCCGCTAGTGGTCGGATCGGCTGCTATGGCTTTTAACGAAGTCTGTGGGGACCGTATGAACTTGATACATAAGAGTTACAGGAAGCTATGTCTGCTGCTAGCTGATGTGGATGAATGGGGTCAGTTAGCACTTCTGAATGTGCTAACGTACTATGCCAAGACATGCTTTCCTGATCCCAACAATGAGAGCTGTTCCAGTGACAGCGACAACTCCTCGGGCCGTCACTCCCCTCGCGTGGAGGCCGACCTGCGTCTGGTGCTGCGAGCGGCCAAACCACTGCTACAGAGCAGGAACTCCGCCGTGGTACTGGCGGTCGCGCAATTGTTCTATCACTGCGGACCAGTCCAGGAGATGCCGCCCGTGGCCAAGGCCATGGTTCGCCTCCTGCGAGCGCCGTCCGAGATCCAGAGCGTGGTGCTGAACACCATCGCGTCTCTCACCGTCTCCAGGCCGAGCCTGTTCGAGCCGTTCCTGAAGTCGTTCTTCGTGCGTACGTCGGACCCCACGCACATCAAGCTCCTGAAGCTGGAGATCCTCACCAACCTGGCCACGGAGACCAGCTCGCCCGTGGTGCTGAGGGAGTACCAGACGTACGTCACCACCAGCGACAAGACCTTCGTGGCCGCCACCATACAGGCCATAGGACGCCTGGCCGTCCGCATACACTCCGAGACGGAGACCTGCCTCAGCGGCCTGCTGCACCTGCTGTCCAGCAAGGATGAGTGGGTGGTGTGCGAGGCGGTGGTGGTGGTGAAGCGCGTGGTTTGCGGAGGGGCGAGTTCGGCGCGGGCCGCCGTCAGCAGAGCTGCCAAACTGCTCCGATCAGACCGCCTGGCGGGCGGCGCTCGTGCGGCCGCGGTGTGGCTGGTGTGTGAGCACGGGTCCCAACACGCGAGGGCCGCGGCCGTCCTCGCTCACATGGCTGAAAGCTTCGCTGAGCAGGAGGAGCTGGTGAAGCTGCAGCTGCTGTCCCTGTCGGTGAAGCTGTCCGTGACGCAGCCGGCCACGCGGCCCGTGTGCCAGTACGTGCTGTCTCTGGCGCGCTACGACTCCAGCTACGACGTGAGAGACCGCGCCCGCTTCCTCCGCAGCTGCCTCGAGGGACGCCTGGCCGAGTTCGCCAGAGAGATCTTCTGCCCGGACACGCCCAAGCCCAGCGTGCAGGCGAACAACAAAGAGCGTACTCATTACACGGTGGGGTCTCTGTCTCAGTACATCGGGTCGTGTGCGTGCGGCTACCGTCCGTTGCCGACAGCGCCCTCTGCGGACACGTGCGCCGCGCTCCGGGACTCCGCCCCCGCTGAGACGGAGCCCGAGCGAGACATGGACCAACAGAAGAACAAGAGCTTCTACTCCGAGCCCGGATCCACGTCCTCCTCGTCCAGCTCCAGCAGCGAGGAGTCCACAGACGAAGAGGATTCCAATGAGGCGGACGACAAACAAGAAAAAGCTAAACAAATCAATAATTACAAATCCTCCTCATCCGGCACCAGTGACTCGGAGTCGGACAGCTCGTCCTCACAGAGCGACATCGACTCGGAGACTGATCACCAACAGGACGGAGAAGAAGGAGACGGGGCGGACGGAGATGATAAGCCACCGAACAAACGGGAACCGGAAGTCAAAGAAAAGAACGACAAATCCAATCTGGAGCTGCTCCTTGAACTGGACGAAGTGGCGAGTTCGTTGCCCACCATGACGCCCACGTGCGGGGGCTTCCTGTCACCCACCACGCCACAAACGGAGAATGAGGACGACTCCATAGTCCCCGTGGGTCCGACCCACGTGCCCGGCGAGGTCGAGCTGGTGTCCCGCGTGCGATGGTCCGGGCTGGGGGTCAGCCGTCGCTGGTTACGAGCGCCCCACCTGTACTCGGACAAGATGGTCGCCGTGCAGCTGAAATTCACAAACCACGGGCACGAAGACGTAGAGAATATACGACTCGAGAAAAAGGTCCTCCAAGGGGGGAGGCGCTCCATCCAGGAGTTCCCGGCGATTCCCCGCCTCGCTCCGGGCTGCTCCACCACCGTCCTGCTGGGCATCGACTTCGCCGACACCATCCACCCCATGGACTTCACACTCACCAGTTCCAACGGCTCCGTGAGCGTGTCCGTGAGCCCGCCGGTGGGCGAGCTGGTGCGTGCCGTGCGCGTGTCGTCGGCCGTGTGGACCAAGACCTGCAACGCTCTGCGAGGAATGAACGAGTGCAGCAAAGAAACCTCCAACCCCGGAGATGAGAACATCTGCCGTCGTGTTCTGGAGGTGGCCAACCTGTGCCGCGCGGAAGGCTCGGAGGAGATGCGGTTCAGCGGACGGAGCATGAGCTCCCGGGCCCTGGTGCTGGTATCGGTGACGAGGGTCGAGGCCGACCGGTTGGCGGTCGTGGTCCGCTGCGAGAACATAGCACTCGCCAACCTCATGGCCGGCCACATAGCGACCGCCCTCGACGGCTAG

Protein sequence:

>DPOGS201087-PA
MSGTTSYNNEKVVQGEVEYPASDPASGAFFQPDYKKNEDLKLMLDGSKDSLKLEAMKRIIGLIAKGRDASDLFPAVVKNVVSKNLEVKKLVYVYLVRYAEEQQDLALLSISTFQRALKDPNQLIRASALRVLSSIRVPMIVPIVMLAIRDSASDMSPYVRKTAAHAIPKLYSLDPDQKEELVAIIDKLLSDKAPLVVGSAAMAFNEVCGDRMNLIHKSYRKLCLLLADVDEWGQLALLNVLTYYAKTCFPDPNNESCSSDSDNSSGRHSPRVEADLRLVLRAAKPLLQSRNSAVVLAVAQLFYHCGPVQEMPPVAKAMVRLLRAPSEIQSVVLNTIASLTVSRPSLFEPFLKSFFVRTSDPTHIKLLKLEILTNLATETSSPVVLREYQTYVTTSDKTFVAATIQAIGRLAVRIHSETETCLSGLLHLLSSKDEWVVCEAVVVVKRVVCGGASSARAAVSRAAKLLRSDRLAGGARAAAVWLVCEHGSQHARAAAVLAHMAESFAEQEELVKLQLLSLSVKLSVTQPATRPVCQYVLSLARYDSSYDVRDRARFLRSCLEGRLAEFAREIFCPDTPKPSVQANNKERTHYTVGSLSQYIGSCACGYRPLPTAPSADTCAALRDSAPAETEPERDMDQQKNKSFYSEPGSTSSSSSSSSEESTDEEDSNEADDKQEKAKQINNYKSSSSGTSDSESDSSSSQSDIDSETDHQQDGEEGDGADGDDKPPNKREPEVKEKNDKSNLELLLELDEVASSLPTMTPTCGGFLSPTTPQTENEDDSIVPVGPTHVPGEVELVSRVRWSGLGVSRRWLRAPHLYSDKMVAVQLKFTNHGHEDVENIRLEKKVLQGGRRSIQEFPAIPRLAPGCSTTVLLGIDFADTIHPMDFTLTSSNGSVSVSVSPPVGELVRAVRVSSAVWTKTCNALRGMNECSKETSNPGDENICRRVLEVANLCRAEGSEEMRFSGRSMSSRALVLVSVTRVEADRLAVVVRCENIALANLMAGHIATALDG-