Monarch geneset OGS2.0

DPOGS200446
TranscriptDPOGS200446-TA1848 bp
ProteinDPOGS200446-PA615 aa
Genomic positionDPSCF300260 - 318977-330169
RNAseq coverage19x (Rank: top 80%)
Annotation
HeliconiusHMEL0104121e-13952.05% 
BombyxBGIBMGA011408-TA6e-10364.20% 
DrosophilaCG2736-PA1e-4830.68% 
EBI UniRef50UniRef50_D2A5M81e-11042.58%Putative uncharacterized protein GLEAN_15144 n=4 Tax=Endopterygota RepID=D2A5M8_TRICA
NCBI RefSeqXP_971582.15e-11142.58%PREDICTED: similar to antigen CD36, putative [Tribolium castaneum]
NCBI nr blastpgi|2700092424e-11042.58%hypothetical protein TcasGA2_TC015144 [Tribolium castaneum]
NCBI nr blastxgi|2700092427e-10842.58%hypothetical protein TcasGA2_TC015144 [Tribolium castaneum]
Group
Gene OntologyGO:00160202.2e-107membrane
GO:00071552.2e-107cell adhesion
KEGG pathwaygga:4168143e-39 
 K13885 (SCARB1)maps-> Phagosome
InterPro domain[1-426] IPR0021592.2e-107CD36 antigen
Orthology groupMCL18903 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200446-TA
ATGCTGTGGGAGAAACTGAATATGAGACCTGGTTTCCCGCCTTACGACTGGTGGTCGGATCCTCCCGACCAGGTCAAGATGAGAGCCTACATATTCAATGTCACCAACCACGAACGGTTCCTCCAGGGTCTGGATGCCAAGATCAACGTAGAGGAGATAGGTCCCATTGTATACTTGGAGAAACTAAATCATTTGGACATCAGATTCAATGAGAACAGCACTCTGACGTACACAGCCAAACGTCACCTGATATACCTGCCCGAGGACAACCACATAGACCTAAACAGAACGGTCATTGCACCGAACTTGGCGTTACTGGGCATAGCGTCATATCTCCATGACGCGGATTTTTTCATCCGGAGCGGTTTCGTTGGTCTGGCTAGTTTGCATTCATCCAAGTTCTTCGTGGAGAAGACCGTGTACCAGTACCTGTGGGATTTTAGAGACAACATCCTGGACACATCCCAGCGAGTCGTCCCGGGGATGGTGCCCACAAACAACATGGGCATGCTGAGTCGAATATACGATGGCTTTTCTGACAACTACACCGTGAAGATAGGCCCGCAGTGGGGTCACCACGAGTTCTTCAAAATCGACAGGCTCAATGGAGCCCAGAATTTCAGGGAATACGACATACATAAATGCAGAGACCGAGTGACCGGCGCCACGGAGGGCGTCATGTACCACCACCACATGTCCAAGAGCGATGTCCTGTACGCCTTGAGGAAGACCGTCTGCAAACCGTTGCCTTTGTACTTTGACAAGGAATTAAAGATGGAAGGCGTGCCCGTTTACCGTTACAACCTATCAGAGCAAGCGTTCGATAGACAGAGGAACGGAAGCGACTGCTACGCAACCGACGATCCACTTCCGGACGGTGTCAGTGACGCCTCCAAATGTTTCTTTGATTTCCCTATGGTTGCTTCCTATCCTCATTTCTACACGGGATCTCCTCACAAGGACGCCTACGTCACCGGCCTGAAGCCGGACAGCGAGAAACACAGGTCCTACGTCATTGTAGAACCGATAACCGGGACCCCATACGACGCGGTTGCCAGGTTGCAGTGTAACCTCCGTATCAGCGACCTTTCTGGATTCTATTCGACCATGTACGAGAAATTTTCAAATCTCATCTTACCCATTGGGTGGATAGAATACCACCAAGAAGGTCTACCAGCTCGCGTCAAGCAAGCCATATATTTCATGGTGGTAATACTCCCGCCGCTGTCAACCGTCATCTTCATCATGACGTTCCTCCTCGGCAGCTCTCTGATAATAAAACAGATTATAACGCAGAAGATCAATAAAGAAATACTCCCGTCCATCATTAATTTTAAATCCCAGAAAGACGATACGAAACTATCCGACAATAACATATATACATACGAGAAGGAATTGTTTCTAAGGAAACCGCAGCTTTCTAGGTCCTACGTCATTGTTGAACCGATAACCGGGACGCCATACGACGCGGTTGCCAGGTTGCAGTGTAACCTCCGTATCAGCGACCTTTCTGGATTCTATTCGACCATGTACGAGAAATTTTCAAATCTCATCTTACCCATTGGGTGGATAGAATACCACCAAGAAGGTCTACCAGCTCGCGTCAAGCAAGCCATATATTTCATGGTGGTAATACTCCCGCCGCTGTCAACCGTCATCTTCATCATGACGTTCCTCCTCGGCAGCTCTCTGATAATAAAACAGATTATAACGCAGAAGATCAATAAAGAAATACTCCCGTCCATCATTAATTTAAAATCCCCGAAAGACGATACGAAACTATCCGACAATAACATATATACATACGAGAAGGAATTGTTTCTAAGGAAACCGCAGCTTTCTAGGTGA

Protein sequence:

>DPOGS200446-PA
MLWEKLNMRPGFPPYDWWSDPPDQVKMRAYIFNVTNHERFLQGLDAKINVEEIGPIVYLEKLNHLDIRFNENSTLTYTAKRHLIYLPEDNHIDLNRTVIAPNLALLGIASYLHDADFFIRSGFVGLASLHSSKFFVEKTVYQYLWDFRDNILDTSQRVVPGMVPTNNMGMLSRIYDGFSDNYTVKIGPQWGHHEFFKIDRLNGAQNFREYDIHKCRDRVTGATEGVMYHHHMSKSDVLYALRKTVCKPLPLYFDKELKMEGVPVYRYNLSEQAFDRQRNGSDCYATDDPLPDGVSDASKCFFDFPMVASYPHFYTGSPHKDAYVTGLKPDSEKHRSYVIVEPITGTPYDAVARLQCNLRISDLSGFYSTMYEKFSNLILPIGWIEYHQEGLPARVKQAIYFMVVILPPLSTVIFIMTFLLGSSLIIKQIITQKINKEILPSIINFKSQKDDTKLSDNNIYTYEKELFLRKPQLSRSYVIVEPITGTPYDAVARLQCNLRISDLSGFYSTMYEKFSNLILPIGWIEYHQEGLPARVKQAIYFMVVILPPLSTVIFIMTFLLGSSLIIKQIITQKINKEILPSIINLKSPKDDTKLSDNNIYTYEKELFLRKPQLSR-