Monarch geneset OGS2.0

DPOGS203693
TranscriptDPOGS203693-TA3243 bp
ProteinDPOGS203693-PA1080 aa
Genomic positionDPSCF300010 - 1826309-1940346
RNAseq coverage54x (Rank: top 69%)
Annotation
HeliconiusHMEL0133100.087.22% 
BombyxBGIBMGA000351-TA2e-6236.39% 
DrosophilaCG34347-PB5e-11956.30% 
EBI UniRef50UniRef50_E0VRU77e-14444.80%4.1 G protein, putative n=3 Tax=Neoptera RepID=E0VRU7_PEDHC
NCBI RefSeqXP_970473.22e-15345.52%PREDICTED: similar to band 4.1-like protein 4A (NBL4 protein), putative [Tribolium castaneum]
NCBI nr blastpgi|1892340714e-15245.52%PREDICTED: similar to band 4.1-like protein 4A (NBL4 protein), putative [Tribolium castaneum]
NCBI nr blastxgi|3479662821e-17944.37%AGAP001632-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00055151.6e-25protein binding
GO:00054884.5e-24binding
KEGG pathwaymmu:2695876e-64 
 K06107 (EPB41, 4.1R)maps-> Tight junction
InterPro domain[107-350] IPR0197493.5e-49Band 4.1 domain
[345-437] IPR0119931.6e-25Pleckstrin homology-type
[187-273] IPR0143524.5e-24FERM/acyl-CoA-binding protein, 3-helical bundle
[190-273] IPR0197485.5e-24FERM central domain
[357-439] IPR0189802.1e-16FERM, C-terminal PH-like domain
[122-189] IPR0189793.3e-16FERM, N-terminal
[141-153] IPR0197502.7e-08Band 4.1 family
Orthology groupMCL15344 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203693-TA
ATGATGGACTGCCTGTGTCCAGCGACACGTACGCTGGCCTGCCGTGTGGTTTTGTTGGACGAAAGAGAATTGATGCATGAGATACAGTCAGAGACGGACGTCCCGACGGACTCGAATGTGACAGTCTTTTCAATTTCACGGGATGATGAAACAACGCATGGATTTTTCTCCATTTCTATTCGCAGTTTTGAAGTGCAGCTATTCCGCGACGAATTATTTGAAGTACACCCAGGGGCTCAAGAAAGGTCTTTTATATGCGGATCTCGGGTCCAAGGTGCGGGCGATGAATTCACTGAATCAGATGAAGTACGAATTGGTCAGCAATTTGAGATCAAAGAAGTGGTGGAGCATTATAAAGAATATTGTCGCAGTAAAGACAATAACACAGGACAAGCACTACTGGATGTTGTATTCAGGCACCTAGATTTGCTGGAAACGGCATACTTCGGGCTTCGATACGTAGATCCAGACAACCAGACGCACTGGCTCGATGCTGGAAAACGACTACGCCGTCAATTGCGTGGCTCCGACACGCACACTTTCTACTTTGGCGTCAAGTTCTACGCCTCCGATCCCTGCAAACTCTTGGAGGAAATTACTCGGTACCAGTTATTCTTGCAGCTAAAACAAGACGTATTACGAGGTCGTCTTCCAGTCAACTTCGAGCTAGCCGCCGAACTCGCAGCATATGTATTACAGTCGGAATTAGGTGACTATGATCCCCGTCGACATACACTTGGCTACGTGTCAGAGTTTAGGCTGCTTGCTCATCAAACACCTGAATTCGAAGGAAGAGCTGCGGATATACACAGAACACTCACATTCTGTATAAGTATTTCTTTATGTACAGCGGAATTAGGTGACTATGATCCCCGTCGACATACACTTGGCTACGTGTCAGAATTTAGGCTGCTTGCTCATCAAACACCTGAATTCGAAGGAAGAGCTGCGGATATACACAGAACACTCACAGGTATATCACCAGCACAAGCGGAACTTAGTTACTTAGATAAAGTGAAATGGCTTGACATGTATGGTGTTGATCTTCATCCTGTTCTGGGCGAAGACAGCGTTGAGTACTTCTTGGGACTAGCGCCAAGTGGCCTGTTGCTCTTACGTGGAAAACATACAGTCGCTACCTATTACTGGCCGCGCGTATCAAAACTTTATTACAAAGGACGATATTTCATGATACGCGTAGCAGATAAAAATAATGACACGTCAACGTATGGTTTTGAATCCCCAACAAGAGCAGCCTGCCGGCATTTGTGGCGGTGTTGTTCTGATCATCATACTTTCTTTCGCTTACAACAAACATCACCTGCATCAGCAGATATATTTGCGTTAGGTTCAAGACTGAGAAATAGTAGTCGAGCTGCAAGACCACGTCCTCCGCCTGCGTTCACGCGAACACCCTCTCGAAGGATTTCCCGTCCACTAACATCTTATTCTTCGTTACACGACGTGCCAAAGTTAGAAGATTTGAGAATAAAAGACTGCCCTCCTGAAGTTAAACAGCCTTCCTCAGTGCACCGCCCTAATTCTATAAGCGGTGAAGGCCCTTGTGAGACTGTTGGTCCCGGCGGGTCACCTCGTTCAACTCGCTCGGCACCAACTCGACGTGGTCTATACTCGGCATCACCTACTACTCACAGGCCGCCACCAGTGCCAAGACACCGTTCAGCTTCAGTTGATTCTCAGAGTTCTAACGACTCGAGGTCCAACCGCAAACATAGGCACCGCTCTCGACGACAGCAGTCAGATGCAGAGAGTGAACTGTCCCGTGGTTCCGGACGCTCTGGGCGCAGACATCGCCGACACCGCTCTAGACACAAGCAGGAGTCCGGCTCAGAGAGAGATGATTCGCAACCAGACAACAAGGAATACGAGCTTGTCGACTCGGAATCTCAATGGAAGGAAGTATTAAGACAGACGTCAGCTGGAGGCAGCGTGCAAGTGGCAAATGTTCGTCGTTCACAAATGGAACCGGAGACAGGAACGCATCGATCGTCACACAGACCACGACGACATAAAAAACATAGGTCAAGATCCCGTTCACCGAACGAAAAGAAGTGGCTACCAAATGAATTAAAGCAACACCTCGAATTCTCTCTAGTGGATACTACAGGAATGACTGAGGAACAGCTGAAGGAGATACCCTATACTGTTGTGCAAACGAGCCAGGCTCGGCACACCAAACTAAGGACTTCATCTAAACACAGACAGACGGATCACGGTTCACTTGCAAGACGGGAGAAGAGCTCTTCGTCACACAAAAGTGACCACGACAACCACAATCAAGGGTCTCTCAGATCGATATCAAGCACACTCAGCACACATAGAACTACCAATGAAAAACATGGAAGGCGTTTGTATCCAAACTATGATGATTCTGTTGGACGGATCGGTGAAGATTATTTAGCTAACAATGGATATAAAACTTCAGTGACGCCATTACCGCATAATAACAATCCGTATAGTCCAGTAACCAACAGCAATAGTAATAGTTCCAGCGGAGAATTGATAGGAAGCGCAAGGGTATCACACGAACATACAGATTCTGGATTAGGCGCTGACCAGGATTACGCGTATTCCTCTGAAAGATCCAGTGACAGTGCAAAATGCGGCGGTGGAAGTTCACAGGCCCCAGTGAGTAGGCAGTGGCGTGTTGGTGGGGGGAGTATAGGGGGTGCGGGGTGCGGCGCGGGCGCATTAACCCGTCGGCCGCCGGCCGCGCCGGGTCGCGCGCGTCTCTCCGTGTCCGGCAGCCAGAGGTCGCTGTTGTCGGTCGCGAGTGACAGCGCCACGTCACGCCGCCCGCGCGACCTCGCGCCCCGCTGCTATCCAAGCCCTGCGGACGATGGCTTTTCTCTCTTCAGACATGCTAGCAACAACAACATAATGGGCGAGAGCGGATCTCTGGCGCGGCGGTACGCACGTCCGTCGAGGGAGTCGCGCGACTACAATGCGAACATCGCGCGCACGCACTCTCGTCTCGCTCATGCGCACGCGCACATGCGACATGACAACACACTCGACATTATATTAAAACCCCTGATAGAAAGCACGCACCCCCCGCCGGCGATTACCTACCTTGTGGCAGACCCCTCACGCGTCCCCCCCCGGGTGGGGGAGGGCTCTGTTGTGGCCGGAGCGCATGGCCATCTCTGGTCGTTGCTAGAGGTGTGCGTTGTGATGCTGGCGGTAGCACCCCTGCGCTGGCCGCCTGCCGCACACCCCTAG

Protein sequence:

>DPOGS203693-PA
MMDCLCPATRTLACRVVLLDERELMHEIQSETDVPTDSNVTVFSISRDDETTHGFFSISIRSFEVQLFRDELFEVHPGAQERSFICGSRVQGAGDEFTESDEVRIGQQFEIKEVVEHYKEYCRSKDNNTGQALLDVVFRHLDLLETAYFGLRYVDPDNQTHWLDAGKRLRRQLRGSDTHTFYFGVKFYASDPCKLLEEITRYQLFLQLKQDVLRGRLPVNFELAAELAAYVLQSELGDYDPRRHTLGYVSEFRLLAHQTPEFEGRAADIHRTLTFCISISLCTAELGDYDPRRHTLGYVSEFRLLAHQTPEFEGRAADIHRTLTGISPAQAELSYLDKVKWLDMYGVDLHPVLGEDSVEYFLGLAPSGLLLLRGKHTVATYYWPRVSKLYYKGRYFMIRVADKNNDTSTYGFESPTRAACRHLWRCCSDHHTFFRLQQTSPASADIFALGSRLRNSSRAARPRPPPAFTRTPSRRISRPLTSYSSLHDVPKLEDLRIKDCPPEVKQPSSVHRPNSISGEGPCETVGPGGSPRSTRSAPTRRGLYSASPTTHRPPPVPRHRSASVDSQSSNDSRSNRKHRHRSRRQQSDAESELSRGSGRSGRRHRRHRSRHKQESGSERDDSQPDNKEYELVDSESQWKEVLRQTSAGGSVQVANVRRSQMEPETGTHRSSHRPRRHKKHRSRSRSPNEKKWLPNELKQHLEFSLVDTTGMTEEQLKEIPYTVVQTSQARHTKLRTSSKHRQTDHGSLARREKSSSSHKSDHDNHNQGSLRSISSTLSTHRTTNEKHGRRLYPNYDDSVGRIGEDYLANNGYKTSVTPLPHNNNPYSPVTNSNSNSSSGELIGSARVSHEHTDSGLGADQDYAYSSERSSDSAKCGGGSSQAPVSRQWRVGGGSIGGAGCGAGALTRRPPAAPGRARLSVSGSQRSLLSVASDSATSRRPRDLAPRCYPSPADDGFSLFRHASNNNIMGESGSLARRYARPSRESRDYNANIARTHSRLAHAHAHMRHDNTLDIILKPLIESTHPPPAITYLVADPSRVPPRVGEGSVVAGAHGHLWSLLEVCVVMLAVAPLRWPPAAHP-