Monarch geneset OGS2.0

DPOGS213354
TranscriptDPOGS213354-TA1077 bp
ProteinDPOGS213354-PA358 aa
Genomic positionDPSCF300109 - 191751-196358
RNAseq coverage74x (Rank: top 66%)
Annotation
HeliconiusHMEL0163865e-4835.40% 
BombyxBGIBMGA009120-TA1e-7743.02% 
Drosophilacora-PA1e-7442.01% 
EBI UniRef50UniRef50_F4WT512e-7441.74%Protein 4.1-like protein n=5 Tax=Apocrita RepID=F4WT51_ACREC
NCBI RefSeqXP_973434.11e-7743.75%PREDICTED: similar to coracle [Tribolium castaneum]
NCBI nr blastpgi|2700094201e-7643.75%hypothetical protein TcasGA2_TC008668 [Tribolium castaneum]
NCBI nr blastxgi|2700094202e-7243.75%hypothetical protein TcasGA2_TC008668 [Tribolium castaneum]
Group
Gene OntologyGO:00055151.7e-25protein binding
GO:00054885.7e-25binding
KEGG pathwaytca:6622294e-77 
 K06107 (EPB41, 4.1R)maps-> Tight junction
InterPro domain[13-231] IPR0197492.3e-42Band 4.1 domain
[226-319] IPR0119931.7e-25Pleckstrin homology-type
[110-223] IPR0143525.7e-25FERM/acyl-CoA-binding protein, 3-helical bundle
[114-226] IPR0197481.7e-24FERM central domain
[235-324] IPR0189804.4e-17FERM, C-terminal PH-like domain
[50-62] IPR0197505.4e-16Band 4.1 family
[21-113] IPR0189792.1e-12FERM, N-terminal
Orthology group 
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213354-TA
ATGCGGGAGAGTCTTCGCCGCCTCGCATCAGACGACATCCCTCCCAGAGGCAGGGTCCGCGTGGAACTGCTCACGGGAGAACACATCACTATTGATCTTGATAGGAAGGCCCTCGGCGGAGATTTACTGGACCTGGTCTGCGAGTCCTTAGACGTGATCGAGAAGGACTACTTCGGGCTGCTTCACGCACAAGGGGAGCCCAGAGTGTGGGTACACCTCGGCAGACGACTCAGCAAAACTTTCAAAATCCTAGCAGCAGTTCCACAGTTTCCGCTACACTATAGACCTCATATTTCAGACGAGCCCTGGGATGTACGGTTTGCTGTGAAGTTCTATCCGCTGGAGCCGTCAGCTCTCAGAGACGACATGACCCGCTACCAGCTGTCACTGGCGCTCAGGCGAGACCTCATGGAAGGACGTCTGACGTGCTCGACGATCACATACGCACTGCTCGCGTCCTACGTCCTCCAGGCAGAGGCGGGGGACAGGTCCGCGGCTGTCCCGCTGGGGGCGGGGGCCACGGCGGCGCTGGTGACCTCGCACAGAGCCGTGCCGCTTCACGTCCTCAACGAGGACATGGAGATGAGAGTCGATGAGTTGTATAGGAAACACAAAGGTCAGACGCCGGCGGAGGCCGAACTGAACTATCTGGAGAACGCCAAGAAGCTCGCGTTGTACGGAGCCGAGATGCATTCGGTGAAGGACTCTGATGATGTAGAGCTCTCACTCGCCGTCTGCGGGAGAGGAATCGCCGTGGTCAGGGACGGGACGGTCATGAATCGCTTCCCGTGGACGAAGATATTGAAGCTCAGTTACAACAAGCGTCTGTTCGTGATCCGCCTCCGAGCCGCGGACTCCGACGAGTGCGAGACGGATGTCAGCTTCCGACTCAACTCCTCGCGGGCCAGCGAGCGCCTGTGGACCAGCACCGTGGAACATCACGTGTTCTTCAGGCGCGAGAGTCCGGTGAAGGTGGAGCGAGTGTCAGGGTTCCCGATGCTCGGGGCCCGGCGACTGTCTTGTCGGCGGACGTTGCGACAGATGCGCGACACGACTGTCGCAAGACAAGTTATTTGA

Protein sequence:

>DPOGS213354-PA
MRESLRRLASDDIPPRGRVRVELLTGEHITIDLDRKALGGDLLDLVCESLDVIEKDYFGLLHAQGEPRVWVHLGRRLSKTFKILAAVPQFPLHYRPHISDEPWDVRFAVKFYPLEPSALRDDMTRYQLSLALRRDLMEGRLTCSTITYALLASYVLQAEAGDRSAAVPLGAGATAALVTSHRAVPLHVLNEDMEMRVDELYRKHKGQTPAEAELNYLENAKKLALYGAEMHSVKDSDDVELSLAVCGRGIAVVRDGTVMNRFPWTKILKLSYNKRLFVIRLRAADSDECETDVSFRLNSSRASERLWTSTVEHHVFFRRESPVKVERVSGFPMLGARRLSCRRTLRQMRDTTVARQVI-