Monarch geneset OGS2.0

DPOGS210349
TranscriptDPOGS210349-TA3294 bp
ProteinDPOGS210349-PA1097 aa
Genomic positionDPSCF300025 + 5683-11272
RNAseq coverage71x (Rank: top 66%)
Annotation
HeliconiusHMEL0137530.087.50% 
BombyxBGIBMGA011911-TA0.065.92% 
DrosophilaCG32206-PB2e-15558.92% 
EBI UniRef50UniRef50_D6WXQ10.049.30%Putative uncharacterized protein n=3 Tax=Tribolium castaneum RepID=D6WXQ1_TRICA
NCBI RefSeqXP_973209.20.048.40%PREDICTED: similar to AGAP006059-PA [Tribolium castaneum]
NCBI nr blastpgi|1892403610.048.40%PREDICTED: similar to AGAP006059-PA [Tribolium castaneum]
NCBI nr blastxgi|1892403610.048.02%PREDICTED: similar to AGAP006059-PA [Tribolium castaneum]
Group
Gene OntologyGO:00055151.8e-09protein binding
KEGG pathway 
InterPro domain[303-433] IPR0008591.3e-13CUB
[116-154] IPR0021721.8e-09Low-density lipoprotein (LDL) receptor class A repeat
Orthology groupMCL15980 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210349-TA
ATGCTCATCGGAGCAGCGAGCAAGAGCAGCGAGCCGATCAATCCGGCAGCGGTGTGGACAGCAGCGCTCGGGTCGTGTTCCGCTCCGCGCCTGCGCCCAGCAGCACTCCTCGGCGCCCCTCTTCACCGTCGGCGGGCGCGCGCCGCGCTGCAGTGTGTCGCGGCATTACGTCACGAACGCATCCATCGTCATCGGAAGCTGTTTCAACACCAAAAGTCTGACACCGCGGACGTTAAAAGTTTAGTGTGGTGTTTAATGATCGCGTGCAGCATGTCAGCAGAAGGGAGGGTGTGGATGTGTCTGGTGCTGGCCTTGGCAGCTCTGTGGAGTCGCGCGGCCGCCGGCGGATGTGGCGTGGCGGAGTTCACGTGCCGTAGTGGAGCTTGCGTACGCCTCGATGCTTACTGCGACGGCGAAACGCAATGTCCTGATGGGAGTGATGAACCGCCACACTGCTCGGTTTGCAATCGGACGTATTACGGACGGATCGGCGTAGCTTATGGACTGGGACTGCGTGGAGCGCCGAGATCTCCTTTTCTGTGTCATCTCACGTTTACCGCTGGCGGAGGATCCCACGGAGATCTCGTTCAACTGGCTTTCGACGAGTTCCGTGTGGGTCGCTATGAATCGGGCGCCTTGGACGGGTGTCCTGATGGCTACATGCAGTTATCGGAACTGGGTCGTCCTTTCACTGGTGGCTCGTGGTGTGGTTCCGCAGAAGGTGTCGCTCTTTATTACAGTGAAACCGCAACAGTTACCGTATCAGTGAAGTTGTTTCGAGCTCGTCTTGGAGAACCTTTTGGTTTCAGACTAAGGTACAAGTTCCTTGCACAGCGTGATGCTATCGTAAGATTTGGAGCATTGGAAGCACCTTTGGAACGAGGGTCCGTGTCGCCTGGAACTTACTGTACGAGAACTTATGAGGAATGTCACCGAAAAGCCTGTCGCCTCCAGAGCCCCAACTACCCCGGCATGTACCCGAGAAATGTAACTTGTTACTGGAGTCTGCGTCAGAAGGACATCCCGACATGCAAGCATGCTATGATATCAGTTCGCCAGCTATATTCGCATAAAATGCAAATAAAACGTTCAATTTCGATGGCCAGTTTAAACAAGACGGGTCGTGCGGTGCGCGCGTGGCGCGAGTGTACGGGAGAGCGAGATCGCCTCATTTTCTATGACGGAGCGTCTACGGACGACCCCGTGCTGGTGGAGTATTGCGGCGGGGACTGGCTGCCTCCGGTCACTTCCCGGGGACCAGAGATGCTGGTTGCTTTCCACTCGTCCCCATTTAGCGCCCCTCCGCGGGCACCCACGCCACACGCACCACTCAGAGGATTTGAACTTGATGTAGACGTCATTTTCGCAGATTCTGATTCTCTAGATTATTCAAGGGAAGCAAAACGTTGTGAATTTCATGTTAAGGCTTCTTCGTCGGAGGAGGAATTTAATGCCACCACGGTAAATGTGAGAGGTCGCCGGGGACGTTTGCATGCACCGACTCACACGCTTCCACCCAATACTACATGCACTTGGACATTTCATGGTCGTCCTGGTGATCTGGTCTGGATCTATTTTTCGAGTTTTACACATTATTCTCTCGTGGAAGGACGACGGACGGAGAGTAACGAACGTGAAGATGACGCTGCAGTCACAGTAACTTCACGACATTTATCGGAGTATTCCCGTTCTGCTGGAGGGAGCGGCGTTTGCGCCGTGGAGCTTCGAATTTGGGACGGTGGTGGTGTAGACGAGGCTGCTGATCTTCTGGGTCACTACTGTGACTCTACACCCTCTCTCTGCGCTCGAGCCGCTCTCGCCAACGCCACGCGATCTCCTCGACCATGCGCACCTCCAGATGGCTACGTATCTGCGGCTGCACTAATGTCTCTCGCAGCTACATCACTTCCTGGTACCGCTACTCATCCCTTGGCATTCGTGATGCACTATGAATTTGTTGATGCTCGGTTAGAAGGTGTCCTGTTACCGATCTCGGAAACGCGTGTCCCAGTCGAGCCAGCGGAGTGTGCGAGGAGACTTACAGTACCTGGGGTATTTTCATCTCCTCGTAACGCATTGTGGTTCGGGCGTGGAGGTGCAAAAAGACTTCGCTGTGTATACAGATTACAAGTTGAAAGAGCTAGTATTGAATTACGTCTTTTAGCAGCAGCTTTTGGACGAGAGCCTAAATGTTCTACAAGAATAGATCCCTTAACGGGACGTTCGGCATGCATTCCTGATCCTATTGACCCTCTTGATCTAAGACCTAGTGATGCCCCCGTTGATTTCGATTATGATGAAAATCCATTACGCGTACCACATTTACGGATATACGAGTCACCTTGGCCAGGATATAGAGTACCTGTAGGATGTATATGTGATAATAGCAGTACACCTTTGATTATATCGAGCGGTGGCCCTTCAATGGAATTAGAACTGGTGGCCAGCACCTTGGCAGCGAACGAGGATCACCGCCACGTACACTTCCAAGGAGAGTGGGCCCGCGGCCCGGCCACCTCTGAGTGTGCTATAAGTCGCCGTCTGCCACCACCCGGGGCATCGGTGCGCCTTTTGCACCCATACAACGGGAATAAGATGTCAGAATGTGGAGAAACGCCTTTTCTGTTAGTGGCTCGAGGTAATCGCTCAGTATTTCTTCGAATCTGGGGCGAAGAATTGCCTGCTTCAGCTCCTACCTCCGAGGCACCATTATGTCACACAACAAATCGTGTTTTGGTTTACGAGTCTCATTCATCAAGATTACTAAAAGCAGTCTGTCCTGGTGGCGATGACTCACGAACAGTCCAAGTGTTCACCGAGGAATGGTGGGCCAGAAGCATCGGTCGTGAAGCCGCACTTATGGTGGTGTGGTCCGCAAGAGAGGCGGGATCCGCACGATTCACGTGGATGGAGGTCTGGCGGCCGGCAGGAGCGACTCCGGGAGCTGTTAGTGGAAACATTTCTTCGTGCGCTCACGAATGTGTGTCTCTGAGCGCTTGTATGGCGGGCGCGCTGTGGTGCGACGGTTCAGTGGATTGCCCAGGCGGCAGTGATGAGGCCGGCGCTTGCGGGGCGGGAGCACGACTTTTGGCAGCATTGGGTGCACCTGCGGCTGCCGGAGCAGCAGGCGCTTGCGGCGTAGCTGCGGCACTAGTGCTACTTGCCGCTTTAGCACTGCGCCGGCGCCGCTCTCGACGAGACAAACGCTTACTTGGTGCTCTGGCGGCCGGCCGCCGATTCACCGAAGAGCTTCTGTACGATGGATCACGTACGTCTTCAGTGACGTCATCTTGA

Protein sequence:

>DPOGS210349-PA
MLIGAASKSSEPINPAAVWTAALGSCSAPRLRPAALLGAPLHRRRARAALQCVAALRHERIHRHRKLFQHQKSDTADVKSLVWCLMIACSMSAEGRVWMCLVLALAALWSRAAAGGCGVAEFTCRSGACVRLDAYCDGETQCPDGSDEPPHCSVCNRTYYGRIGVAYGLGLRGAPRSPFLCHLTFTAGGGSHGDLVQLAFDEFRVGRYESGALDGCPDGYMQLSELGRPFTGGSWCGSAEGVALYYSETATVTVSVKLFRARLGEPFGFRLRYKFLAQRDAIVRFGALEAPLERGSVSPGTYCTRTYEECHRKACRLQSPNYPGMYPRNVTCYWSLRQKDIPTCKHAMISVRQLYSHKMQIKRSISMASLNKTGRAVRAWRECTGERDRLIFYDGASTDDPVLVEYCGGDWLPPVTSRGPEMLVAFHSSPFSAPPRAPTPHAPLRGFELDVDVIFADSDSLDYSREAKRCEFHVKASSSEEEFNATTVNVRGRRGRLHAPTHTLPPNTTCTWTFHGRPGDLVWIYFSSFTHYSLVEGRRTESNEREDDAAVTVTSRHLSEYSRSAGGSGVCAVELRIWDGGGVDEAADLLGHYCDSTPSLCARAALANATRSPRPCAPPDGYVSAAALMSLAATSLPGTATHPLAFVMHYEFVDARLEGVLLPISETRVPVEPAECARRLTVPGVFSSPRNALWFGRGGAKRLRCVYRLQVERASIELRLLAAAFGREPKCSTRIDPLTGRSACIPDPIDPLDLRPSDAPVDFDYDENPLRVPHLRIYESPWPGYRVPVGCICDNSSTPLIISSGGPSMELELVASTLAANEDHRHVHFQGEWARGPATSECAISRRLPPPGASVRLLHPYNGNKMSECGETPFLLVARGNRSVFLRIWGEELPASAPTSEAPLCHTTNRVLVYESHSSRLLKAVCPGGDDSRTVQVFTEEWWARSIGREAALMVVWSAREAGSARFTWMEVWRPAGATPGAVSGNISSCAHECVSLSACMAGALWCDGSVDCPGGSDEAGACGAGARLLAALGAPAAAGAAGACGVAAALVLLAALALRRRRSRRDKRLLGALAAGRRFTEELLYDGSRTSSVTSS-