Monarch geneset OGS2.0

DPOGS209889
TranscriptDPOGS209889-TA3312 bp
ProteinDPOGS209889-PA1103 aa
Genomic positionDPSCF300049 - 435876-439187
RNAseq coverage356x (Rank: top 33%)
Annotation
HeliconiusHMEL0118470.083.33% 
BombyxBGIBMGA000187-TA0.075.88% 
Drosophilagry-PA7e-15544.05% 
EBI UniRef50UniRef50_D6WZH40.043.15%Putative uncharacterized protein n=2 Tax=Tribolium castaneum RepID=D6WZH4_TRICA
NCBI RefSeqXP_971240.20.044.14%PREDICTED: similar to FLJ12716-like protein [Tribolium castaneum]
NCBI nr blastpgi|1892412180.044.14%PREDICTED: similar to FLJ12716-like protein [Tribolium castaneum]
NCBI nr blastxgi|1892412180.043.92%PREDICTED: similar to FLJ12716-like protein [Tribolium castaneum]
Group
KEGG pathway 
InterPro domain[267-520] IPR0217734.5e-60Foie gras liver health family 1
[576-1070] IPR0128801.5e-43Domain of unknown function DUF1683, C-terminal
Orthology groupMCL14282 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209889-TA
ATGGCGACTCAGCCGAGTGACAACACAGAATTTCCGCCTGAAATCATTCTTAAGCCCCTTGCCTTGATAGGGTTGTCAGGGCTTGATACGGTGAATAATGCAATCCACAAAGCTATATGGGATGCCTTCTCAAACAACCGCCGACCGGACCGAGCCGCCGTGAGGTTCAAGTTGTTGAATAACACGTTCGAATTTCCAGTGGTGAAGCCTAAGAGAAACTCCTATGAATGGTACATCCCCAAAGGTATATTGAAGAAAAACTGGATAACTAAACGTGTATCGTTAATTCCGGCCGTGGTGGTTATTTTTTATGATATGGAGTGGAATGATCCTCAGTGGAACGAAAAGATCATCGAGTGTGCGTCGAGAGTGCAGTCGATACGGGCAGCGGTGGAGGGACACGCTACACGTGTCGCTGTTGTGGTTGTACAGAGTGGACTTTCACCCCCACCATCGGAGTACATGCTCGGTGCTGAAAGAGCACAGGCACTTTGCTCAGCATGTGAAATACAATCTAAGTCTCTCTTTGTACTCCCTCACAGCGATCACCTCATGGGTTACATTATAAGACTAGAAAATGCTTTTTATGATATTGCACAAAATTATTATCATCACGAAACCAAGAACATCAAGCAGCATAGAGATCATCTGAATAAGACTACCCATCAGTATTTGTTTGTTAGACATCAGTTCAAACTAGGCTTCCTCAATGAACTCAAGCAAGACATAAGCACGGCCCACAAACACTACATGCATGCATATAACAACCTCCTTGATACCAGACAAGTAGATACTAATGTACATGAAATACGAACCGTGTCTGGTTACATCAATTATAAACTATGTAAGCTGCTGTTTGCCTTAAATTTGCCACGAGATGCAATTGCACAAGTTAAGTCACATATAGAGCGCTACAAAAACAGAATTGGACCCACTGAACTGTTGTTTGAGCATTATGGCTGGATTGCCAGGCAGTATAGTGCCTTTGGAGAATTATTTGATGAAGCTATAAGGTTAGGGCTTCCGGCAATTCAATCCCAACATCCTGGCTTTTATTACCAGTATGCAGCTCAATTTACAGTGAAAAGACGGCAAGCCATGAGGTCGGTATGCTGTGATGCTTCACACTATCCACCTGCCCCAGACCCCATGGAGGGTATTGTGGAGTTTTATGGCCAGAGACCTTGGAGACCGGGACGACTCAGTGCGGATCCACATGATCCACAAAAGGAACAAGCGGCGGTGTTGGCACTGCAATACAATGAAAGAATTTTCAACCATTCTGCTATGATAATTAGTTTTCTAGGTAGTGCTATCTCCCAGTTTAAAACATTTCACTCGCCCAGAATGAGGAAGCAGTTAGTGGTTGAGATGGCTAATGAATATTATTTTTGTGCGGACTATGGTAAAGCTTTGACTTTATTGTCTCATATGCTCTGGGATTATAGAAAAGAAAAGTGGTGGTTTTTGGCTTCCCATGTCTTAAACCGAGCTTTACAATGTGCCTACTTGTCTGCAAAAATTCAAGACTACATTCATTTATCAGTGGAGGCACTCTCCAAATACATTCAAGTGCCAAACAACGACAAAGATAGAATATTTAGAAACATAATGGCAGTTCTCAACATGAACATTCCATCACCAGAGCCGAACCTCCCTCCTTCCTCACAGAGTAAAGCATTAGAAATGTGGCAACTGGCTATAGACAAAGAGCCTCTCACCATTGCCATAGATATGATAAACATAGCCAGTTTCCTAGAAGTAAAAGCAAAGTTCAAGCAACAGAAATATAGGATGGATGATACAATTGAAGTTGAGTTGTTTGTTAGACTTACATATAACACAACCCTTGATGTTAAAAGTGCCTCTATGACAATTGCAACAAATACAGAAACTATTGACATAAATATAACGGATGAAGGCAGTACTACACTGAAACTGATCAGAGGAGAAGTTAAAAGGTTTCTGTGTCAATTTAAAGCCAGTCCACATGATAATGGATCGGAAATGAAAATCAAAAATGTATCATTTGTATTGGACAGTGACAGGAGAAAAATTATAATGAACTTTAAAATCGATGAAATCAAGAATGTAGAGCCCACAGTCCATCCTGAATTACTACACTTCATAATGAGTCCTAAAAGTGACTATGAATTTGATTGTATAATGCCTTTGACCACTACATCCATCACCAGCAGGGAATGTAGACTGTCTTTAGATATTAAAAATGCAGTGCCGGCTTTACAAGGCGAGTGGTTTCCCACCACTTTCACAGTAATAAACCATGAAGACGGTCCCGTTCATGATATGTCAATAGTGCTGACACTTCTAAGCTCTCCTGATAATCCAAACCCTGAATCGGTCACAGAGTTGGGCTTTAGACACGGTGAACCCGAAGCCCAACCCATTAAACTCTGTGTCGGAGATGTGAATAAAAGTTCTTCATATTCAAACACATTTTATTTAAAAACTAACAGAACAGCCACAACAACTGTTCAAATAAAAGTAACGTACACAGTAGATGCTTATGAAACACCTCAACTTGAATGTTCCAAAGAATTCACAACGAAAATCACAGTGATCAAACCGTTTGATGTATCAACCAGTTTCGTGTCCATGAACTTTAAGCCTATAACGAAATGCTATGTAGATGATCCCTTTATAGTTATGCCTCAAATAAAAATTTTAAGTCCCTGGAATTTAGTTATTTTAGATACAGAACTAGAAACGGTAGAAAGCTTTAGATATGCTGATGAGAAAAAACCTCAATCATGTATAAGTAACCTACGAGTGGCTGAGAAGAATGTGGCCTCTGATGCTATATGTATACAGGCTAACTACAAGCCAAAGGAAGTCGCTACGAGAGTAGGCTTGTACAACATCTCTTGGCGTAGAGAGAGCAACACAGATGGCCATTGTGTTATGAGCACTACTGCCCTCTCGGCACTTCCAATAGATGATTGCCCAATTACTGTTGAAGTCAATTATCCAGAGGTTGTTGACCTCCAAACATCCGTGCCATTAAAATGTACTCTAATTGGGAAAACTAATACTCCTATCAGACTGAGTCTCTCCGTGGAAGGCACAGATGCATATATGTTTTCAGGGTACAAAAAGTTCTCCATCACTGTACCACCCAGAGATAAGGTCGAGTTATGTTACAACATTCACCCCCTGGTGGCCGGGAACACAATCCCTCCTCGGTTAAAAGCAACAGTTCTTGGTGACACGTCTAGACAAGAGGTTGTAAAAGAAATGTTTGACAAAATCTTTCCTCAAAATATTTTTGTTATGCCTAAATATAATAAATAA

Protein sequence:

>DPOGS209889-PA
MATQPSDNTEFPPEIILKPLALIGLSGLDTVNNAIHKAIWDAFSNNRRPDRAAVRFKLLNNTFEFPVVKPKRNSYEWYIPKGILKKNWITKRVSLIPAVVVIFYDMEWNDPQWNEKIIECASRVQSIRAAVEGHATRVAVVVVQSGLSPPPSEYMLGAERAQALCSACEIQSKSLFVLPHSDHLMGYIIRLENAFYDIAQNYYHHETKNIKQHRDHLNKTTHQYLFVRHQFKLGFLNELKQDISTAHKHYMHAYNNLLDTRQVDTNVHEIRTVSGYINYKLCKLLFALNLPRDAIAQVKSHIERYKNRIGPTELLFEHYGWIARQYSAFGELFDEAIRLGLPAIQSQHPGFYYQYAAQFTVKRRQAMRSVCCDASHYPPAPDPMEGIVEFYGQRPWRPGRLSADPHDPQKEQAAVLALQYNERIFNHSAMIISFLGSAISQFKTFHSPRMRKQLVVEMANEYYFCADYGKALTLLSHMLWDYRKEKWWFLASHVLNRALQCAYLSAKIQDYIHLSVEALSKYIQVPNNDKDRIFRNIMAVLNMNIPSPEPNLPPSSQSKALEMWQLAIDKEPLTIAIDMINIASFLEVKAKFKQQKYRMDDTIEVELFVRLTYNTTLDVKSASMTIATNTETIDINITDEGSTTLKLIRGEVKRFLCQFKASPHDNGSEMKIKNVSFVLDSDRRKIIMNFKIDEIKNVEPTVHPELLHFIMSPKSDYEFDCIMPLTTTSITSRECRLSLDIKNAVPALQGEWFPTTFTVINHEDGPVHDMSIVLTLLSSPDNPNPESVTELGFRHGEPEAQPIKLCVGDVNKSSSYSNTFYLKTNRTATTTVQIKVTYTVDAYETPQLECSKEFTTKITVIKPFDVSTSFVSMNFKPITKCYVDDPFIVMPQIKILSPWNLVILDTELETVESFRYADEKKPQSCISNLRVAEKNVASDAICIQANYKPKEVATRVGLYNISWRRESNTDGHCVMSTTALSALPIDDCPITVEVNYPEVVDLQTSVPLKCTLIGKTNTPIRLSLSVEGTDAYMFSGYKKFSITVPPRDKVELCYNIHPLVAGNTIPPRLKATVLGDTSRQEVVKEMFDKIFPQNIFVMPKYNK-