Monarch geneset OGS2.0

DPOGS202337
TranscriptDPOGS202337-TA2769 bp
ProteinDPOGS202337-PA922 aa
Genomic positionDPSCF300032 + 846739-859123
RNAseq coverage1982x (Rank: top 6%)
Annotation
HeliconiusHMEL0100420.086.78% 
BombyxBGIBMGA005003-TA0.086.94% 
DrosophilaFBX011-PA0.077.66% 
EBI UniRef50UniRef50_Q9VH600.077.66%CG9461 n=14 Tax=Coelomata RepID=Q9VH60_DROME
NCBI RefSeqXP_395525.30.078.93%PREDICTED: similar to CG9461-PA [Apis mellifera]
NCBI nr blastpgi|3800259950.078.93%PREDICTED: F-box only protein 11-like [Apis florea]
NCBI nr blastxgi|3504175130.079.06%PREDICTED: F-box only protein 11-like [Bombus impatiens]
Group
Gene OntologyGO:00082707e-12zinc ion binding
GO:00048427e-12ubiquitin-protein ligase activity
GO:00055151.9e-10protein binding
KEGG pathway 
InterPro domain[520-703] IPR0110502.3e-28Pectin lyase fold/virulence factor
[132-215] IPR0223642e-18F-box domain, Skp2-like
[572-749] IPR0123345.9e-15Pectin lyase fold
[836-896] IPR0031267e-12Zinc finger, N-recognin
[413-549] IPR0066339.4e-11Carbohydrate-binding/sugar hydrolysis domain
[132-172] IPR0018101.9e-10F-box domain, cyclin-like
[833-898] IPR0139932.1e-08Zinc finger, N-recognin, metazoa
[499-540] IPR0224412e-06Parallel beta-helix repeat-2
Orthology groupMCL14770 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202337-TA
ATGCCTAGTGCCTCCTTTTCCTCTTCGCGTTCTTACGTCCGTAGATCGCGAAGAAAAGGTGGTCATCGGATACCATTGCCTTCCAGGACACAATCGAGTGAGCCATGTGAATCAGTGCCGTGTCCGAATAACGTGACGGGGAGCGCTATGGCAGCTACAGCATGTGCAGGTCCGAGTGGCAGTGGTGGGGGTGGATCCCCGCCGGTCCCCGCGGCCGCAGCCGCCACCACAGCTGGTCATCACAGCCCATATGACTTGCGTAGGAAGTCACCACCGGCTTACCATGAACCGGGGCCATCGGGTACCTGCTCTCTTCCAGCAAGGAAAAGGCCCAGGACGTCGCTGTCCCAGGGTGTGGATGTGTGTAACGTGTCCCAGTACCTTCAGTACGAGCTGCCTGATGAGGTGTTGCTCTGCATCCTCTCACACCTGACAGAGAGGGACCTGTGCCGCGTCGCTCAGGTCTGCAAGCGGTTCAACACCATCGCCAATGATACAGAGCTGTGGAAGAGTTTATATCAGTCGGTGTTTGAGTACGACACTCCTCTGATGCACCCCGCTCCGTGCCAGTTCGAGTTCGTGGCCGCGGACGACTGTGAGGCCGACAACCCCTGGAAGGAGAGCTTCAGGCAGCTCTACTACGGGATACACGTGAGACCCAACTACCGCCCCAAGAAGGACTCTCGGATCAAACACTTCAACACTATCAGGGCGGCCCTAGAGTACGTGGAGGAGCGCGGCGGGTCTCCGGCGTCCGGCGCTTGTTCGTGCTCGTGCGGCGCGTCCGCGTGCTCGTGCCGGCGCACGCCCTCCGCCCCGGCCGCCGCAGCCCCCGCGCCCGCCCTGCTGTTCGTGCACGCCGGCCTCTACCAGGAGGAGTGCCTAGCGATTGACACGGACGTGCAGCTCATTGGTTGTGCCCCCGGTAATGTAGCCGAGTCTGTGGTGTTGGAACGCGAGGCGGAGTCTACGCTGACGTTCGCTGAGGGCGCTAACAGGTCATATGCTGGTCACATGACGCTCAAGTTCTCACCGGACGCCACGAGTACCATGCAGCATCACAAGCATTACTGTCTGGAGGTGTCCGATAACTGTTCGCCGACAGTCGACCATTGTATCATACGGAGCGCTAGTGTCGTGGGTGCGGCCGTGTGCGTGAGTGGAGCTGGGGCTAATCCGGTCATAAAGCACTGCGACATCAGCGACTGCGAGAACGTAGGGCTGTACGTAACGGACTACGCACAGGGAGCGTACCAAGACAATGAGATATCGAGGAACGCCCTCGCCGGTATATGGGTGAAGAACTTCGCCAACCCCATCATGCGTCGCAACCACATACACCACGGCCGGGACGTCGGTATATTCACCTTTGAAAATGGATTGGGATACTTCGAAGCCAACGACATCCACAACAACAGAATCGCCGGTTTCGAAGTGAAGGCCGGCGCCAATCCTACGGTAGTGCACTGCGAGATACATCACGGTCAAACGGGCGGCATCTACGTGCACGAGTCGGGACTGGGGCAGTTCATAGATAATAAGATACACTCCAACAACTTCGCCGGCGTCTGGATAACCTCCAACAGCAACCCCACCATCCGGCGCAACGAGATATACAACGGACACCAGGGCGGGGTTTATATATTCGGGGAGGGGCGCGGACTTATTGAACACAATAATATATATGGGAACGCGTTGGCCGGAATACAGATTCGCACAAACAGTGATCCTATAGTGAGGCACAATAAGATCCACCACGGCCAGCACGGCGGTATATACGTACATGAGAAGGGTCAGGGGTTGATCGAGGAGAACGAGGTGTACGCCAACACCCTGGCCGGGGTCTGGATCACGACCGGCTCCACTCCCGTGCTGCGGCGTAACCGCATACACTCGGGGAAACAGGTCGGCGTATATTTCTATGACAACGGCCACGGAAAGTTGGAAGACAATGATATATTCAACCACCTGTACTCCGGAGTGCAGATCAGGACCGGCAGCAACCCCGTGATCCGAGGCAACAAAATCTGGGGCGGCCAGAACGGGGGCGTGCTGGTGTACAACGGCGGCCTCGGCCTGCTCGAACAGAACGAAATATTCGATAACGCCATGGCCGGGGTGTGGATAAAGACTGACTCAAACCCAACATTAAAAAGGAATAAGATCTTTGATGGCAGAGATGGTGGCATCTGCATTTTTAACGGAGGAAAGGGTGTGTTAGAAGAGAACGACATTTTTCGGAACGCTCAGGCCGGCGTGTTAATTTCCACCCAGAGTCACCCTGTACTCCGTAGGAATAGAATCTTCGACGGCCTGGCCGCTGGTGTCGAGATCACAAACAATGCCACGGCGACTCTCGAACACAATCAGATATTTAATAATCGGTTTGGAGGATTGTGTTTAGCGTCGGGCGTGTCGCCGCTGGTGCGAGGCAACAAGATCTTCAGCAACCAAGACGCCGTGGAGAAGGCTGTTGGCGGCGGACAGTGCCTGTACAAGATATCCTCGTACACCTCCTTCCCGATGCATGATTTCTATAGATGCCAGACATGTAACACGACTGATCGTAATGCCATATGTGTCAATTGCATCAAGACATGCCATTCTGGACATGACGTGGAATTTATACGCCATGACAGATTCTTCTGCGACTGCGGTGCTGGGACTCTGTCCAACCAGTGCCAGTTGCAGGGCGAGCCTACCCAGGACACGGACACGCTGTACGACTCAGCCGCGCCGATGGAGTCGCACACGCTCATGGTCAACTGA

Protein sequence:

>DPOGS202337-PA
MPSASFSSSRSYVRRSRRKGGHRIPLPSRTQSSEPCESVPCPNNVTGSAMAATACAGPSGSGGGGSPPVPAAAAATTAGHHSPYDLRRKSPPAYHEPGPSGTCSLPARKRPRTSLSQGVDVCNVSQYLQYELPDEVLLCILSHLTERDLCRVAQVCKRFNTIANDTELWKSLYQSVFEYDTPLMHPAPCQFEFVAADDCEADNPWKESFRQLYYGIHVRPNYRPKKDSRIKHFNTIRAALEYVEERGGSPASGACSCSCGASACSCRRTPSAPAAAAPAPALLFVHAGLYQEECLAIDTDVQLIGCAPGNVAESVVLEREAESTLTFAEGANRSYAGHMTLKFSPDATSTMQHHKHYCLEVSDNCSPTVDHCIIRSASVVGAAVCVSGAGANPVIKHCDISDCENVGLYVTDYAQGAYQDNEISRNALAGIWVKNFANPIMRRNHIHHGRDVGIFTFENGLGYFEANDIHNNRIAGFEVKAGANPTVVHCEIHHGQTGGIYVHESGLGQFIDNKIHSNNFAGVWITSNSNPTIRRNEIYNGHQGGVYIFGEGRGLIEHNNIYGNALAGIQIRTNSDPIVRHNKIHHGQHGGIYVHEKGQGLIEENEVYANTLAGVWITTGSTPVLRRNRIHSGKQVGVYFYDNGHGKLEDNDIFNHLYSGVQIRTGSNPVIRGNKIWGGQNGGVLVYNGGLGLLEQNEIFDNAMAGVWIKTDSNPTLKRNKIFDGRDGGICIFNGGKGVLEENDIFRNAQAGVLISTQSHPVLRRNRIFDGLAAGVEITNNATATLEHNQIFNNRFGGLCLASGVSPLVRGNKIFSNQDAVEKAVGGGQCLYKISSYTSFPMHDFYRCQTCNTTDRNAICVNCIKTCHSGHDVEFIRHDRFFCDCGAGTLSNQCQLQGEPTQDTDTLYDSAAPMESHTLMVN-