Monarch geneset OGS2.0

DPOGS203023
TranscriptDPOGS203023-TA3027 bp
ProteinDPOGS203023-PA1008 aa
Genomic positionDPSCF300068 + 473684-487955
RNAseq coverage444x (Rank: top 28%)
Annotation
HeliconiusHMEL0110420.061.50% 
BombyxBGIBMGA012260-TA3e-12766.22% 
DrosophilaBr140-PA1e-3948.37% 
EBI UniRef50UniRef50_Q9VI631e-11685.51%AF10 n=23 Tax=Eumetazoa RepID=Q9VI63_DROME
NCBI RefSeqNP_524250.32e-11785.51%alhambra, isoform A [Drosophila melanogaster]
NCBI nr blastpgi|455492164e-11685.51%alhambra, isoform A [Drosophila melanogaster]
NCBI nr blastxgi|910824117e-16239.16%PREDICTED: similar to mixed-lineage leukemia protein [Tribolium castaneum]
Group
Gene OntologyGO:00055151.4e-07protein binding
GO:00082701.4e-07zinc ion binding
KEGG pathway 
InterPro domain[12-96] IPR0110113.1e-14Zinc finger, FYVE/PHD-type
[24-79] IPR0130831.1e-08Zinc finger, RING/FYVE/PHD-type
[29-77] IPR0019651.4e-07Zinc finger, PHD-type
[30-78] IPR0197877.5e-07Zinc finger, PHD-finger
Orthology groupMCL18915 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203023-TA
ATGTTACGTAACGAACCGGCGCAGAAGGTGGTGTGCAATAATGCTTCCTTCAATATGAAGGAGATTATGAAGGAGATGGTTGGTGGTTGCTGTGTGTGTTCCGACGAGCGTGGTTGGCCTGACAACCCGCTGGTCTACTGCGACGGCAATGGATGTTCGGTTGCAGTACACCAAGCCTGCTATGGAATCATTGCTGTTCCGACCGGTCCCTGGTACTGCAGGAAGTGTGAGAGCCCTGAGACTAAAAGCAAAGTGAGATGCGAGCTGTGCCCGTCTAAGTTGGGTGCGTTGAAGCGTACTGACACAGGTGGCTGGGCCCACGTCGTATGCGCTCTCTACATACCAGAGGTGCGTTTCGGGAACGTGACTTCTATGGAGCCGATTGTCCTGCGACTCATTCCCACCGAGAGATATAATAAGACTTGCTACATCTGCCAGGATCTTGGCAAAACTCATCGCGCCAACGCAGGCGCCTGCATGCAATGCAACAAATCTGGTTGTAAACAGCAGTTCCACGTGACCTGCGCCCAGTCCCTCGGCCTGTTGTGCGAGGAGGCCGGCAACTACCTGGACAACGTGAAATACTGCGGTTACTGCCAACATCACTACAGCAAACTGAAAAAAGGCGGTAACGTGAAGACCATCCCGCCGTACAAGCCGGTCAGTCACGATAGCCGCAGCGACTCGAGCGAGAGGGAGGGGGAGCCGCCGACGACGCACTGCAAGCGAGGGCCCGGCCGGAAGTCGTCTCACTCCAGCGGCGGCGCGTCCGGGAAGAACACGCCCAACTCGTCCAAGACGCCTACGAACACATCCCAACCGATGGACAAGAAGAAGCCTTCACCGTCTCGTCGAGGATCTGTGGCTGAGAGCGGCAGCAAGACCAATACACCAGCCCCCTCACCCTCGCCGCAACACATACAGGAGACACACACTAAAGGAGGTTGTTCTACGCCCATCAACACCGCTAAGATCCCCTTGCCGCCGGAGTCTCCTGGCAAGGAAGGCATGATCAGCTCAGCGGCCATAGCATCTATACCCATACCCCCGAGCACTTCAACAACGACTGTCGTGCAACCTACCAAGCCGTACGAGTCCGTCATCACCAACACGGAGACGGCTGATGCTAAACAGACCAAGAAAAGGAAGGCTGTTCAAGGTTCGCAGTCGGCCGTGGACTATGCATCATCACCGACGCCCGTGGAGGTCGCGAACCAGATCGGGAACAACACTTGGGAACAACAGACCAGCCACGCAACTAGCGACACTAATGTGGAAGTAGAGAAGATTATTAAAAAGGCTAAAACTGAAGGTATGGACGGCGGGTCATCGTCAGCTGGTCACTACACCAGCGTGAGCCCGGCTCCGCCGCCGCCGCCGCCGCCCGCCCACAGCCCGGCGTCACACACCTCGTTACAGAGCCCTAGACATCTCCCCAGTCCGATGCCAGGGCCGAGCGGGATCAACCAGGCTCCCAACATCAGATCGCCTTCGCAGCACCAGATGAAGGAGCGCGAGCCGCCGGCGTCGCTGCTGGTCTCCGTGCCGCTGCCTTCAGCCAGCCACGGCCTGAACCTGTCCGCGCACGCGCACGCCCTCATGCACGCACAGATCCCGCTGCCGTCTCCAATGCCAGAGATGGGGCATATCTTCCATCAGACCCACAAGCAGGTGGCGATGGAGTCGGGGCTGAGTCACTCACCCCACGCTCGGTCCTGGGGCGGTCTGAACGTCTCCTACGAACTACAGGATCCAAACAAACCCGGTGTGAGCGGTATAGCTGGGCCCAGCAAGGAGGCGCTCGTCGGCGCCAACATGGCCAATATGGCCAATATGGCCAACATGGCTAATATGGCGAACATGGCAAACATGGCGAATATGGGCATACCGCCAGCCTTACGGAACAAGAAGAGAGCAGCACTAGCCACGTCCACGGCGAACACCCCGCCGCCACCGCCGATGCAGTCAACCGCAGCTCAGAACCTGAGCGGGATGCGGAGGGGACCCCAGCCGACCCCGCCGCCCGTGTATCACGAGGCAATCAAAGACTCTCCTCCGAGCTCGCCCGGCTCCGAGAGACCGCTGAAACCTAAATTGGAACACAAGTTGGGTGTGAACTGCTCGGCTCCTCATATGCTCGGTAACGAGCTGAACCCGGAGAGCGGAGCGGCGGCTCGTCTCCAGGAGCAGCTGACGGCCGAGCTGGCTGCCCACGCGGCCGGCGCCGTCAATTCAGCTGACACGCCCATACCGCCGCCCCTCATCAACAAGGCCGCTCCGAGATCCGGTGCTCAAAGCCTGGATCAGCTCCTCGAGCGACAATGGGAACAGGGCTCGCAGTTCCTCATGGAACAAGCACAGCACTTCGACATAGCGTCGCTGTTGTCCTGCCTGCACCAGCTGCGGACGGAGAACGTCCGCCTGGAGGAGCACGTCGGCAATCTCCTGCAGAGGAGGGACCACCTGCTGGCCGTGAACGCACGCCTCGCTATACCACTAGCTGTGGTGAGTGGACCCGGCGAGCCAGTCAGATGCGCTCGCGAGAACGGATCAGGCCTGAGGGCTCCTTCCGTCCCGGGACGACCAGCTGACAATGTAGCCTGCGATCGACATCAGGTAATGATAACAAATATACCTCTATATATAATAAGTGCTCCAGTTGATTCGTACGGCGGGGGGGGGTCCGTCAGCCGGCGAGGCGGCGAGTACGGCGACCTGGCCGACACCGACATGGTCGACGTGGAGAATGTGAGCGATATGGACTATGTGGACCGTGCTTACCGGCCGGGTGGTGGGGTGCTGGGGCCGGGAGAGGTGAGGTGGGCTCACGGATACGCCCTGCCCGGGGAGCCCGGCCCGAGCGGGCTGGCGGCGGGCGCTGTAGTCCCCGCGGCGGGGCGACTCGGCGGCCCCGACGAGAACGGACACGGAGCCCTGCGGACGGTCAGGGCCTTCATAGTACTGGAGCAGCGAAGGCGAGCCAGGAGCGACTTCTCCATCGACGCCATACTGGCGGCTGACTACTGA

Protein sequence:

>DPOGS203023-PA
MLRNEPAQKVVCNNASFNMKEIMKEMVGGCCVCSDERGWPDNPLVYCDGNGCSVAVHQACYGIIAVPTGPWYCRKCESPETKSKVRCELCPSKLGALKRTDTGGWAHVVCALYIPEVRFGNVTSMEPIVLRLIPTERYNKTCYICQDLGKTHRANAGACMQCNKSGCKQQFHVTCAQSLGLLCEEAGNYLDNVKYCGYCQHHYSKLKKGGNVKTIPPYKPVSHDSRSDSSEREGEPPTTHCKRGPGRKSSHSSGGASGKNTPNSSKTPTNTSQPMDKKKPSPSRRGSVAESGSKTNTPAPSPSPQHIQETHTKGGCSTPINTAKIPLPPESPGKEGMISSAAIASIPIPPSTSTTTVVQPTKPYESVITNTETADAKQTKKRKAVQGSQSAVDYASSPTPVEVANQIGNNTWEQQTSHATSDTNVEVEKIIKKAKTEGMDGGSSSAGHYTSVSPAPPPPPPPAHSPASHTSLQSPRHLPSPMPGPSGINQAPNIRSPSQHQMKEREPPASLLVSVPLPSASHGLNLSAHAHALMHAQIPLPSPMPEMGHIFHQTHKQVAMESGLSHSPHARSWGGLNVSYELQDPNKPGVSGIAGPSKEALVGANMANMANMANMANMANMANMANMGIPPALRNKKRAALATSTANTPPPPPMQSTAAQNLSGMRRGPQPTPPPVYHEAIKDSPPSSPGSERPLKPKLEHKLGVNCSAPHMLGNELNPESGAAARLQEQLTAELAAHAAGAVNSADTPIPPPLINKAAPRSGAQSLDQLLERQWEQGSQFLMEQAQHFDIASLLSCLHQLRTENVRLEEHVGNLLQRRDHLLAVNARLAIPLAVVSGPGEPVRCARENGSGLRAPSVPGRPADNVACDRHQVMITNIPLYIISAPVDSYGGGGSVSRRGGEYGDLADTDMVDVENVSDMDYVDRAYRPGGGVLGPGEVRWAHGYALPGEPGPSGLAAGAVVPAAGRLGGPDENGHGALRTVRAFIVLEQRRRARSDFSIDAILAADY-