Monarch geneset OGS2.0

DPOGS205830
TranscriptDPOGS205830-TA1563 bp
ProteinDPOGS205830-PA520 aa
Genomic positionDPSCF300081 - 195160-199448
RNAseq coverage148x (Rank: top 54%)
Annotation
HeliconiusHMEL0173890.083.82% 
BombyxBGIBMGA010850-TA0.075.15% 
DrosophilaCG6765-PA6e-3149.24% 
EBI UniRef50UniRef50_Q0IG941e-8938.66%Putative uncharacterized protein n=2 Tax=Culicinae RepID=Q0IG94_AEDAE
NCBI RefSeqXP_001663069.12e-9038.66%hypothetical protein AaeL_AAEL003075 [Aedes aegypti]
NCBI nr blastpgi|1571339124e-8938.66%hypothetical protein AaeL_AAEL003075 [Aedes aegypti]
NCBI nr blastxgi|3800238574e-9137.72%PREDICTED: zinc finger protein 271-like [Apis florea]
Group
Gene OntologyGO:00055153.7e-18protein binding
GO:00036764.3e-11nucleic acid binding
GO:00082708.8e-06zinc ion binding
GO:00056228.8e-06intracellular
KEGG pathway 
InterPro domain[4-124] IPR0113333.4e-21BTB/POZ fold
[25-125] IPR0130693.7e-18BTB/POZ
[32-134] IPR0002101.3e-11BTB/POZ-like
[463-491] IPR0130874.3e-11Zinc finger, C2H2-type/integrase, DNA-binding
[470-493] IPR0070878.8e-06Zinc finger, C2H2
Orthology groupMCL17011 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS205830-TA
ATGACTTCGACGGACACCTACCAGTTGAAATGGCATTCTCACAGTTCACATCTGAACGGCTCGGTAGCGTCTTTGCTGCGCTCGGAGCGTTTCACGGATGTCGTGCTCTGTACTATGGACGGGTCGCAGATACCGGCACACAAGTTCATACTGAGCTCTTGCAGCGTTTATCTCAGCACGTTGTTCGAGGGACAGCGATCTGTGACGCGAATGGGCGGGATGTTGTACGTAGTTCTCCCTTCAGAAATCTCAACAAAGGCTTTGAAGATCCTAGTGGAGTATATGTATAAAGGTGAGACAACTGTATCAAATGAGGTTCTAGATACAGTGCTTAAGGCGGGTGAAGTGCTTAAAATAAGAGGTTTGTGGCGTCAGAACGATGAGGCCGGTGGTGATTCAACTCCCGCTGAAAAAACAACAACCCAGCCGGCCAACATCAACAAGCAAGCCAAGAAACCAGATGAACCAAAGCTCACCGTAAAAAAAGATGATAAACTCCTTAAAACCTTCAACCCCGTGCAGCCGCAGGGCGTCACAAGACCTATGTTCATCGGTCCACCAAAGCTGGTGTTTATAAAGACTTCAGAAGGCAGCACTCAAGCGGCAGCATTGAGACCTGGTGCTCCGAAAGGTCAGACTATACTGGTGGCTCCTGCTCCAGCCACCGAGGCCAATTCATTACCGATACTTTCCAATGCTGAGGACTCTCCCGATGAAACCCCACCACCAAGAATACTGCGTCGACATGCAGCTGAGAGGAAGTACGGGAAGAAACAGAAGACTGATCAAGCGGAAGAGGAAAAGGAACAGGAAGAACCAGACAATGCGTCCGTTAGCTCAAAGAAATCTAGTCAGAGTGTTCCGGACACAAGCACAGAGATCCAGATTAAGGATGAGCCGGAGTGGGATGCGAGCAGTATCGAGGAGGAGGAGAGGTCTATAGCTGAAATTTTCCAAGCCGAGATGAGTGTCAAGTCAGAACCTATTGATGACATGGATATTGAAGAGGAAAGCCTGTTGTACAGTCCTCTGGCGTGTGAGCTGTGCGCGGAGGTGTTCACGGTGCCCGCGGCGTGGGTCCGACACGTACAGAGACACGCGCATTCCGACCATCACCATCCCAGGAAGAGGAGGCGCTCTGCGAGTGACGACACGGAGGAGACGATGGCATTACTCCGCTGTGATCTGTGCCAGAAACACTTCCCCAACCCCGCCGAGTGGGTCCGACACGTGCAGGGGACACACACAGAGACAGAATTGGCCATATCAAACAACAGTGCACCACCAAAACGTCACAATCGTTTCACAGACGGTGAGCAAAACAAGATTTGCTCGCAATGCAAGAAGACCTTCCCTTCGCACGCCTCCATGCTCATACACATGAGAACACACACAGGAGAGCGGCCGTTCGTGTGCGGTCTGTGTAACAAGGGCTTCAATGTGAAGTCCAACCTGCTGCGACATCTGCGGACGCTGCACGACCAGGTCATCAGTCCCGCCCGCATCGACGACGAGGAGGCCGGCGCTCCGCCCTCGCTCAGCGACCTGAAGAAAGAGACTTGA

Protein sequence:

>DPOGS205830-PA
MTSTDTYQLKWHSHSSHLNGSVASLLRSERFTDVVLCTMDGSQIPAHKFILSSCSVYLSTLFEGQRSVTRMGGMLYVVLPSEISTKALKILVEYMYKGETTVSNEVLDTVLKAGEVLKIRGLWRQNDEAGGDSTPAEKTTTQPANINKQAKKPDEPKLTVKKDDKLLKTFNPVQPQGVTRPMFIGPPKLVFIKTSEGSTQAAALRPGAPKGQTILVAPAPATEANSLPILSNAEDSPDETPPPRILRRHAAERKYGKKQKTDQAEEEKEQEEPDNASVSSKKSSQSVPDTSTEIQIKDEPEWDASSIEEEERSIAEIFQAEMSVKSEPIDDMDIEEESLLYSPLACELCAEVFTVPAAWVRHVQRHAHSDHHHPRKRRRSASDDTEETMALLRCDLCQKHFPNPAEWVRHVQGTHTETELAISNNSAPPKRHNRFTDGEQNKICSQCKKTFPSHASMLIHMRTHTGERPFVCGLCNKGFNVKSNLLRHLRTLHDQVISPARIDDEEAGAPPSLSDLKKET-