Monarch geneset OGS2.0

DPOGS214036
TranscriptDPOGS214036-TA1140 bp
ProteinDPOGS214036-PA379 aa
Genomic positionDPSCF300238 + 194770-197190
RNAseq coverage20x (Rank: top 79%)
Annotation
HeliconiusHMEL0047783e-8850.41% 
BombyxBGIBMGA003260-TA1e-1834.46% 
DrosophilaCG9215-PA2e-1933.77% 
EBI UniRef50UniRef50_D2A2143e-2328.65%Putative uncharacterized protein GLEAN_08444 n=2 Tax=Tribolium castaneum RepID=D2A214_TRICA
NCBI RefSeqXP_973104.16e-2428.65%PREDICTED: similar to novel KRAB box and zinc finger, C2H2 type domain containing protein [Tribolium castaneum]
NCBI nr blastpgi|910817611e-2228.65%PREDICTED: similar to novel KRAB box and zinc finger, C2H2 type domain containing protein [Tribolium castaneum]
NCBI nr blastxgi|910817612e-2729.18%PREDICTED: similar to novel KRAB box and zinc finger, C2H2 type domain containing protein [Tribolium castaneum]
Group
Gene OntologyGO:00036766.6e-13nucleic acid binding
GO:00056346.8e-10nucleus
GO:00082706.8e-10zinc ion binding
GO:00056226.1e-05intracellular
KEGG pathway 
InterPro domain[326-358] IPR0130876.6e-13Zinc finger, C2H2-type/integrase, DNA-binding
[5-76] IPR0129346.8e-10Zinc finger, AD-type
Orthology groupMCL35005 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214036-TA
ATGTCAATTAATTTGTGCAGAGTTTGTCTTGCAACTGGAGCAACTCTACCTATTTTTGATAAAGAAGAAAGTGTTTTAAATATACAGGCTAACTTAGCGATATGCTTAAAAGAAAAGGTTGAAGACCGAGATGGCTACCCCAGATTTATATGTGTAGTATGTAATCAAATATTATATGAGTTTTGTAAGTTTATAAATAAATATAAAGAGACATGCAAACTACTGGAAAAGGGTTTAGATACAGTAAAAGAGGAAAATAAAGATGATTTCATCTGCAATGAAGCCACAAATGAAGAAGTTGCAATTTGTTCTGTTAAGAAAGACAAAGAACAAAACAATTTATTTGAGACTAAAAAGCAACAAGAACCTTTAAAGAAAATTTTATATGTAAAGTTAGAAAGACTGGAATTACCAACGAAGAATAAATCTTCTGTAAGAAGATGTTTAGAAGTATCTAATAAAATTGCATCTTCCATTTTAGAAGGTGAATTCTCATGGAACGGAAGTGAATGGTGTGTGTCGTCCGACAGTATTAAAAATAAAGTAGTCAAGCCAAAACAATCACGCAATAATGAAAGGAAGGAGCTGAAAATACCAATCGTAAATATAAGAAAACCAAAGCCACCAAAATTGTGTGATCTGTGTGGAGAGGTTTTTAAATCCCACGAAAGGTTGACCGTTCACAAGAAGAGAGCCCATAATTGTGAACGCTCACAATGTAAATATTGCTCAAAAATGTTTAATGCCAGTTACGACCTCAAAAGACACATACTGAGGAGACATGAGACAAAGAAGAATTTTGTATGTGCAATTTGTGGCAGAGGGTTCGCTTTTAATGGAGAACTGACTACACACCATAAAAATGTACATGAGAAACATCTGAAACTTAAAAAGAATTTTAAATGCTATAGACCAGCGGTCTGTACGGTATGTGATTCAAGATTTTTCCATGAAGAATATTTAAAGGAACATATGCGTCTTCATACTGGAGAAACTCCCTTCAAGTGTCCAATTTGTAAAAGAGGTTACGCTCAAAGAGGCAATATGAAGAGTCACATGAAAACTCACAAAAAATCTGAACTTGATGAAGATACTCTGAGAAAATTAAGGCCCAATTATCTGAAACTATTAAAAACATAG

Protein sequence:

>DPOGS214036-PA
MSINLCRVCLATGATLPIFDKEESVLNIQANLAICLKEKVEDRDGYPRFICVVCNQILYEFCKFINKYKETCKLLEKGLDTVKEENKDDFICNEATNEEVAICSVKKDKEQNNLFETKKQQEPLKKILYVKLERLELPTKNKSSVRRCLEVSNKIASSILEGEFSWNGSEWCVSSDSIKNKVVKPKQSRNNERKELKIPIVNIRKPKPPKLCDLCGEVFKSHERLTVHKKRAHNCERSQCKYCSKMFNASYDLKRHILRRHETKKNFVCAICGRGFAFNGELTTHHKNVHEKHLKLKKNFKCYRPAVCTVCDSRFFHEEYLKEHMRLHTGETPFKCPICKRGYAQRGNMKSHMKTHKKSELDEDTLRKLRPNYLKLLKT-