Monarch geneset OGS2.0

DPOGS203949
TranscriptDPOGS203949-TA1008 bp
ProteinDPOGS203949-PA335 aa
Genomic positionDPSCF300005 + 93040-94047
RNAseq coverage13x (Rank: top 82%)
Annotation
HeliconiusHMEL0120841e-17591.37% 
BombyxBGIBMGA013330-TA2e-15790.48% 
Drosophilacroc-PA7e-6264.80% 
EBI UniRef50UniRef50_G5EI844e-15590.48%Transcription factor crocodile n=2 Tax=Obtectomera RepID=G5EI84_BOMMO
NCBI RefSeqXP_001812698.11e-7551.21%PREDICTED: similar to forkhead protein/ forkhead protein domain [Tribolium castaneum]
NCBI nr blastpgi|3580315841e-15490.48%transcription factor crocodile [Bombyx mori]
NCBI nr blastxgi|3580315840.092.26%transcription factor crocodile [Bombyx mori]
Group
Gene OntologyGO:00063551.8e-63regulation of transcription, DNA-dependent
GO:00435651.8e-63sequence-specific DNA binding
GO:00037001.8e-63sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[55-145] IPR0017661.8e-63Transcription factor, fork head
[51-152] IPR0119913.3e-47Winged helix-turn-helix transcription repressor DNA-binding
Orthology groupMCL14510 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203949-TA
ATGCACGCTTTATTTGGAGAACAAAGCCATTACGCATACCGCAGCGGAGCTGCAGGAGGCTATTCTGCTGGAGTCACTCCTTACGCTTACGATCAATACAGATATGGATACGGACCGCCCTATCTTCACCCGCATCAACAGCATGTCGGAACACCGAAAGATATGGTCAAACCTCCCTATTCGTATATTGCACTGATAGCAATGGCGATTCAAAATGCACCCGATAGGCGAATAACCCTCAATGGTATATACCAATTCATCATGGAACGTTTTCCTTATTATAGAGAAAATAAACAAGGGTGGCAAAACTCAATACGACACAACTTGAGTCTTAACGAATGTTTTGTTAAGGTTGCAAGGGATGATAAAAAACCCGGCAAAGGAAGTTATTGGACGTTGGATCCAGATTCTTATAACATGTTTGATAACGGATCTTATTTGAGACGACGTCGCCGTTTCAAAAAAAAGGATGCGCTTAAAGAAAAAGAAGAAGCTTTGAAACGCCAGCAACAATTACAACAAGCGCAAGAACTGGCGGCACAGGAAGCTCTTAGCGCTGCTGATGCTTTAGGACAAGCGCGAGACGTCAAGCCAGACGTAAAGCCTAGAATATTTGAATGTAGACCAAAACGAGAACCCGGTGCGGATTGCACGCGGTATGACAAATTAAGTGAGCCTATAGACGAGTTTAGCGAGCCTCGATTACCGCCCTCCGCAGTTTACTGTTCGCCACAGCCATATTCTTTAGCAGCCGAGGAGTTTCGAGCAGCTACAAGCGGCTGGTATTCTGCACCCGAACCATCCGCTGACCAGCTTCCGCCAGCTTTTCGGGACCTCTTCGAACCGCCTAGTTGCCAGTTGGCGGGATACCGCGGCAGTTCACCGGCTCCCGACGCTTACCGCGCTTCACCTCCACCTCACCACCACTACCGTTCGCCAGCTCCATCCTACTACCATCATCAAGCTTGCGTAGCCGCCGCACCCGCCTCTGCTCATAAATCCTACTGA

Protein sequence:

>DPOGS203949-PA
MHALFGEQSHYAYRSGAAGGYSAGVTPYAYDQYRYGYGPPYLHPHQQHVGTPKDMVKPPYSYIALIAMAIQNAPDRRITLNGIYQFIMERFPYYRENKQGWQNSIRHNLSLNECFVKVARDDKKPGKGSYWTLDPDSYNMFDNGSYLRRRRRFKKKDALKEKEEALKRQQQLQQAQELAAQEALSAADALGQARDVKPDVKPRIFECRPKREPGADCTRYDKLSEPIDEFSEPRLPPSAVYCSPQPYSLAAEEFRAATSGWYSAPEPSADQLPPAFRDLFEPPSCQLAGYRGSSPAPDAYRASPPPHHHYRSPAPSYYHHQACVAAAPASAHKSY-