Monarch geneset OGS2.0

DPOGS215556
TranscriptDPOGS215556-TA1290 bp
ProteinDPOGS215556-PA429 aa
Genomic positionDPSCF300129 + 702192-703481
RNAseq coverage88x (Rank: top 63%)
Annotation
HeliconiusHMEL0061683e-15364.58% 
BombyxBGIBMGA010685-TA4e-11650.62% 
DrosophilaPbp95-PA4e-1122.22% 
EBI UniRef50UniRef50_E2AAK22e-2336.75%snRNA-activating protein complex subunit 4 n=1 Tax=Camponotus floridanus RepID=E2AAK2_CAMFO
NCBI RefSeqXP_001689386.11e-1728.25%AGAP001156-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|3071822156e-2336.75%snRNA-activating protein complex subunit 4 [Camponotus floridanus]
NCBI nr blastxgi|3071822151e-2236.75%snRNA-activating protein complex subunit 4 [Camponotus floridanus]
Group
Gene OntologyGO:00055151.1e-13protein binding
GO:00036771.5e-12DNA binding
GO:00063551.5e-12regulation of transcription, DNA-dependent
KEGG pathway 
InterPro domain[6-101] IPR0154951.5e-15Myb transcription factor
[60-158] IPR0090571.1e-13Homeodomain-like
[62-153] IPR0122871.5e-12Homeodomain-related
[59-107] IPR0010052.6e-08SANT domain, DNA binding
[60-104] IPR0147786.1e-08Myb, DNA-binding
Orthology groupMCL24798 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS215556-TA
ATGAGCAATACATTCACAGGTATGAAATGGACTAAGGAGGAAGAGGAATATCTTAAAAGGTTAATTGATTACTACAAACAAGATAACTACATCCCTTGGGGTAAAGTTGCAGCATCTATGGAGAATAGGACCAAGATTCAAATTTATAACAAATTTTTACGATTGGAGGAACACAGAAAAGGCAGGTTTATGCCCGAGGAAGATGCTGTAATATTGACAGGTGTTGATAGTTTTGGGCAAGACTATAAGAAAATTTCAAAATACCTTCCGGGTAGATCCGCAGCCCAGTGCAGAGTTAGGTATCAAGTGTTAGCTAAGAAGCGAATTTCAGCTGTTTGGACAGTGGACGAGGATAGAAAACTAGTGCAATTAATGGCAAATCAAGATTCAAACATTAATTATTCCACTTTGGTTCCTTATTTTCAAGGAAAAGATAGGTTTCATTTGAGATCCCGATATTTGACCTTGACAAAGTGGATGAGACTGCATCCCAATATGGATATAGCTTTAGCACCCAGACGAGGGGCTCGACGCTTAGGTCACGGCCAATCATCTGATGACCTGAATTCAGCCATTGAAAGTTTAAAAACGAAGATTCAATCAGAACTCACAGACAATAGGAAGAAAAGGGTAACTAAAGATTCTCCTGAAAATGTTATTGAAGATGCAATTATTGCTACGCTTGTTACGGAAAATGTCAGGTTGGAAGAAGCTAGGAAAGGTCAGACATCTTGCGATACGCAGACTGGTATGGAACAAAGAAATAAAACGAGCAACGCCTGCAATCTATCAAGTTTGCAGAAAGTTTTAATTCTATTACGATCAAAGTTAAATAAAAAGAAATTTATTCAAAATGGCGATCCGAAGTATAAAGGTTTGATTGAGACAGAAAATGATATTTATTCCGTAAGAGTTAAATCCTATTCTAAGGAGAATATAAAAAAGAATAATGTTAACATTAATTCTAAACCCGATATTTGGGGTGAAGTTTGTCTTGGTCCTTTGGAACACGTGTTTCCACCGCATTACGCGACGATAACTGGTTGCAGGAAACTAATGTCCTATGTAAGCAGCAAACCCAATAGGGATGACACTGTTAACCTACAAACGTTACTAAGAAAAAACATCCTTCTCAAAGAACAACTGCTTCTTTTGATGGAACGATTTAATGCTTTATTTCTTTGGCCCCTTCTTCTATCCAACTCACCTCCAGAGCCTTTCGCATCAATAGAAAATGATAAGTCATTAATTGATAAAGACATTAAAATCTTTAAGGATGATGAAAATTAG

Protein sequence:

>DPOGS215556-PA
MSNTFTGMKWTKEEEEYLKRLIDYYKQDNYIPWGKVAASMENRTKIQIYNKFLRLEEHRKGRFMPEEDAVILTGVDSFGQDYKKISKYLPGRSAAQCRVRYQVLAKKRISAVWTVDEDRKLVQLMANQDSNINYSTLVPYFQGKDRFHLRSRYLTLTKWMRLHPNMDIALAPRRGARRLGHGQSSDDLNSAIESLKTKIQSELTDNRKKRVTKDSPENVIEDAIIATLVTENVRLEEARKGQTSCDTQTGMEQRNKTSNACNLSSLQKVLILLRSKLNKKKFIQNGDPKYKGLIETENDIYSVRVKSYSKENIKKNNVNINSKPDIWGEVCLGPLEHVFPPHYATITGCRKLMSYVSSKPNRDDTVNLQTLLRKNILLKEQLLLLMERFNALFLWPLLLSNSPPEPFASIENDKSLIDKDIKIFKDDEN-