Monarch geneset OGS2.0

DPOGS202817
TranscriptDPOGS202817-TA1929 bp
ProteinDPOGS202817-PA642 aa
Genomic positionDPSCF300018 + 433399-438499
RNAseq coverage93x (Rank: top 62%)
Annotation
HeliconiusHMEL0059918e-18071.82% 
BombyxBGIBMGA010518-TA7e-16863.98% 
Drosophilachm-PA1e-10147.23% 
EBI UniRef50UniRef50_Q7PM099e-10651.48%AGAP009676-PA (Fragment) n=2 Tax=Culicidae RepID=Q7PM09_ANOGA
NCBI RefSeqXP_318735.42e-10651.48%AGAP009676-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582985523e-10551.48%AGAP009676-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|2700133544e-12342.30%hypothetical protein TcasGA2_TC011945 [Tribolium castaneum]
Group
Gene OntologyGO:00056344.5e-37nucleus
GO:00063554.5e-37regulation of transcription, DNA-dependent
GO:00167474.5e-37transferase activity, transferring acyl groups other than amino-acyl groups
GO:00082708.4e-10zinc ion binding
GO:00037008.4e-10sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[498-629] IPR0161812.2e-56Acyl-CoA N-acyltransferase
[553-629] IPR0027174.5e-37MOZ/SAS-like protein
[357-386] IPR0025158.4e-10Zinc finger, C2HC-type
Orthology groupMCL11687 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202817-TA
ATGACTAATTTGGATGTGAGCAACAGCAGCACTGACAGCAGTTCAGGGTCGACTTCTGATTCTAGCAGTAGTGGGACGTCATCTACCAGCTCAGGCTCAGGATCCAGCAGTTCTGACTCTGAATCATCCACCTCCGACGTGCCGGCCACCGTACCGCCACCTAAATCACCAGTACCTCCACCAGTGGAAGAACCCGCAAAAGAGGAGCCAAAACCTCAAAGGCCAGCAAAAGTCTCTTCATCCAGTGATGATGAAGTACCAAAACCGGAGACACCAAAACCAATGCCACCAAGACGACGATCAATCAAATCTAAAAATGCTGCATCAGTACTCGTCAAAAAACAACCAGCAGTTCCGAGGATCACAACCGGTGCAAAGTCCAAAGCGATATCAAAAACAATACCGAAAGCAAATAAAATAGATGCAAATTCACTAAAAAAGAAAAGCATTTTCTCACCCGACAACAGTTCTGAATCAGAAACTGAGAGCAAGGACAGCAAAACGTCCAAAACCAGCCCAAAGGGTTCACCGATCAACAAAAAGACCAGTATTAAATCCAGCGACGAAAAAGATACGCCTTCACCATTGATGCCGGTGCAAAATGAAGATTCAAATTCCAACTCCAAGGATAGCTCTAAACGTAGAGCAAGCAGGTCAAGTGGGCCTCCGTCTAAAAAAGCTTCGGAGGACAAAAGCGCGTCTTCATGCTCGTCCAGCCAATCATCAGTGGAATCTGTGTCCTCCGAGAGTGATAGCGATCGCACGGAGAAGAAAGAAGATTCCAATACAAGCAAATCCAAACCATCCTCTAATGCGAGTAGTAAGCCGGAGACAACACCGAAGAGTGGTGATTCGAGTGACTCCGTGGGTACAATGACTCGTAAGTTGACTCGATCCTTGTCAGCCAGAGTGTCCCGGATGGCTGCCAAACCCACCAACACTGACACCGACTCAGAGGCTGATGATAAAACTGTGGAACAGAGTAAGGATGATAAACGCCTGGCAAAGGCTCGGGCTGCGATCGGACGCTCGCCGGTCACACCAGCTCCTCCGACCGCACCTTCAGAGAGGAGGTGTCCCGTCAGAGACTGTGACTCCAGCGGACATCTGGGCGGTAAGGTCAACCGTCACTTCACCTGGGACGCTTGTCCCGTGTATCACAACGTGACGGCTGCCTGGTGCGTCGCGGCGGCCGAGGAGCGAGCAGCCGCCGCCGCGACCAGGAGGCGGGCGCTCGCCGCAATGCACCAGAGGCCCAGGGCTATGCCCACCATCGAACAACGGGCGTACCAGCTCAAGGTCAAGGACCTGCGTTCGAAGTGGAAGGGCAGTCAGGAGTTACGGTCGATGGCGAACAACGAGGAGTTGGGTGATGAGAGGGAGCCGGTGCTGGAAGGCTTCGCCCCCGACTACGACCTGCGGCTGTTCAGGGAAGCGCAGGCTCTGGCGGCTGTTAAGATCGAAGAGGAACTTGGAGATATATCCACCGATAAAGGCACCAGATACGTGGTGATGGGCAAGTATCTGATGGAGGTCTGGTATCAGTCGCCGTACCCGGGCGACGCGGCTCGGGTGCCGAGGCTGTTCGTGTGTGAGTTCTGCCTGTCGCATCACAAGTGCGCGGCCGGCGCTAACAGACACAAGGCCAAGTGTGTATGGAGACATCCGCCCGGGGACGAGGTGTACAGGAAGGACAACCTGAGCGTGTGGCAGGTGGACGGCCGAAAACATAAGCAGTACTGCCAGCAGCTCTGTCTGTTGGCGAAGTTCTTCCTGGACCACAAGACGCTGTACTACGACGTAGAGCCCTTCCTCTTCTATGTGATGACCTGCGCTGATGATGAAGGCTGTCACATCGTTGGATATTTCAGTAAGAGCTTGGAGTCCCTCACTCCAAAAGTTCTAGAAGAAGACAAGAAGTATAACTGA

Protein sequence:

>DPOGS202817-PA
MTNLDVSNSSTDSSSGSTSDSSSSGTSSTSSGSGSSSSDSESSTSDVPATVPPPKSPVPPPVEEPAKEEPKPQRPAKVSSSSDDEVPKPETPKPMPPRRRSIKSKNAASVLVKKQPAVPRITTGAKSKAISKTIPKANKIDANSLKKKSIFSPDNSSESETESKDSKTSKTSPKGSPINKKTSIKSSDEKDTPSPLMPVQNEDSNSNSKDSSKRRASRSSGPPSKKASEDKSASSCSSSQSSVESVSSESDSDRTEKKEDSNTSKSKPSSNASSKPETTPKSGDSSDSVGTMTRKLTRSLSARVSRMAAKPTNTDTDSEADDKTVEQSKDDKRLAKARAAIGRSPVTPAPPTAPSERRCPVRDCDSSGHLGGKVNRHFTWDACPVYHNVTAAWCVAAAEERAAAAATRRRALAAMHQRPRAMPTIEQRAYQLKVKDLRSKWKGSQELRSMANNEELGDEREPVLEGFAPDYDLRLFREAQALAAVKIEEELGDISTDKGTRYVVMGKYLMEVWYQSPYPGDAARVPRLFVCEFCLSHHKCAAGANRHKAKCVWRHPPGDEVYRKDNLSVWQVDGRKHKQYCQQLCLLAKFFLDHKTLYYDVEPFLFYVMTCADDEGCHIVGYFSKSLESLTPKVLEEDKKYN-