Monarch geneset OGS2.0

DPOGS201083
TranscriptDPOGS201083-TA1401 bp
ProteinDPOGS201083-PA466 aa
Genomic positionDPSCF300185 + 123217-131984
RNAseq coverage125x (Rank: top 57%)
Annotation
HeliconiusHMEL0046212e-14790.07% 
BombyxBGIBMGA001391-TA2e-13284.59% 
Drosophilasvp-PA6e-9976.62% 
EBI UniRef50UniRef50_E3X8052e-9770.27%Putative uncharacterized protein n=1 Tax=Anopheles darlingi RepID=E3X805_ANODA
NCBI RefSeqXP_001655965.16e-9877.23%coup transcription factor [Aedes aegypti]
NCBI nr blastpgi|3123743645e-9770.27%hypothetical protein AND_16003 [Anopheles darlingi]
NCBI nr blastxgi|3479680554e-9471.16%AGAP002544-PB [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00056342.5e-46nucleus
GO:00036772.5e-46DNA binding
GO:00063552.5e-46regulation of transcription, DNA-dependent
GO:00048792.5e-46ligand-dependent nuclear receptor activity
GO:00037072.4e-43steroid hormone receptor activity
GO:00434012.4e-43steroid hormone mediated signaling pathway
GO:00037002.4e-43sequence-specific DNA binding transcription factor activity
KEGG pathway 
InterPro domain[161-173] IPR0030682.5e-46Transcription factor COUP
[195-347] IPR0089462.4e-43Nuclear hormone receptor, ligand-binding
[224-347] IPR0005364.8e-22Nuclear hormone receptor, ligand-binding, core
[238-259] IPR0017235.5e-18Steroid hormone receptor
Orthology groupMCL12044 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201083-TA
ATGTTTCGTACGATGCTGCAGGGCCGCTCGTTATGGTCCAAATTCTGGGAAAACTTAGCCTCACTATCCGCGTCTGACATTAAGATAAAATCACCTACGTGCCGACGCGTTCCCACGGCCGTCGCGAAAAACCCCCGACTCCGGCCAGTGGACAACCCTAGACGGGATCGATCGAAGACATTAATTAAAAAGGATGAAATAATAGTCCGGTTCCGATTGTTTATGGTGTATGGTCAAGCGTCCGACCCCCAGGACCTGCCGATACTGGTTCCGGGCGTCATCTTATCTGCTTTTGAATGGGAAATTGTGATATTTGTCGAAACCATACATCAATCGGAATGTTCCGAGACACCATGCAGCATGTTATGGCACGTCAAAATATTCACCTACAGGGGGTTATTTGAATGGTCGTTCCACTTGCCAGGACGAGAATCGAAGGTGGACCCCCGCGGTGCGGTCGGCGTGCACCTAACGAACCAGAGAGCTGTTCAGAGAGGTCGAGTGCCTCCGTCTCAGTCCGCCGGCCTGGCGCTGCCGGGGCAATTCGCCTTGACCAACGGTGATCCCGCCGCGGGCTTGAACAGTCACCCTTACCTCTCCTCGTACATCTCCCTGCTCCTTCGAGCGGAGCCCTACCCCACGCAGCCGGCCTCGAGGTACGGCCAATGCGTGCAGCCCACCAACGTCATGGGTATAGACAATATATGCGAACTAGCCGCCAGGTTGCTCTTCTCCGCCGTCGAGTGGGCGAGGAACATCCCCTTCTTCCCCGAACTGCAGGTCACGGACCAGGTCGCGCTCCTGCGACTGGTTTGGTCCGAGCTGTTCGTCCTCAACGCCTCCCAATGCTCGATGCCCCTCCACGTGGCGCCGCTGTTGGCCGCCGCGGGTCTACACGCGTCACCCATGGCCGCCGACCGCGTGGTGGCCTTCATGGACCACATACGGATCTTCCAGGAGCAGGTGGAGAAGCTGAAAGCGCTCCACGTGGACTCCGCGGAGTACTCCTGTCTGAAGGCCATCGTCCTCTTCACGACAGGTAAAATTTTGGACAGCTTATTCGGGGAGGCGAGGTTGCTGCTGTACAGAGTCGCCGGCGCGTTCGCTGCTATCACGAACCACGGGGAGCTCCTGGCGCTGGTCCGCACGCACTTGGACGCGTACGCCGAGGCGACCAGGGCTCCCCAGCCGCCCGCGCCGCCGCCTCCGTCCGCAGCCTCCTCGGGCTACTACTCCACGATGGAGACATCGCTCGGCGTCAACTCCTCCCTGTCCTACGGCAGCTTCCTGTCTCCGTCGCGTGTGCCGCCTCAGTATACGAGCAGTCCGCGTTTGGACGCGGGTACGTCATCGTTTAAGATATACGAGGGCAGCGGGAGCAGGGTTGACGCCAAGCGATGA

Protein sequence:

>DPOGS201083-PA
MFRTMLQGRSLWSKFWENLASLSASDIKIKSPTCRRVPTAVAKNPRLRPVDNPRRDRSKTLIKKDEIIVRFRLFMVYGQASDPQDLPILVPGVILSAFEWEIVIFVETIHQSECSETPCSMLWHVKIFTYRGLFEWSFHLPGRESKVDPRGAVGVHLTNQRAVQRGRVPPSQSAGLALPGQFALTNGDPAAGLNSHPYLSSYISLLLRAEPYPTQPASRYGQCVQPTNVMGIDNICELAARLLFSAVEWARNIPFFPELQVTDQVALLRLVWSELFVLNASQCSMPLHVAPLLAAAGLHASPMAADRVVAFMDHIRIFQEQVEKLKALHVDSAEYSCLKAIVLFTTGKILDSLFGEARLLLYRVAGAFAAITNHGELLALVRTHLDAYAEATRAPQPPAPPPPSAASSGYYSTMETSLGVNSSLSYGSFLSPSRVPPQYTSSPRLDAGTSSFKIYEGSGSRVDAKR-