Monarch geneset OGS2.0

DPOGS214105
TranscriptDPOGS214105-TA1098 bp
ProteinDPOGS214105-PA365 aa
Genomic positionDPSCF300014 - 1995104-1996201
RNAseq coverage268x (Rank: top 40%)
Annotation
HeliconiusHMEL0114172e-16194.32% 
BombyxBGIBMGA006156-TA1e-13881.45% 
DrosophilaSoxN-PA4e-6751.20% 
EBI UniRef50UniRef50_Q17NV31e-8150.78%Sex-determining region y protein, sry n=4 Tax=Endopterygota RepID=Q17NV3_AEDAE
NCBI RefSeqXP_001648654.12e-8250.78%sex-determining region y protein, sry [Aedes aegypti]
NCBI nr blastpgi|1571049693e-8150.78%sex-determining region y protein, sry [Aedes aegypti]
NCBI nr blastxgi|1571049691e-9451.00%sex-determining region y protein, sry [Aedes aegypti]
Group
Gene OntologyGO:00036772.2e-32DNA binding
GO:00055153.1e-29protein binding
KEGG pathway 
InterPro domain[86-167] IPR0009102.2e-32High mobility group, HMG1/HMG2
[67-159] IPR0090713.1e-29High mobility group, superfamily
[156-236] IPR0220976e-08Transcription factor SOX
Orthology groupMCL13617 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS214105-TA
ATGGAGACGGATCTAAAGGGCGGCAGCCTGCACGCCACGGTGCCGCCGCACCACGCGCTGCAGCACGGGTACGGGTCGCTAGGCGCGCTCGGTGGCATGATGGCGCTGCCGCAGCAGCAGCCGCTGGCCCAGCACCAGCCACTGCAGCACCACCAGCACCAACCCCTGCCCCAGCACCACGCGCAGCCCCAGCAGCAACACCATCAACCCCCAAACAACAACAACAACAACTCCAGCAAGAACTCTAACGCGGAGAGGGTGAAGCGTCCCATGAACGCTTTCATGGTGTGGTCGCGCGGGCAACGCAGGAAGATGGCTTCCGACAATCCTAAAATGCACAACTCGGAAATATCTAAACGTTTGGGTGCACAGTGGAAAGACCTCTCGGAGTCAGAGAAGCGACCATTCATCGACGAGGCGAAGAGGCTTCGGGCCGTTCACATGAAGGAACACCCGGACTACAAGTACAGACCCCGGAGGAAAACGAAGACGCTCGCGAAGAAACAGGAAAAGTATCCGTTAGGAGGCGGCGCTCTACTGGGAGCCGGTGACGGTCAGCGCACGAACGCGCCGACGGCTCAGCAGCCGCGGGACGTGTACCAGATGACACCGAACGGGTACATGCCCAACGGTTACATGATGCACGATCCCAGCGCCTATCAACAGCAGGCGTACGGCTACCCGCGCTACGACGTGTCACAGATGCAGCAGCAGTACAGCGGCGGATACTACGGCGGCGGGCAGGGCTCGCCGTATCTCCCTCAGCCGCCGTCTCCTTCCGCGTACGGCCTGGGTCCCGGCTCGCCAGGAGGGTACGCGATGCCCGCCTCGTGCGCCTCGCACTCACCCAGCGGATCATCCGCCAAGTCGGAGCCGGTTTCCCCGGGCCCGCCGGGTATGAAGCGCGAATACGTGCACGAGCCGCAGCTAAAGCGCGAGTTCGCGCACGCGCACGGCGGCCAGAGTCACCCGCACGAGCAGCTCGGCATGAAGCGCGAGTACGGCCAGCAGGACCTCAGCCACATCATCAACATGTACCACGTGCCTGATGAACAGCGGTACGCGGCGGCCGGAGAGCGAGCCATGCCGCTCATCTGA

Protein sequence:

>DPOGS214105-PA
METDLKGGSLHATVPPHHALQHGYGSLGALGGMMALPQQQPLAQHQPLQHHQHQPLPQHHAQPQQQHHQPPNNNNNNSSKNSNAERVKRPMNAFMVWSRGQRRKMASDNPKMHNSEISKRLGAQWKDLSESEKRPFIDEAKRLRAVHMKEHPDYKYRPRRKTKTLAKKQEKYPLGGGALLGAGDGQRTNAPTAQQPRDVYQMTPNGYMPNGYMMHDPSAYQQQAYGYPRYDVSQMQQQYSGGYYGGGQGSPYLPQPPSPSAYGLGPGSPGGYAMPASCASHSPSGSSAKSEPVSPGPPGMKREYVHEPQLKREFAHAHGGQSHPHEQLGMKREYGQQDLSHIINMYHVPDEQRYAAAGERAMPLI-