Monarch geneset OGS2.0

DPOGS213486
TranscriptDPOGS213486-TA1272 bp
ProteinDPOGS213486-PA423 aa
Genomic positionDPSCF300100 + 234515-236331
RNAseq coverage524x (Rank: top 24%)
Annotation
HeliconiusHMEL0168491e-16884.58% 
BombyxBGIBMGA004368-TA1e-14874.74% 
DrosophilaSox15-PA3e-2635.63% 
EBI UniRef50UniRef50_D6WQ156e-4835.88%Putative uncharacterized protein n=3 Tax=Tribolium castaneum RepID=D6WQ15_TRICA
NCBI RefSeqXP_974207.14e-4835.71%PREDICTED: similar to conserved hypothetical protein [Tribolium castaneum]
NCBI nr blastpgi|2700104592e-4735.88%hypothetical protein TcasGA2_TC009856 [Tribolium castaneum]
NCBI nr blastxgi|2700104599e-5334.97%hypothetical protein TcasGA2_TC009856 [Tribolium castaneum]
Group
Gene OntologyGO:00036773.1e-12DNA binding
GO:00055151.5e-10protein binding
KEGG pathwayxtr:3950664e-14 
 K04495 (SOX17)maps-> Wnt signaling pathway
InterPro domain[13-60] IPR0009103.1e-12High mobility group, HMG1/HMG2
[14-52] IPR0090711.5e-10High mobility group, superfamily
Orthology groupMCL19498 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213486-TA
ATGTCGTACTATCGACTTTTTTATCGACAACTCGCCTCTCTAAGTAAAAAATGGCGTTCATTGACCCCTCAAGACCGACGGCCGTTCGTCGAGGAAGCGGAGCGGCTTCGGGTCATTCATATGACGGAACATCCCAACTACAAGTACCGGCCGCGGAGACGGAAACAGAACAAAGCGAGGCCCAACCAGCCCTCGACTCCGTCTAGCGCTCCGCCCCCAGCCTCCGCCACGCTCGGCTCTCCGTACGCTACTGCCACCGACTCACCAGACACCCGCTTTACGCACAACCCTGGCGCTGGTTTCTCTCCTTACCGAACCTCCCCACTCGCGTCCGACTACCAAACAGGTCAATTCAATGGCCACGTTCAAACACCGGAATCCTCCCCCGCGAGGTCGCCAGAACCTCAGGGCAGGAGGAGCGCCCCAGCTGAAGCGCCACTCCCTACTCCCGACGCTTCCCCAGTGGAGAACGAAAAGGAGAACTTCCAGTACGAGGACAGGCGAAGAGCGATAAACGCTTCGTCAATGAACGATTCTTATTCTTACAAAACGTTCAGAAACCCTGGGTCGTACTCCCCGGCGCCGGTCGCCGCTATGGGGATGGCTAACGGTATGTACGTGATGTGTCGGACTTTAGCTGAACAGCCGCCTCTGGTTACGGGAACGTTCTTCCCACCAGTTGCGACATCTCAAGACCAGCAGGCGCTTGGATCCACGGCACCCAGAGTCTCATCTGCCCCGAGTGGTCCTCTCAATACCGAATACAACATCCAGTACCACCCTTACGATCACTACGAACAGATGTACAAAACCGAAGACTCATACGTCACACACTATCCAGAGCAACCGAAAACGGAGTACGAGGAATACAACCCAGGGGGTCCGTACTTCGCAGAGGAACCACCAAACCAGCCTCAGGAAACCGAGCAGATAATCAACCCGCGGCCGGAAATACGGACCGGATCTCCGGAGTCCGACGTCGACGCGAGGGAATTTGATAAGTATTTGGACTATGGAGCAGAAGGAGCGATGGAGCAATATAGATACGAGCAGCAACAGCAGCAGCAGTACAGACAGCAGTATTGCGAACAAACTCAGGGAACGGAGTACTGTTCCGAAAGAATGTACCCCTCACCGTACGCTTCCGTCATCTCCGGGGCTCCCCCGCCTGGGAACTACAGCACACCCCCCGGAACAGACCCGACCCCACGACCTGACGACGAGTTCAGCGTCATACTCGCTGGCGTCCGGCAGACTTGTTACAGCAATTGA

Protein sequence:

>DPOGS213486-PA
MSYYRLFYRQLASLSKKWRSLTPQDRRPFVEEAERLRVIHMTEHPNYKYRPRRRKQNKARPNQPSTPSSAPPPASATLGSPYATATDSPDTRFTHNPGAGFSPYRTSPLASDYQTGQFNGHVQTPESSPARSPEPQGRRSAPAEAPLPTPDASPVENEKENFQYEDRRRAINASSMNDSYSYKTFRNPGSYSPAPVAAMGMANGMYVMCRTLAEQPPLVTGTFFPPVATSQDQQALGSTAPRVSSAPSGPLNTEYNIQYHPYDHYEQMYKTEDSYVTHYPEQPKTEYEEYNPGGPYFAEEPPNQPQETEQIINPRPEIRTGSPESDVDAREFDKYLDYGAEGAMEQYRYEQQQQQQYRQQYCEQTQGTEYCSERMYPSPYASVISGAPPPGNYSTPPGTDPTPRPDDEFSVILAGVRQTCYSN-