Monarch geneset OGS2.0

DPOGS212254
TranscriptDPOGS212254-TA2220 bp
ProteinDPOGS212254-PA739 aa
Genomic positionDPSCF300077 - 853623-886881
RNAseq coverage16x (Rank: top 81%)
Annotation
HeliconiusHMEL0149488e-14076.99% 
BombyxBGIBMGA011565-TA4e-14272.93% 
Drosophilakn-PC0.072.55% 
EBI UniRef50UniRef50_P567210.074.13%Transcription factor collier n=18 Tax=Diptera RepID=COLL_DROME
NCBI RefSeqXP_973686.20.064.23%PREDICTED: similar to knot CG10197-PB [Tribolium castaneum]
NCBI nr blastpgi|1892348760.064.23%PREDICTED: similar to knot CG10197-PB [Tribolium castaneum]
NCBI nr blastxgi|1892348760.064.28%PREDICTED: similar to knot CG10197-PB [Tribolium castaneum]
Group
Gene OntologyGO:00036770DNA binding
GO:00063550regulation of transcription, DNA-dependent
GO:00055154.1e-13protein binding
KEGG pathway 
InterPro domain[107-602] IPR0035230Transcription factor COE
[409-495] IPR0147563.2e-22Immunoglobulin E-set
[410-492] IPR0029094.1e-13Cell surface receptor IPT/TIG
[408-492] IPR0137831.4e-09Immunoglobulin-like fold
Orthology groupMCL10454 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212254-TA
ATGACGTTGTCTCGCTTTAGTGGCGAGGGCGTGTTCCCAGGGCGCCTCCCCCAGTCCGCGGGGACGGGGGCACGCGAAGCTCGAGCACTCCGCGGCCTGTTCCCGCGACGCTTTGATTGTCGCCGAGCCTGTGGGGCTCGCGCGTGTTCGGTCGCGATTCGGTCTGACGAGTTGAACGCGGCGGCGGGCGGGTACACCGGCTCGCCGTGGCCGCCGCTGGAGCTCGACCCGGTTGGATGGGGAAGGAAACTCTACCCCACTGGAGCCCCCAGGTCATCAGGGGGGCTTATGTTCGGGCTCCATCAGGAGGGGGTCCACGCCCAGCCTCGAGGGCCTGTCACCTCGCTGAAGGAGGAACCTCTCACCAGGGCCTGGATGACACCAACCTCACTAGTCGACAATACAAATACGGTGGGTGTTGGCCGCGCTCACTTCGAGAAACAACCGCCAAGCAACCTGCGCAAGAGCAACTTCTTCCACTTCGTAGTTGCTTTATACGACCGAGCCGGACAGCCCGTAGAGATAGAACGAACAGCTTTCATAGGATTCATCGAAAAGGATCAGGAAGCGGAAGGCCAAAAGACAAACAACGGTATCCAGTACAGATTACAGTTACTTTACGCAAATGGTATTCGACAGGAGCAAGACATATTCGTTCGATTGATAGATTCCGTCACGAAACAGGAACAAAACGCAAGCTTATTAAAAGCAATGAGCTCTAAAGGCATGTTCGCACACAAGCCCACCTCAATCAGTAGGGCACGCGGCGGGCACGCGGCGAGCATGCGGCGACCACGGGGCAGATTGTCATCGCGAACCGATTACCCCATCGTATATGAAGGTCAAGATAAGAATCCTGAAATGTGCCGAGTCCTTCTCACGCATGAAGTTATGTGCAGTCGGTGCTGTGATAAGAAAAGTTGCGGCAACAGAAACGAAACGCCATCAGATCCTGTTATCATAGATCGATTTTTCTTAAAGTTCTTCTTAAAATGCAACCAGAATTGTTTGAAAAATGCCGGCAACCCGCGAGATATGAGACGATTTCAGGTGGTAATCTCGACGCAAGTGATGGTGGATGGTCCTCTGCTAGCGATATCGGACAACATGTTTGTCCACAACAACAGCAAGCACGGTAGACGTGCCAAGAGATTAGACCCATCTGAAGGTATTGACGCCGCGCCCGACTCTAATTCGGGGCTATACCCACCGTTGCCCGTAGCAACGCCATGCATCAAAGCAATATCTCCCAGCGAAGGCTGGACGTCAGGGGGCTCTACCGTAATAATAGTGGGGGACAACTTTTTCGATGGACTTCAAGTTGTATTTGGAACTATGTTGGTGTGGAGCGAGCTGATAACATCACATGCGATAAGAGTTCAAACTCCACCGCGGCACATACCTGGTGTAGTTGAAGTCACACTTTCATACAAGAGCAAGCAGTTCTGCAAAGGAGCGCCTGGAAGATTCGTATATGTTTCTCTCAACGAGCCTACAATAGATTACGGCTTTCAGAGACTACAGAAACTTATACCGCGGCATCCTGGTGATCCAGAGAAACTACCAAAGGAGATAATTCTAAAGCGAGCAGCAGACCTCGCCGAGGCCTTGTATTCAATGCCTCGTAATAACCAACTGGGTCTATCCGCTCCTCGCTCGCCCTCCAGTATGCCCTTCAACTCATACACCGGACAGTTGGCGGTCAGCGTCCAAGATACTGCCGCCTCACAGTGGACTGAAGAAGAGTACGCACGCAGCGGCGGTTCGGTATCGCCGCGGTATTGTTCTGCCGCGTCTACGCCGCACGCGCCCGCCGCCTACCCACCGCAACACTACCCTGCACCACCTACCTCACTCTTCAATACCTCCTCGCTGTCTCTAGGTCCCTACCACCCGGCCAACGTAAACGGACATACAGAATACAATGCTTACAAAGAAACCGAGCATTATACGGAAAGAAATGATGATAATAAAACCATCTATCAAAACACTCATACGAAATGTATCGACACGAAAACGCATAAAGACAAGTCCCGAAGTGCGTTCGCAGTTGTCAGACAAAGTCCACCGCGTAATTTCCAACAGCAAAATTGGCAACATCTCGCTGTACAGTCAGGAATGGGCGGTCTGGTGTCATCGCCTTTTAGTGTGAATCCGTTCTCGCTGCCCACTTGCAGCGCACAGCAATACGCGCAGACAGCGCCGCTTGCCTCCAAGTAA

Protein sequence:

>DPOGS212254-PA
MTLSRFSGEGVFPGRLPQSAGTGAREARALRGLFPRRFDCRRACGARACSVAIRSDELNAAAGGYTGSPWPPLELDPVGWGRKLYPTGAPRSSGGLMFGLHQEGVHAQPRGPVTSLKEEPLTRAWMTPTSLVDNTNTVGVGRAHFEKQPPSNLRKSNFFHFVVALYDRAGQPVEIERTAFIGFIEKDQEAEGQKTNNGIQYRLQLLYANGIRQEQDIFVRLIDSVTKQEQNASLLKAMSSKGMFAHKPTSISRARGGHAASMRRPRGRLSSRTDYPIVYEGQDKNPEMCRVLLTHEVMCSRCCDKKSCGNRNETPSDPVIIDRFFLKFFLKCNQNCLKNAGNPRDMRRFQVVISTQVMVDGPLLAISDNMFVHNNSKHGRRAKRLDPSEGIDAAPDSNSGLYPPLPVATPCIKAISPSEGWTSGGSTVIIVGDNFFDGLQVVFGTMLVWSELITSHAIRVQTPPRHIPGVVEVTLSYKSKQFCKGAPGRFVYVSLNEPTIDYGFQRLQKLIPRHPGDPEKLPKEIILKRAADLAEALYSMPRNNQLGLSAPRSPSSMPFNSYTGQLAVSVQDTAASQWTEEEYARSGGSVSPRYCSAASTPHAPAAYPPQHYPAPPTSLFNTSSLSLGPYHPANVNGHTEYNAYKETEHYTERNDDNKTIYQNTHTKCIDTKTHKDKSRSAFAVVRQSPPRNFQQQNWQHLAVQSGMGGLVSSPFSVNPFSLPTCSAQQYAQTAPLASK-