Monarch geneset OGS2.0

DPOGS203118
TranscriptDPOGS203118-TA3003 bp
ProteinDPOGS203118-PA1000 aa
Genomic positionDPSCF300094 - 58420-68397
RNAseq coverage352x (Rank: top 33%)
Annotation
HeliconiusHMEL0221270.094.54% 
BombyxBGIBMGA001528-TA0.089.82% 
DrosophilaCG42251-PC2e-4846.22% 
EBI UniRef50UniRef50_UPI0001757D4E2e-13944.42%UPI0001757D4E related cluster n=1 Tax=unknown RepID=UPI0001757D4E
NCBI RefSeqXP_971667.24e-14044.42%PREDICTED: similar to CG11146 CG11146-PA [Tribolium castaneum]
NCBI nr blastpgi|1892339908e-13944.42%PREDICTED: similar to CG11146 CG11146-PA [Tribolium castaneum]
NCBI nr blastxgi|3287806543e-15837.89%PREDICTED: hypothetical protein LOC550870 isoform 2 [Apis mellifera]
Group
Gene OntologyGO:00055151e-26protein binding
KEGG pathwayhmg:1002018817e-11 
 K07365 (NCK)maps-> Pathogenic Escherichia coli infection
    T cell receptor signaling pathway
    ErbB signaling pathway
InterPro domain[885-997] IPR0009801e-26SH2 motif
Orthology groupMCL16054 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203118-TA
ATGCCTTCAAATAAAAAGTTGATGGCCAAAATGAAACAAATAGCCCAGTGGATTTTGGTTGTCCATGTGGTTGAATTTATAGAAGCTTATTGCACACCACGCGACACAGCGAGAAGATCACCACAGCGTAGTCGTCGCTCGAGGGAGCCTCGTCGTTCCCCAGATACATCCCGATGGCGTTACGATTCTCGTCGCCACAACTACGACTTTGATACACGAAGACCATACGAGTACGAATCTAGGAAGTATTTCGGAAGAGAATACGACGCGGATTCGGGAAGACGACCAGCGTTTGACGCGGAATCAGCCCGTCGGCGTGCGGATTACGAATCAGCTCGAAGGACCGACTTTGATTTGTCCGCTCCCGAGGTCTCTAGGCGGAGTCCCGAAAGCGCACCAGAAGGTTCAGCAGATTCCCCCCCTGAACCAGCTCCTTCCGCTCCACTACGTCAAGAAAGACATTCAAGACATCGATCACAGCCATATTACGAATCTGAATCATCTTCAGAAGATCGATCATCTTCAACTGAGGAGGAAGACGAGTTCGGTGAATTAGATCCGTTCGCTACAGCACCTTCACTACACTCGCCGGACAGTGGACGTTCTGATCCCCGATACGCGAACGCACCCAGACGAAACCCAAGCCCTTATTATTACGGCGACCTATTTAAATCAGCGCCGCCACCCCCAGCTCCTTCAGCGCCAACTGTATCATCAACATCTTCATCACGTGGCCCTCGCTATAGGAAATCTGCGAGTCTCGACGCTCCACACGTCGCACAGAGACCAAATCTGCCTAAAAGGTTCTCCGTGGCAGAAGATGGCTTTCGTGAATTAGAGCGGTCATCATCATCGTCCGGTGAGAGCGCGTGGATGCCGCCGAGGCGCCGCGGGCGCGACGCGGCCGGCTGCGCATGCGCCGCGCCCGAAGATGTAGAGGAGCATAGGTCCTCGAGAAGTGTGTTCTACGTGCCAGCGCCATTGCCACGCTGGCCTCAAGAAATGACGACTCGACAGATATACGAAACAGCCTTCGACTGTAAAATAGCCCGCTCTGATGATGACTTGGACGACTTTGATCGTGTCAGCAATCATCCGGCCTTATTGCAAGCGGAAGAGCGGAAGGTGGTTTCTTCTGCATCATCAGCGTTTGAGGCTGTTGAACGGAGTGAACCTGACTACAAAGGCCGGCGTAAAGTGACCATGAAAACTGACAGCGTACGACGCTCTAAAATTCCTACGATACGACAAAAAAGAGAAAATGAAAGTAATGCTTCAGCGTTATCAGAAGAGCTAGAAAAAGTACACATTGAAGAAGAGGATCGTACTTCAACATCTCAACTTCCATTGCGCGGTTATACTCCATCACCGCCTTCTACGGCACCGTTACCCACGAAATTTCAGCAAAAGGATGGGTCTGCTATGAACAGTATTAAAAGTGCTCCAAATTTACCTCAAACCCAACCATCACATCCCCGCCTTAAAGATTTGCGGTTACCTGTTAAATCACTAAGAGCTCGTGAGACTCCAACCAGTAGTGATGCGAGTTTGACTGATGTGAAAGCCTCAGTATCAACTGAACTTGACGTTGGACAACGACTCAGGGATTCGAGCCGCGAATCCCGTGATGGTGACGACGAGAGGGTACAATCTAAAGACGGTATTTTTTTAGAGATTAAGGGTCGACCCACTTCCGATTCTGATACACCTGAAATGAATCGATATAAGGGCCCTAAAGGGGTCGGCCCTATAATGGAGTTCAAAGGTAGGCCCGGTGTTGCACGTCCACGGCGAAAATATTCGAGCACAGAGAGTATGGCTACTAGCAGCAGTGGTGGAAGCATGGAATCACTGCGAAGTAGCAACAGTGAGGGAGATAGAAGTAGTAGTAGCTCAGAGAGTAGACATTCATCGTCTTTGAGTTCACACAGCTCGGATTCTGGTAATGTGCCATTTGTTAAGTCACATCACATGCAACTATCGGGGTTTGGGCATCATCCTAATAAACTACACATTCTAAGTCCTATATCAGATAAATCTTCTCAAGAACCGGCTTCCGAAACATCTGATAATAACAAGAACAACAATTCACAGAAAGTTTCACCCGAAGACGCCGAAACTGGTAATGTAACAATGCAGACAACCGTTGAAATCTTACCCAAACCTAAAAGACGTGCTTTACAGAATAGAAATTTGCTAAACCTAACGTTCAGACATTCAACGCCTGGCGATACAGAAATCCAAGGATCAGATAGTGGAATATCAATACATTCAAGAGAAGGAGTTGATTCGAGAAATGCATTCGTTAACTTCAAGAACAGCAACACTGAAGAGGAAAGGAAAGATGAAGACGTTGATTTGTCTGATCTTCCATTCGATATGCCGAAACTGCGAAGACGTAGGGCTGAGGCCGAGGTTGACCTTAGATCTTTGCCCTTCGATATGCCCAAACTAAGACGTAAACTCCGCGGACAATCTTTACAATTAAATAGTGACTTCGGCGAAGCCATCTCTAATGCTTCATCCAGTCAAAGCGTACAAGACTTGAATCAAGATAAGAAACATCGCGATAAATTGACTTTGAACTTCGAAAGTGGCAGTGGTTCTGGTAGTACCAAAGGGTTGCATTTGAATCTGGGACCGATCGCTCCGCCACGAGACTTGATTGATGCCTCTCTTCCACTTGACCGTCAAGGGTGGTATCATGGAACGTTGTCGCGTTTGGAAGCTGAGGGTCTGTTGCGAGACGCGGACGAGGGAGCTTTTCTAGTACGAAACAGTGAATCCGCGAAACACGACTACTCCCTCAGCTTAAAATCGACACGTGGGTTTATGCATATGCGTATATGTCGTGGAGGTGAAGGTTATACTTTGGGAGGTGCGACTACCGCCTTCCCTACCGTTCCGGCTCTCATGAGACATTACGTCACAGCCCAAAGACTCCCTGTCAGGGGAGCTGAACATATGGCACTGTCCACACCACTGCCAGCTGTTATGCTATGA

Protein sequence:

>DPOGS203118-PA
MPSNKKLMAKMKQIAQWILVVHVVEFIEAYCTPRDTARRSPQRSRRSREPRRSPDTSRWRYDSRRHNYDFDTRRPYEYESRKYFGREYDADSGRRPAFDAESARRRADYESARRTDFDLSAPEVSRRSPESAPEGSADSPPEPAPSAPLRQERHSRHRSQPYYESESSSEDRSSSTEEEDEFGELDPFATAPSLHSPDSGRSDPRYANAPRRNPSPYYYGDLFKSAPPPPAPSAPTVSSTSSSRGPRYRKSASLDAPHVAQRPNLPKRFSVAEDGFRELERSSSSSGESAWMPPRRRGRDAAGCACAAPEDVEEHRSSRSVFYVPAPLPRWPQEMTTRQIYETAFDCKIARSDDDLDDFDRVSNHPALLQAEERKVVSSASSAFEAVERSEPDYKGRRKVTMKTDSVRRSKIPTIRQKRENESNASALSEELEKVHIEEEDRTSTSQLPLRGYTPSPPSTAPLPTKFQQKDGSAMNSIKSAPNLPQTQPSHPRLKDLRLPVKSLRARETPTSSDASLTDVKASVSTELDVGQRLRDSSRESRDGDDERVQSKDGIFLEIKGRPTSDSDTPEMNRYKGPKGVGPIMEFKGRPGVARPRRKYSSTESMATSSSGGSMESLRSSNSEGDRSSSSSESRHSSSLSSHSSDSGNVPFVKSHHMQLSGFGHHPNKLHILSPISDKSSQEPASETSDNNKNNNSQKVSPEDAETGNVTMQTTVEILPKPKRRALQNRNLLNLTFRHSTPGDTEIQGSDSGISIHSREGVDSRNAFVNFKNSNTEEERKDEDVDLSDLPFDMPKLRRRRAEAEVDLRSLPFDMPKLRRKLRGQSLQLNSDFGEAISNASSSQSVQDLNQDKKHRDKLTLNFESGSGSGSTKGLHLNLGPIAPPRDLIDASLPLDRQGWYHGTLSRLEAEGLLRDADEGAFLVRNSESAKHDYSLSLKSTRGFMHMRICRGGEGYTLGGATTAFPTVPALMRHYVTAQRLPVRGAEHMALSTPLPAVML-