Monarch geneset OGS2.0

DPOGS211181
TranscriptDPOGS211181-TA2904 bp
ProteinDPOGS211181-PA967 aa
Genomic positionDPSCF300007 + 466032-470295
RNAseq coverage31x (Rank: top 75%)
Annotation
HeliconiusHMEL0124240.082.57% 
BombyxBGIBMGA003170-TA0.078.29% 
DrosophilaTg-PA2e-10234.14% 
EBI UniRef50UniRef50_D6WL520.055.75%Putative uncharacterized protein n=1 Tax=Tribolium castaneum RepID=D6WL52_TRICA
NCBI RefSeqXP_972752.10.055.75%PREDICTED: similar to transglutaminase [Tribolium castaneum]
NCBI nr blastpgi|910829230.055.75%PREDICTED: similar to transglutaminase [Tribolium castaneum]
NCBI nr blastxgi|910829230.055.75%PREDICTED: similar to transglutaminase [Tribolium castaneum]
Group
Gene OntologyGO:00181494.4e-28peptide cross-linking
GO:00038104.4e-28protein-glutamine gamma-glutamyltransferase activity
KEGG pathway 
InterPro domain[352-965] IPR0236084.3e-201Protein-glutamine gamma-glutamyltransferase, eukaryota
[533-630] IPR0029317e-34Transglutaminase-like
[757-870] IPR0137831e-28Immunoglobulin-like fold
[868-966] IPR0089584.4e-28Transglutaminase, C-terminal
[100-235] IPR0147562.4e-17Immunoglobulin E-set
[146-231] IPR0011024.2e-15Transglutaminase, N-terminal
Orthology groupMCL15537 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS211181-TA
ATGGAGACGTCAGTGGTACAAAAATCAACAGCGGCCGCTGACAAAATGACTACCAGCATGACTTCGAGCGTGACTGGAATGGGACTTGGCGGGATTACTGGGTTAACAGCTGGAATAAGTACTCTGACATGCATGAGGGAACAGCAAGTAACAATCGGAGGAATGCCGACTAATTATGTGCCCGGGTTATCTGCTAATTATAATATTTCAAGTTTCTCCCAACGGCCAGGAGAGAGCCGTCGCTGTCCGCCTGCGAAGGCTGGACCAACAGCACGATTGCAGACTGTTGGAACAAGACTAAGTCATTCTTCGAGAAACAGTCCCAATACGCAATACGCTCATAATCTCATATGTAAACTGGCCGACCAAAATGAACGTCGACGCCGTCAAGATACTATTGAGTTAATTCCTCAAAGCTATTATGCGCAACATCCGCTTAAAGTTGAACTTACAGAATTTTATTCACGTGATAATGCGAAGGATCACCACACGGATCAATATGATTTAGTTAACGATACCGTTTTACCTAATCCAGTTATTAGAAGAGGACAGAATTTCTTTTTTGCTGTTCGTTTTGATAGGACTTACGATAAACAACAAGACGTTATACGTGTTGTGTTTTGTTTTGGACCTAAGCCAGGCGTTACTAAGGGAACCCGTGTAGTATTGCAAGTTAACTGGAATACCCAACAAGGAGTTTTCCAACATCCGCGTGATGTTATTGGAATGGGCATGGGAATGGCTCGTTCTCTTCCCCAAGAACCAACTACTACTGTTGCCCCTACAGGACCAGTTAGTACTATTTGCACTGTGGGGGCAATTAGTTGTATCCGCGAGACAACAAATTATGGTGTGCGTCGCAGTTCCGTTAGTAATGATCCCTTAACAGTGTCTTCCCAAAACAGTCCCCTAAGCCCCCATTCGTCTGAGACATATACAACTCGTCCAATTATCGAACGCTATAGTGGAACATCCCAACATCCATCATTTGGACGTAGCTATGGTTCAAGACATGGATCTTCACAAAATTTGGCATCTATTGTACAGGAGACGGACAAATGGGACATAAGTATTCAACGTCAAGATGGAAACACTATTACATTCCAAGTGCACGTACCAGCTTCCGCTCCAGTTGGTATATGGAATTGTTGGATTCAAACCAATCGTTTGGGACAGCGTGATAACCGCAACGATTACAAATGCGATGAGGATATTTATGTATTATTTAATCCCTGGTGCCGTGAGGACGCGGTTTATATGGATAATGATTCGCTAAGGAAGGAATATATCTTAAATGACCAAGGAAAACTGTGGTGTGGCACTTGGCGGCAGCCTATAGGCCGCAAATGGATTTTCGGTCAATTTGATGATGTAGTTTTACCTGCTTGTATGTATTTATTGGAAAGCAGTGGACTTGAACATTCAGAACGTGGAAACCCCGTTCGTGTAGTGAGGGCGATTTCAGCAATGATAAACGCTACAAATGAAACCGATGGCTTGATTGTTGGTCGTTACGACGGCGTGTATAAAGATGGAGTAGCTCCTCACGCTTGGACAGGTTCAGTGGCTATTCTTGAACGATATCTCACCAGTGGTGGGAAGTCTGTTGAATACGGTCAATGCTGGGTATTTTCAGGTTTAGTCGTAACCATTTGTAGAGCATTGGGAATACCGTGTCGATCTGTAACAAATTACGTGTCGGCTCTCGATACGAATCGCACTTTTACTGTGGATAAATTTTTTGATCGCGATGGCAACGAGGTACCCAACGGTCCTGATGAGGACTGTTATGATTCTTGCTGGAACTTCCACGCATGGAATGACGTTTGGATGCAAAGGCCTGATTTACCACAAGGTTATGGTGGATGGCAGATAATAGATTCGACACCTCAAGAAGAAGCCGAGTCTGTCAACCAATGTGGCCCGGCGAGTGTGGAGGCTGTTCGACGCGGTGAAGTCGGTTTCCAGTTCGACACACCATTCGTTTTCTCGCAGCTCAACTCCGAGCTTTGTCATTTCCAAGAGGAAGAAAATTCCGACTGGGGCTTCGTTAGAATGGCGTCCAATCAAAACACTGTTGGACGAAAAATTATCACCAAGAATCCTAATCGCGAGGACGATGAGGGTGACGGTGATTTGTTAGAAATAACCCACGAGTATAAAACTATTGAGAGCACGTCGCCTGAACGCCTCGCCGTGATCGCTTCGTGTCGTGGCTTCCAGCGTCTGCAGCAGTACTATGAACTGCCAGATAGAAATAACGAAGACGTCCATTTTGATCTAATGGATATTGAAATCGTTCCGTACGGTCAACCTTTCGATTGTACTATGAATATTCAGAACAAGTCCAATGAGGACCGTACGATTTGGTGCGTACTCACAGCCTCATCATGCTATTACACTGGAGCTGTATCATCAAGATTGCGTCGAGCTCAAGGCGAATTCATAGTTCGTGCGGGACAGAAGGAAGTCCTCAAATTGCACGTCACAGCTCAGGAATACATGGACAAACTAGTTGACCACGCGATGGTGAAGGTGTGTGCTATGGCTTACGTGAAGCAGACGCGTCAAACTTGGTCCGATGAGGATGACTTCTCATTACACAAACCAAAACTGCAAATTCAGGTACGCGGAACACCTGCAGTGGAGCAAGAGTGTTCTGCGACGTTAAGCTTTCAGAATCCTTTAAGCGTTCACTTAACTGATTGCTATTTCAGTGTCGAAGGTCCCGGATTGCAAAGACCACGTCAGGTAAAGTTCCGAGATGTAAAACCAGGAGAGCTGGTTAGCTACCAGGAGAAGTTCGTCCCGACACGTCAGGGAGAGAGCCGTATTGTAGTGACTTTCTCATCACGAGAGATCGATGACATCATCGGCTGTACGGCGGTGACGGTGCGTGGCTAG

Protein sequence:

>DPOGS211181-PA
METSVVQKSTAAADKMTTSMTSSVTGMGLGGITGLTAGISTLTCMREQQVTIGGMPTNYVPGLSANYNISSFSQRPGESRRCPPAKAGPTARLQTVGTRLSHSSRNSPNTQYAHNLICKLADQNERRRRQDTIELIPQSYYAQHPLKVELTEFYSRDNAKDHHTDQYDLVNDTVLPNPVIRRGQNFFFAVRFDRTYDKQQDVIRVVFCFGPKPGVTKGTRVVLQVNWNTQQGVFQHPRDVIGMGMGMARSLPQEPTTTVAPTGPVSTICTVGAISCIRETTNYGVRRSSVSNDPLTVSSQNSPLSPHSSETYTTRPIIERYSGTSQHPSFGRSYGSRHGSSQNLASIVQETDKWDISIQRQDGNTITFQVHVPASAPVGIWNCWIQTNRLGQRDNRNDYKCDEDIYVLFNPWCREDAVYMDNDSLRKEYILNDQGKLWCGTWRQPIGRKWIFGQFDDVVLPACMYLLESSGLEHSERGNPVRVVRAISAMINATNETDGLIVGRYDGVYKDGVAPHAWTGSVAILERYLTSGGKSVEYGQCWVFSGLVVTICRALGIPCRSVTNYVSALDTNRTFTVDKFFDRDGNEVPNGPDEDCYDSCWNFHAWNDVWMQRPDLPQGYGGWQIIDSTPQEEAESVNQCGPASVEAVRRGEVGFQFDTPFVFSQLNSELCHFQEEENSDWGFVRMASNQNTVGRKIITKNPNREDDEGDGDLLEITHEYKTIESTSPERLAVIASCRGFQRLQQYYELPDRNNEDVHFDLMDIEIVPYGQPFDCTMNIQNKSNEDRTIWCVLTASSCYYTGAVSSRLRRAQGEFIVRAGQKEVLKLHVTAQEYMDKLVDHAMVKVCAMAYVKQTRQTWSDEDDFSLHKPKLQIQVRGTPAVEQECSATLSFQNPLSVHLTDCYFSVEGPGLQRPRQVKFRDVKPGELVSYQEKFVPTRQGESRIVVTFSSREIDDIIGCTAVTVRG-