Monarch geneset OGS2.0

DPOGS207250
TranscriptDPOGS207250-TA2760 bp
ProteinDPOGS207250-PA919 aa
Genomic positionDPSCF300008 - 984634-1000627
RNAseq coverage104x (Rank: top 60%)
Annotation
HeliconiusHMEL0098031e-16578.20% 
BombyxBGIBMGA012062-TA9e-10946.83% 
Drosophilamp-PG2e-9738.11% 
EBI UniRef50UniRef50_F4X5172e-14547.83%Collagen alpha-1(XV) chain n=5 Tax=Acromyrmex echinatior RepID=F4X517_ACREC
NCBI RefSeqXP_396317.33e-12645.59%PREDICTED: similar to CG33171-PC, isoform C [Apis mellifera]
NCBI nr blastpgi|3407209099e-14744.84%PREDICTED: collagen alpha-1(XXII) chain-like isoform 1 [Bombus terrestris]
NCBI nr blastxgi|1892356670.046.39%PREDICTED: similar to collagen alpha 1(xviii) chain [Tribolium castaneum]
Group
Gene OntologyGO:00310121.4e-76extracellular matrix
GO:00071551.4e-76cell adhesion
GO:00051981.4e-76structural molecule activity
GO:00054886e-70binding
KEGG pathway 
InterPro domain[707-879] IPR0105151.4e-76Collagenase NC10/endostatin
[709-876] IPR0161866e-70C-type lectin-like
[708-878] IPR0161871.7e-62C-type lectin fold
[359-415] IPR0081607.5e-10Collagen triple helix repeat
Orthology groupMCL11595 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207250-TA
ATGAAATGGATTTGGTTACGTGTTGTAATACTTCTCGTGGTACAGGGTGCCTTTCAAGAGCTCAAGCTTTATGGCTCTCCATCGCAAGCGGAAGTGCAATGTGTTAATACATACGAGGTCGACCAGGAGGAAGGAGATTCAGAGGGTTCAGGGCGGTATGGGACCATACCACCGTTTCCACCACCACCACCAGGAATGGATGTAATAATTTATTCGCAGGGTCCGCCTGGGGAATCTATACGTGGCCCTCCTGGGCCACCAGGCCCTCCCGGTCCTCCTGGCGTAAACACCGTCTCGGAAACTTCAGGGTCCGGAGATGACCAAATTTTTGGAGAAAATTACGCATCGCTCGGTCATTGTGGATGCAATTCAAGTGTTTTATTATCACTTTTGGAAATTGCTCCCGAACTTCAAGGCCCTCCAGGTCCTCCTGGTATAACGGGGGCTGACGGATTAACAGGTGCTCCTGGAATACCTGGACAACCTGGCATGCCTGGAGAGAGAGGTTCAATCGGCCAACGAGGCGAAAAGGGTGATAGAGGTGACAGTGGACCACGAGGATCAGAGGGTCAACCAGGTCCTAAGGGAGAGCCAGGTGTAGATGGAAGACCTGGAAGTCCGGGACCACCAGGTCCACCAGGAACACCTGGCTCTTCTGATTATAACAACTTTGATGAATCATTACTGGGTTCATACGGTGGAGCAATTGGAAGACCAGGTGCCCCGGGGCCTAAGGGAGATGCAGGTCAACCTGGACCTATAGGACTACAAGGAGAACGGGGATTCCCTGGACCCAAAGGTGAAAGAGGGCAAATAGGACAAACAGGAGCCAAAGGTGACCGTGGACATCCTGGCCACAAGGGAGATAGAGGAGTAAAGGGAGATCGAGGTAATCCAGGACTAGATGGGCGTTCTGGTCTACCGGGAGCCAATGGACGTTTTGGAGAAAAAGGAGAAAAGGGAGAACGAGGCATACCTGGTCCACCGGGACCGCCATCCCTACCCATAGGAGTTGTTGCTTCGGAAGAACCGGAATTTCTGGCGACAGGTTTACGACACTTAGGACCAGCTGAAAAAGGAGAAAAGGGAGAGAAGGGGAGTCGAGGAAATGACGGAACATCAGGTTTTCCTGGAAAAGATGGTAAGCCAGGCGAAAGGGGTGATATAGGTCCCTCTGGTTTACCAGGTATTGCAGGTCCTCCAGGGAGCCCAGGTCTAAAGGGTGACAGAGGAGAAAGAGGTCCTCCAGGCCCTGTCAGTTTAACATCTGCCGGCTCTGATATTCTAACAATCAAAGGTGAAAAAGGTGAACCAGGCTTAAGGGGCCGAAGAGGACGACCTGGTCCACCTGGGCCGCGAGGAGCCCAAGGACTTCAAGGCTTAGTTGGTCCAACCGGAAAACCGGGCGAAAAAGGTGACATTGGTTTACCTGGTTGGATGGGACGACCAGGAACATTAGGACCACCGGGAATTCCAGGCCCAGTAGGACCAAAGGGAGAAAAAGGGGACCCTGGAGTGAATATATTAGATGTCTCAATGGGAGAAAAAGGAGACCGAGGATTAGAGGGGATATCTGGTCCGAAAGGAGAGCAGGGTCCTATTGGACCTCCAGGTCCACCTGGTCCCGGTTCTAGATCAGAAGCAGTACAATATATTCCTGGACCACCAGGGCCTCCAGGACCACCAGGGCAACCTGGAACTCCTGGAATATCTATTGTCGGACCCAAGGGTGAACCTGGAGTTAGCTACCTAGAAGAATATCCTGTGCATGGAAGCACGAAATACTTTGGTAGACCAGCCTCTCCAGAATATCGACCTCATCAAGACGAAATGAACGCCAACAAGAATGTACCAGGCGCTCTGGTATTCCACACTACGGAAGAGATGCTACGGCTTGCGTCAACAAGTCATCTTGGAGCACTTGCGTATGTGATTGAAGAACAATCCCTTTTTGTAAAGGTTAACTCAGGCTGGCAGTACGTTTTGTTAGGTTCCCTAGTGACGCAATCAGCTCTCCATACAACAACAACGTCTGCTCCGGCACCACCACCACTACTGCCGGCTGCAAGCCTTGTGCATGCACCTTTATCAAACATGGTGGATACGCCTCTAGCTCCCATGGGACCTAGTCTCCGTCTAGCCGCTCTGAATGAGCCGCTGTCCGGCGACATGCATGGCATACGCCGTGCTGACTATGCCTGCTACCGACAAGCTCGTCGAGCTGGCCTGAAAGGAACATTCAGGGCCTTTCTTACAAGCAGAATACAAAACTTAGATTCCACAGTGCGATATGCTGATAGGCATTTGCCAGTTATCAACACTCAGGGTGACGTCCTATTCCAATCATTCTCAGATATTTTTGATGGAAATGGTGGTGTGATAGCTGGATCCCCAAGGATATACAGCTTTAGCGGAAAGAATATAATGCTTGATTCAAACTGGCCTCAAAAGCTCATCTGGCACGGATCTCATGCGAGCGGAGAACGAGCTCTGGAGACTTTCTGTGAGGAGTGGCAGAGCGCTGATCCCTCATCCCGTGGCATGGCCGCCTCGTTACATTCACACCGACTTTTGTCTCAGGAGAGATATTCCTGTAATAACCACTTTGCAGTATTATGTATTGAAGCTACTTCGCACTTGAGTGTTCGAAGAAAACGAGAGATAGCAAGGTACAACATGTCTTCGGTGAATGACGAGTATCATCCGTACAACGCTGAAGAATATCAAGACTTGTTAAATGAGATATTCGGACAACCATAA

Protein sequence:

>DPOGS207250-PA
MKWIWLRVVILLVVQGAFQELKLYGSPSQAEVQCVNTYEVDQEEGDSEGSGRYGTIPPFPPPPPGMDVIIYSQGPPGESIRGPPGPPGPPGPPGVNTVSETSGSGDDQIFGENYASLGHCGCNSSVLLSLLEIAPELQGPPGPPGITGADGLTGAPGIPGQPGMPGERGSIGQRGEKGDRGDSGPRGSEGQPGPKGEPGVDGRPGSPGPPGPPGTPGSSDYNNFDESLLGSYGGAIGRPGAPGPKGDAGQPGPIGLQGERGFPGPKGERGQIGQTGAKGDRGHPGHKGDRGVKGDRGNPGLDGRSGLPGANGRFGEKGEKGERGIPGPPGPPSLPIGVVASEEPEFLATGLRHLGPAEKGEKGEKGSRGNDGTSGFPGKDGKPGERGDIGPSGLPGIAGPPGSPGLKGDRGERGPPGPVSLTSAGSDILTIKGEKGEPGLRGRRGRPGPPGPRGAQGLQGLVGPTGKPGEKGDIGLPGWMGRPGTLGPPGIPGPVGPKGEKGDPGVNILDVSMGEKGDRGLEGISGPKGEQGPIGPPGPPGPGSRSEAVQYIPGPPGPPGPPGQPGTPGISIVGPKGEPGVSYLEEYPVHGSTKYFGRPASPEYRPHQDEMNANKNVPGALVFHTTEEMLRLASTSHLGALAYVIEEQSLFVKVNSGWQYVLLGSLVTQSALHTTTTSAPAPPPLLPAASLVHAPLSNMVDTPLAPMGPSLRLAALNEPLSGDMHGIRRADYACYRQARRAGLKGTFRAFLTSRIQNLDSTVRYADRHLPVINTQGDVLFQSFSDIFDGNGGVIAGSPRIYSFSGKNIMLDSNWPQKLIWHGSHASGERALETFCEEWQSADPSSRGMAASLHSHRLLSQERYSCNNHFAVLCIEATSHLSVRRKREIARYNMSSVNDEYHPYNAEEYQDLLNEIFGQP-