Monarch geneset OGS2.0

DPOGS213101
TranscriptDPOGS213101-TA5115 bp
ProteinDPOGS213101-PA1704 aa
Genomic positionDPSCF300016 + 61646-71514
RNAseq coverage67x (Rank: top 67%)
Annotation
HeliconiusHMEL0065570.073.28% 
BombyxBGIBMGA007861-TA0.058.37% 
DrosophilaCg25C-PB1e-10238.49% 
EBI UniRef50UniRef50_B0W4260.045.86%Collagen alpha chain n=15 Tax=Coelomata RepID=B0W426_CULQU
NCBI RefSeqXP_001843460.10.045.86%collagen alpha chain [Culex quinquefasciatus]
NCBI nr blastpgi|3407227010.047.51%PREDICTED: collagen alpha-1(XI) chain-like [Bombus terrestris]
NCBI nr blastxgi|3838589440.049.80%PREDICTED: collagen alpha-1(XI) chain-like [Megachile rotundata]
Group
Gene OntologyGO:00052013e-28extracellular matrix structural constituent
GO:00055813e-28collagen
GO:00071551.8e-21cell adhesion
GO:00051981.8e-21structural molecule activity
KEGG pathwaycqu:CpipJ_CPIJ0017860.0 
 K06236 (COL1AS)maps-> Amoebiasis
    Focal adhesion
    ECM-receptor interaction
InterPro domain[23-230] IPR0089852.7e-35Concanavalin A-like lectin/glucanase
[1544-1697] IPR0008853e-28Fibrillar collagen, C-terminal
[27-215] IPR0031291.8e-21Laminin G, thrombospondin-type, N-terminal
[488-544] IPR0081603.7e-09Collagen triple helix repeat
[138-215] IPR0133204.3e-08Concanavalin A-like lectin/glucanase, subgroup
[105-208] IPR0126808.2e-06Laminin G, subdomain 2
Orthology groupMCL10234 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213101-TA
ATGGCACTTCAGATGAAAGGGTTGCTCCTTATATTGTTTTGCGCTCTCGCCCGGGGAAAGGGGGAAGACAAGCTCGAGATATTTGACGTTCTCTCCGTGGTCAACTATGATGAACTTCCTGAAGGGGTGTCACGGACTCCAGGGCGATGTCAAAATAATCCTTCAGAGCAAGATTACTCAGCTCTTTCCCTTAACGAGAACGCAACATTGCATCAGCTCGCTGCAGGAATGTTTTATAATACATTTCCTGAAGACTTTTCCATATTGTCAGTCGTTAGACTAAATGGCCCAGAACAAAATCCTCTCTTCGTATTATACTCGGACGCCGGAGACGAACAACTACAAGTGGTCGTTGGGGAGATGGTAGAATTGTATTACGAAGATACACGAGGAGATCCCGAGGATCATGAGCTACTCAGCTTCCGAGTCAACATTACAGATGAAAAATGGCATAGAGTGGCACTAAGTATAAAGGGTGACTCAGTAACATTGTTAGTAGACTGTGAAATTCAAGATACACTGCCTTTACGACGTCATCCGGGAAGTACATTCAACTTGGCAGGAGTTTTAGTAGTCGGAGCTCAAGTGACGCCAGATCAATATTACGAGGGCGATATTGAATTGCTCCAATTTGCTAATCAACCGGACCTTGCCTACGATATGTGTATATCAATCGCCCCAGACTGTGGATCCTTTGGGTTGCAAACATCAGATTTAGCGTTGGATAATACTGATTACGATGAAAAAGACTACGACGATCGTATTGAAGTAACGGAATCGTTAGTTCACGTTGAAAGTGAGGGGAAACCTTCGGATCAATTTGATCAGAGTTTACTTATAGAAAGACAGAATCAATCGGAGAATCAGTGGGCAGTTTCAAGCAGAGATATGCCTACATTGAGGCCGTATTGGCAAGTCTCGGAGTCGTATGCTGTGACCCCACCCTATCCTCTAGTAACAGATAATGATGAAAATAGCATCTTCGGGAGATTTACGGATTCAGACGATTTTATTTCATCCGGTAACATTCAAGAAACGCCACCCTCACTCCTAGAAGTATCTAGTACAACGGTTAAAAATGATGTTGATGAAGACGTTGCTCTGACCACAACTGAAGTATCGTCTGCGGGGATAACGACTGAAGATACGGATTATCTCACCCACCCACCAGATATTAGGGGTAACTCTTCTACTGGGTATGATAATTCAGACAGTTACTACGACTATGGCACAATTGGCACTTATTTGGGACCACGTGGGTACCCTGGACCTCCTGGTAGGCAAGGTCCACGAGGACCCAAAGGGGAACCTGGGAAGCCAGGTGCCGAGGGTCAACAAGGGTTTCAGGGAGCTCCAGGGCACGTTTTCGTAGTCCCGTTGCCACAGTCGGGAAATGATAAGGGACCTGACGCACATTCGGAAGCACTACGCCAAATGTTAACACAACATATGGCTTCAATGCGGGGTGCGGAAGGCCCCATGGGTCTCACAGGACCGCCGGGTCCTGAAGGCCCTACTGGAGTTGAAGGTTCGAAGGGCGAACAAGGAGATCAAGGTGAACCAGGACCACCAGGATCAAGGGGCCTCCAAGGACAACCTGGAAGATTAGGACGTCGAGGCCACCCAGGCAGAGATGGAGAACGAGGTCCACCCGGACCTCAAGGTCTAAAAGGAGATCAAGGATACCCTGGCCAAGCCGGAATACCTGGAGATAAAGGCGAAAGGGGTACACCAGGACAACAGGGAGAAACAGGTGCTCCAGGTTTAGATGGACCTCCTGGTGAAGATGGACCTCCAGGACCCCCAGGAATTTCGGGTGAATTGGGGCCTAGAGGCTTTACCGGCCCAAGAGGATTTCCGGGTCTTATTGGATATCCCGGTATACCAGGAAACGAAGGCCAACAAGGCATAAAAGGTGCCGCAGGACAGCCAGGTCCACCTGGATCCCCGGGTCAACCAGGAGTAATGGGACCACCCGGATCCCCAGGACCTCAGGGTCCCATTGGGGCCCCAGGATTACAGGGGTCCCAAGGGAAGCAGGGAATATCTGGTTTGCCAGGACCCGAAGGCTCACCAGGTACACCGGGTACACCTGGACAACAAGGTCCTGCAGGAGATGTGGGATTACCTGGTCCACAGGGTATGTTAGGATTTCCGGGGCCACGAGGTCTAAAAGGAGATGACGGACCGCGTGGTCCACCCGGTGATAAAGGAGACAAGGGAATAAGAGGAATTGAAGGAGAGAAAGGTGAAATGGGGCAAAAAGGGGAGCGCGGGGTGGCTGGAGAGCCTGGCCCTGCTGGTATCGAAGGACCAGAAGGACAAAAAGGTTCAGAAGGTCCTAGAGGTGAAACTGGTTCAATCGGTCCTGTTGGTGAAAAAGGTGCAACAGGACCTCAAGGACCATCAGGTTACCCTGGAGCTCAAGGCGAAAAGGGAGATAAAGGGGCTTCCGGCAGACGGGGAAGACGAGGGAGCAAAGGAGTTGCGGGCTTAGTAGGAATTCCCGGAGACCGAGGCGAAAGCGGACCAAGGGGCTATCGTGGCCCAAGAGGTCGTAGAGGATCAGATGGGCCGCCAGGACCTAAAGGCGATACAGGACAACCCGGGCCTCCGGGGTCAAGTGGTGAACGTGGTCCACAGGGTTTGGAAGGGCCTAGGGGATATCCTGGGTCCATCGGTCCACCGGTAATGAATATCCCACTTTGCGGGACTCCGGGACAAGCTGGACCTCCAGGCTTACCTGGACCCCCTGGATCTAACGGTGAACCAGGTCCACCTGGTCTACAAGGACCGTCTGGTATGTCAGGAGCACCTGGTGAGGTGGGTCCACCGGGAGATTCAGGAAAAGAAGGACACCCGGGACCACCAGGACCGGAGGGAAAACCAGGCCCTCTGGGACCTCCAGGATCACCTGGCGCAAACGGAGAGCCAGGTTTACCTGGGGCCCCCGGAATTCCTGGAAGTAAAGGTGACATGGGTCCACCAGGACAAGCAGGCGTAAGAGGTGAGAAAGGAGAACAAGGAGAACCTGGACGTGAAGGTTTACAAGGACTTATTGGCCGAGATGGGCCAAGAGGATCTCCTGGACCAGGAGGTCAAAAAGGAGAAGTTGGCGAACCTGGTCCTATAGGTCCTGTTGGCCGTGATGGTCTACCAGGTCCACGGGGCCTCTCTGGGGTCCCTGGACCTATTGGACCTCCAGGAGAAGATGGTGACAAAGGTGAATCCGGTCCACCTGGAGAAAAAGGTTTCAAAGGCGCAATGGGGCAACCCGGCCCATCGGGTGCTCCAGGAATTCAAGGTCTTAGAGGAGAACCTGGACCAGTGGGTTTACCTGGTGATAAAGGACCCCCGGGCGATATTGGTCCACCTGGACCGGCGGGAACTGATGGCACACGTGGGCCTCCGGGACTTATCGGTAAAATTGGGCCCGAAGGTCCAAAAGGTGATCAGGGTTCAAAAGGAGATAGTGGAGAAGTCGGACCTATAGGTCCCCCTGGACCCGCTGGTCCTACTGGATCTGTTGGAAGGAGGGGTCCAAAAGGAAATCAAGGTGAACAAGGTCCTAGAGGTCCGGAAGGAGAAAGAGGGGAAATAGGAAGCCCGGGATCGACAGGTCCGCAAGGACCACAAGGATCTGAAGGAAAAGTGGGACCACGAGGATACGCAGGACCAAAAGGTGATGATGGTTTACCGGGACCTCCAGGTGAAGCAGGTGCTAAAGGACTTCCCGGTCCCGAAGGCGCCAAAGGTGACACTGGACCGTCTGGCTTCCCTGGAGATAGGGGAGAACCTGGTCCACAAGGAGTCAAGGGTGAACCTGGTACTGATGGTCCAGAAGGAAGCCCGGGACCCCCAGGTTCACCTGGACCAATAGGACCTATTGGAAAACCAGGTGAAACTGGTATTCCAGGAAGTCCTGGCACCGAGGGGCAACCTGGTATACAAGGAAATCCTGGAAATCCTGGTGAAAAGGGTAATATGGGTCCTAGGGGACTTCAAGGGGAACAAGGTCCACCAGGAGCAATTGGTCCTGTTGGACCAGAAGGACCCCCTGGTTTAAGGGGACTGGCTGGACCAACTGGAGATGTTGGAGCACCTGGTGTTATGGGACCAATGGGTGTACCAGGACCTAGCGGATCCCCCGGCCAACAAGGAATAAAAGGAGAAAAGGGGAATAGGGGAGCGAAAGGTCATACTGGTGATACGGGAAATATTGGAATTAAAGGAGACCAAGGCGAAATTGGAAAACAGGGACCAACAGGTCCAATAGGTCCTATGGGGCCAAAAGGAGACACAGGTCCAATTGGGCCTCCCGGATCCAAAGGAGATGTAGGACCGGCTGGGCTGGCTGGACTTGAGGGACCTCTGGGCCCAAAAGGCACGGCAGGACCTGAAGGTCGCCCCGGTCTACCCGGCCCTCCCGGTGCTCCAGGACCTCCAGGTCCTCCCGCGCCAATCCCTCAGCTACCTTCTGATCTGTTCATGTCCAGTAGACGAAGGCGAAGTATTGAAACTGAATCTACGGAAACCGCTACTGAAGATAGTTATGAAGAAGAGGAAATCGAATGGACTCGTGAGATAATGGCTGGCGTATTAGCAGCTCGAGGAACGCTGGATGCGGCCCGCCGACCGCGTGGTACTCGCTCCAACCCAGCGCTGTCATGCAGGGACTTACGAACATCTCATGCTAATTTAACTGATGGTTTCTACTGGATAGATGCTCGTGGGGGTTCTGGACGTCCTATTAAAGTGTTTTGCGATGGCCAAAGCACCTGTTTGTACCCGGAAAATGTCGATGCTGCAGCTGTGTATTTCGATATATCAGGACAGAAGTTCTCACAGCTTGATGGAGGATATCGGATAAATTACGACAGCGAAGGTTCCGGCTTCATACAAATGCGGTTCTTACGATTGCTATCAACTGGTGCAAGACAAAATTTCACTTACACGTGTGTCAAAACAGTTGCACCACAAAGATCCGATATTCCCGTGGATCTCATAAAAAACAAAAAAATTAAGTTGTTGGGTCAAAACAATTTTGAGTTTAAGGAGCCCCAGATAATAAAGGATGATTGCAAGGTAAAATTTTCGACGCCATTACTAAAATAG

Protein sequence:

>DPOGS213101-PA
MALQMKGLLLILFCALARGKGEDKLEIFDVLSVVNYDELPEGVSRTPGRCQNNPSEQDYSALSLNENATLHQLAAGMFYNTFPEDFSILSVVRLNGPEQNPLFVLYSDAGDEQLQVVVGEMVELYYEDTRGDPEDHELLSFRVNITDEKWHRVALSIKGDSVTLLVDCEIQDTLPLRRHPGSTFNLAGVLVVGAQVTPDQYYEGDIELLQFANQPDLAYDMCISIAPDCGSFGLQTSDLALDNTDYDEKDYDDRIEVTESLVHVESEGKPSDQFDQSLLIERQNQSENQWAVSSRDMPTLRPYWQVSESYAVTPPYPLVTDNDENSIFGRFTDSDDFISSGNIQETPPSLLEVSSTTVKNDVDEDVALTTTEVSSAGITTEDTDYLTHPPDIRGNSSTGYDNSDSYYDYGTIGTYLGPRGYPGPPGRQGPRGPKGEPGKPGAEGQQGFQGAPGHVFVVPLPQSGNDKGPDAHSEALRQMLTQHMASMRGAEGPMGLTGPPGPEGPTGVEGSKGEQGDQGEPGPPGSRGLQGQPGRLGRRGHPGRDGERGPPGPQGLKGDQGYPGQAGIPGDKGERGTPGQQGETGAPGLDGPPGEDGPPGPPGISGELGPRGFTGPRGFPGLIGYPGIPGNEGQQGIKGAAGQPGPPGSPGQPGVMGPPGSPGPQGPIGAPGLQGSQGKQGISGLPGPEGSPGTPGTPGQQGPAGDVGLPGPQGMLGFPGPRGLKGDDGPRGPPGDKGDKGIRGIEGEKGEMGQKGERGVAGEPGPAGIEGPEGQKGSEGPRGETGSIGPVGEKGATGPQGPSGYPGAQGEKGDKGASGRRGRRGSKGVAGLVGIPGDRGESGPRGYRGPRGRRGSDGPPGPKGDTGQPGPPGSSGERGPQGLEGPRGYPGSIGPPVMNIPLCGTPGQAGPPGLPGPPGSNGEPGPPGLQGPSGMSGAPGEVGPPGDSGKEGHPGPPGPEGKPGPLGPPGSPGANGEPGLPGAPGIPGSKGDMGPPGQAGVRGEKGEQGEPGREGLQGLIGRDGPRGSPGPGGQKGEVGEPGPIGPVGRDGLPGPRGLSGVPGPIGPPGEDGDKGESGPPGEKGFKGAMGQPGPSGAPGIQGLRGEPGPVGLPGDKGPPGDIGPPGPAGTDGTRGPPGLIGKIGPEGPKGDQGSKGDSGEVGPIGPPGPAGPTGSVGRRGPKGNQGEQGPRGPEGERGEIGSPGSTGPQGPQGSEGKVGPRGYAGPKGDDGLPGPPGEAGAKGLPGPEGAKGDTGPSGFPGDRGEPGPQGVKGEPGTDGPEGSPGPPGSPGPIGPIGKPGETGIPGSPGTEGQPGIQGNPGNPGEKGNMGPRGLQGEQGPPGAIGPVGPEGPPGLRGLAGPTGDVGAPGVMGPMGVPGPSGSPGQQGIKGEKGNRGAKGHTGDTGNIGIKGDQGEIGKQGPTGPIGPMGPKGDTGPIGPPGSKGDVGPAGLAGLEGPLGPKGTAGPEGRPGLPGPPGAPGPPGPPAPIPQLPSDLFMSSRRRRSIETESTETATEDSYEEEEIEWTREIMAGVLAARGTLDAARRPRGTRSNPALSCRDLRTSHANLTDGFYWIDARGGSGRPIKVFCDGQSTCLYPENVDAAAVYFDISGQKFSQLDGGYRINYDSEGSGFIQMRFLRLLSTGARQNFTYTCVKTVAPQRSDIPVDLIKNKKIKLLGQNNFEFKEPQIIKDDCKVKFSTPLLK-