Monarch geneset OGS2.0

DPOGS206549
TranscriptDPOGS206549-TA5289 bp
ProteinDPOGS206549-PA1762 aa
Genomic positionDPSCF300190 + 217412-225660
RNAseq coverage2800x (Rank: top 4%)
Annotation
HeliconiusHMEL0022880.068.73% 
BombyxBGIBMGA014040-TA0.061.19% 
Drosophilavkg-PA0.046.56% 
EBI UniRef50UniRef50_B0WDA50.045.23%Collagen alpha-2(IV) chain n=4 Tax=Culicidae RepID=B0WDA5_CULQU
NCBI RefSeqXP_001951336.10.048.03%PREDICTED: similar to collagen alpha-2(IV) chain, partial [Acyrthosiphon pisum]
NCBI nr blastpgi|3504185390.049.54%PREDICTED: collagen alpha-2(IV) chain-like [Bombus impatiens]
NCBI nr blastxgi|3838581520.050.37%PREDICTED: collagen alpha-2(IV) chain-like [Megachile rotundata]
Group
Gene OntologyGO:00052012.6e-107extracellular matrix structural constituent
GO:00055812.6e-107collagen
GO:00054883.3e-47binding
KEGG pathwaydpo:Dpse_GA179870.0 
 K06237 (COL4A)maps-> Small cell lung cancer
    Pathways in cancer
    Amoebiasis
    Focal adhesion
    ECM-receptor interaction
InterPro domain[1494-1718] IPR0014422.6e-107Collagen IV, non-collagenous
[1494-1603] IPR0161873.3e-47C-type lectin fold
[600-658] IPR0081603.4e-10Collagen triple helix repeat
Orthology groupMCL10127 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206549-TA
ATGCAGGTTGTATGCAATCAGACCGTATGTGATTGTGCTGGTCTCAAAGGAGATCGAGGTGACATTGGTTCTCCAGGAATACCCGGACCCCAAGGCGACTATGGAGAGGATGGACCTGATGGACCCATGGGACCCCCTGGTGAACCAGGGGACTGGGGCGAGAAGGGTATCTCGGGTGACAAAGGAGAAAGGGGTGTAGATGGCCCATACGGACCGAGGGGATTCACTGGACCTCAGGGACCCACTGGTTTAGAAGGTGTGAGAGGTATTGCTGGTCTTGATGGATGCAGTGGTATTGATGGTATTATGGGTCCTCCTGGCCCCCAAGGTATACCTGGTGATAGAGGGTTACCAGGTCCCTATGGTGAAAAAGGAACACAAGGGTTAGCAGGAGAGGGTGGTGTTAATTCAAGAGGAGCGAAGGGAGATCAAGGTGATAGTGGACGACCTGGCTTACAAGGACCAAGAGGGCCTATTGGATGGCGAGGTGATAGTGGTATGCCGGGAGAAGTTGGTGATCAGGGACCTATGGGGTACCGAGGAGAACCGGGATATAGAGGAGATCTCGGCGACGACGTTGTGGGGCCTCAAGGTGAAAAGGGAGATCAAGGGGAAGTTGGCGAATCAGGGAGGCCAGCGAAAGTTATCTATATTGATCATCTTCAAGAAAATGTTACGATATTAGCCAAGGGAAACAGGGGTGACAAAGGATTTGCGGGTACACAGGGCGTCAAAGGTGTTAAAGGAGACACTGGCTCTATGGGTCCTCCAGGGCCAAATGGACTTAACGGAGACCAAGGTTACAAAGGCGATCAAGGAATAGACGGCCCTAGAGGTAAACCTGGACATCGTGGACGCAAAGGGCCTCTTGGACCAAAAGGTGATAAAGGCTGGCCAGGATATGCTGGGCTAGATGGAGAAGACGGAGAGCCGGGAGCAAGAGGAGAAGACGGAAGACCTGGAATGCCGGGAGTTCAGGGACCTCAAGGAGAGAAAGGAATATACGACGAACGACTTAATGAACCACTTCTGCCAGGTTCAAACGGTCCACAAGGTCCTGTAGGTTATCCAGGACCTCCAGGACCTCCAGGTGCAGATGGAACCAGAGGTTTGCCTGGAATTCAAGGTCCACCAGGTCTTCCGGGACCGAAAGGAATAGCTGGACGTCAAGGACCGCCAGGAAGTTCAGAAAAAGGAGAACCAGGAAATGACGGTTTTAAGGGTTTACCTGGACCTCGAGGTCCTATGGGCTACCCAGGTCCACAGGGAGTATTAGGACCAAAAGGTTTTAAAGGCTCAGCTGTAAGAGGCCCCGAGGGTGAAGAAGGTACACCGGGATTAGATGGCAGACCTGGTATAAGAGGAGACAGAGGGGACTTCGGTTTCATGGGCCTGCCCGGGTATCCAGGTCGAGGTGTTCACGGTGTTGGTCCACCAGGGGAGGATGGTCCTCCAGGACGTCCTGGAGTCGTTGGTGATAGTGGAACACCTGGAAGACCTGGATTTAGAGGTCCAAAAGGAGAACGTGGTGACGACTGTCCATTCTGTCCATCAGGTTTACCAGGTATGAAGGGAAAGAGAGGAGATGAAGGTTTTAAAGGTCAAAAAGGATATCCTGGCCCTGAAGGAGATCGTGGCCAGCGTGGGTTAAAAGGAGAAAGTGGATCACCAGGTTTACCTGGATCAAAAGGTCCGAAGGGCATTACTGGTCCGCCCGGAATGACTGGCCGTCCTGGTCTACAAGGAGAGAAGGGACGACTTATACAGCCTCCTCTTTCCTTAATAATAGCTGAACGTGGACCTCGTGGTTTTATTGGGGATCCTGGTCTTCGTGGAGATCCAGGTTTTCCTGGGCTTCGAGGAGAAAATGGCTGGAAAGGTTCCAAAGGTATGGCTGGTGAGGATGGTTTCCCTGGTCCTGATGGAAGGGATGGATTAAAAGGACGAGACGGCGTACCAGGAATGCCTGGTGAACATGCCGATGTTCCTATACAATTTCTATTTGGACAACGAGGAGATAAAGGGATTAAAGGACAACTCGGAGAACCCGGAGATGATGGTCTAAAGGGTGATGCCGGAGAAGCCCTTGGTTTCGGAATAAATGCTAAAGGAGAAAAAGGGGAACCGGGACCGATGGGTCCAGAAGGTTTGCAAGGAATTAAAGGAGATTCTGGTGATATTGGATACGAAGGACTTCCAGGAGAACGAGGGGATATTGGTCTACCTGGTGTTTCTAAACAAGGAGAAAGAGGTGCTAGAGGTTTTCCCGGAGACAAAGGAGATATAGGTCCCTACGGAGAACCTGGAGGTCCAGGTCTTAGGGGTCCTGTGGGATTTGATGCACTTAAAGGCAAGAAAGGTAGTCGGGGAGAAGTTGGGTACGCAATTATTTACGGAGAAAGAGGTTTCGATGGTATGGCCGGGGATTATGGTGATGTAGGTGAACCTGGTTATGCTGGAAACCCCGGAAGAGCAGGTTTGATGGGACCTAAAGGGGAACCAGGTTTACCTGGTGATGTGGGTCCACCTGGACCCGTAGGACCACCAGGACGAAAAGGAATGTCAGGAAACATTATACAGGGTGCACCTGGTATGCCAGGTCAACCCGGACGACTGGGTTCTATAGGATTAATCGGTGAACCAGGACTACAGGGCTACAATGGCTTGCAAGGGGATGTTGGTCCTAAAGGGATGAAAGGAGAAGCTGGTCGAATGGGAAATCGTGGCTGGACTGGTGAACGTGGTCTTACGGGCAGAAGAGGGCGACCCGGACTTATGGGTCAACCTGGCCTGAGTGGCGAAACGGGAGACCGAGGTGAAACTGGTCTTCGTGGTTATGATGGTTTACCTGGTAAAGAAGGTCCCCTGGGCATAATTGGTCAAAAAGGAATACGTGGTGATATTGGTTTACCGGGAGCAGACGGTTTAGGTGGACCTCCAGGTCCTAAAGGAGAGAGAGGTTACGATGGAGTTGTTGGTGATAAAGGAATGCAGGGAGAAAACGCCTCCATAGGAATGAAAGGCATGTCTGGAGACATGGGTTTTAATGGAATGCCAGGAAGACCAGGGCAAACTGGTTTAAAAGGTTTAAGAGGTGACATCGGCAACCCAGGATTAAATTTAAGAGGCCTTAATGGTACAAAAGGATTCCGAGGTGATGATGGCATTCCTGGAAGAGTAGGGGAAAAGGGTTTAAAAGGATTCCAAGGAGATTACGGTTTCGAGGGTATTGCTGGTGAAATAGGAGACGAAGGTTTTCCAGGTTTATCTGGTTTACCTGGACGAATAGGATTTGATGGTGCCAAAGGACCTTCAGGGCACAAAGGATTGCCAGGTTTACAGGGTCCGAAAGGTGATACAGGATTTGAGGGTGAACCAGGTAGAATGGGTTCACCAGGATATCCCGGTGACGTAGGCTTGCGAGGCTTGGTTGGTGAAAGGGGTCCATCTGGCGCCAAAGGAATGTCAGGAGATATTGGACCCAGTATTTATTTACCAGCCACCAAAGGGGATATGGGAGATATCGGAATGGAGGGACTAAAAGGGGGTAAAGGCGAAATGGGTGAACCTGGATTTCCAGGATTAAAAGGCCACAAAGGAGAACAAGGCGATGTAGGCTTACAAGGAGAATTTGGTGATGATGGACTTCCAGGTCCTAAGGGTTATTTAGGAGTAATGGGACCTCCAGGTTTACCAGGTCTAGATGGCATCAACCCTGAGCCAGGAGAACAAGGCAAATCTGGAATTGACGGATTACCAGGTTGGCCAGGTCCCATGGGTCAAAAGGGTGCTCCGGGAGAGTTTGGTATTAATGGTCCTGAAGGAGCACCTGGTCAACCAGGGCTCATTTTTAGTGGACCAAAAGGGTATAAGGGAGCAACTGGTCGACCCGGGCTAAGGGGCATTTCTGGTAAGCCTGGTTCAACAGGATTACAAGGAAATCCGGGACTAAAAGGATTAACTGGTGACATTGGTGAACCTGGCTATGCTATAAGCCCTAAGGGTGAAACAGGAAATCCTGGTATATCAGGGTTTTATGGCTTGAAAGGGATAAAAGGAGAAGCTGGAGATTTGGGACTGGCAGGTTTGAAAGGATATCAAGGCCCAATGGGAATGAAAGGAGAAAGAGGTGACGAAGGCTATGAAGGACTTAATGGATATTCAGGTGCTAAGGGAATGAAAGGTGATAGAGGAGATGAAATACTTCCATCAGATGTTGAGCCCGGGCCAATTGGTGATATAGGTCCTCCTGGATTTGATGGGCAACCTGGTCGTGCAGGAGCTCCCGGAAATTTCGGAGAAAATGGCATTCCTGGATTCAAAGGTGAAAGAGGTGATATTGGAGATATTGGTCCTGAAGGTTTGCTAGGCAAACAAGGTGGACAAGGGTTCATGGGTATCAAAGGAGAAATTGGTTTTGATGGAATCCGTGGTTTGCCTGGTCTTCCTGGATTACCAGCACCTCCTCCACCAATTCCTAAATCAAGAGGATTCTATTTTACAGTACATTCACAGACTCATCTCATTCCCGAATGCCCCTCTGGAACTACACCTTTATGGGAAGGATTCTCCTTACTTCATATAGTTGCAAATTCTAAGGCCCATGGACAAGATTTAGGTGCACCTGGAAGTTGTCTTCGAAGATTTTCAACAATGCCTTATATGTTCTGTAACATAAACAATGTTTGTGATTTCGCCCAACGCGAAGACTACAGTTTTTGGCTATCAACACCAGAACCAATGCCAAGCGGAATGACCCCAATTCCAGCAACTGACGTTGGATCATACATATCCAGGTGTCAAGTGTGCGAGACATCAACACGATCCATTGCTATTCATAGCCAAAGCAGCTCCATACCAACTTGTCCAGATGGTTGGGATGAATTATGGATAGGTTATAGTTTCCTTATGCATACCGCTGGAGCTGATGCGGCAGGTCAAAGTCTCATATCACCGGGATCCTGCCTTCGGGAATTCAGAACGCGACCATTCATAGAATGTAACGGACTCGGCCGTTGCAACTTTTTCGCAACCGCGGTTTCATATTGGTTATCAACAATTGATGACAACAAAATGTTTGAAACACCTATTCAAGAAACACTGAAACAAAATAAAGTTTCTAGAGTCAGCAGGTGCGCCGTATGTATGCGACGTCAACCACAGAGGTCGTATAGCGCAGGCACAGTGGAGGCTGTACCTAACGCAGTAGTACGACGCCCCGTCAACCGACCTCTTAACCGGCTTCGGCCTCGCTACCCTGCGAGGTACCGGGGGAGACGCCGCCATTGA

Protein sequence:

>DPOGS206549-PA
MQVVCNQTVCDCAGLKGDRGDIGSPGIPGPQGDYGEDGPDGPMGPPGEPGDWGEKGISGDKGERGVDGPYGPRGFTGPQGPTGLEGVRGIAGLDGCSGIDGIMGPPGPQGIPGDRGLPGPYGEKGTQGLAGEGGVNSRGAKGDQGDSGRPGLQGPRGPIGWRGDSGMPGEVGDQGPMGYRGEPGYRGDLGDDVVGPQGEKGDQGEVGESGRPAKVIYIDHLQENVTILAKGNRGDKGFAGTQGVKGVKGDTGSMGPPGPNGLNGDQGYKGDQGIDGPRGKPGHRGRKGPLGPKGDKGWPGYAGLDGEDGEPGARGEDGRPGMPGVQGPQGEKGIYDERLNEPLLPGSNGPQGPVGYPGPPGPPGADGTRGLPGIQGPPGLPGPKGIAGRQGPPGSSEKGEPGNDGFKGLPGPRGPMGYPGPQGVLGPKGFKGSAVRGPEGEEGTPGLDGRPGIRGDRGDFGFMGLPGYPGRGVHGVGPPGEDGPPGRPGVVGDSGTPGRPGFRGPKGERGDDCPFCPSGLPGMKGKRGDEGFKGQKGYPGPEGDRGQRGLKGESGSPGLPGSKGPKGITGPPGMTGRPGLQGEKGRLIQPPLSLIIAERGPRGFIGDPGLRGDPGFPGLRGENGWKGSKGMAGEDGFPGPDGRDGLKGRDGVPGMPGEHADVPIQFLFGQRGDKGIKGQLGEPGDDGLKGDAGEALGFGINAKGEKGEPGPMGPEGLQGIKGDSGDIGYEGLPGERGDIGLPGVSKQGERGARGFPGDKGDIGPYGEPGGPGLRGPVGFDALKGKKGSRGEVGYAIIYGERGFDGMAGDYGDVGEPGYAGNPGRAGLMGPKGEPGLPGDVGPPGPVGPPGRKGMSGNIIQGAPGMPGQPGRLGSIGLIGEPGLQGYNGLQGDVGPKGMKGEAGRMGNRGWTGERGLTGRRGRPGLMGQPGLSGETGDRGETGLRGYDGLPGKEGPLGIIGQKGIRGDIGLPGADGLGGPPGPKGERGYDGVVGDKGMQGENASIGMKGMSGDMGFNGMPGRPGQTGLKGLRGDIGNPGLNLRGLNGTKGFRGDDGIPGRVGEKGLKGFQGDYGFEGIAGEIGDEGFPGLSGLPGRIGFDGAKGPSGHKGLPGLQGPKGDTGFEGEPGRMGSPGYPGDVGLRGLVGERGPSGAKGMSGDIGPSIYLPATKGDMGDIGMEGLKGGKGEMGEPGFPGLKGHKGEQGDVGLQGEFGDDGLPGPKGYLGVMGPPGLPGLDGINPEPGEQGKSGIDGLPGWPGPMGQKGAPGEFGINGPEGAPGQPGLIFSGPKGYKGATGRPGLRGISGKPGSTGLQGNPGLKGLTGDIGEPGYAISPKGETGNPGISGFYGLKGIKGEAGDLGLAGLKGYQGPMGMKGERGDEGYEGLNGYSGAKGMKGDRGDEILPSDVEPGPIGDIGPPGFDGQPGRAGAPGNFGENGIPGFKGERGDIGDIGPEGLLGKQGGQGFMGIKGEIGFDGIRGLPGLPGLPAPPPPIPKSRGFYFTVHSQTHLIPECPSGTTPLWEGFSLLHIVANSKAHGQDLGAPGSCLRRFSTMPYMFCNINNVCDFAQREDYSFWLSTPEPMPSGMTPIPATDVGSYISRCQVCETSTRSIAIHSQSSSIPTCPDGWDELWIGYSFLMHTAGADAAGQSLISPGSCLREFRTRPFIECNGLGRCNFFATAVSYWLSTIDDNKMFETPIQETLKQNKVSRVSRCAVCMRRQPQRSYSAGTVEAVPNAVVRRPVNRPLNRLRPRYPARYRGRRRH-