Monarch geneset OGS2.0

DPOGS206535
TranscriptDPOGS206535-TA5805 bp
ProteinDPOGS206535-PA1934 aa
Genomic positionDPSCF300190 - 195251-204248
RNAseq coverage3574x (Rank: top 3%)
Annotation
HeliconiusHMEL0022850.074.43% 
BombyxBGIBMGA014039-TA0.065.58% 
DrosophilaCg25C-PB0.051.33% 
EBI UniRef50UniRef50_P081200.051.33%Collagen alpha-1(IV) chain n=16 Tax=Diptera RepID=CO4A1_DROME
NCBI RefSeqXP_001962982.10.052.82%GF15711 [Drosophila ananassae]
NCBI nr blastpgi|3287235130.055.76%PREDICTED: collagen alpha-1(IV) chain-like [Acyrthosiphon pisum]
NCBI nr blastxgi|3838581760.058.68%PREDICTED: collagen alpha-1(IV) chain-like [Megachile rotundata]
Group
Gene OntologyGO:00052013.4e-117extracellular matrix structural constituent
GO:00055813.4e-117collagen
GO:00054882.4e-47binding
KEGG pathwaydpo:Dpse_GA179870.0 
 K06237 (COL4A)maps-> Small cell lung cancer
    Pathways in cancer
    Amoebiasis
    Focal adhesion
    ECM-receptor interaction
InterPro domain[1709-1932] IPR0014423.4e-117Collagen IV, non-collagenous
[1815-1932] IPR0161872.4e-47C-type lectin fold
[1320-1378] IPR0081604e-11Collagen triple helix repeat
Orthology groupMCL10127 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS206535-TA
ATGAAGTTAGCAGGATTCGTTATTCGATCGTTTGTTCAAGTTCAAGATCTCGTTGAAGTCTTTTCAATGATCAATTCTTGGTTAGTTCTTAGAGTAGAAGTTACAGAGGCACTTACAGCAGATACAGGAGCCGAAATCCTGGAGAGAAGGAGGAGAACGGGCACAAGGTGTACAGCGAGGAGGCGCGCACGCGCCGGGCCCGACATGGCTGCGCCGCACTACTGGTCGCGCGCCCTCACACACAGGGACGAGCGCAGCGCCTCCCGTACAGTCAGGCGCGGGGCGGGGGCTCAGAGACCGCCGCCCGAGCGCTCCTCTTTCCTTAGTATCCCGGCTCAAACGGCGCGCGCGCACGGGACACTGGCGCAGCGCGCTTTGGGCCATGGCGGTCCACGTACTTTGCAACGAAACGAATATGAAAGAGGAGACGTCGAAGAAAATAATATTTATGATAATCAAGACTGGATGTATCAGAATAGTTACAATCCGCAACCAACAATAAATTCATACTATGGTTTATCTAGAAGGCAGGACTTACCGACACCCTCTGCCCCCCCTTCTCCTCCACCGGAAAGAGTACAGCCCTCAAGAAGTTTTGGACAAAACTTTGCTGTGTATGACCCAGTAACTCGTCAGCGGACAAATGCATTTGATCGTAATTGTACGGCCCCTGGCTGCTGTGTACCAAAATGTTTTGCAGAAAAGGGTAGTAGGGGTTTCCCAGGAATGCGGGGACCACCTGGAATCACAGGTCTACCAGGACACGTTGGCGCTGAAGGTCCACAGGGACTTAAGGGTCAAAAAGGTCAAGATGGACCACAAGGTCCTCGGGGTCCGCGAGGAGAAAAGGGTAAACCTGGAGCTCAAGGATTTATAGGCTTAGCAGGACCACCAGGTCCTCAAGGAGAACCTGGTATGCCAGGGATTCCTGGACGTGATGGTTGTAACGGAACTGATGGAGAACCTGGAATGGTGGGGATCAAAGGTTCACAGGGTCCACGTGGATTTGCTGGGCCTAAAGGTAACAAGGGTGATAAAGGAGAGCCGGCTTATATGGGTCGATACCCAAAAGGTGAAAAAGGAGAACCTGGAGCTGATGGTTTACAAGGCCAATCTGGCCCAGCTGGACCAACAGGTCCTCCAGGTTTGGCTGGTCCCAAAGGAATGACTGGACCTATGGGACCACCTGGATATAAAGGTGATAAAGGTCCTAAAGGATCTAAAGGACAATCTATTCAAGGTGATAAAGGAGACCGAGGTGACAAAGGTGACAGAGGACCCGGTTGTCCATCAACAACGTTACCTTCATTGGATAATAAAGGAGCAATAAAAGGTGTCAAAGGTGATATGGGATCAAAAGGTGAAAAGGGAGAACCTGGGAGAATGGGTGAAAAGGGAGAAACAGGTCCAATGGGGGAACCTGGCTTGCCTGGATTAATGGGCATTAAAGGAGAAAAGGGCTTAAGAGGAAATCCTGGGGAACGGGGTCGTGAAGGAATGTATGGTGAACCCGGACCTATGGGAAGAAAAGGTGATAGAGGCATTGATGGACTGAATGGTCTTCCCGGCCGACCGGGTTTGAAAGGAGAACCCGGCAGGGATGGAGCAACAGGTCTAATGGGCTTAAAAGGAGTGCCAGGTCCACCTGGTGGTCGAGCTGGAGCACGAGGTCCACCTGGGCCACCAGGTCCTCGGGGCTATATCGGCGTTGCTGGTGCACCAGGGTCTAGTGGTAGGCCCGGAGAAAATGGATTACCAGGACCTATGGGTCCAAGAGGTGGACAGGGAGAACCAGGTGACACAGGCATTGAAGGTCCAGCAGGTCAAAAAGGAGAAAAAGGAGAACCTGGTCTTGATGGCTTGCCTGGAGAAATAGGTCAACGAGGATATGATGGACCCATTGGTCCTCAAGGACCTAGGGGACTAAAAGGAGAAGAAGGTCAATCAATTCCTGGTGACAAGGGAAACAGTGGCCAACCAGGAATTCCAGGAGATAAGGGAGCCAAGGGCGAAAGAGGTTATCCAGGATTACGAGGTACACCTGGAAACTCTACATTAGGTACACCAGGAAGTCCCGGAGAAATGGGTCCACCTGGTGAAAAAGGTGAAAAAGGAACTCCTGGGTACGATGGAATACCTGGTAATCCTGGACAAAAAGGTGACATTGGAGGACGGTGTAACGAATGTCGACCTGGAAGTATGGGCGAAAAGGGAGACCGCGGTGCTGATGGTCTACCTGGTGAACGGGGTGAACGAGGTCACATCGGACCCATCGGGATGACCGGGGAGCGTGGTGCTGACGGTATGAATGGAATGCCTGGAGCTGCTGGAGCACCGGGTGAACGTGGATTGGACGGACCAATAGGACCACCAGGAATGAGGGGAGCAGATGCAATGATACCGTCCAATTTAGTAAAAGGACCTCCCGGAGAAAGAGGTGAACCAGGAGAAAAAGGAAACATGGGACCTAAGGGTGAAAGAGGACCTGATGGAATAATGGGTGATCGTGGATTAAATGGCATGCCCGGACAGAAGGGTGACATGGGTAGAATGGGACCTTCTGGTATAGATGGCACACCTGGTAGTGATGGAATACCGGGACGGCCAGGAATGAAAGGCATGTCCATCAAAGGTGAAAAAGGAATATCTGGTGATCAGGGTGAAAAAGGTGACAAAGGATTTTCTGGAAGACCAGGACTTAAAGGTGAACCTGGTCAATGTCCCAATGAGTTAAAAATTCGCACAAAGGGAGAAAAAGGCAACCCTGGCGTTCCAGGACCACAAGGACCATTAGGTATGAAGGGTGAAAAAGGTAATCAAGGGCCATTCGGTTTTACTGGTCCAAAGGGAGAGATGGGTTTACCAGGACGAGCTGGACCGGTAGGTCCACGTGGTCTTCCGGGTTTCAAAGGCGATAAAGGTGAAATGGGTTCAATGGGATTTCCGGGAACACCAGGGGATTTAGGCCCTAGAGGTTTTCCAGGGTTACCAGGATTAAAAGGAGACAAAGGTGAGATTGGTCCTTCTATGCCTGGACCACCTGGACCTGCTGGATTAAAGGGAGATAAAGGAGAACAAGGTCCAAGAGGTCAACCTGGAATAGAAGGAAAGGATGGTCCTCCAGGATTAGCTGGCTTACAAGGTGAAAAAGGTGATATGGGATTAATAGGAAGGCAAGGTTATCCAGGACCTATTGGATTAAAGGGCGAACCGGGTCCTATAGGACCATCCGGAGTTCCGGGCATTCCTGGTACGCCAGGAAGAGATGGACCTAAAGGTCAACAAGGATTTCCCGGTCCACCTGGTAAACCTGGTGTAATTGGCTTACCTGGACAAAAAGGTGAACCAGGTATTCAAGGTCCAGATGGCCCGAAAGGTTTCCCAGGACCTCGTGGTCATGTTGGTATGCAAGGGCAAACTGGTCTTGATGGAAGTCCCGGTGAAAAAGGAGATAAGGGTGATATAGGATTCCCGGGTGAGCCTGGTAGACCTGGTCTTGATGGACCTAGAGGATTAGCTGGTGCACCTGGTGAGAAAGGTGATATAGGTTTCCCAGGAAACCCTGGGTTGAATGGATTTATTGGACCAGCTGGCCCAAGAGGTGATATAGGCTTCAAGGGTTCCAGGGGACCAAAAGGAGAACCTGGTTTAGCTTCAGAAAAGGGAGAAAAAGGAGATCAAGGTTTTCCAGGATTACCTGGTGTTGATGGAAGACCTGGGCAAGATGGAGAAAAAGGTGACAAAGGTTTCCCTGGCTATCCAGGTCAAGGCATTCCAGGAAGTCAAGGTGAAAAGGGAGATGCAGGTTTGCCTGGAAAAATGGGTTTTCCTGGTATTCCTGGCGATAAAGGCGACCGAGGCTTTCCAGGACTGGCAGGTTTAAAGGGAGAAAGAGGCCCTGCAGGCAAAGACGGTTTGCCAGGAATGCCGGGAAGAGATGGCAGTCCTGGTGCTCCAGGCCAAGATGGTTTACCAGGAATGGATGGCGAAAAGGGTGAAAGAGGTGATCGAGGATTACCAGGTCGTGATGGTCTTGATGGATTGAAAGGTGACCAGGGTATTGCTGGACCACCAGGGCCAATAGGACCAATGGGTTTTCCGGGTCCTAAAGGAGACATTGGTTTACCTGGGCCATCTATAAATATCAAAGGTGAAAAGGGAGATATAGGTTTTCCCGGTATTACTGGACTTCAAGGAGATAAGGGTGATCGAGGTAGAGATGGCTTCCAAGGTCTACAAGGGGAAAAGGGTGATCAAGGATTCACTGGACAAAAGGGTGAAATGGGTAGAATGGGCGCCATGGGTGAAAGAGGTGAAAGAGGTCCAATTGGACCGACTGGTATTCCTGGACTCACAGTAAAAGGTGAAAAAGGTTTACCTGGAAATAACGGAAAACACGGCAGACCTGGCATGCGCGGTGCTACTGGAGAAAAAGGAGAACAAGGATTACCTGGACTTCCAGGTCCAATTGGGCGCTCTGGCATGCCAGGAACACCGGGACCTAGAGGTGAACCCGGTGAACCAGGAAGTGAAGGAGTCGCAGGACCCCCTGGGTTTGACGGTCCTCCGGGGCTACAAGGTCGTCCTGGCGAATATGGTGAAAAAGGTAACAAGGGTGATAAGGGTGCTGTTGGTTTTGGTTTACCTGGCCCGAAAGGAGACACTGGCTTGCCAGGATTACCGGGTTTAAATGGTGAAAAAGGTGATAAAGGAGATCAGGGTTTCGATGGATTAGTTGGAGAGATGGGTGAGAAAGGTAACCAAGGAGAAAAAGGTGACAGAGGCTATCCTGGTCGGCCTGGAATTCCTGGCCTTGATGGTGTAAAAGGAGATAAGGGAGAAGCGGCTGCTATAGTTTATGGAAGTAAGGGAGAACCAGGACCAAGAGGTCCTCCTGGATTGAATGGTCCACCTGGACTTGACGGATTACCTGGTCCTAAAGGCTGGGATGGTGCTCCAGGCATGAAAGGAGATAAAGGTTTCCAAGGACCTATGGGCCCACCAGGCTTACCAGGACCTCAAGGAATAATGGGTATTCAAGGTGAACGTGGTGAAACAGGTCGTATGGGATTACAAGGTGTACCTGGAATACCTGGTGCTCCTTGTGCTACTACAGACTATCTTACTGGCATCCTTTTAGTGCGTCATAGTCAAACAAACATAGTACCCCAATGTGAACCCGGACATATTAAATTGTGGGATGGCTATTCCTTACTTTACATTGATGGAAATGAAAAGGCTCATAATCAAGATCTGGGATATGCTGGATCTTGTGTAAGAAAGTTCAGTACCATGCCATTCCTTTTCTGTGATCTTAATGATGTATGCAATTACGCAAGTCGAAATGATCGCAGTTATTGGCTTTCTACAAATTTGCCGATACCCATGATGCCAGTAAACAACAATGAAATTTCACGATATATTTCAAGATGTGTTGTTTGTGAGGTTCCAGCCAATGTCATAGCTGTTCACAGTCAAACTCTTGATATACCTAGTTGTCCAGTGGGTTGGAACTCATTATGGATTGGATACAGTTTTGTTATGCACACTGGAGCTGGTGGACAAGGCGGTGGTCAAGCCCTTGCTAGTCCGGGATCTTGTCTTGAAGACTTCCGAGCGACACCATTTATTGAATGTAACGGTGAAGGTGGTACTTGCCATCATTTCGCCAATAAACTTAGTTTTTGGCTAACAACTATAGATGATAAGAAGCAATTCGCAAAACCAGAGCGTGAAACTCTTAAATCTGGACGACTATTGCAGCGAGTGTCTAGATGCGCTGTTTGCATTAAGAATACCACATAG

Protein sequence:

>DPOGS206535-PA
MKLAGFVIRSFVQVQDLVEVFSMINSWLVLRVEVTEALTADTGAEILERRRRTGTRCTARRRARAGPDMAAPHYWSRALTHRDERSASRTVRRGAGAQRPPPERSSFLSIPAQTARAHGTLAQRALGHGGPRTLQRNEYERGDVEENNIYDNQDWMYQNSYNPQPTINSYYGLSRRQDLPTPSAPPSPPPERVQPSRSFGQNFAVYDPVTRQRTNAFDRNCTAPGCCVPKCFAEKGSRGFPGMRGPPGITGLPGHVGAEGPQGLKGQKGQDGPQGPRGPRGEKGKPGAQGFIGLAGPPGPQGEPGMPGIPGRDGCNGTDGEPGMVGIKGSQGPRGFAGPKGNKGDKGEPAYMGRYPKGEKGEPGADGLQGQSGPAGPTGPPGLAGPKGMTGPMGPPGYKGDKGPKGSKGQSIQGDKGDRGDKGDRGPGCPSTTLPSLDNKGAIKGVKGDMGSKGEKGEPGRMGEKGETGPMGEPGLPGLMGIKGEKGLRGNPGERGREGMYGEPGPMGRKGDRGIDGLNGLPGRPGLKGEPGRDGATGLMGLKGVPGPPGGRAGARGPPGPPGPRGYIGVAGAPGSSGRPGENGLPGPMGPRGGQGEPGDTGIEGPAGQKGEKGEPGLDGLPGEIGQRGYDGPIGPQGPRGLKGEEGQSIPGDKGNSGQPGIPGDKGAKGERGYPGLRGTPGNSTLGTPGSPGEMGPPGEKGEKGTPGYDGIPGNPGQKGDIGGRCNECRPGSMGEKGDRGADGLPGERGERGHIGPIGMTGERGADGMNGMPGAAGAPGERGLDGPIGPPGMRGADAMIPSNLVKGPPGERGEPGEKGNMGPKGERGPDGIMGDRGLNGMPGQKGDMGRMGPSGIDGTPGSDGIPGRPGMKGMSIKGEKGISGDQGEKGDKGFSGRPGLKGEPGQCPNELKIRTKGEKGNPGVPGPQGPLGMKGEKGNQGPFGFTGPKGEMGLPGRAGPVGPRGLPGFKGDKGEMGSMGFPGTPGDLGPRGFPGLPGLKGDKGEIGPSMPGPPGPAGLKGDKGEQGPRGQPGIEGKDGPPGLAGLQGEKGDMGLIGRQGYPGPIGLKGEPGPIGPSGVPGIPGTPGRDGPKGQQGFPGPPGKPGVIGLPGQKGEPGIQGPDGPKGFPGPRGHVGMQGQTGLDGSPGEKGDKGDIGFPGEPGRPGLDGPRGLAGAPGEKGDIGFPGNPGLNGFIGPAGPRGDIGFKGSRGPKGEPGLASEKGEKGDQGFPGLPGVDGRPGQDGEKGDKGFPGYPGQGIPGSQGEKGDAGLPGKMGFPGIPGDKGDRGFPGLAGLKGERGPAGKDGLPGMPGRDGSPGAPGQDGLPGMDGEKGERGDRGLPGRDGLDGLKGDQGIAGPPGPIGPMGFPGPKGDIGLPGPSINIKGEKGDIGFPGITGLQGDKGDRGRDGFQGLQGEKGDQGFTGQKGEMGRMGAMGERGERGPIGPTGIPGLTVKGEKGLPGNNGKHGRPGMRGATGEKGEQGLPGLPGPIGRSGMPGTPGPRGEPGEPGSEGVAGPPGFDGPPGLQGRPGEYGEKGNKGDKGAVGFGLPGPKGDTGLPGLPGLNGEKGDKGDQGFDGLVGEMGEKGNQGEKGDRGYPGRPGIPGLDGVKGDKGEAAAIVYGSKGEPGPRGPPGLNGPPGLDGLPGPKGWDGAPGMKGDKGFQGPMGPPGLPGPQGIMGIQGERGETGRMGLQGVPGIPGAPCATTDYLTGILLVRHSQTNIVPQCEPGHIKLWDGYSLLYIDGNEKAHNQDLGYAGSCVRKFSTMPFLFCDLNDVCNYASRNDRSYWLSTNLPIPMMPVNNNEISRYISRCVVCEVPANVIAVHSQTLDIPSCPVGWNSLWIGYSFVMHTGAGGQGGGQALASPGSCLEDFRATPFIECNGEGGTCHHFANKLSFWLTTIDDKKQFAKPERETLKSGRLLQRVSRCAVCIKNTT-