Monarch geneset OGS2.0

DPOGS202079
TranscriptDPOGS202079-TA1266 bp
ProteinDPOGS202079-PA421 aa
Genomic positionDPSCF300116 - 343825-360512
RNAseq coverage50x (Rank: top 70%)
Annotation
HeliconiusHMEL0050520.088.50% 
BombyxBGIBMGA011301-TA6e-10689.27% 
DrosophilaCG12594-PA3e-2330.74% 
EBI UniRef50UniRef50_E0W0961e-2745.38%Putative uncharacterized protein n=1 Tax=Pediculus humanus corporis RepID=E0W096_PEDHC
NCBI RefSeqXP_001811794.17e-6847.39%PREDICTED: similar to CG12594 CG12594-PA [Tribolium castaneum]
NCBI nr blastpgi|1892356621e-6647.39%PREDICTED: similar to CG12594 CG12594-PA [Tribolium castaneum]
NCBI nr blastxgi|1892356623e-6547.39%PREDICTED: similar to CG12594 CG12594-PA [Tribolium castaneum]
Group
Gene OntologyGO:00071552e-05cell adhesion
GO:00051982e-05structural molecule activity
KEGG pathwaysmm:Smp_1238301e-07 
 K06236 (COL1AS)maps-> Amoebiasis
    Focal adhesion
    ECM-receptor interaction
InterPro domain[227-408] IPR0089853.7e-22Concanavalin A-like lectin/glucanase
[163-214] IPR0008841.2e-11Thrombospondin, type 1 repeat
[242-359] IPR0133205.8e-07Concanavalin A-like lectin/glucanase, subgroup
[289-384] IPR0126806.3e-06Laminin G, subdomain 2
Orthology groupMCL20535 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202079-TA
ATGATTAAGTTTAAAATGGCATCTCGGATGTGCCGTGGCATTTCCAATGGATATATCTTGATGGTCGTTTTGTTTTTGTCGGGGCAAGCTGTAGTTTGTGACAGCCAGTGCCCAAAGTTTGCAGAAAGGCCTCTGGAAACGAGAGTTCAAGATGCTTCAATAGTTTTTAGGGCGGTGGTCGTTCAGGCACACTATCAGATAAAGACATTTGATTTGGCTTTAGTGTCAATATACAGAGGTGGAGTTGAGTTGGCATCGATCAGCCAATACGCAGGATCACCCTACAACACGACAGATAGGCAGGTAAATCTCAAAATCAATAATCAATTACGTGATTGCTTTAACTGGAGCATGGTCCAACAAAGCGAGCTTGTGGTTTTCGCTCGTGTCAGTGAACCGGCTGTGGACCTGGAAACAACACCAGCTGATGGGCCCTGGCTGGAAGCTACTGCAGCAGCAGTTCCTTGGAGCTTGGGAGTCGATATAGCAATATGGAATGCTGTCGGCTGGGCTGGCTGGGGGGAGTGGGGTGTGTGTAGCAAGACGTGTGGTGGGGGAAGACAAACCAGAAGAAGATACTGCTCAAGAAATTTTTGTGAAGGTTACGGAGAACAGGGAAGGTCATGTAATTCCTTCAAATGTGATGGTACAATAAATCCTCTGGCACCAGACGCCAGGCGAAATTTTCATCCAGCACAAGCCAGATGGGGTCTAGTACCAGATAGACCTCATGCCTTTAGTCTGAAACCCAACTCTTATATCTGGATAGCGTCTTCCGAACTCTTCGCTCCAGGCAAGACCTTCCCCAGAGAATTCACACTATTCATTTCTTTAAGATTAAGACCTGAGAGCGGGGGTTACGGACAAGGAACGTTATTTTCAGTTCGTTCAAGACGTAAAACTGGTTCATTTTTGTCTCTGGAACTAGCCGGGCGAGGAGCAGCTAGATTGGTTCATTCAGGTGCTGGAACTTCCCGGTCTATATACCTCGCTGTCCCACTTTATGACTTTAGGTGGCACCACATCGCTATAAGTGTCCATGACGACAACACTGTGAGAGTGTATGTGGATTGCCGATGGCTGAGGACTGACGTACTCGAAAAGGACGCTTTAGATACACCAAAGGACGCTGATCTCATTATAGGCTATCTCTTCTCAGGGGACTTGGAACAAATGGTCGTTGTGCCGAAAGCCGGTCAAGCCCACGAGCAGTGCTCTAGCCAAGTGACTGGCATAACACCATTCGTTACCCCGCGCGACACATAA

Protein sequence:

>DPOGS202079-PA
MIKFKMASRMCRGISNGYILMVVLFLSGQAVVCDSQCPKFAERPLETRVQDASIVFRAVVVQAHYQIKTFDLALVSIYRGGVELASISQYAGSPYNTTDRQVNLKINNQLRDCFNWSMVQQSELVVFARVSEPAVDLETTPADGPWLEATAAAVPWSLGVDIAIWNAVGWAGWGEWGVCSKTCGGGRQTRRRYCSRNFCEGYGEQGRSCNSFKCDGTINPLAPDARRNFHPAQARWGLVPDRPHAFSLKPNSYIWIASSELFAPGKTFPREFTLFISLRLRPESGGYGQGTLFSVRSRRKTGSFLSLELAGRGAARLVHSGAGTSRSIYLAVPLYDFRWHHIAISVHDDNTVRVYVDCRWLRTDVLEKDALDTPKDADLIIGYLFSGDLEQMVVVPKAGQAHEQCSSQVTGITPFVTPRDT-