Monarch geneset OGS2.0

DPOGS207308
TranscriptDPOGS207308-TA2820 bp
ProteinDPOGS207308-PA939 aa
Genomic positionDPSCF300008 + 1316218-1333044
RNAseq coverage265x (Rank: top 40%)
Annotation
HeliconiusHMEL0162961e-5638.13% 
BombyxBGIBMGA012091-TA7e-5842.96% 
Drosophilavkg-PA7e-3132.83% 
EBI UniRef50UniRef50_F4X5177e-6736.81%Collagen alpha-1(XV) chain n=5 Tax=Acromyrmex echinatior RepID=F4X517_ACREC
NCBI RefSeqNP_001163364.12e-6231.89%multiplexin, isoform M [Drosophila melanogaster]
NCBI nr blastpgi|3071744391e-6934.91%Collagen alpha-1(XV) chain [Camponotus floridanus]
NCBI nr blastxgi|1892356672e-12038.56%PREDICTED: similar to collagen alpha 1(xviii) chain [Tribolium castaneum]
Group
Gene OntologyGO:00310121.5e-47extracellular matrix
GO:00071551.5e-47cell adhesion
GO:00051981.5e-47structural molecule activity
GO:00054881.4e-42binding
KEGG pathway 
InterPro domain[754-938] IPR0105151.5e-47Collagenase NC10/endostatin
[782-938] IPR0161861.4e-42C-type lectin-like
[773-936] IPR0161871.9e-37C-type lectin fold
[19-218] IPR0089854e-10Concanavalin A-like lectin/glucanase
[466-522] IPR0081601.3e-08Collagen triple helix repeat
Orthology groupMCL25921 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207308-TA
ATGGTCCTCTTTTTCAGATGGTTTGGCGCCTTTAGCCCCATGGTGATAGCTGAACCAGACAACTACGATATCCTGAGTCTAGTCCGTGCGAATGTTTTGACCGATTTTATAGATATAGTCAAAGGCACAGATGTGTATGGCGCCATTAAACTTGTAAAAAACGAATTAATTACTATTAAACTGGATCAATTCCCAGATCCAATAAACCATCTAGCAACTCCTTTTGAAATATATGCCTTAGTGAAGTTGAACGTGGATGTGACGTCATGTTTGTTTCAAATCATATCGAATAAAGAAAACAAACTCAGTCTATGTTTTACACCTGAAGGAGAAGATTTAATTAGAATTACATTAAATGGTAGCGATCTACCAGAAAATGGAATATCTTTCCATTACTTGATAGAAGATTATAATGCATTTGTAAATATAATCTTAGCTGTGAATGATAAAAATGTGGAATTTTACTCTAACTGTGAAAAAATTGAAACCCAATATTTCGATTCCGACTACACCATCGAAAACATAAATCTCGAAAAAGATTCTATACTACATTTTGGCAAATTGACCGAAGAAAGTAATTTATTTGAGGCTGCGATACAGACACTTGTGATATATCCAAAGCCTGATATTAATGGACGAAGATATATATGTTCCGATGATAAATTGCCGGCTAGCATTAAAGCTAACCCGAGCACCGAAAGTAATGATTTCACAAAGGCAGAAAACCTCGAAGTGAATACGTTTATTGATTTTGATTCAAGCGAAAAAATATCAACCAACTCCCTGTTTGATAGCACTGAAGAAACTGTTGTTAAAGGAGAAAAAGGTGATAAGGGGGACAAGGGCGAGAAAGGCGATCGAGGAGACAAGGGTGAACGAGGTGAATCTGTCATGGGTGAACGTGGTCCAATTGGTCCTGATGGAGCTCCTGGAACACCGGGTGTGATGGGAAAAGAAGGCTCCTGCAAATGTTCAGAAGCTATTGTGTCAGACTTACTACTAAAAATGCCAGAAATGAGAGGACCTCCAGGTGACTACGGGCTGAAAGGCGATAGAGGTGAAAAGGGGGTGAAAGGAGATAGCGGATTACCAGGAAAAGATGGTAGAGATGGTAATGAGGGCGATCCCGGTATACAAGGTCCTCCTGGAACACCAGGTCTTGTTCGTAAGGAAATAGTAGAGACAAAAGTGCCAGTTGTTGGAGAAAAAGGAGAAAGAGGGCCCGTTGGACCACCTGGTACTCCTGGTAGAGACGGCTTAAGAGGAGAAAAAGGAGACAAAGGTGAACCGGGTCTCATGGGACTACCTGCAAAACTATCATCGATATTAGACGAGGACATCGATCCTAATGAAGAAAAGGCTATCGTCGAAAAATTCAGAGGATATAAAGGGGCAAGTGGTCCTGAAGGACCGAAGGGTGAAAAAGGGGATACAGGAGCAATTGGTCCTCAAGGTGAAACTGGCAGAGATGGTATTCAGGGTCCCCCAGGAAAACATGGACATAAAGGAGAAACTGGCAAAGATGGATCAAAGGGTGACAAAGGAGAACCAGGAATACCCGGTCCTCCTGGTACTGTGCCATCATCTCAAATAAGTCTCATGAAAGGACCGAAAGGTGACCGTGGTCCACCAGGTCAGACAGGTCCTCGAGGACCAACGGGACATCATGGAAAAGTGGGCCCCATAGGACCACCGGGTAAAAGCCACAAGGGAGAGCCTGGGAAACCAGGTCCTATGGGACCCAAAGGAGAAAAGGGTGCTACTGGACCTAGAGGAGAAAAAGGTGAAGGGTTGTCGCCCAGTGATATCGAGAGGTTAAAAGGACATAAAGGTGACAGAGGTGAAATTGGTTTACCTGGTGAAGCTGGAAAGCCTGGGTTGCCGGGGACTTGTGGCGAATGTGTTCGCGTATCAATCCCGGGCCCATCTGGACCACCGGGACCTCCGGGTCCATCAGGTCCTCCTGGAGTCTCTATCATCGGTCCTAAAGGAGAACCTGGTGGATTAGTAACTAAGAAATCATTTTTTGCATTCAATGACATTCATCATGAGAGCACAGATGAAGACGATGATTTTTATACAGCAGCGACTGTCATTTTCAAAACAACTACCGGTCTTCTTAAGAGAACTACTGACACCCCTCTGGGGACGCTGGCATATATATTACAAGAGAAAATATTATTAATGCGGGTTGAAAATGGATGGCAATACGTTGTGATGGGTTCTTTTTTGCAAACAAGGGAATCACATACCAGCACAACATTCAGACCAACGTACTATTCATCAACTCCATCAAGTCCACCCTCTTCAGATGAAACGACAGAGAATAATGAAGATAATTACATACGTTTGGTCGCCTTAAACCAAGCATATGCAGGAAATATACTTATGGCAAACAATAGAACTGGGCGTAATGCTGCTGACCAGGAATGTTACCGACAAGCTTATATACATAATTTTAAAAGCACTTTTGCAGCCTTCCTAGCTACTAGGGTTGAAGATCTAAGATTTATTGTAAAAAGAAAACGAGACAGATATGTTCCGGTAGTCAACTTGTACGGACAAGTTCTTTTCGATTCCTGGGCGAGCATGTTTAATGGTTCAGGAGCACTGTTTGCAAAATCAAGTATTTACAGCTTTAATGGAAAAAATGTTCAGATTGATACTACTTGGCCTTTAAAAGCTGTATGGCATGGCAGCAACTCTTTTGGCACAGTTTTATCAAGAGCAAATTGCAATGAATGGACGAGTGACAGTCCGCTGAACGTTGGCGCGGCCTCCCTACTATATACCCATAGACTATTAGAGGAAGAACAGTAA

Protein sequence:

>DPOGS207308-PA
MVLFFRWFGAFSPMVIAEPDNYDILSLVRANVLTDFIDIVKGTDVYGAIKLVKNELITIKLDQFPDPINHLATPFEIYALVKLNVDVTSCLFQIISNKENKLSLCFTPEGEDLIRITLNGSDLPENGISFHYLIEDYNAFVNIILAVNDKNVEFYSNCEKIETQYFDSDYTIENINLEKDSILHFGKLTEESNLFEAAIQTLVIYPKPDINGRRYICSDDKLPASIKANPSTESNDFTKAENLEVNTFIDFDSSEKISTNSLFDSTEETVVKGEKGDKGDKGEKGDRGDKGERGESVMGERGPIGPDGAPGTPGVMGKEGSCKCSEAIVSDLLLKMPEMRGPPGDYGLKGDRGEKGVKGDSGLPGKDGRDGNEGDPGIQGPPGTPGLVRKEIVETKVPVVGEKGERGPVGPPGTPGRDGLRGEKGDKGEPGLMGLPAKLSSILDEDIDPNEEKAIVEKFRGYKGASGPEGPKGEKGDTGAIGPQGETGRDGIQGPPGKHGHKGETGKDGSKGDKGEPGIPGPPGTVPSSQISLMKGPKGDRGPPGQTGPRGPTGHHGKVGPIGPPGKSHKGEPGKPGPMGPKGEKGATGPRGEKGEGLSPSDIERLKGHKGDRGEIGLPGEAGKPGLPGTCGECVRVSIPGPSGPPGPPGPSGPPGVSIIGPKGEPGGLVTKKSFFAFNDIHHESTDEDDDFYTAATVIFKTTTGLLKRTTDTPLGTLAYILQEKILLMRVENGWQYVVMGSFLQTRESHTSTTFRPTYYSSTPSSPPSSDETTENNEDNYIRLVALNQAYAGNILMANNRTGRNAADQECYRQAYIHNFKSTFAAFLATRVEDLRFIVKRKRDRYVPVVNLYGQVLFDSWASMFNGSGALFAKSSIYSFNGKNVQIDTTWPLKAVWHGSNSFGTVLSRANCNEWTSDSPLNVGAASLLYTHRLLEEEQ-