Monarch geneset OGS2.0

DPOGS207247
TranscriptDPOGS207247-TA2592 bp
ProteinDPOGS207247-PA863 aa
Genomic positionDPSCF300008 - 1130084-1149704
RNAseq coverage7x (Rank: top 86%)
Annotation
HeliconiusHMEL0075611e-7070.29% 
BombyxBGIBMGA012062-TA0.066.22% 
Drosophilamp-PG2e-8635.45% 
EBI UniRef50UniRef50_E9IQA31e-10942.66%Putative uncharacterized protein (Fragment) n=1 Tax=Solenopsis invicta RepID=E9IQA3_SOLIN
NCBI RefSeqXP_396317.32e-11744.57%PREDICTED: similar to CG33171-PC, isoform C [Apis mellifera]
NCBI nr blastpgi|3407209092e-12940.93%PREDICTED: collagen alpha-1(XXII) chain-like isoform 1 [Bombus terrestris]
NCBI nr blastxgi|1892356670.043.28%PREDICTED: similar to collagen alpha 1(xviii) chain [Tribolium castaneum]
Group
Gene OntologyGO:00310122e-69extracellular matrix
GO:00071552e-69cell adhesion
GO:00051982e-69structural molecule activity
GO:00054881.1e-64binding
KEGG pathway 
InterPro domain[652-831] IPR0105152e-69Collagenase NC10/endostatin
[661-828] IPR0161861.1e-64C-type lectin-like
[660-830] IPR0161872.1e-57C-type lectin fold
[303-356] IPR0081608.5e-10Collagen triple helix repeat
Orthology groupMCL11595 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS207247-TA
ATGGCGATGGTGATATCACAGACGACTAAAGCAGCTATTGCTGGATGTTTAGCTTTAATGATATTAGGAGCTGTTCTGGTCGTTGGTGTGGCGTTCGGTTGGTTTGATCCCAAACGAGAAGATGAAGACACACCTGCAAAGATCAGTGCAAGACTTCAAGGCCGGACATCTTTCGTCACTCACAGAGCTGTAACGCTGTCGAGTCAATGCCAATGTACGGTAAATGATATATTTAGAATGATGGAGGTTGTACCTGGTTTGGTTGGACCACCTGGACCTCCAGGGCCGACTGGTGCCGATGGCATGACAGGTGCCCCAGGAAAAACTGGACAAATAGGAGAAACAGGAGCTCCTGGACCTCAAGGGACAAAGGGTGATCGTGGAGAGAGGGGCGATACAGGTCCCACCGGAATAGAGGGCCAACCAGGACCTAAAGGAGAGCCTGGTGCTGATGGTTCACCAGGTCTTCAGGGACCGCCGGGACCTCCAGGTCCACCTGGATTGCCTTCATCGCCAATGATGGAATCAACAGGATTATACGTAGCTGGGGATCGTGGAGTTATGGGACCTCCCGGCGAACGAGGCCCAATGGGTCTACCCGGACCCCAAGGAGAAAGGGGTTACGCGGGGAATAAAGGCGAAAAAGGATTACATGGGGCTAAAGGCGATAAAGGTGAACGGGGGTACGTGGGATTGCGGGGGCCACATGGTGCGAAAGGGGAACGAGGGGTGCCAGGAAAAGATGGCACACCTGGTTTGCCCGGCGCTCACGGACGCCCAGCGGAAAAGGGTGAAAAGGGTGCTCGCGGACTTCCTGGTTTACCCGGACCTTCAGTAGTCGGCATTTCTGAAAATTCTGTTTTAAGTGAAATCGCACTTCCAGGATCTCGTGACGTCATGAAGTTGAAAGGCGAAAGAGGAGAAAGGGGCGAAAAGGGTGAAAAAGGTAGCAGAGGTATGGAAGGCCCACAAGGTTTTCCAGGCACTGATGGAAGGCCTGGTGAAAGAGGGGATATCGGCCCATCAGGTGTCCCTGGACCTCAAGGAGTACTAGGTCCACCCGGACCTGTAGCAATTTCAAGAGAAGAAGCTTTGATCATGACCAAGGGTGAAAAGGGTGAGACCGGTCCCAGGGGAAAACGGGGTCACCCCGGCGCTCCAGGCCCGAGAGGACCGCCAGGGCTTCCTGGACCCCCCGGAGTCCCTGGAATTAATGGACCTTCTGGTGATATTGGCCTGCCAGGATGGACGGGTCCACCGGGTGTAGCGGGTCAACCAGGCCCGCCGGGACAAAAAGGTGAAAAAGGAGACTCCGGTATATCACCAGCCGACCTCGAAAAGGTAAAAGGTGAAAAAGGTGAACGTGGCTACGATGGAACCTCTGGACCACCGGGTAAAGATGGTCCTAGAGGTCCGCCTGGACCTCCAGGAACTCCAAGCACAAGTTTGCAATATATTCCGGTTCCTGGCCCTCCTGGCCCCCCAGGGCCACCCGGGCCTCCTGCGGTTTTCACGAATAACGTTCCAATCGACGCTTTGACAGATAGCCCTGGGATTAATCGCCTTCAACCTGGCACAGGAAAACCACGAGATCCGCTACAAATTCTAAGAAACTTGAATAATTTGATGCAGTACCGCCAAGAACAATTTGAGCCTGGAATTCGAGATTCACTAGATAGTGACGGAGAAAATACAGATTTCGATGATGAAGAAGATGGCAGGACTCTGGTCGGCACTATACTATTTAAATCAACTGAATCATTATTACGGTTGGGAACAAACACTCCTCGAGGAACATTAGCATACGTATTGCAAGAGCAAGCGCTCCTTGTAAGAGTCAACAAAGGCTGGCAATATGTTGCAATGGGTTCGCTTCTAAAGATACCGAGTCCGCCGGGTAGCGGCGTTACGCTCACTCCAGTTCAAAATATATTAGAAACTTCTAGTTTGGTACATCATAAAAACTCGGCAACAGGCGGACCTGCGCTTCGTCTGGCAACACTTAATGAGCCTCACACAGGCGATATGCATGGAGTCAGCAGCACAAACTACGAATGTCACAGACAAGCTGAAAGATCCGGATTAGATGGAACTTTCAGAGCCTTTATTACTTCAAGGGTACAAAACATAGAGTCCATAGTGAATTGGGTGGACCGTGAAATACCAGTAGTGAATATCCGAGGGGACATTCTCTTCAATTCGTGGGGTGAAATGTTGGATGGGTCTGGTGCTGTATTTGCACACGCTCCTAAATTATACAGCTTCAATGGAAAAAACGTAATGATGGATCCCAGTTGGCCAACAAAAGCTGTTTGGCATGGGGCCACACCAAATGGGGAACCGGCAATGGATGCGTATTGTGACGCATGGCACAGCAGTAGCCCGACAAAATTCGGATTGGCCTCTTCATTACGCTCTAACAAGCTTTTAGATCAAGAAACGTACCCGTGCAGCACGCGACTAATCGTGCTCTGCATTGAAACTACTCCGCTTAACACAGTGAGAAGAAAAAAACGTTCCAAATATCGGGTATCCGACAAAACACATTTCCTCAAAGACATCGAAAAACGAAACGAAACTCTAAACTTATAG

Protein sequence:

>DPOGS207247-PA
MAMVISQTTKAAIAGCLALMILGAVLVVGVAFGWFDPKREDEDTPAKISARLQGRTSFVTHRAVTLSSQCQCTVNDIFRMMEVVPGLVGPPGPPGPTGADGMTGAPGKTGQIGETGAPGPQGTKGDRGERGDTGPTGIEGQPGPKGEPGADGSPGLQGPPGPPGPPGLPSSPMMESTGLYVAGDRGVMGPPGERGPMGLPGPQGERGYAGNKGEKGLHGAKGDKGERGYVGLRGPHGAKGERGVPGKDGTPGLPGAHGRPAEKGEKGARGLPGLPGPSVVGISENSVLSEIALPGSRDVMKLKGERGERGEKGEKGSRGMEGPQGFPGTDGRPGERGDIGPSGVPGPQGVLGPPGPVAISREEALIMTKGEKGETGPRGKRGHPGAPGPRGPPGLPGPPGVPGINGPSGDIGLPGWTGPPGVAGQPGPPGQKGEKGDSGISPADLEKVKGEKGERGYDGTSGPPGKDGPRGPPGPPGTPSTSLQYIPVPGPPGPPGPPGPPAVFTNNVPIDALTDSPGINRLQPGTGKPRDPLQILRNLNNLMQYRQEQFEPGIRDSLDSDGENTDFDDEEDGRTLVGTILFKSTESLLRLGTNTPRGTLAYVLQEQALLVRVNKGWQYVAMGSLLKIPSPPGSGVTLTPVQNILETSSLVHHKNSATGGPALRLATLNEPHTGDMHGVSSTNYECHRQAERSGLDGTFRAFITSRVQNIESIVNWVDREIPVVNIRGDILFNSWGEMLDGSGAVFAHAPKLYSFNGKNVMMDPSWPTKAVWHGATPNGEPAMDAYCDAWHSSSPTKFGLASSLRSNKLLDQETYPCSTRLIVLCIETTPLNTVRRKKRSKYRVSDKTHFLKDIEKRNETLNL-