Monarch geneset OGS2.0

DPOGS210743
TranscriptDPOGS210743-TA4428 bp
ProteinDPOGS210743-PA1475 aa
Genomic positionDPSCF300013 + 349799-359455
RNAseq coverage69x (Rank: top 66%)
Annotation
HeliconiusHMEL0217730.072.78% 
BombyxBGIBMGA006274-TA0.072.80% 
DrosophilaCg25C-PB4e-11236.77% 
EBI UniRef50UniRef50_UPI0002062AE20.054.92%UPI0002062AE2 related cluster n=1 Tax=unknown RepID=UPI0002062AE2
NCBI RefSeqXP_002428942.10.059.13%collagen alpha-2 precursor, putative [Pediculus humanus corporis]
NCBI nr blastpgi|2420169190.059.13%collagen alpha-2 precursor, putative [Pediculus humanus corporis]
NCBI nr blastxgi|2420169190.059.12%collagen alpha-2 precursor, putative [Pediculus humanus corporis]
Group
Gene OntologyGO:00052012e-70extracellular matrix structural constituent
GO:00055812e-70collagen
KEGG pathwayphu:Phum_PHUM4117700.0 
 K06236 (COL1AS)maps-> Amoebiasis
    Focal adhesion
    ECM-receptor interaction
InterPro domain[1253-1475] IPR0008852e-70Fibrillar collagen, C-terminal
[249-307] IPR0081602.6e-10Collagen triple helix repeat
Orthology groupMCL10767 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210743-TA
ATGTGGCAAGTGTCAGTAGTGTGGCTGGCGGTGCTGGCGCTGCCAGTGGCTTGCAAGGACCTGGTAGAGGTCCGAGAGGAACTGCCCCAAGCTCAAGAACCAGATCTCGTGAATTATGATGATTATAATCCGTTTGGAGAAGATGAAAACGCTATACTTGCAGAAAGAGACTTCACTCAATCGACCACTGCGCAATCCACCAGATGTTTCGACATAGGACGCTCGTTCTCTGAAGGGGAAACCTGGAAACGAGACAACTGCACGCTCTGTCAGTGTTATAACACTCGTGTCAATTGCTCTGTGCTGCCATCTTGTCTTTTAAGCCCGGGCACGATCCGTCCAACACCATTGGTACCGCCGCAATCTCCACCGGTTCGTGGCATGGCTGGGGAGGCTGGGGCCGACGGACCCCCCGGACCTCCAGGGACCCCTGGCGTTAATGGAGTCCCTGGAGCACCTGGCATTCCAGGACCGGTCGGACCAATGCCTGATGTAAACGCCTACCTGGCTCAGTTAGCAGCTGCCGCCGGTGGTGACAAAGGTCCTGCCTTCGATCCATATCACTACATGCAAGCACCTGTGGGAGTTCCAGGGGTTAGGGGAATTCCAGGGCCCCAGGGTCCTCCGGGACCTCAAGGATTTCAAGGCCCCCGAGGTGAACCCGGTGAGCCTGGATTGCCTGGTCCACCAGGAGTTGCTGGTGAGAGAGGTCCTCAAGGGCTACCCGGAAAAGATGGATCACCGGGAGAAGATGGGGAGCCTGGTCCAGCCGGTGCTATCGGCCAGCCAGGCCAGAGAGGAAGCCCTGGTATACCTGGTATTCAAGGATTGAAGGGACATCGTGGCTTTGACGGCAAGGACGGTGCTAAGGGCGAACAAGGTTTGCCTGGAGAAAAGGGCCCTACTGGCCTCCCTGGACCTATGGGCCCAAATGGACCCGCGGGTCCTGTCGGACCGAGAGGCGAAAGGGGCCGTGAAGGCCCTGCTGGAGTTCCGGGTATAAGAGGAGTTGACGGAGCTCCAGGATCACCGGGACTAATGGGTAATATTGGAAAGCCTGGCGCACCTGGTTTTCCTGGATCCCCCGGCTTGAAGGGCGAACAAGGAGCAGTCGGCCCAAAAGGCAGCCAAGGAATGCAAGGTCCACGAGGGGAGCCAGGACGTAATGGACAGTCAGGGGAAGTGGGTCCCCCGGGTCTAGCTGGTAGAGATGGCACACCGGGAGAAAAAGGGGTACAAGGACAAACAGGCGCCGCCGGACCTCAAGGATTTCCGGGACCTCGTGGTACGCCTGGAGTTGCCGGTGATCCTGGGTTGCCAGGGACGAAGGGAATGCCAGGCCAGCCGGGGGATAGAGGACCTAAGGGTGAAATTGGCTTAAGAGGTGAAATTGGCGCACCAGGTCCGCGTGGTATGCCAGGTGGCATTGGACCTGAGGGTAAAAGGGGAAAAAGAGGTTTGAGAGGACCACCGGGAAACTTAGGACCGCAAGGAGATAGAGGGTTACAAGGATTAAGAGGTTTGAATGGCGCTGACGGTCCAATGGGTCCAAAAGGTCAAACCGGGGAAAGAGGCGCTGTTGGTCTCCCAGGTCCAAAAGGCAGTACCGGCGATATTGGCAGACCTGGACCTCAAGGACTTCTCGGCCCCAGAGGGTTTATGGGCAGACCAGGCATTCCAGGTAAATCTGGTCAACCTGGCGAAAGAGGTATACCTGGTGCAGATGGTAGGCCTGGTGAACAGGGACTTCAAGGACTTCAAGGTCCACCTGGCTTAATTGGAAGTCCTGGCGAACGTGGATTGCAGGGTGAACATGGTAAAGATGGAGATGTTGGTCAACCTGGAGCACCTGGTGCTAAAGGAGACGCCGGAAGAGATGGAAGTCCAGGGAGTCAAGGACCGCAAGGCCCCGTTGGTGCAACTGGGGAACGCGGTCCCATAGGGCCTGCTGGACCGACTGGTTTTCCTGGTTTACCTGGAGCACCAGGATCACCAGGTCCTGCAGGTAAAGACGGAGAACCAGGTGTGTCTGGACCTGCCGGACCACCAGGAGCTACAGGACAAAGAGGTGAAAGAGGTTTCTCAGGAGAAAGAGGCAGTCCAGGTCTACCAGGTGTTGCAGGAGAAAAGGGTGAAGCAGGTGCTCAGGGTCTTGATGGTCCACCGGGTGCGGAAGGACCTCGTGGAAGTAAGGGACATCCAGGACCAATGGGCACGATGGGATTACCAGGACTACGTGGTATGTCCGGTTTGCCAGGTGAAAAGGGGGAGAGAGGATCTTCAGGACCACAGGGGCCAGAGGGACCAGCGGGTCGCCAAGGAGAGCAAGGACCTCAAGGTCCTATAGGTCCTGCTGGACCACCTGGAGAGCCGGCCGAAAGAGGTGAACCTGGAACACCAGGGATGCCAGGAGAATCAGGGGCACCGGGGTCTACAGGAGAGCGGGGTCACCCTGGTCCTCAGGGCAATAATGGTTTACCTGGTCCTCCAGGATTAACAGGCATGCCCGGCTTGAAAGGGGATAGAGGCTATGCAGGACCCAAAGGTCAACAAGGAGCTCAAGGAAGTCCAGGGATTTCTGGTGAACCGGGGCAACGAGGGCTTCCAGGGCAACCTGGAGCTAAGGGTGCTCATGGAGAGCAAGGACATAAAGGTGAAATTGGACGTGCTGGCTTACCGGGAAGACCAGGGGACATGGGGCCGCAGGGTCCTCAAGGAAGCCCGGGTCCTACTGGGGCTCCTGGGTTACCAGGTGCCAAAGGGTCGACTGGAGATACTGGACGGTCCGGACCACCTGGACCTCAAGGGTTAATTGGACCACAAGGACCAGAAGGTCCTAAAGGCGAAAGAGGTGCTGAAGGAGAAACTGGGCCACAGGGACAGCAGGGTATCCCTGGACAAGCTGGAGAAAGAGGACCAAGTGGATTACCAGGTTTAACTGGGGCACCAGGACCACAAGGTCTGAGAGGAGTGCAGGGAGAAAGTGGAGTTCCGGGAAAACCAGGAGCAGACGGTGCCCCAGGTCCTATAGGCGCTCCTGGACCACAAGGGATGACTGGTCCAATGGGGGAACCTGGACCAGAAGGTCGTCCGGGCAAGCTGGGTCAGCCTGGTATTCCAGGACGGCAGGGTGAAAAAGGTCCTATGGGACAACCAGGACAACCAGGTCCACCAGGTTCTCCAGGCGTGCAGGGACCTCCAGGTTCCTCCGGACCTCCTGGAGCAACTGGAGAAAGAGGACCTAGAGGAGAAAGTGGTTCTCCTGGCATAGAAGGGCCTCAAGGTCCATCTGGCAAGCAGGGCCCGCCTGGTATGGATGGTATTAAGGGTGAGCGAGGAGAAAACGGCGCTGATGGACCTAAAGGTCACGCGGGATTGCCAGGACTCCCGGGTTTGATAGGTACTCCAGGAAGACAAGGTGACAGAGGTTTACCAGGTGCCATAGGACCGCCAGGAAAGGATGGAGACGCTGGACCAAGAGGACCACCTGGCCGTGATGGTAGTCCGGGTCCCCAAGGGCCATTAGGTCCCCCGGGAGGTCGTGGTCCTCCAGGAGAGCCTGGACGTCATGGAACACCTGGACCCGCTGGACCTCCTGGACCACCAGGACCTCCTGGAGAAGGGTTGGCATATGATGCTGCTGCAATTGCTGCTATGCTTCAACAAGGAACGATGAAAGGTCCAGATCCGATGGGAGATGATCCGAACATAATGCCACCAAGGTTCTTCAAAGAAGATATGTCCCCAGAAGAGAGGAAAAGTATTGTAATGAAAGCGTACGAAAGACTCCAGGTTTCTCTGGACAAATTCTTAAAACCTGATGGCTCTAAGGAAGCACCCGCCAAGACATGTGGGGATATCAAATACCATCATCCTCATTTCGAAAGTGGTCAGTACTGGATAGATCCCAACGGTGGTGATATCAAGGATGCGATCTTGGTGCACTGCGATTTAAGTACTGGCGCTAGCTGTGTATTTCCGAAGCCCATGATGTCGGAGGAACTCGTCCACTCCGAAAGAAACGAGGCGTGGTTAAGTGAGATGGATAACGGGTTCGCTATATCTTACAAAGCGGAGCACAGTCAGCTGACTTACCTACAACTGCTATCAGTGAAAGCGGTACAGAATGTCACACTTCATTGTCGAAATATTGTTGGCTACTATGACCCAGCCACTAGGAACTACAAACATGGCCTGAAACTACTGGCTTATAATGATGCCGAGATTCTTCCTAAGGCCAATAACAGACTGCGATATAAGGCTTTAATAGATGAATGTCAGTTTAAATCTCAAGATTGGTCCAAGACGATAGTCCAGTATGAGACGGACAAGCCTGGGCGGTTGCCAGTTCTGGATGTGGCTGTAAGAGACCTGCCCAGAACCGACCAGGCCTTCAGGATTGAACTGGGACTGGCGTGCTTTACTTAA

Protein sequence:

>DPOGS210743-PA
MWQVSVVWLAVLALPVACKDLVEVREELPQAQEPDLVNYDDYNPFGEDENAILAERDFTQSTTAQSTRCFDIGRSFSEGETWKRDNCTLCQCYNTRVNCSVLPSCLLSPGTIRPTPLVPPQSPPVRGMAGEAGADGPPGPPGTPGVNGVPGAPGIPGPVGPMPDVNAYLAQLAAAAGGDKGPAFDPYHYMQAPVGVPGVRGIPGPQGPPGPQGFQGPRGEPGEPGLPGPPGVAGERGPQGLPGKDGSPGEDGEPGPAGAIGQPGQRGSPGIPGIQGLKGHRGFDGKDGAKGEQGLPGEKGPTGLPGPMGPNGPAGPVGPRGERGREGPAGVPGIRGVDGAPGSPGLMGNIGKPGAPGFPGSPGLKGEQGAVGPKGSQGMQGPRGEPGRNGQSGEVGPPGLAGRDGTPGEKGVQGQTGAAGPQGFPGPRGTPGVAGDPGLPGTKGMPGQPGDRGPKGEIGLRGEIGAPGPRGMPGGIGPEGKRGKRGLRGPPGNLGPQGDRGLQGLRGLNGADGPMGPKGQTGERGAVGLPGPKGSTGDIGRPGPQGLLGPRGFMGRPGIPGKSGQPGERGIPGADGRPGEQGLQGLQGPPGLIGSPGERGLQGEHGKDGDVGQPGAPGAKGDAGRDGSPGSQGPQGPVGATGERGPIGPAGPTGFPGLPGAPGSPGPAGKDGEPGVSGPAGPPGATGQRGERGFSGERGSPGLPGVAGEKGEAGAQGLDGPPGAEGPRGSKGHPGPMGTMGLPGLRGMSGLPGEKGERGSSGPQGPEGPAGRQGEQGPQGPIGPAGPPGEPAERGEPGTPGMPGESGAPGSTGERGHPGPQGNNGLPGPPGLTGMPGLKGDRGYAGPKGQQGAQGSPGISGEPGQRGLPGQPGAKGAHGEQGHKGEIGRAGLPGRPGDMGPQGPQGSPGPTGAPGLPGAKGSTGDTGRSGPPGPQGLIGPQGPEGPKGERGAEGETGPQGQQGIPGQAGERGPSGLPGLTGAPGPQGLRGVQGESGVPGKPGADGAPGPIGAPGPQGMTGPMGEPGPEGRPGKLGQPGIPGRQGEKGPMGQPGQPGPPGSPGVQGPPGSSGPPGATGERGPRGESGSPGIEGPQGPSGKQGPPGMDGIKGERGENGADGPKGHAGLPGLPGLIGTPGRQGDRGLPGAIGPPGKDGDAGPRGPPGRDGSPGPQGPLGPPGGRGPPGEPGRHGTPGPAGPPGPPGPPGEGLAYDAAAIAAMLQQGTMKGPDPMGDDPNIMPPRFFKEDMSPEERKSIVMKAYERLQVSLDKFLKPDGSKEAPAKTCGDIKYHHPHFESGQYWIDPNGGDIKDAILVHCDLSTGASCVFPKPMMSEELVHSERNEAWLSEMDNGFAISYKAEHSQLTYLQLLSVKAVQNVTLHCRNIVGYYDPATRNYKHGLKLLAYNDAEILPKANNRLRYKALIDECQFKSQDWSKTIVQYETDKPGRLPVLDVAVRDLPRTDQAFRIELGLACFT-