Monarch geneset OGS2.0

DPOGS210757
TranscriptDPOGS210757-TA1002 bp
ProteinDPOGS210757-PA333 aa
Genomic positionDPSCF300013 + 1509244-1514185
RNAseq coverage171x (Rank: top 51%)
Annotation
HeliconiusHMEL0038065e-3333.86% 
BombyxBGIBMGA006274-TA3e-1147.44% 
DrosophilaCg25C-PB6e-0954.84% 
EBI UniRef50UniRef50_F0IEL67e-1643.75%Putative uncharacterized protein n=1 Tax=Capnocytophaga sp. oral taxon 338 str. F0234 RepID=F0IEL6_9FLAO
NCBI RefSeqXP_001634304.11e-1039.09%predicted protein [Nematostella vectensis]
NCBI nr blastpgi|3263352793e-1543.75%hypothetical protein HMPREF9071_0940 [Capnocytophaga sp. oral taxon 338 str. F0234]
NCBI nr blastxgi|3263352792e-2244.30%hypothetical protein HMPREF9071_0940 [Capnocytophaga sp. oral taxon 338 str. F0234]
Group
Gene OntologyGO:00052012.1e-07extracellular matrix structural constituent
GO:00055812.1e-07collagen
KEGG pathwaymcc:7094933e-13 
 K06238 (COL6A)maps-> Focal adhesion
    ECM-receptor interaction
InterPro domain[53-107] IPR0081607.9e-10Collagen triple helix repeat
[145-332] IPR0008852.1e-07Fibrillar collagen, C-terminal
Orthology groupMCL34847 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210757-TA
ATGTTGCAGAATAATTTGACATTTCCACTGCTTCCGTCTACGTGGCACAAAATAAAAAGTAAAGTAGCATGGAACAGTTCTACCAATCCAAGTGATTGTGGAAGTAAGGAAGACAATGCAACAAAAGGCCCGGTGCTTACTACAACGGTAGAAAAGGGAGGAAAAGGTGAAAAGGGAGAGAAAGGTGAAAAGGGAGAGAGGGGAGAAAAAGGTGAAAATGGAGAAAAAGGGGAAAGGGGTGAGAAAGGATCGCTAGGATCACGTGGGTTAGATGGCGCAAAGGGCGATAAAGGTGTAATGGGTTCACCCGGCAACATAGGCTTAACTGGTCCAAAGGGAGATAAAGGAGACACAGGAGAGCAGGGACCAAAGGGAGAAAATGGAAAGGATTGTGAAGGTTTGTCAGCAGCGGAATCGGGGCTAAATAATGTGGGTTCAAAATCAATGCCGGCATACACCTGTTCATCAATTCCTGATGATGGTTCTTTTTACATCAATCCCTCACGAAGTTTCGAAGTTAATTGCGAAGGATACAAAACCTGTATTAAGCTACCGGATCGTCTTAGGAGCTCCAACTTTAATACCTCAGCATATGAATTTAAAGATAAAAAAGGTTTTTGGCTAAGTACTTTAGGATTCACCGTGAAAAAACTATATGATCAAGATTTTTCAAGGTTGTTATATCTCATGACGAAGTCAGAGGGCTTTCAGCTCACCATGAAATATCATTGTTTCAACAGTCCAGTTGAGTCATTGAGGATAATGACATGGAATGGAAACATTATAAGTAATATATCCAGCCCCAAAACTCCAATAAAATTTGTTATACCCGAAAAGAGTAATGGGTGCAAGGACGCTAAGAAGAATGAATTAAAAAGTTCGACTATAGAAATTGAATCTGACATAACAAGTTACCTTCCTGTTGATTTTTACTTAGGGGACATTATAGGAGATACCATTCAAAAAGTTTATATCGAGCTGACAAGCCTATGCTTTAAATAA

Protein sequence:

>DPOGS210757-PA
MLQNNLTFPLLPSTWHKIKSKVAWNSSTNPSDCGSKEDNATKGPVLTTTVEKGGKGEKGEKGEKGERGEKGENGEKGERGEKGSLGSRGLDGAKGDKGVMGSPGNIGLTGPKGDKGDTGEQGPKGENGKDCEGLSAAESGLNNVGSKSMPAYTCSSIPDDGSFYINPSRSFEVNCEGYKTCIKLPDRLRSSNFNTSAYEFKDKKGFWLSTLGFTVKKLYDQDFSRLLYLMTKSEGFQLTMKYHCFNSPVESLRIMTWNGNIISNISSPKTPIKFVIPEKSNGCKDAKKNELKSSTIEIESDITSYLPVDFYLGDIIGDTIQKVYIELTSLCFK-