Monarch geneset OGS2.0

DPOGS210689
TranscriptDPOGS210689-TA969 bp
ProteinDPOGS210689-PA322 aa
Genomic positionDPSCF300013 - 751584-778063
RNAseq coverage26x (Rank: top 77%)
Annotation
HeliconiusHMEL0138773e-5377.17% 
BombyxBGIBMGA006304-TA7e-5981.75% 
Drosophila% 
EBI UniRef50UniRef50_D6W8H26e-8755.29%Putative uncharacterized protein n=3 Tax=Endopterygota RepID=D6W8H2_TRICA
NCBI RefSeqXP_001814363.16e-9154.40%PREDICTED: similar to C3 and PZP-like alpha-2-macroglobulin domain-containing protein 8 [Tribolium castaneum]
NCBI nr blastpgi|1892343281e-8954.40%PREDICTED: similar to C3 and PZP-like alpha-2-macroglobulin domain-containing protein 8 [Tribolium castaneum]
NCBI nr blastxgi|1892343288e-8754.40%PREDICTED: similar to C3 and PZP-like alpha-2-macroglobulin domain-containing protein 8 [Tribolium castaneum]
Group
Gene OntologyGO:00048666.2e-10endopeptidase inhibitor activity
KEGG pathwaycfa:4776995e-08 
 K03910 (A2M)maps-> Complement and coagulation cascades
InterPro domain[136-216] IPR0028906.2e-10Alpha-2-macroglobulin, N-terminal
Orthology groupMCL25224 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210689-TA
ATGGCAATTACCGTTGTGTTGCTAGCACTGCAACTAGCAGTTGTTGCTGCAGGGAAGTGCAATACAGGATTCTCTGTTGTGGCCCCGGAGGTTGCTGTCCCGGGTAAGACGACAGCTGTTCTTGTTACTCTTCATGGAAGAGTAGGAGATGACTTATTAGATCCACTCAACGTGACAGTTCGACTCGAAACTCAAGATGCTAACAATGACGTCAAACTGCTTACAACCGCCAGTCAAATGATCACAGGCTACGGAATCATACCTCTCAAGATCCCATCGTCCCCAGCATCTCATTGCACTCTGCACACCAGTGTCGGCTGCCTCGGTAATGACGAGTGCACAGCAAAGAGCCACAGCGTTATAAGACTCTTGGGCCCCGTGCGAGATGTAATCGTCCGACCAGCAAGACATCATTACAGGCCCGGAGAAACCATCACGTTTTGGATTCTTGCCCTGGATCACGATTTGCGGTTGGTTAGAGAGGAGATAGCTTACGTGGCTCTCAGCGATCCAGCAGGAACGAAGGTTGCATTGTGGGAGCAGCTGTCGTTAGACGATGGTGTGAGGAAACTAAGCGCAGTTTTGGCTGCTGGAGCACGTTCTGGAACATGGAGAATCGAGGCGTCTTGCGGCGGATCTTCGGCTCGAGTCGACATCTCAGTTGGTGCTAGTTTGAGGAGCACAACCCCCGCGACCCCGGCCGCAGAACAACATTACGTGGAGCTGAAATTTGCGGATAGCATGAGAAGACGATATAAGCCTGGTTTACCATTTGTTGGAAAAGTCGAAGCAATGAGCACGGAGAAGAGAGTGAGCGTTCGCGTGAAGCTCTACGATGATAAAACCGACATCTACAGCCAAGATATTGATATGTCGACCGGAGAAGGCGGTTTCATAGTCCCGACTGTGATGGCTGATTCACCATTTATAAATTTGCAGGACCGCAACGCGTCACCAACAACTGTTTAA

Protein sequence:

>DPOGS210689-PA
MAITVVLLALQLAVVAAGKCNTGFSVVAPEVAVPGKTTAVLVTLHGRVGDDLLDPLNVTVRLETQDANNDVKLLTTASQMITGYGIIPLKIPSSPASHCTLHTSVGCLGNDECTAKSHSVIRLLGPVRDVIVRPARHHYRPGETITFWILALDHDLRLVREEIAYVALSDPAGTKVALWEQLSLDDGVRKLSAVLAAGARSGTWRIEASCGGSSARVDISVGASLRSTTPATPAAEQHYVELKFADSMRRRYKPGLPFVGKVEAMSTEKRVSVRVKLYDDKTDIYSQDIDMSTGEGGFIVPTVMADSPFINLQDRNASPTTV-