Monarch geneset OGS2.0

DPOGS212148
TranscriptDPOGS212148-TA978 bp
ProteinDPOGS212148-PA325 aa
Genomic positionDPSCF300038 + 271384-273325
RNAseq coverage3x (Rank: top 90%)
Annotation
HeliconiusHMEL0077192e-10456.45% 
BombyxBGIBMGA006747-TA7e-3643.93% 
DrosophilaCG7142-PA3e-4133.94% 
EBI UniRef50UniRef50_D0R8R14e-4536.71%Chymotrypsin-like proteinase 6A n=3 Tax=Tribolium castaneum RepID=D0R8R1_TRICA
NCBI RefSeqNP_001161130.18e-4636.71%hypothetical protein LOC660544 [Tribolium castaneum]
NCBI nr blastpgi|3071686752e-4738.46%Cationic trypsin-3 [Camponotus floridanus]
NCBI nr blastxgi|3071686753e-4838.52%Cationic trypsin-3 [Camponotus floridanus]
Group
Gene OntologyGO:00042523.5e-65serine-type endopeptidase activity
GO:00065083.5e-65proteolysis
GO:00038248.7e-65catalytic activity
KEGG pathway 
InterPro domain[39-319] IPR0012543.5e-65Peptidase S1/S6, chymotrypsin/Hap
[40-324] IPR0090038.7e-65Peptidase cysteine/serine, trypsin-like
[70-85] IPR0013147.3e-13Peptidase S1A, chymotrypsin-type
Orthology groupMCL10207 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS212148-TA
ATGCAAAAGCAGGATGCCATGGAATACACCATGGAGTATGATTGGGGAATCATTTGCCTATTGATCATTAGTTTCACAAGATTAAACGCAGCGTCAAGTACTTCTGAAAGCAGCAGGATTGTTGGCGGCCATGAATCCAAACCGTACAACCACCCATATCTGGTGACCCTCCAATTGAGATTCTTGTGGGTGAGAATGCACGTTTGCGGTGGAAGCATCATAAATGAGAAATGGATATTAACTGCAGCACACTGCGTGCAAGATTCCTGGTTGCTGAGATGGCTCCCAATGGATGCTGTAGCTGGCGTTCATAATATAAACACGTTTGGGAAAGAGGCACAGATTAATACGATCAACGAACGCATACCACATCCTTTATATGAAGGAGGTATTGGTGCATATGACATCGCTCTACTGGGCTTGCGTACGCCATTCGTCTTCACGGATCAAGTCCAGCCGATAAACCTTCCTTACACTTCTAAAATTTCAAACGAATCTCTTCTTCTGGTCGGATGGGGAGCTTTGAGGACAACTTCCTTCATACCTGACCTACCAAATGAATTACAAGAAGTCAAAGTGACGTACATCCCGTACCAACAATGTTACGACGCTATTGAAGAAATAAAAGAACCTTCTGAATATAATCCTCTCGATAAAGAAGCGCATTTATGTACTGGACCCCTGACTGGAGGCATCGCTGCTTGTAGTGGAGATTCCGGAGGCCCTCTGGTACAAATGACTTCTATAGAAGCCTTGAATAATAGAAACGAAAAGGATAATGATTACGACGAATATTATAATGACAAAAGACTGATTAGAAGTGAAAAACTAGTGCCCAATGTAAACCGCGATCAAGTGCCTTTTATTATTGGGATTGTGTCTTGGGGTATGGCTCCCTGTGGAACCAAAGGAGCTCCAACGGTTTACACTAATGTATCACAATACATGGATTTCATTAACGCACATATTAAAACGTAA

Protein sequence:

>DPOGS212148-PA
MQKQDAMEYTMEYDWGIICLLIISFTRLNAASSTSESSRIVGGHESKPYNHPYLVTLQLRFLWVRMHVCGGSIINEKWILTAAHCVQDSWLLRWLPMDAVAGVHNINTFGKEAQINTINERIPHPLYEGGIGAYDIALLGLRTPFVFTDQVQPINLPYTSKISNESLLLVGWGALRTTSFIPDLPNELQEVKVTYIPYQQCYDAIEEIKEPSEYNPLDKEAHLCTGPLTGGIAACSGDSGGPLVQMTSIEALNNRNEKDNDYDEYYNDKRLIRSEKLVPNVNRDQVPFIIGIVSWGMAPCGTKGAPTVYTNVSQYMDFINAHIKT-