Monarch geneset OGS2.0

DPOGS203240
TranscriptDPOGS203240-TA1071 bp
ProteinDPOGS203240-PA356 aa
Genomic positionDPSCF300210 - 43036-47507
RNAseq coverage119x (Rank: top 58%)
Annotation
HeliconiusHMEL0087012e-15493.42% 
BombyxBGIBMGA007075-TA5e-14788.96% 
Drosophilastet-PB5e-10960.52% 
EBI UniRef50UniRef50_Q9BML46e-10760.84%Rhomboid-2 n=7 Tax=Coelomata RepID=Q9BML4_DROME
NCBI RefSeqXP_313932.43e-11666.67%AGAP005058-PA [Anopheles gambiae str. PEST]
NCBI nr blastpgi|1582924637e-11566.67%AGAP005058-PA [Anopheles gambiae str. PEST]
NCBI nr blastxgi|1582924639e-11466.67%AGAP005058-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00042524.2e-136serine-type endopeptidase activity
GO:00160214.2e-136integral to membrane
GO:00065081e-91proteolysis
KEGG pathway 
InterPro domain[1-350] IPR0172134.2e-136Peptidase S54, rhomboid, metazoan
[52-336] IPR0026101e-91Peptidase S54, rhomboid
[157-306] IPR0227641e-39Peptidase S54, rhomboid domain
Orthology groupMCL11662 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203240-TA
ATGTCGTGGGAAACCGAGTTGAACGACTCTCGGGGACAATACGGTAGCTGGAGTGCGGATGGCGAGTTCACTGTTCAGCGCCGCCCACCTGTTCAGAGACCTGATCTATACCACTGCCCCATAGACCACACAGCATTCCTACTCGATAACTATATGAGCGGCAAGCGCACGCGCAGCTATAGATGTGCGGTGCATCAAAGAGACAGAGAAGTCAGCTCTGAGAACGACTTTCATTTACTATTAGAAGATCCAACACTTTTTGCCAGGATGGTTCATCTCGTGGCTATGGAAGTGCTCCCGGAGGAGAGGGATCGGAAATACTATCAGGAAAGATACACGTGCTGTCCCCCGCCGCTCTTCATCATCTGCGTCACTTTACTAGAATTGGGAGTGTTCGCGTGGTACGCGTGGGGCGGCGTGGCGGCCGCGGGCCCGGTGCCTGTGGACTCGCCCCTCGTGTATCGACCTGACAGGAGACAGGAGTTGTGGCGCTTCCTCACATACAGCGTGGTACACGCCGGCTGGCTGCACCTCGCATTCAACTTGCTTGTACAGTTAGCGGTGGGTCTGCCGTTGGAGATGGTTCACGGTGCGGTGAGGTGTGGCGCGGTGTACTTGGCGGGTGTGCTGGGCGGGTCGCTGGCCGCCTCCGTCCTCGACCCTGATGTGTGTCTCGCGGGAGCCTCGGGTGGAGTGTACGCCTTACTAGCGGCTCACCTCGCTAATGCCTTGTTGAACTTCCACGCAATGAGATACGGCGCTGTGAGACTCGTGGCAGCCCTCGCCGTCGCATCCTGTGACGTCGGTTTCGCTGTCCACGCTAGGTATACTAAGGAGGCGCCGCCTGTATCGTACGCGGCGCACGTGGCGGGCGCTCTGGCCGGTCTCACCATCGGACTGTTGGTGTTGAAACACGCACAACAGCGCTTATGGGAGAGACTGTTGTGGTGGGCGGCGCTAGGAGCGTACGCGGCCTGCACTCTATTCGCCGTATTATACAACGTGTTCAGCGGACCGGTCGATGAATTGCACTATATGCCGCCCGATCCGCCGCCAGACGTCGGCTTTTGA

Protein sequence:

>DPOGS203240-PA
MSWETELNDSRGQYGSWSADGEFTVQRRPPVQRPDLYHCPIDHTAFLLDNYMSGKRTRSYRCAVHQRDREVSSENDFHLLLEDPTLFARMVHLVAMEVLPEERDRKYYQERYTCCPPPLFIICVTLLELGVFAWYAWGGVAAAGPVPVDSPLVYRPDRRQELWRFLTYSVVHAGWLHLAFNLLVQLAVGLPLEMVHGAVRCGAVYLAGVLGGSLAASVLDPDVCLAGASGGVYALLAAHLANALLNFHAMRYGAVRLVAALAVASCDVGFAVHARYTKEAPPVSYAAHVAGALAGLTIGLLVLKHAQQRLWERLLWWAALGAYAACTLFAVLYNVFSGPVDELHYMPPDPPPDVGF-