Monarch geneset OGS2.0

DPOGS209584
TranscriptDPOGS209584-TA1302 bp
ProteinDPOGS209584-PA433 aa
Genomic positionDPSCF300015 - 876560-879728
RNAseq coverage502x (Rank: top 25%)
Annotation
HeliconiusHMEL0170305e-6277.62% 
BombyxBGIBMGA006645-TA4e-14868.14% 
DrosophilaCG7556-PA8e-10344.32% 
EBI UniRef50UniRef50_E3WPN58e-11148.35%Putative uncharacterized protein n=1 Tax=Anopheles darlingi RepID=E3WPN5_ANODA
NCBI RefSeqXP_624533.18e-11350.68%PREDICTED: similar to CG7556-PA [Apis mellifera]
NCBI nr blastpgi|665142031e-11150.68%PREDICTED: dnaJ homolog subfamily C member 1-like [Apis mellifera]
NCBI nr blastxgi|3123831728e-11348.00%hypothetical protein AND_03860 [Anopheles darlingi]
Group
Gene OntologyGO:00310722.5e-21heat shock protein binding
GO:00064579.3e-13protein folding
GO:00510829.3e-13unfolded protein binding
GO:00055159.3e-10protein binding
GO:00036774.4e-09DNA binding
GO:00063558.7e-05regulation of transcription, DNA-dependent
KEGG pathwaydme:Dmel_CG75566e-101 
 K09521 (DNAJC1)maps-> Protein processing in endoplasmic reticulum
InterPro domain[19-107] IPR0016232.5e-21Heat shock protein DnaJ, N-terminal
[37-55] IPR0030959.3e-13Heat shock protein DnaJ
[358-418] IPR0090579.3e-10Homeodomain-like
[357-410] IPR0010054.4e-09SANT domain, DNA binding
[359-404] IPR0147781.6e-06Myb, DNA-binding
Orthology groupMCL13656 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS209584-TA
ATGCGGGTACATGTACTATTGTTTTTGTCATTTTCATTGGGTGTATATGGTTGGGACGACGGAGATTTAGAAGTATTTGACGTGGTAGAAGAAGTCAATTCGAATTTTTACGAATTATTAGGAGTTAAGCAGGATGCATCACAATCAGAAATTAAGAGAGCCTTCAAGCAACTCACATTAAAGCTCCATCCTGACAAAAATGACGCACCTGATGCTGACGTGCAGTTTAGGAACCTAGTATCAGTTCATAATATTCTGAAGGATCCAGGAAAACGAGAAAAGTATAATGAGGTGCTCAAGAATGGTCTGCCTAATTGGCGATCTGCGGTGTATTATTACCGTCATGTCAGAAAAATGGGATTGTTGGAGGGAGCTATCATCTTGTTTATAATTATCAGTTTTGGACAGTATGCCGTAGGATGGGCAGCGTATTTGGAGAAGAGATACACAGCTGAACAAATATTGAGTTCGAGAGGCAAAAAACAATCAAAGAAGTCAGGCTTTGATACAGGGTTGGTGGAGATCCTACATCATTTACCAAAACCTAGCATTAAAGATACATTGCCGTTCCAAATACCACGAGGCGTGTGGTGGACAATCACAAGTATACCATATGCCATAATGGAATTAAAAAAGAGAAGAAAAGAGATGCAGGAAGAAAAGATCAGGCAGAAGAAGAAAGAGGAGGATGATCGTGTGAGGGCGGAGCGCGAGGCGGTGGCGGCGGAGGAGCGCGCACCACCTGCTAGAGACTGCTTGTTGGACAGTGCCCCTGGACCGGAGCCCCAGAAGGAAACCGCTGCTGTTACTCCGGCTCCTCCCATAATTTCTGGTGGTTTGTGGACCGACGACGACCTGGCGGAATTGGTTCGCCTGATCAAGAAGTATCCGCCGGGAGCGTCGGAGCGCTGGGAAAGAATAGCGGAGGCCATGGGTCGCAGCGTTCCTGAGGTCACGCATATGGCCGCGAAGGTCAAGGAGAACTGCTATAAGATACCTGGACAGGAGACGGCGGAGGAAGTGCCAGAACCACCTAAGAAGGTGAAGACTCGTCAGACGGAGGAGTCTTCCGGGGGTAACTGGTCGCAGGTTCAACAGAAGGCTTTGGAGACGGCGCTCGCTAAACATCCTAAGGGCACAGCTGGTGATCGCTGGCAGAAGATCGCCGCTGCCGTGCCAGGGAAAACTAAGGAGGAATGCATGCAAAGGTGTAAATATTTATCAGAGATGCTGAGGAAACAGAAACAAAAGGAGGAACAGAAGGATAAAGAGGCTGTGACAGAAGATGAGGTGGCCACGTGA

Protein sequence:

>DPOGS209584-PA
MRVHVLLFLSFSLGVYGWDDGDLEVFDVVEEVNSNFYELLGVKQDASQSEIKRAFKQLTLKLHPDKNDAPDADVQFRNLVSVHNILKDPGKREKYNEVLKNGLPNWRSAVYYYRHVRKMGLLEGAIILFIIISFGQYAVGWAAYLEKRYTAEQILSSRGKKQSKKSGFDTGLVEILHHLPKPSIKDTLPFQIPRGVWWTITSIPYAIMELKKRRKEMQEEKIRQKKKEEDDRVRAEREAVAAEERAPPARDCLLDSAPGPEPQKETAAVTPAPPIISGGLWTDDDLAELVRLIKKYPPGASERWERIAEAMGRSVPEVTHMAAKVKENCYKIPGQETAEEVPEPPKKVKTRQTEESSGGNWSQVQQKALETALAKHPKGTAGDRWQKIAAAVPGKTKEECMQRCKYLSEMLRKQKQKEEQKDKEAVTEDEVAT-