Monarch geneset OGS2.0

DPOGS201100
TranscriptDPOGS201100-TA1773 bp
ProteinDPOGS201100-PA590 aa
Genomic positionDPSCF300137 - 480570-487025
RNAseq coverage303x (Rank: top 37%)
Annotation
HeliconiusHMEL0226213e-7860.00% 
BombyxBGIBMGA013649-TA2e-14954.42% 
DrosophilastmA-PC5e-15051.49% 
EBI UniRef50UniRef50_Q8IGJ07e-14851.49%Protein EFR3 homolog cmp44E n=42 Tax=Coelomata RepID=EFR3_DROME
NCBI RefSeqXP_001120727.16e-17355.14%PREDICTED: similar to conserved membrane protein at 44E CG8739-PA, isoform A [Apis mellifera]
NCBI nr blastpgi|3504201914e-17356.24%PREDICTED: protein EFR3 homolog cmp44E-like [Bombus impatiens]
NCBI nr blastxgi|3504201919e-17156.24%PREDICTED: protein EFR3 homolog cmp44E-like [Bombus impatiens]
Group
KEGG pathwayame:4121767e-08 
 K03125 (TFIID1, KAT4)maps-> Basal transcription factors
Orthology groupMCL11441 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201100-TA
ATGTGTCACAGTAACCACAACGACCAGGCGGTCAGAGACAACATCCGGCTCGCTGGGATACAGGGGCTCCAGGGCGTGATAAGGAAGACAGTGTCCGACGACCTCGTGGAGAACATCTGGGAGGCCCAGCACATGGACAAGATCGTACCCTCGCTGCTCTATAACATGCAGACAGCCGAGAAATACGAAACGGTCACCTGTATGGAGACGGACGCGAGGGACGGCTTGGAGGACGACCCGCCACGCCTCGCCGAGGCCTGTCTCAGGGAGCTGGTGGGCCGCGCCTCCTTCGGACACATACGGAGCGTGCTCAGACCCGTGCTCACTCATTTCGATCGTCACGAACTGTGGGTGCCGAACGACTTCGCCGTCCACACATTCAAAATCATCATGTTCTCAATTCAAGCCCAGTACTCGTATAGTGCTGTAGAGGCTCTGATGCAGCACTTGGACGCGGGGACCTGCGGAGACCCGGCCGCCAGGACCCGGGTCAGAGCCGCGAGGGCCGCGGTCCTTAGTAACATAGTCGCCATAGCAGCTGGAGATAGTGTTGGTCCATCAGTTTTGGAGATCATCAACAACCTCCTGACCAACCTGAGAACGTCTGTGGCGAGAGATTCAGAGAAGGAGTCGGACGAGAGGTTGTACCAGGAGGCTCTCATCAACGCGCTGGGGGAGTTCGCGGACCACCTGCCCGACTACCAGAAGATAGACATCATGATGTTCATAGTGAGCAAGATACCCACCACGCGCGGCAAGCCCGCCCGCGCCGACGTCATGCTGCAGAGCATCCTGCTCAAGTCCCTGCTCAAGGTGGGCACGACGTACAAGACGAGCGAGCTCAGCAAGGCCTTCCCCGCCGCCTTCCTGGAGGCGCTGCTGCGGCTGTCCGCGGCCGCCGGAGACTCGCCGCCGCCCGTCCTCTTGCAGCGGATACTGCACACGCTGCTCGACAGGAGGGGCAACGCGCACCTGCTCGCCGAGCCCACGGTGGAGTACGAGGCGCTGGGCCTGTCGGTGGGCAAGTGCTCGCGGCCCGACCTCATATTCATCAGCAAACACGGCTACGCCATATTTAACTCGCTGTACGAAGGGTTACAGCTGGAGTCCAACAACCTGGAGAACATCAGCGCCATCTACACCACGCTGGCGCTGCTGTTTGTGGAGCTGGCCTCGGACGAGACGGTGTGCGACATGCTGCAGCTCGTGCTGTCTATCCAGCAGTCCGCGCTGTCCAACCCGGTGCTGTCCGTGTGGCAGCAATGTTCGCTGCACGTCGTGTGCGCGTCCCTGGCGGCGCTGGTGTGTCACGTGATGATGCTGCCCGCCCTGCAGCACTACATCACTCAGATCGTCGACGCTCGCCGCGAGGAGGCGCCGCATCTGTTACCGCCGCTGAAGCAATACGACCAGCTGCCTCCCTCCAAGATGCCCAGCAAACTGCCCTACCTCATGATAGACCAGATGGCGTTGTCGGAGTGTTTGTCATCGTGCGGGGTGGAGGGTAGTCGACTGTCGAGTGGTGCTCGGTACGGCCCAGCTGTACACAGGCACTCCTGGGTCGAAGCTGGAGCTGCCCAGGGTAGAGACAGCTTGGCAGACATCTCAGCCGGGCCCACCACGGATCTGGACAGCGCTAACAGCTCTCCAGGTGTACAGAGGAGAATTCAGTACGACGATCTGGACGAGGAGTACAGACAGTTCATAGAGAAATATAACCACAACCACCGACCCACCGCACACGACTTCGGAACATACGTAACATACTACTAA

Protein sequence:

>DPOGS201100-PA
MCHSNHNDQAVRDNIRLAGIQGLQGVIRKTVSDDLVENIWEAQHMDKIVPSLLYNMQTAEKYETVTCMETDARDGLEDDPPRLAEACLRELVGRASFGHIRSVLRPVLTHFDRHELWVPNDFAVHTFKIIMFSIQAQYSYSAVEALMQHLDAGTCGDPAARTRVRAARAAVLSNIVAIAAGDSVGPSVLEIINNLLTNLRTSVARDSEKESDERLYQEALINALGEFADHLPDYQKIDIMMFIVSKIPTTRGKPARADVMLQSILLKSLLKVGTTYKTSELSKAFPAAFLEALLRLSAAAGDSPPPVLLQRILHTLLDRRGNAHLLAEPTVEYEALGLSVGKCSRPDLIFISKHGYAIFNSLYEGLQLESNNLENISAIYTTLALLFVELASDETVCDMLQLVLSIQQSALSNPVLSVWQQCSLHVVCASLAALVCHVMMLPALQHYITQIVDARREEAPHLLPPLKQYDQLPPSKMPSKLPYLMIDQMALSECLSSCGVEGSRLSSGARYGPAVHRHSWVEAGAAQGRDSLADISAGPTTDLDSANSSPGVQRRIQYDDLDEEYRQFIEKYNHNHRPTAHDFGTYVTYY-