Monarch geneset OGS2.0

DPOGS202124
TranscriptDPOGS202124-TA3570 bp
ProteinDPOGS202124-PA1189 aa
Genomic positionDPSCF300150 + 399539-407178
RNAseq coverage218x (Rank: top 45%)
Annotation
HeliconiusHMEL0077590.090.62% 
BombyxBGIBMGA006967-TA0.086.70% 
DrosophilaTaf2-PA0.068.98% 
EBI UniRef50UniRef50_Q243250.068.98%Transcription initiation factor TFIID subunit 2 n=42 Tax=Eukaryota RepID=TAF2_DROME
NCBI RefSeqXP_393397.30.069.36%PREDICTED: similar to TBP-associated factor 2 CG6711-PA [Apis mellifera]
NCBI nr blastpgi|2700132000.069.34%hypothetical protein TcasGA2_TC011774 [Tribolium castaneum]
NCBI nr blastxgi|2700132000.068.04%hypothetical protein TcasGA2_TC011774 [Tribolium castaneum]
Group
Gene OntologyGO:00082375.7e-19metallopeptidase activity
GO:00082705.7e-19zinc ion binding
GO:00054884.6e-07binding
KEGG pathwayame:4099060.0 
 K03128 (TFIID2, TAF2)maps-> Basal transcription factors
InterPro domain[14-383] IPR0147825.7e-19Peptidase M1, membrane alanine aminopeptidase, N-terminal
[639-1009] IPR0160244.6e-07Armadillo-type fold
Orthology groupMCL12486 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS202124-TA
ATGAAAAAAGAACGTACTGGCGACAATTGTCGCCCATTTAAATTAGCCCATCAAATTTTGAGCTTAACAGGAATAAGCTTTGAAAGAAGAAGTGTGATAGGTTTTGTTGAGTTAACAATAGTACCCCTAAAGGATAACTTAAGGTATATTCGCCTAAATGCCAAGCAATGTCGTATATATCGTGTGTGCCTAAACGACCAGTATGAAGCCAACTTTCAGTATTTTGATCCTTTCCTTGATATTTGTCAAAGTGATGCCAACACGAGATCCCTCGAGGTGTTTTCTCAAAACCATTTATCAGCTGCACAAAAGACAGATCCTGATCACAATTCTGGTGAACTCCACATACAAGTTCCAGATGATGCTGCCCACTTAGTCGGTGAAGGAAGGGGTTTGAGGATTGGCATTGAATTCTCTCTTGAATCCCCACAGGGCGGGATGCACTTTGTGGTTCCAGAAGGAGAGGGAACTATGGTTGAGAAATCAGCACATATGTTTACATACGGCCATTCAGCACGACTCTGGTTTCCTTGTGTGGATAGCTTTGCTGAACCTTGCACTTGGAAGCTTGAGTTTACTGTAGATGAGACATTCACAGCTGTGTCTTGTGGGGAGTTATTAGATGTAGTGTACACACCTGATCATAGACGGAAGACATTCCACTATGTTGTTAATACTCCAGCCTGTGCTCCGAATATTGCGCTGGCCATTGGGCCATTCGATACCTATGTTGATCCACATATGAATGAAGTGACACATTACTGTTTGCCTCATCTCCTACAAATTCTCAAGAACACTGTGAGATATTTGCATGAGGCCTTTGAATTTTATGAAGAAACACTATCTACAAGATATCCTTACCCGTGTTATAAGCAAGTGTTTGTTGATGAGACGGAAGATGATGCAACGGCATACACAACAATGTCAATACTTAGCACGCATCTTCTTCATTCAATTGCCATCATAGATCAAACATACATCAGTAGAAAGGCCATGGCCCAAGCTGTGGCTGAACAATTCTTTGGCTGTTTTATAACTATGCAGAACTGGTCCGATCTGTGGCTTGCCAAGGGTATACCTGATTACTTGTGTGGTCTTTACTCTAAGAAGTGCTTTGGTAATAATGAGTACAGATATTGGATTCAACAGGAATTACAAGAGGTGGTGAGTTACGAAGAGCATTATGGTGGTATAGTCCTCGATCCATGGCAGCCGCCAGCAAGCGGAGCTCGTGTTGAACCCAAGGACGTTTTCTATTTCCCTGTCAGAAATGTACACACCATGTCCCCTAGATATATCGAGGTAATGCGAAAGAAATCCCATCTGGTATTGCGGATGTTAGAACAACGAATAGGCCAAGAGCTGTTGTTACAAGTATTCAATAAACAGCTTTCGTTAGCAACAAACGCAGCAAACACAAAAATCGGTAGCGGTTTGTGGGGACATCTGCTTTTATCGACAAATTTGTTTGTCAAAGCTATATTTACTGTGACTGGCAAAGATATGGCCGTGTTTGTAGATCAGTGGGTTAGAACGGGCGGGCATGCTAAGTTTCAATTGACTTCCGTTTTCAACAGGAAAAGAAATACAGTTGAATTGGAAATTCGTCAAGACAGCGTTCATGAGCGTGGGATCAGGAAGTATGTGGGGCCTCTCTTAGTCCAACTACAAGAATTAGATGGAACTTTCAAACATACTTTGCAAATAGAAAATACTGTTGTAAAAGCGGATATCACGTGCCACAGTAAGAGTAGGAGGAATAAAAAGAAGAAAATTCCATTATGCACTGGAGAGGAAGTTGACATGGATTTATCTGCTATGGATGACTCACCAGTACTATGGATTCGGCTGGACCCAGAGATGTCCCTCTTACGAAGTACAGTGATATCCCAACCGGATTACCAATGGCAGTACCAATTACGTCACGAACGTGACGTCACAGCTCAAAGCGAGGCTATAGACGCGCTCCACAACTACCCCGAACCAGCTACCAGGAAGGCCTTGACGGATATCATAGAGAATGAACAAACACATTATAAAATCCGATGCCGGGCCGCGCACTGTTTGACTAAGGTTGCTAATGCCATGATAAGCTCGTGGGCGGGACCGCCGGCTATGTTGACGATATTCAGGAAAATGTTCGGATCATTCGCCGCACCGCACATCATCAAACAAAATAACTTCGATAATCTACAACATTACTTTTTGCAGAAAACTATACCTGTTGCTATGGCCGGTTTGAGAAATATCCATGGTATATGTCCACCGGAAGTCGTAAGATTTCTATTGGATCTGTTCAAATATAACGATAATTCAAAGAACCACTTCTCTGACAACTATTACAAAGCTGCTCTAGTCGATGCGCTGGCTGCAACCATAACTCCCGTCATATCTGTTTTACAACCCGGTGCTCCAATAACCGCGGAATCGTTATCAGCAGACACGCGTTTGGTCCTCGAAGAGATAACGCGCGTGTTGAATCTGGAGAAGGTGCTGCCGTGCTACAAAAACACGGTGACAGTCAGTTGTCTACGAGCTATCAGACGTTTACAGCAGTGTGGTCACTTGCCCAGTATACCCACAGTGTTTAGAGCGTACGCGCAATACGGGCAGTATATAGATGTGCGTTTAGCAGCGTTTGAGGGTCTAGTGGACTTCGTACGAGTGGACGGCAAGCCAGAGGACCTGTCATATCTGTTGACCGCTATAGAGAACGACCCTGACCCTGGCGTGAGGCATGGTCTGGCGCGACTCATGGTCTCAATGCCGCCCTTCGAGAGAGCTCAGAGACATAGACTGGATACGGAATCCGTCGTTCATAGATTATGGAACAACATAAATAGTCAATTATCAAATGATGCAAGGTTGAGATGCGATCTCGTCGACTTGTATTACACGCTGTATGGCCTCAAACGACCTATATGCGTGCCCTTGCCCGAAATTCAGGCCATGATGAAACAGATGCATCACAAGGAAAGAGAAAGACTTGACAGAGAAAGAGAACGAGCAGAGAGGGGGAGGGAGAAGGAGAAAATAAGAGAAATGGACATAAAACCGGTTATCAAACAAGAAATTGAGGATATACCTGTGAAAGATGAGTTAGATATGGGTTTAGATGAGACGATGCAAGTAAGTGAGTTACCAGTACCGGTGTCAAGCATTAAGGAGGAAGATGACAAAATTGATGTCACAACAGTCCATGAGTTGCCGATCAGGGTTTACAGTGACGATTCCAAACGCGAGTTTTCATCAGATAATGCGGTTCCTCTTCCCGGTATTCCGGGCTCGTGTGGTCCTGTGGGCTTCGAGCCGGGAATGTTTAAACTGGAGAGAGATGACCCAGCTGCACCGAAGGCCAAAAAGAAGAAGAAGGAGAAAAAGAAGCACAAACACAAGCACAAGCACAAACACAGCAAAGAAAAAAGCAAAGACAAAGACAAGCTGCCGCGTCCTCCCTCCACAGACACGCTTCGTATTAAAGAAGAGACGAGGGAAACGCTGAGTTCATTCAGCTCAAGTCAGAGCCCCTCGGAAGATATATCATCAATGCCATCTAATATGAGTTTCTAA

Protein sequence:

>DPOGS202124-PA
MKKERTGDNCRPFKLAHQILSLTGISFERRSVIGFVELTIVPLKDNLRYIRLNAKQCRIYRVCLNDQYEANFQYFDPFLDICQSDANTRSLEVFSQNHLSAAQKTDPDHNSGELHIQVPDDAAHLVGEGRGLRIGIEFSLESPQGGMHFVVPEGEGTMVEKSAHMFTYGHSARLWFPCVDSFAEPCTWKLEFTVDETFTAVSCGELLDVVYTPDHRRKTFHYVVNTPACAPNIALAIGPFDTYVDPHMNEVTHYCLPHLLQILKNTVRYLHEAFEFYEETLSTRYPYPCYKQVFVDETEDDATAYTTMSILSTHLLHSIAIIDQTYISRKAMAQAVAEQFFGCFITMQNWSDLWLAKGIPDYLCGLYSKKCFGNNEYRYWIQQELQEVVSYEEHYGGIVLDPWQPPASGARVEPKDVFYFPVRNVHTMSPRYIEVMRKKSHLVLRMLEQRIGQELLLQVFNKQLSLATNAANTKIGSGLWGHLLLSTNLFVKAIFTVTGKDMAVFVDQWVRTGGHAKFQLTSVFNRKRNTVELEIRQDSVHERGIRKYVGPLLVQLQELDGTFKHTLQIENTVVKADITCHSKSRRNKKKKIPLCTGEEVDMDLSAMDDSPVLWIRLDPEMSLLRSTVISQPDYQWQYQLRHERDVTAQSEAIDALHNYPEPATRKALTDIIENEQTHYKIRCRAAHCLTKVANAMISSWAGPPAMLTIFRKMFGSFAAPHIIKQNNFDNLQHYFLQKTIPVAMAGLRNIHGICPPEVVRFLLDLFKYNDNSKNHFSDNYYKAALVDALAATITPVISVLQPGAPITAESLSADTRLVLEEITRVLNLEKVLPCYKNTVTVSCLRAIRRLQQCGHLPSIPTVFRAYAQYGQYIDVRLAAFEGLVDFVRVDGKPEDLSYLLTAIENDPDPGVRHGLARLMVSMPPFERAQRHRLDTESVVHRLWNNINSQLSNDARLRCDLVDLYYTLYGLKRPICVPLPEIQAMMKQMHHKERERLDRERERAERGREKEKIREMDIKPVIKQEIEDIPVKDELDMGLDETMQVSELPVPVSSIKEEDDKIDVTTVHELPIRVYSDDSKREFSSDNAVPLPGIPGSCGPVGFEPGMFKLERDDPAAPKAKKKKKEKKKHKHKHKHKHSKEKSKDKDKLPRPPSTDTLRIKEETRETLSSFSSSQSPSEDISSMPSNMSF-