Monarch geneset OGS2.0

DPOGS210296
TranscriptDPOGS210296-TA1140 bp
ProteinDPOGS210296-PA379 aa
Genomic positionDPSCF300551 + 1776-7173
RNAseq coverage129x (Rank: top 56%)
Annotation
HeliconiusHMEL0055439e-6589.40% 
BombyxBGIBMGA004227-TA2e-17583.66% 
DrosophilaFen1-PA5e-15768.91% 
EBI UniRef50UniRef50_Q7K7A97e-15568.91%Flap endonuclease 1 n=38 Tax=Eukaryota RepID=FEN1_DROME
NCBI RefSeqXP_001651504.11e-16974.54%flap endonuclease-1 [Aedes aegypti]
NCBI nr blastpgi|3202029357e-17283.38%flap endonuclease-1 [Bombyx mori]
NCBI nr blastxgi|3202029351e-17782.46%flap endonuclease-1 [Bombyx mori]
Group
Gene OntologyGO:00062811.4e-205DNA repair
GO:00045181.4e-205nuclease activity
GO:00036773e-42DNA binding
GO:00038243e-42catalytic activity
KEGG pathwayaag:AaeL_AAEL0058703e-169 
 K04799 (FEN1, RAD2)maps-> Base excision repair
    DNA replication
    Non-homologous end-joining
InterPro domain[1-376] IPR0060841.4e-205DNA repair protein (XPGC)/yeast Rad
[1-108] IPR0060852.8e-45XPG N-terminal
[219-358] IPR0200453e-425'-3' exonuclease, C-terminal subdomain
[147-219] IPR0060865e-33XPG/RAD2 endonuclease
[221-254] IPR0089181.1e-12Helix-hairpin-helix motif, class 2
Orthology groupMCL12447 Single-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210296-TA
ATGGGTATTTTAGGATTATCAAAGTTGATTGCAGATATTGCTCCAATGGCTGTAAAAGAAACAGAGATAAAAATTATTTCGGTTGGTAGGAAAGTTGCCATCGATGCATCTATGAGCTTGTATCAATTCTTAATTGCTGTAAGAAGTCAAGGCGCTCAGCTGACGTCCGTTGATGGTGAAACAACATCACACCTAATGGGTACATTCTACAGAACGATTCGTCTCATAGAAGATGGTATCAAGCCTGTGTATGTCTTTGATGGTAAACCGCCTGATATGAAGTCACATCAATTGAACAAGAGGGCCGAGAGACGAGAGGAAGCTGAGAAAGAACTCCAGAAGGCTACCGAGGCTGGTGATACGGCATCTATTGACAAGTTCAACCGTCGGTTGGTGAAGGTGACTCAGCAACACGGTGCCGAAGCTCGGCAGTTGTTGAAGCTTATGGGGATACCCGTGGTGGAGGCTCCGTGTGAAGCTGAGGCACAATGCGCTGAATTAGTCAAAGGTGGTAAGGTGTATGCTGTAGCCACTGAGGATATGGATGCTTTGACCTTCGGAGCGAACGTGCTGTTGAGGCACCTCACCTTCTCCGAGGCGAGGAAGATGCCAGTACAGGAGTTCCACCTGGACCAGGTGCTGAGAGGATTGGAATTGGAACAGACAGAGTTCATTGACCTCTGCATTCTGTTGGGTTGTGATTACTGCGGCTCCATCAAAGGGATCGGACCGAAACGGGCCATCGAACTCATCAAGCAACACCGCAGTATAGAACAGGTCCTTCACAATATCGACACAAAGAAGTACAGTCCGCCGGAGAATTGGGAATATGAAAACGCTCGGAGACTGTTCCAGCAACCAGAAGTTACCGAGGCGAAGGATGTCGAGTTAAAATGGTCGGATCCTGACGAGGAAGGTCTGGTGAAGTTCCTCTGTGGAGACAAACAGTTCAACGAGGAGCGCGTCAGGAACGGGGCCAAGAAACTCATGAAGGCGCGCACCGGAACCACGCAGGGCAGGCTGGATGGATTCTTCAAGGTGTTATCAACAACACCAAACCCAAAAAGGAAAGCGGAGGAAGATAAAAAGAGTGCCAACAAGAAAGTTAAAACAGCTGGAAGGGGGCGGAAACCGAAATAA

Protein sequence:

>DPOGS210296-PA
MGILGLSKLIADIAPMAVKETEIKIISVGRKVAIDASMSLYQFLIAVRSQGAQLTSVDGETTSHLMGTFYRTIRLIEDGIKPVYVFDGKPPDMKSHQLNKRAERREEAEKELQKATEAGDTASIDKFNRRLVKVTQQHGAEARQLLKLMGIPVVEAPCEAEAQCAELVKGGKVYAVATEDMDALTFGANVLLRHLTFSEARKMPVQEFHLDQVLRGLELEQTEFIDLCILLGCDYCGSIKGIGPKRAIELIKQHRSIEQVLHNIDTKKYSPPENWEYENARRLFQQPEVTEAKDVELKWSDPDEEGLVKFLCGDKQFNEERVRNGAKKLMKARTGTTQGRLDGFFKVLSTTPNPKRKAEEDKKSANKKVKTAGRGRKPK-