Monarch geneset OGS2.0

DPOGS203775
TranscriptDPOGS203775-TA819 bp
ProteinDPOGS203775-PA272 aa
Genomic positionDPSCF300010 + 694848-699357
RNAseq coverage582x (Rank: top 22%)
Annotation
HeliconiusHMEL0037452e-13686.59% 
BombyxBGIBMGA013352-TA1e-12994.40% 
DrosophilaNfI-PB9e-11175.98% 
EBI UniRef50UniRef50_D6W8F28e-11987.29%Nuclear factor 1 n=2 Tax=Tribolium castaneum RepID=D6W8F2_TRICA
NCBI RefSeqXP_971603.12e-12087.71%PREDICTED: similar to Nuclear factor I CG2380-PB [Tribolium castaneum]
NCBI nr blastpgi|910922463e-11987.71%PREDICTED: similar to Nuclear factor I CG2380-PB [Tribolium castaneum]
NCBI nr blastxgi|910922462e-11688.84%PREDICTED: similar to Nuclear factor I CG2380-PB [Tribolium castaneum]
Group
Gene OntologyGO:00056341.1e-150nucleus
GO:00063551.1e-150regulation of transcription, DNA-dependent
GO:00062601.1e-150DNA replication
GO:00037001.1e-150sequence-specific DNA binding transcription factor activity
GO:00056225.5e-32intracellular
KEGG pathway 
InterPro domain[39-212] IPR0006471.1e-150CTF transcription factor/nuclear factor 1
[96-204] IPR0036195.5e-32MAD homology 1, Dwarfin-type
[35-75] IPR0195481.5e-25CTF transcription factor/nuclear factor 1, N-terminal
Orthology groupMCL10840 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203775-TA
ATGTCACCAAGTGTATTAACCTTCCCTGGTCAAGTCACTGGCATGTTCTTCGTCGAGAGCTGCCAGTGCCACGTGTCAGATCCTGGCAGCCCAGCAGTGTTCTGTCTAGCCAATGACGAGTTCCATCCGTTCATAGAGGCCCTGATGCCTTTTGTAAAGTCCTTTTCGTACACGTGGTTCAATTTGCAAGCGGCCAAAAGGAAATATTACAAAAAGCACGAGAAACGGATGAGTCTAGAGGAGGAGCGACACACTAAATATGAACTACAGAACGAGAAAGCAGAAGTGAAACATAAATGGGCATCGCGGTTATTAGGGAAGTTGAGGAAAGATATCACACAGGACTGTAGGGAGGACTTCGTGTTGAGCATCACCGGGAAAAGACCAGCCGTCTGCGTGCTCTCCAACCCTGACCAGAAGGGGAAGATGAGGAGGATAGACTGTCTGAGACAAGCTGATAAGGTGTGGCGATTAGATTTAGTGATGGTTATATTGTTCAAAGCGATACCCCTAGAGAGTACGGACGGTGAACGTTTAGAGAAACACCCCGAGTGCACCCAGCCGGGGCTGTGCGTGAACCCCTACCACATCAACGTGTCAGTGAGAGAACTCGACCTGTACCTGGCGAATTTCATCAACAGCTATGATATACTTAGCGGATCTCTCTCGCCTCACCCGAATAGGGACAAAGAGAATGAACACGAGGCTAAGAACAAAGGGTATACCCATAATCCATACAACGGTGTTATTTGCAACGATATCATTTTAGCGACCGGTGTCTTCTCTTCGAAGGAGCTTTGGAAGCTGTCTAAAGGTTAG

Protein sequence:

>DPOGS203775-PA
MSPSVLTFPGQVTGMFFVESCQCHVSDPGSPAVFCLANDEFHPFIEALMPFVKSFSYTWFNLQAAKRKYYKKHEKRMSLEEERHTKYELQNEKAEVKHKWASRLLGKLRKDITQDCREDFVLSITGKRPAVCVLSNPDQKGKMRRIDCLRQADKVWRLDLVMVILFKAIPLESTDGERLEKHPECTQPGLCVNPYHINVSVRELDLYLANFINSYDILSGSLSPHPNRDKENEHEAKNKGYTHNPYNGVICNDIILATGVFSSKELWKLSKG-