Monarch geneset OGS2.0

DPOGS200154
TranscriptDPOGS200154-TA1122 bp
ProteinDPOGS200154-PA373 aa
Genomic positionDPSCF300128 - 47570-52445
RNAseq coverage600x (Rank: top 21%)
Annotation
HeliconiusHMEL0144175e-13570.42% 
BombyxBGIBMGA001621-TA3e-9668.75% 
DrosophilaKr-h1-PC1e-1626.60% 
EBI UniRef50UniRef50_D3ZYA21e-1931.40%RCG46571 n=4 Tax=Eutheria RepID=D3ZYA2_RAT
NCBI RefSeqXP_001947159.16e-1930.37%PREDICTED: similar to Zinc finger protein 271 (Zinc finger protein 7) (HZF7) (Zinc finger protein ZNFphex133) (Epstein-Barr virus-induced zinc finger protein) (ZNF-EB) (CT-ZFP48) (Zinc finger protein dp) (ZNF-dp) [Acyrthosiphon pisum]
NCBI nr blastpgi|736724532e-2031.13%zinc finger transcription factor KRAB-E2S [synthetic construct]
NCBI nr blastxgi|3272887693e-2428.42%PREDICTED: zinc finger protein 91-like [Anolis carolinensis]
Group
KEGG pathway 
Orthology groupMCL26455 Lepidoptera specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS200154-TA
ATGCCTTCGAACGAGCCCATCACCCTCGACAGTGACGACGAACCAGATTTGGGTTCCCCGAGTCAGGCTTCGGATTCCCTGAGTATGGATGAAAATTTAAGAGCGTTGCGTCACATACACGAGAAATACGCGAATATACATAAGGACGACCTGTACGAAGCCCATCTCCTTGCTGCGAATTTCAAACAACCAGATGCGAGGGGTGACGACGACGTGCTCTCAGTAGCTAGCTCCGATGGTAGCGCGTCAGTAAAAGAATTAGGTACAAGAGTAGAAAAATTGGAGGATAGTTCGAATTCCGAGTCAGGCAGTAGCAGTTCGGAGTCAGATAGTAGTAGTTCATCGTCGTCTAGCTCGTCGGACAGCTCACACTCGTCGGACTCGTCGGATTCTGAGGAAGACGAGAAGCCAAACAGCAGCGTCCATGATGACAGATCCTTCGGCGCCTGGTCCCACACGCCGGAGTCGGATTCGGATCCGGAGCGCGGGGCCCCGGTCGCTCGGACCGATGACGACATCAGCGAATCGGATAACGAGTCAGCTGATAAGAGCTTCCCATGTAGGGTCTGCGGGAAGTGGTATTCGACCAGGGTCACGCTCAAGATCCACGCGCGCGTTCATCAGAACAAGGGCGGCGGCGGCTCCAGGTCCAGGGCGCGTTCCTCGGACAGATACGAGTGTGACTGTTGCAGCGAGACGTTCAGCAGGAGAGAGAAGCTTTGGGAGCATAAGGCTGAAGCCCACCGCGGCGCTATGACTGTCCGATGCGAGGTTTGTCGTCGGTGCTTCGAGGACGACAACGAGCTGGCGGCGCACGCCACCACACACACTAGTGATGACAGGATAGGTCGTTGCTCGGACTGCGGCTCGTCGTTCGCCAGATACGACCAGCTGCGCCGCCACCGCGCCTCCGTCCACGGCTCGGCGCCCGCCCGCCTGCCGCACGCGTGCGTCCAGTGCGGCAAGAGATTCTCACACGCGCACTCCCTCACCAGACACGCGCACAACCACGCCAAGCAACTGTACAGATGCGTGGTTTGTAAGGCATCCTTCGCCCGCGCGGACCAACTCGCCCAGCACCTGAACAGCCACCTCGCCACCTACAAACGTATGAAGCAGTGA

Protein sequence:

>DPOGS200154-PA
MPSNEPITLDSDDEPDLGSPSQASDSLSMDENLRALRHIHEKYANIHKDDLYEAHLLAANFKQPDARGDDDVLSVASSDGSASVKELGTRVEKLEDSSNSESGSSSSESDSSSSSSSSSSDSSHSSDSSDSEEDEKPNSSVHDDRSFGAWSHTPESDSDPERGAPVARTDDDISESDNESADKSFPCRVCGKWYSTRVTLKIHARVHQNKGGGGSRSRARSSDRYECDCCSETFSRREKLWEHKAEAHRGAMTVRCEVCRRCFEDDNELAAHATTHTSDDRIGRCSDCGSSFARYDQLRRHRASVHGSAPARLPHACVQCGKRFSHAHSLTRHAHNHAKQLYRCVVCKASFARADQLAQHLNSHLATYKRMKQ-