Monarch geneset OGS2.0

DPOGS213697
TranscriptDPOGS213697-TA3477 bp
ProteinDPOGS213697-PA1158 aa
Genomic positionDPSCF300219 + 385907-397848
RNAseq coverage468x (Rank: top 27%)
Annotation
HeliconiusHMEL0164280.070.20% 
BombyxBGIBMGA010670-TA0.073.04% 
DrosophilaCG3542-PA6e-17949.74% 
EBI UniRef50UniRef50_E2BU720.046.84%CDK5 regulatory subunit-associated protein 1 n=13 Tax=Eumetazoa RepID=E2BU72_HARSA
NCBI RefSeqXP_001810113.10.061.90%PREDICTED: similar to U1 small nuclear ribonucleoprotein, putative [Tribolium castaneum]
NCBI nr blastpgi|2700091750.061.90%hypothetical protein TcasGA2_TC015831 [Tribolium castaneum]
NCBI nr blastxgi|1892386240.054.85%PREDICTED: similar to U1 small nuclear ribonucleoprotein, putative [Tribolium castaneum]
Group
Gene OntologyGO:00160209.1e-34membrane
GO:00161929.1e-34vesicle-mediated transport
GO:00055154.1e-14protein binding
KEGG pathwaytca:6585100.0 
 K12821 (PRPF40, PRP40)maps-> Spliceosome
InterPro domain[100-300] IPR0109899.1e-34t-SNARE
[567-628] IPR0027136e-18FF domain
[249-308] IPR0007274.1e-14Target SNARE coiled-coil domain
[411-438] IPR0012028.9e-13WW/Rsp5/WWP
Orthology groupMCL11493 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS213697-TA
ATGCTCCCTCGTCGTCGAAACGTCGGAGTGTCGGATAAGACACCTCTTTTACAAGAAGAATTGATACCCTACGATAAACTCAACAAAAACGGACACATTTACCAGCCAACATTTTCTGATAAGCAGCAAGAGAACTACTCGGTACCAGATACTTCGTTCGACTTTCTTGAGGAATTTGTATTTGAAGCAGTGATGGCTGCTAGAGATAGGACTCAAGAATTTGCGTCAACAGTTAGAAGTCTCCAAGGTCGAACGTTTGCTCGGCCTATAATAAAAGATGAAAAAAAAGCTGCAATGCTCGCAACATATTCTCAATTTATGAGTATGGCAAAAGTTATAAGTAAAAACATAACAAGTACCTACACCAAACTTGAAAAGCTTGCCTTGTTGGCAAAAAAGAAGTCTCTATTCGACGATCGGCCTATGGAAATCCATGAGCTAACATATATAGGAGAAATGCCTAGAGGACGGAGGAGCATGCATAGCCATTCCTCTAGTGTAGTCCTAGCACTACAATCAAGACTTGCATCTATGAGTAACCAGTTTAAACAGGTACTTGAAGTAAGGTCTGAAAATCTTAAGCATCAAAATAATAGACGCACACAATTTTCTGCATCTGCTCCAGTGGTCAAAGAAGTTCCATCTTTATTGCAACCAGATGAAGTTAGTATAGATTTAGGGGACACCTCTCCTCTTCAAAGCCAACAATTAGCACTAAGGGATGATACGGATTCCTATGTGCAACAAAGAGCGGAGACTATGCATAACATTGAGAGTACTATTGTAGAGTTGGGTGGGATTTTCCAACAATTGGCTCACATGGTCAAAGAACAGGATGAGGCTATAGGCAGAATAGATGCCAACATCCAAGAAGCTGAAATGAATGTTGAAGCTGGCCATAGAGAAATAATGAAATATTTCCAAAATATAACAGGAAATAGAGCACTCATGTTTAAAGATACTCCTGGCTCTGGTACTAGTTCACCAGGCCTTATGAGCACTGGGCCTTTGCTCCCGCCGCCGATGCTCGGCGGATTGCCGCCTCCCATGCCGCCCGCGGTAGCGATGCCGCCGGTTCCCGGCATGCCACCGAATATGCCACTTCCTCCGCCGATGGGTTTTCCACCTATGATGGCACCATTTTCAATGCCTCCACCAGGATTTCCACCTTTTAAACCAGAGTTAAGTGCACCAGCACCAGAATTATCGCCCATGGTGAATCAGAATTCCCCATGGACTGAACACAAAGCACCAGATGGACGTACTTACTATTACAATTCCATAACAAAACAGAGTCTTTGGGAAAAACCAGATGATTTGAAAACCCCCGCTGAAAAACTTTTGTCGTCGTGTGTATGGAAAGAATATACCACAGATGCTGGCCGAGTCTACTATCATAATATTGAAACAAAGGAATCTAGTTGGGTTATCCCAAAGGAACTTCAAGAGATAAAAGACAAAATAGCCGCTGAAGAAGCAGCACATGCTATAATGAATGCCGAGGTACCGCCAGGTGAAGTACCTCTGCCAATGTCACCTGCCATTAACAGTACATCCGCTTTAGATGAAGCTATGGCCAAGACACTAGCATCTATAGACCCCGGTCTTACGACATCTATACCTATCCCAGAGGAAATAAAGCCTGAGGAAATAGCCGCCCCACCACAACCAAATGGAGCTGATGCTGAACCAACTCCTGAAACACTGTACAAAGACAAGAAGGAAGCCATAGAGGCCTTCAAGGAGCTGCTCAAAGAAAGGAATGTACCTTCCAATGCTACATGGGAGCAATGCGTCAAGATTATATCAAAAGATCCGAGATATGTCACGTTCAAGAAATTAAATGAAAAGAAACAGGCGTTCAATGCCTACAAGACACAGAAACTTAAAGACGAGAGGGAGGAGCAGAGATTAAAAACAAAAAAGAATCGGGAGAATTTGGAGGAGTTCTTATTGAGTTGCGATCGTGTGACGTCACTCACTAAGTATTATAAGTGCGAGGAAATGTTTAATAATCTAGAGATATGGCGATGTGTTCCTGACTCGGACAGAAGAGACATTTATGAAGACTGCATCTTCACGATAGCGAAACGCGAAAAAGAGGAGGCCAAGGCATTGAAGAAACGGAATATGAAAATTTTATCTCAAGTCTTAGAAAACATGAATGAAATAAGTTACAACAGTACTTGGAGTGAAGCTCAAGTATTACTACTCGAGAATGCTGCTTTTAAGAATGACGTCAGTCTACTGGGTATGGACAAAGAGGACGCTTTAATAGAATATTATATGATGAAGCGATCTACCCACATTATACTTGACATCATATTATTTACCAATACCGACCAAGTTTTAACTTGTTCACAACCAGAATTTATTAAAATAAAACTTATATTTTGGGCACTATTGGACGGTCTGCACGAAGAAGGGAAGTTGACTTCTATGTCTCTGTGGGTCGAACTGTATCCAGTCATATCAGCTGATACACGATTTTCGGCTATGCTCGGACAAAACGGCTCAACACCTCTGGACCTGTTCAAGTTCTATGTGGAGAACCTCAAGGCGAGGTTCCACGACGAGAAGAAAATCATCAAGGAGATCCTGAAAGAGAAAGAGTTTGAAGTCAAGCCGACCACCACCTTCGAGGAATTCGCCACCGTCGTGTGCGAGGACAGCAAGTCGGCGTCGCTGGACGCCGGGAACGTTAAGCTCACGTACAATTCGTTACTTGAAAAGGCCGAAGCGAAGAACAAGGAGAAATTGAAAGAGGAATCAAAAGCTCAGAAGAAAATCGAGAGCGCGTTCAAGTGGTTGTTAAGCGAGGCTCGACTGGACCCGGCGCTGTCGTGGGCCGAGGCGAAGGAGAAGATAGATCTGAACGCACCAGAGATAGTGGCCGTGCAGGACGAGAACGAACGTGAGCGGATATACATGGACTATCAACACGAGCAGGAGGAGAGCTGTATGCATTACCATCATCCCAAGCCGAGAAAATCGAAGAAATCCAAAAAGAAGAAGCAACGGTCAAGATCGCCGTCTATCGTGAGCTTGAAGCCGTCTCGTTCCCGTTCCGTGTCCGAGACTCGTCTGTCCTCCGGGACCGCCTCGCCCTCGGACGAGGAACGTAAGAACAAGAAGACTAAAAAGAAACATCGCAAGCATTCGCCGCCGAAATCTCCTACGCCTGAGGAGGGAGGTATCACAGACGAGGAACCAGCGAAGCACAGGAGTAAGAAAACCAAAAGGAGCGCTCCCAGCTCACCCGACCAGCCCGAACCGCCCCACAGACCCAAGAAGAAAAAAGAGAAACGGGACAAGAAGGACAGGTCGGTGGCGTGTCCCGGCGGGGCGGCGGCGGTGGGAGGCGCGGCGGTGACGGCGGCGGCCGGCTCTGCGACCGGCGCCGTGTGGAGCGACGCCGAGCTGGAGTCGCGCCGTGCCGCTCTCCTCGCTCAGCTGCACGAGCACGAGGCCGACTGA

Protein sequence:

>DPOGS213697-PA
MLPRRRNVGVSDKTPLLQEELIPYDKLNKNGHIYQPTFSDKQQENYSVPDTSFDFLEEFVFEAVMAARDRTQEFASTVRSLQGRTFARPIIKDEKKAAMLATYSQFMSMAKVISKNITSTYTKLEKLALLAKKKSLFDDRPMEIHELTYIGEMPRGRRSMHSHSSSVVLALQSRLASMSNQFKQVLEVRSENLKHQNNRRTQFSASAPVVKEVPSLLQPDEVSIDLGDTSPLQSQQLALRDDTDSYVQQRAETMHNIESTIVELGGIFQQLAHMVKEQDEAIGRIDANIQEAEMNVEAGHREIMKYFQNITGNRALMFKDTPGSGTSSPGLMSTGPLLPPPMLGGLPPPMPPAVAMPPVPGMPPNMPLPPPMGFPPMMAPFSMPPPGFPPFKPELSAPAPELSPMVNQNSPWTEHKAPDGRTYYYNSITKQSLWEKPDDLKTPAEKLLSSCVWKEYTTDAGRVYYHNIETKESSWVIPKELQEIKDKIAAEEAAHAIMNAEVPPGEVPLPMSPAINSTSALDEAMAKTLASIDPGLTTSIPIPEEIKPEEIAAPPQPNGADAEPTPETLYKDKKEAIEAFKELLKERNVPSNATWEQCVKIISKDPRYVTFKKLNEKKQAFNAYKTQKLKDEREEQRLKTKKNRENLEEFLLSCDRVTSLTKYYKCEEMFNNLEIWRCVPDSDRRDIYEDCIFTIAKREKEEAKALKKRNMKILSQVLENMNEISYNSTWSEAQVLLLENAAFKNDVSLLGMDKEDALIEYYMMKRSTHIILDIILFTNTDQVLTCSQPEFIKIKLIFWALLDGLHEEGKLTSMSLWVELYPVISADTRFSAMLGQNGSTPLDLFKFYVENLKARFHDEKKIIKEILKEKEFEVKPTTTFEEFATVVCEDSKSASLDAGNVKLTYNSLLEKAEAKNKEKLKEESKAQKKIESAFKWLLSEARLDPALSWAEAKEKIDLNAPEIVAVQDENERERIYMDYQHEQEESCMHYHHPKPRKSKKSKKKKQRSRSPSIVSLKPSRSRSVSETRLSSGTASPSDEERKNKKTKKKHRKHSPPKSPTPEEGGITDEEPAKHRSKKTKRSAPSSPDQPEPPHRPKKKKEKRDKKDRSVACPGGAAAVGGAAVTAAAGSATGAVWSDAELESRRAALLAQLHEHEAD-