Monarch geneset OGS2.0

DPOGS210114
TranscriptDPOGS210114-TA1407 bp
ProteinDPOGS210114-PA468 aa
Genomic positionDPSCF300017 + 1305731-1319892
RNAseq coverage572x (Rank: top 22%)
Annotation
HeliconiusHMEL0022355e-6279.85% 
BombyxBGIBMGA000222-TA5e-17778.14% 
DrosophilaCG17119-PA1e-10652.24% 
EBI UniRef50UniRef50_E2AMM97e-11154.45%Cystinosin-like protein n=12 Tax=Endopterygota RepID=E2AMM9_CAMFO
NCBI RefSeqXP_001954456.14e-10852.34%GF18270 [Drosophila ananassae]
NCBI nr blastpgi|3838497824e-11459.59%PREDICTED: cystinosin homolog [Megachile rotundata]
NCBI nr blastxgi|3838497829e-11559.59%PREDICTED: cystinosin homolog [Megachile rotundata]
Group
KEGG pathwaydan:Dana_GF182701e-107 
 K12386 (CTNS)maps-> Lysosome
InterPro domain[134-364] IPR0052821.7e-83Lysosomal cystine transporter
[286-317] IPR0066032.2e-10Cystinosin/ERS1p repeat
Orthology groupMCL11727 Multiple-copy universal gene
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS210114-TA
ATGTGCGAATGTGAGAAAAATATGGAGATAACGGATGCTGGTGTCCCAGCTGGGTATTGCCAGATAAAAAAATTGTATGTTGATGTGAAAGATATGAACTTATTGGTGGATGAGACCAGATTGTTCAAATTAGTCCAAATCGGCTCATACAACACATCCCTCCAAGTGACATCAAACATCCAACATCCTGATATAGTGCAGTTCGTGCCGCCCGTCGTCGACATCCCCGGGTCCGACCCGCCCAATGCCACCGTCGAGGTGACCTACGACATATGCGCCTACGGAAAAAGCCCCGGACATTCAGAAGTCACGTTCAATGTCACCCCGGGTGGTTTCGTTGACATGGACACGGTGTTTGTTCGCGTGACTGTGATGTATTCCAACGCGATCTACGTCATATCCTATATCATGGGCTGGATCTACTTCCTGGCGTGGTCCGTATCATTCTACCCTCAGATATACATCAACTTCAAGAGGAAGAGCGTAGTCGGCTTGAACTTTGACTTCCTGGCCCTAAACATAATGGGATTCGCCATGTATTCACTGTTTAACTGCGGTTTATACTTCTCTAAGGACATACAGTCGGAGTATTTCTCCCGTCACCCCCGCAGCCTCAACCCTGTGCAACTGAACGACGTGTTCTTCTCGCTCCACGCATCATTCGCTACCCTCATCACGATAACGCAATGCTTCCTATATGAGCGTGAAAACCAGCGAGTGTCGGTGACGGGGCGTTGCGTGCTGGGTGGGATGGCGGGCGTGGCTCTAGTGTCGGCGAGCGTGGCGGGCGCTGGCAAACTAGCCTGGCTGGACTTCCTCAATTATTGCAGCTACATCAAGCTTTGCATCACTCTTATCAAATACGTGCCTCAGGCGTACATGAACTATAAGAGGAAGTCAACGGTCGGTTGGAGCATTGGGAACATCTTCCTGGATTTCGTCGGCGGTTCGCTATCAGTCCTCCAGATGACCTTGAACGCTTACAACTACAACGACTGGGTGTCATTCTTCGGAGACGCGACCAAATTCGGTCTCGGATTATTCAGCCTGGTGTTTGATATCTTCTTTATATTGCAACACTACGTCTTCTACAGGTCATCAGAACGTTCATCATCATCATCACCTAAATGTATTGTGGGTAACAACAATCAAAACAAAAATCACTCGGCATCGCATTCCACAAGCTTGAGGGGCGCTTGCGACGCGGAATTCGAATTTGGAACGAACGGCGGCTTAAAGACTGTCATAAATAAGGTCGTGTTGTGGATTGATTTCAGAGAGCGCAAGGATTTTATGTTTGATTCAAGCGAAGAAGGTTCTATTAGGACAACTGGTGATAGTGAATATTGTTTGGTACCGACCTGGAACGCCGACGAAAAGAAATACGATCTGGAAGCGAAGGCGTGA

Protein sequence:

>DPOGS210114-PA
MCECEKNMEITDAGVPAGYCQIKKLYVDVKDMNLLVDETRLFKLVQIGSYNTSLQVTSNIQHPDIVQFVPPVVDIPGSDPPNATVEVTYDICAYGKSPGHSEVTFNVTPGGFVDMDTVFVRVTVMYSNAIYVISYIMGWIYFLAWSVSFYPQIYINFKRKSVVGLNFDFLALNIMGFAMYSLFNCGLYFSKDIQSEYFSRHPRSLNPVQLNDVFFSLHASFATLITITQCFLYERENQRVSVTGRCVLGGMAGVALVSASVAGAGKLAWLDFLNYCSYIKLCITLIKYVPQAYMNYKRKSTVGWSIGNIFLDFVGGSLSVLQMTLNAYNYNDWVSFFGDATKFGLGLFSLVFDIFFILQHYVFYRSSERSSSSSPKCIVGNNNQNKNHSASHSTSLRGACDAEFEFGTNGGLKTVINKVVLWIDFRERKDFMFDSSEEGSIRTTGDSEYCLVPTWNADEKKYDLEAKA-