Monarch geneset OGS2.0

DPOGS201167
TranscriptDPOGS201167-TA1875 bp
ProteinDPOGS201167-PA624 aa
Genomic positionDPSCF300065 + 706439-710309
RNAseq coverage1397x (Rank: top 9%)
Annotation
HeliconiusHMEL0137342e-15647.87% 
BombyxBGIBMGA003964-TA0.088.42% 
DrosophilaEsp-PB0.052.32% 
EBI UniRef50UniRef50_Q7QBV80.058.32%AGAP002331-PA n=5 Tax=Endopterygota RepID=Q7QBV8_ANOGA
NCBI RefSeqXP_972290.10.056.93%PREDICTED: similar to AGAP002331-PA [Tribolium castaneum]
NCBI nr blastpgi|910895810.056.93%PREDICTED: similar to AGAP002331-PA [Tribolium castaneum]
NCBI nr blastxgi|3479676430.058.32%AGAP002331-PA [Anopheles gambiae str. PEST]
Group
Gene OntologyGO:00068103.2e-52transport
GO:00550853.2e-52transmembrane transport
GO:00160213.2e-52integral to membrane
GO:00052153.2e-52transporter activity
KEGG pathway 
InterPro domain[157-458] IPR0115473.2e-52Sulphate transporter
[502-589] IPR0026451.1e-06Sulphate transporter/antisigma-factor antagonist STAS
Orthology groupMCL10158 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS201167-TA
ATGATACCGCGGAGAAAACACAACGCGTCTTCAGGGTCGCTTCCGCCACCAGACGACAAGAACGCCAGCAACGACTATATTCTTTCAGAAAATGGCACGAGCGAGGGTTGGCGAGCAGCTCTACGAAGACGGTTCAATAAGAAAACCCTTAACAAGAGATTTCCCGTCACAGCCTGGCTACCGCAGTACAATGTGGAAGAGGCGATCGGAGACGTCATAGCGGGGGTTTCTGTGGGTTTGACAGTTATCCCCCAGTCTTTGGCCTACTCTAACATCGCCGGTCTGCCTCCTCAATACGGATTGTACGGCTCGTTCATCGGTTGCTTTGTATACATCATACTGGGAGGGTGCCGGGCCGTACCCGCAGGACCTACTGCCATTGCATCGTTACTCACTTGGCAAGTGGCTGGCGGCGTAGTGGAGAAGGCGATCCTCTTGAATTTGCTCACGGGACTCGTGGAACTTATGATGGGAGTGCTCGGTCTAGGCTTTCTTATCAATTTCGTCTCAGGACCCGTTTCTTCAGGTTTCACATCAGCAGTTGCTCTGATGATCGCCACATCCCAAGTCAAAGACATGTTCGCTATATCTGTGACCGGAACTACTTTCTTACAACAGTGGATTTCTGTATTCCAAAATATTCACAATGCATCACTTTGGGATCCAGTACTAGGATTCATCTGCATTGCGTTACTTCTTTCAATGAGGAAAATCGGGATGATTAAATTAGGAGCAAAAAACCCGGAAGGTCCAAGCACGCGACAAAAAGTTCTGACGCGTTGCATGTGGCTGTTGGGGACATGTCGGAATGCCATCGTGGTGGTGGCGTCGGGAGCTTTGGGATTCTGGTTTGTGAGCGAACAGGGATCCTCACCCGTGCGACTCATGGGAGCGATACCGTCGGGAGTACCGACACCGCAGGCGCCGCCGATGAGCTACGTGCGTGCCGACAACACCACAGCAGACTTCTTAGAGATGGTCTCGGAATTGGGCTCGGGTCTGCTGGTGATACCCATCATTGTACTTCTGGAGGATATCGCTATCTGCAAGGCGTTCTCAGATGGACGAACTATAGATGCCACGCAGGAGATGATCGCACTCGGTGTAGCCAACATCGCTAACTCCTTTATGCAAGCGTACCCGGGCGGCGGGTCACTGGCACGATCCGTCGTCTCCAACGGCTCCGGAGTCAGAACAACCTTCAATGGACTTTATACTGGTGTCATGGTTATCTTGGCCCTACAATTTTTTACGCAATATTTCGAGTACATACCCAAAGCTGCACTTGCTGCTGTGATTATTTCTGCAATTTTATTTATGGTGGAATACGATGTTATAAAACCAATGTGGCGAGCTAAAAAATTGGACTTGATACCTGGCGTAGGCACATTCATTCTCTGCTTAACTCTTCCTATCGAGTTGGGTATTCTAACTGGAGTCGTCGTCAACATTATTTTCATCCTGTACCACGCAGCACGCCCTAAATTCTCTGTTGAAATGTTAAAGACGGAGCAGGGTGTGGAGTATTTGATGATCACCCCCGACCGCTGTCTCATGTTTCCGTCTGTCGACTATGTGCGCCGGCTCGTCACTAAATGTGCCGCCAGCAGCTCGGTACCCGTCGTGATCGAATGCACACACATATACAGCGCTGACTACACCGCTGCCAAGAGTATCGAACAACTCACAGGGGACTTTCACGCCCGCCAACAGCCCCTCTACTTCTACAACGTCAAACCCTCAGTCTGCTCTATTTTCGAAGCTGTTACGAAGCCTGAGCACTTCGTTGTGTTCTACGAGGACGATGAATTGGACCGCCTCTTGGCCGCCGACGAACGCCTCGCACCTCGTAAACCGCCCCCACTGCACGTTTAG

Protein sequence:

>DPOGS201167-PA
MIPRRKHNASSGSLPPPDDKNASNDYILSENGTSEGWRAALRRRFNKKTLNKRFPVTAWLPQYNVEEAIGDVIAGVSVGLTVIPQSLAYSNIAGLPPQYGLYGSFIGCFVYIILGGCRAVPAGPTAIASLLTWQVAGGVVEKAILLNLLTGLVELMMGVLGLGFLINFVSGPVSSGFTSAVALMIATSQVKDMFAISVTGTTFLQQWISVFQNIHNASLWDPVLGFICIALLLSMRKIGMIKLGAKNPEGPSTRQKVLTRCMWLLGTCRNAIVVVASGALGFWFVSEQGSSPVRLMGAIPSGVPTPQAPPMSYVRADNTTADFLEMVSELGSGLLVIPIIVLLEDIAICKAFSDGRTIDATQEMIALGVANIANSFMQAYPGGGSLARSVVSNGSGVRTTFNGLYTGVMVILALQFFTQYFEYIPKAALAAVIISAILFMVEYDVIKPMWRAKKLDLIPGVGTFILCLTLPIELGILTGVVVNIIFILYHAARPKFSVEMLKTEQGVEYLMITPDRCLMFPSVDYVRRLVTKCAASSSVPVVIECTHIYSADYTAAKSIEQLTGDFHARQQPLYFYNVKPSVCSIFEAVTKPEHFVVFYEDDELDRLLAADERLAPRKPPPLHV-