Monarch geneset OGS2.0

DPOGS204736
TranscriptDPOGS204736-TA2655 bp
ProteinDPOGS204736-PA884 aa
Genomic positionDPSCF300231 - 635867-638758
RNAseq coverage107x (Rank: top 60%)
Annotation
HeliconiusHMEL0115850.046.14% 
BombyxBGIBMGA013655-TA3e-3328.93% 
DrosophilaCG32354-PA1e-2728.28% 
EBI UniRef50UniRef50_F0VGR83e-7532.70%Serine protease inhibitor dipetalogastin n=4 Tax=Sarcocystidae RepID=F0VGR8_NEOCL
NCBI RefSeqXP_002117100.11e-3624.20%hypothetical protein TRIADDRAFT_61048 [Trichoplax adhaerens]
NCBI nr blastpgi|2378317515e-7733.28%serine protease inhibitor dipetalogastin precursor, putative [Toxoplasma gondii ME49]
NCBI nr blastxgi|2214869792e-9934.42%follistatin, putative [Toxoplasma gondii GT1]
Group
Gene OntologyGO:00055154.2e-12protein binding
KEGG pathwayspu:5866045e-32 
 K06254 (AGRN)maps-> ECM-receptor interaction
InterPro domain[335-382] IPR0023504.2e-12Proteinase inhibitor I1, Kazal
[338-382] IPR0114979.3e-11Protease inhibitor, Kazal-type
Orthology groupMCL20538 Insect specific
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS204736-TA
ATGAAGACTGTAATAGGTCTACTGTTTATCTTCGCATCGATATGTTATTTGGATGCGAAAAGAATAAAAAAAAGGTCATGCATCTGTACAGAACTATACAGTCCTATATGTGGTACTGATGGAACCACTTACACGAATAAATGTTTTTTCAATTGCGCCAAAAATACCCACAAAAAACATGGTTCCACAAAAGACATATATATAGCCTATGAAGGAAAATGTAGTGACTCATGTATTTGCAAGGATAACTATTCTCCGGTATGTGGCAGTGACGGCAAAACGTACCCTAATAGCTGTTATCTTAATTTTAAAAGTAAAGAAATAGAAAATGACTGTAAAAATAACGGAGACGATCCTGATGAAAACAAATTAATAGAAGCATATAAAGGTGAATGTTCCGACGAATGTTTTTGCACAGATGAATATGCACCAATTTGCGCTAACAACAATAAAACTTACTCGAATTCCTGCCAACTAGAGTGTGAGAATAAAAAAAGAAAAAATAATAATTTACCGCCTCTTGTGGTTAAAAGTGACGGTCAATGTCCCAAACCATGTATCTGCGAAGGAATGTATCAACCGATATGCGGTGACGATGGGAAAACTTATGCCAATGTTTGTAGTTTAGGATGTATTAATGAAGAAAGACAAAATAATAACCTTCCACCAATAAGTAAAAGGAGTGACGGAAAATGTCCGAACATATGTAAATGTCCAAAAATATATAAACCAGTATGTGGAAATGATGGCAAAACATATCCAAGTAATTGCAATTTAAAATGTATAAACAAAGAGAGAGAAGGAAACAAGCTGTCACCCATTAGAGAAATCAGTAAGGGCGAGTGTCCAAAAACATGTGTATGCCCTTTTAATTATTTACCTGTATGCGGTTCTGATGGAGTTACTTATTCTAATGAATGTTTACTTAAATGTGCAAGTAAGGACAATGAAAAGAAAAACTTACCACCTATAACTGTTGTAAATGAAACGTCATGCCCAGAATCATGTCTGTGTCCATTAATCTATGAGCCAATATGCGGTGACGACGGCAAAACATATTCCAGTAGTTGTGAACTGAGATGTAAAAATAAAGAAAGAGAAATAAATAAAGAACTACCAATTAAAAAAGTCAGTGATGGGGAATGCTCAAAACCATGTCGTTGTCCAAAAATTTATAGTCCCGTGTGTGGTGATAATGGTGAAACATTTTCTAATAACTGTGAATTAGAATGTGAAAACAAAAAACGCCAAGCTAAAAATGAATCACCAATAGCTGTGGTAAGTAAGGGAAAGTGTCCGGAACCTTGTAGTTGCCCAAAAATATTCGAACCTGTATGTGGTGATGACGGAATAACTTATTCCAGCAGTTGTGATTTAGGTTGTGTTAATAAAGAAAAAGAAAAAAATAATGAAGCACCCATCCTTGAGGTTTCCAAAGGTGCATGCCCAGGTTCCTGTATATGTCCATTAATAATTTCAGAGCCTGTTTGTGGAAGCGACGGTCAAACTTATCGTAGTGAATGTGAATTAGACTGTGAAAATAAAATAAGAATAGCAAAAGATGAATCACCTCTCTCTGTTATTAGCAAGGGTGAATGTCCAAAAGCTTGCGCGTGTCCTTTAATAGATCTTCCTGTTTGCGGTTCGGATGACGTCACTTACCCTAACGAATGTTCACTTAACTGTACAAGTGCAGATAATGTAAGAAAAAGTTTACCTGCTATTACTGTGAAAAGCCAAGGAGAATGTGAAGAGTCATGCATATGTTCAACAAATTATGATCCTATATGTGGTTCAGACGGTGTAACTTACTCCAACGAGTGTCAACTAGAATGCAAAAATAAAAAGCGAATCAAAAACTCCCTAGATAGAATAGATATTGTAAAAAAAGGAAAATGTAATGGATCCTGCAGCTGTCCTGCAGATGTCAATCCAGTATGTGGCAGTGACGGACAAAGTTATCCCAATGAATGTCAATTAGTATGCGAGAGCGATGATTTGGTACGACAGGGGCTTTCAGCTTTAGAAGTCATCGAAAGTGATCTTTGTGAAGAATCATGCGAATGTTATAACGCAATTATACCAGTTTGCGGGTCAAATAATAAATCTTACAGAAATGCTTGCTATTTAGATTGTGCCAACAGAAACAGAAGAGGCAATGAAACATCAATTACGATAAAATATAGTGGTGCATGCAGAAGTTGCACTTGCACCCGAGAACTTAACCAAGTGTGTGGTAGCGACGGTAATACGTATAATAATCCTTGTCTTTTAGATTGTGAAAGTGAAAGACTAAAAGGAATAGGAAAATCACCTCTGTATATTATTCACTATGGCGACTGTCAAGGATGTGATTGTTCAAATGAATACGAACCTGTCTGTGGAACTGATAACAATACATACACAAACTTATGTCAATTACAGTGTGAAAGTAACATTAGACAACGTGAAAATCAGAAAGAGATAGCTCTCCTCAGCAAAGGAACATGCCCAGAGAGTGATTATGATTGTGAAAATTGCCCTCTTACGTACCAACCAGTTTGTGGTAAAGATCTTGTAAGCTACTGGAACGACTGCTGGTTTAAATGTAGTAATAAATGTAAACTGAGTCGTGGGGAAAAACCTATCCCGATGGCTAAAACTGGATGCTGTTAA

Protein sequence:

>DPOGS204736-PA
MKTVIGLLFIFASICYLDAKRIKKRSCICTELYSPICGTDGTTYTNKCFFNCAKNTHKKHGSTKDIYIAYEGKCSDSCICKDNYSPVCGSDGKTYPNSCYLNFKSKEIENDCKNNGDDPDENKLIEAYKGECSDECFCTDEYAPICANNNKTYSNSCQLECENKKRKNNNLPPLVVKSDGQCPKPCICEGMYQPICGDDGKTYANVCSLGCINEERQNNNLPPISKRSDGKCPNICKCPKIYKPVCGNDGKTYPSNCNLKCINKEREGNKLSPIREISKGECPKTCVCPFNYLPVCGSDGVTYSNECLLKCASKDNEKKNLPPITVVNETSCPESCLCPLIYEPICGDDGKTYSSSCELRCKNKEREINKELPIKKVSDGECSKPCRCPKIYSPVCGDNGETFSNNCELECENKKRQAKNESPIAVVSKGKCPEPCSCPKIFEPVCGDDGITYSSSCDLGCVNKEKEKNNEAPILEVSKGACPGSCICPLIISEPVCGSDGQTYRSECELDCENKIRIAKDESPLSVISKGECPKACACPLIDLPVCGSDDVTYPNECSLNCTSADNVRKSLPAITVKSQGECEESCICSTNYDPICGSDGVTYSNECQLECKNKKRIKNSLDRIDIVKKGKCNGSCSCPADVNPVCGSDGQSYPNECQLVCESDDLVRQGLSALEVIESDLCEESCECYNAIIPVCGSNNKSYRNACYLDCANRNRRGNETSITIKYSGACRSCTCTRELNQVCGSDGNTYNNPCLLDCESERLKGIGKSPLYIIHYGDCQGCDCSNEYEPVCGTDNNTYTNLCQLQCESNIRQRENQKEIALLSKGTCPESDYDCENCPLTYQPVCGKDLVSYWNDCWFKCSNKCKLSRGEKPIPMAKTGCC-