Monarch geneset OGS2.0

DPOGS203712
TranscriptDPOGS203712-TA3267 bp
ProteinDPOGS203712-PA1088 aa
Genomic positionDPSCF300010 - 1413144-1420003
RNAseq coverage133x (Rank: top 56%)
Annotation
HeliconiusHMEL0023900.081.74% 
BombyxBGIBMGA003498-TA0.081.06% 
DrosophilaCG32354-PA5e-2124.04% 
EBI UniRef50UniRef50_D6W6H40.054.23%Putative uncharacterized protein n=3 Tax=Tribolium castaneum RepID=D6W6H4_TRICA
NCBI RefSeqXP_001811978.10.054.29%PREDICTED: similar to agrin [Tribolium castaneum]
NCBI nr blastpgi|1892336170.054.29%PREDICTED: similar to agrin [Tribolium castaneum]
NCBI nr blastxgi|2700146630.054.29%hypothetical protein TcasGA2_TC004709 [Tribolium castaneum]
Group
Gene OntologyGO:00055153.2e-18protein binding
KEGG pathwaytca:1001417630.0 
 K06254 (AGRN)maps-> ECM-receptor interaction
InterPro domain[905-1077] IPR0089854.8e-21Concanavalin A-like lectin/glucanase
[127-172] IPR0023503.2e-18Proteinase inhibitor I1, Kazal
[909-1060] IPR0133207.3e-15Concanavalin A-like lectin/glucanase, subgroup
[132-172] IPR0114976.1e-12Protease inhibitor, Kazal-type
[658-707] IPR0020492.4e-11EGF-like, laminin
[943-1057] IPR0126803.4e-08Laminin G, subdomain 2
Orthology groupMCL11844 Patchy
Genotypes for resequenced monarchs and outgroup Danaus species

Nucleotide sequence:

>DPOGS203712-TA
ATGCTAACTTCACCGAATGCGATTTTATTAATGGTCCCTAATATATTGGGATGCTATATTTTTCCGAATGACGTGACGAATCCGTGTCGAGGCGTGATCTGCGGCCCAGGAGAGCTGTGTCGCCCTACTGCAGACGGAAAGAATTACAGTTGTGAATGTCCAACATCTTGTCCAAGTTACGGGGATCATGAAGGTTCACGCCCCTTATGCGCTAGCGACGCTAAAGATTATCCCGGAACATGTGAAATGCGAAGAGCTGCTTGCGAGAGCAACACAAACATAACTTTTAAATATCATGGCAAATGTGACCCCTGCGCCGGCGTATCCTGTCCAGATCCAGAAGTTTGCCAATTGGATGACCAACGCCAACCTTCCTGTCGCTGCGCAGAACCTTGTCCTTTAGAATTTTCCCCTGTATGTGCGTCTGATGGAAAGACTTATTCGAACGAGTGTCAAATGCATAGAGAGTCCTGTCGAGCCAGAAAACAATTAAAGATTATTTTTAAAGGACAATGTATTTCAGGTGTAAACCCATGTGCGGAGGTGGAGTGTCGTCACGGCGCTGAATGTCGTGTGGAGGGTAGTGGCGCGGTCTGCGCTTGTCCGCCCCCCTGTGAACAAGTGCTGCGACCTGTCTGCGGTTCTGATGCGAGGACCCATGACAGCGAATGTGAACTTCGACGTGCTGGCTGTTTGTTAGGAAGAGAGTTGAAGGTCGTTCACGCTGGAGCCTGCGGTTCCAACGGTGTTTGTGCTGGGCGGGTATGCCCTCACGGTGGTGAATGTGTTTCCTCAGGAGGTCGAGGCGTTTGTCGATGTCCAAAATGTTCTAATGAATTTGCTCCCGTGTGTGGTTCTGATGGTATTTCCTACGGCAACAGATGCAAGCTGCAGTTAGAGTCCTGTAGACATCGTCGTCACGTTCAAGTGTTGTACGATGGACCTTGCAATGGATGTGAAAATAAAAAGTGTGAATATTATGCCGTTTGCGAGAGTGATGGTGTTTCTGAAGCTAGCTGCGTTTGTCCAAAACATTGTGAAGAAGGAACTGAAACTGAAGAAGTTTGCGGTAATGACAACAAAACTTATAGCAGCGTTTGTGCTCTTCGTAACATAGCCTGCCGCGAAAAGAGGAGACTCCACGTTAAACATATGGGTTCTTGTGAATCTTGCGGCAATGTCGAATGTCCGCTAGGTATGTGGTGTTCTCGAGGCAAATGTGCTTGTGCAAGTTGTGCTGATGTTCCTCGAGAGACCGTTTGTTCTGACACGAGGCGAACGTTCCCCAACGAGTGTTCATTGCATAAAGCTGCGTGCGAGGCGCGAGCCCGCGGGGAACCTCCTCCGCAGGTTGCTTACTATGGAGACTGTACTGATGCTAATAAAGATAATAGTTCAGGGGCTAATGTATCAGAGAAAATGGAACAGAGTAATGAGATACGGAATGGTGTGGAGGTCGACAGTACCAGTGAACCATCTGTTGGAACAGCTGTTTGTGCTAGAGTGCAATGCGCCTACGAGGCGACCTGCGCTGTGGACGGTAACGGCCAGCCACGTTGTGCATGTCTGTTTGACTGCGCCGCTGCGGCAGCTTCTTCCTCAGCGCCCGTCTGCGCCTCCGACTTACGCATGTACCCCACGCTGTGTCATATGAAACTGGAGTCTTGTCGCCGTCAAGAGGACCTTCGACTGAGGCCTTTAGCATTGTGTAGGGGTCTCGAGTTCAGGCCATGTGGTGATGATGAAACCGTAACAGATTCGGAAGGTCTTCCAGTTGATTGTGGCGGTGGACCTCATCGTAAGGACTGTCCGACGGATAGCTACTGCCATCATACTGCTAAGGCTGCAAGATGTTGTAGGAAAGACAAAGCTGTCGCAGAGAAGAAAGACTGTCAAGAATCTTGGTACGGCTGTTGTGCGGACGGGGTGACGTCAGCACGTGGTCCCGGGGGGGCGGGATGTCCCTCACAATGTGGTTGTCACAGGCTGGGTTCTGTGTCCGAGATGTGTGATGAAAGCGGCCAGTGTCAATGTAGACCTGGTGTGGGAGGGCACAAATGTGACAGATGCGAACCAGGTTATTGGGGCTTACCCCGGATCGGTACTGGACATACTGGCTGTATACCATGTGGGTGTTCGGCCTTCGGATCAGTTCGTGAGGACTGCGAACAGATGACCGGGCGGTGTGTTTGCAAACCGGGCATTCAGGGGCAGAAGTGTACTGTTTGTTCCAACCATGAACATACTCTAGGACCTAACGGCTGCTTTGACCCGGAATCCACCCAACTACCAGCTACCGACTGTGAACGTATGACATGTTATTTCGGAGCCTACTGCGCTATACGTAGCGGTCTTGCCACTTGTGAATGTAATGCTCAGGAATGTTTTACAACCGAGGGCCCGTCTGTTTGCGGTAGTGACGGACGGACATATCTATCAGCTTGTCATGCGAGGGCGCACGCTTGTCGGACACAATCGGACATAGTTGTACAGGCGTTTGGTCCCTGTGCTGAAGATACGCCGTCTGTGAAGCGAGAGGAAATAAATTCATCTATTATTTCGAAAGAAAATGCCGAAGAAGGTTATTGTAACAAAAACCCATCTCAAATAGATACCGATATTGAAGTTACAGAATCGGAGGAAGAGCAATACATAACAAATGAGGTTGAAGAAAATTATCCAATATACGAAGAATACATCGAGGAAAACGAAAACGAAATATACTCATCGCCATTGTTCGACGGGCATGCTCGGATGACAGCTCGCACAAGATTGCCCGCTAAACGATTCGATATTTGGGCCGAAGTATCGGCGGTGTGCGGTAAAGGCGCTTTAATAAGTGCCTCAGGTGTGCGAGATTATTTATGGCTCGGGTTCGTAAAAGACAGAGCTGTATTGCGTTGGGACGCTGGCAATGGCCCTTTAGAGTTACGATCTGGTAAAATAAGAGTTGATACTAAGTCTAAAATATCGGCGCGGCGATATAAGAAGGACGCCATGTTGAAACTTGAATCTTATACAGTTAGGGGTACGACACATGGACGCATGAGTTCATTAGACGTTGATCCTTATATTTATATTGGCCATCCGCCGGATAACGTTACAAAGTTATCTGGTGTACACACAATGAACGGTTTTGTGGGATGTGTACATCGCTTGCGTGTGAGCGGACGTGACGTCATCCCCCCGTCCCGAGGCCTAAATATTGTGGCTCATGGTCTGCGACCATGCACTCCTTACAATCTAGCCAAGGTCGTGTGTCCTTAG

Protein sequence:

>DPOGS203712-PA
MLTSPNAILLMVPNILGCYIFPNDVTNPCRGVICGPGELCRPTADGKNYSCECPTSCPSYGDHEGSRPLCASDAKDYPGTCEMRRAACESNTNITFKYHGKCDPCAGVSCPDPEVCQLDDQRQPSCRCAEPCPLEFSPVCASDGKTYSNECQMHRESCRARKQLKIIFKGQCISGVNPCAEVECRHGAECRVEGSGAVCACPPPCEQVLRPVCGSDARTHDSECELRRAGCLLGRELKVVHAGACGSNGVCAGRVCPHGGECVSSGGRGVCRCPKCSNEFAPVCGSDGISYGNRCKLQLESCRHRRHVQVLYDGPCNGCENKKCEYYAVCESDGVSEASCVCPKHCEEGTETEEVCGNDNKTYSSVCALRNIACREKRRLHVKHMGSCESCGNVECPLGMWCSRGKCACASCADVPRETVCSDTRRTFPNECSLHKAACEARARGEPPPQVAYYGDCTDANKDNSSGANVSEKMEQSNEIRNGVEVDSTSEPSVGTAVCARVQCAYEATCAVDGNGQPRCACLFDCAAAAASSSAPVCASDLRMYPTLCHMKLESCRRQEDLRLRPLALCRGLEFRPCGDDETVTDSEGLPVDCGGGPHRKDCPTDSYCHHTAKAARCCRKDKAVAEKKDCQESWYGCCADGVTSARGPGGAGCPSQCGCHRLGSVSEMCDESGQCQCRPGVGGHKCDRCEPGYWGLPRIGTGHTGCIPCGCSAFGSVREDCEQMTGRCVCKPGIQGQKCTVCSNHEHTLGPNGCFDPESTQLPATDCERMTCYFGAYCAIRSGLATCECNAQECFTTEGPSVCGSDGRTYLSACHARAHACRTQSDIVVQAFGPCAEDTPSVKREEINSSIISKENAEEGYCNKNPSQIDTDIEVTESEEEQYITNEVEENYPIYEEYIEENENEIYSSPLFDGHARMTARTRLPAKRFDIWAEVSAVCGKGALISASGVRDYLWLGFVKDRAVLRWDAGNGPLELRSGKIRVDTKSKISARRYKKDAMLKLESYTVRGTTHGRMSSLDVDPYIYIGHPPDNVTKLSGVHTMNGFVGCVHRLRVSGRDVIPPSRGLNIVAHGLRPCTPYNLAKVVCP-