TextAnalysis

FREEMIUM
By TextMiner | Updated 4 days ago | Tools
Popularity

9.1 / 10

Latency

311ms

Service Level

100%

Health Check

N/A

Back to All Discussions

Sentence segmentation is converting some symbols into spaces

Rapid account: Simonsmith
simonsmith
7 years ago

Example paragraph: “Nicotinamide riboside ( NR ) is a pyridine - nucleoside form of vitamin B 3 that functions as a precursor to nicotinamide adenine dinucleotide or NAD+. According to the peer-reviewed literature, NR was discovered as a human vitamin precursor of NAD+ in 2004 and as a sirtuin-activating compound in 2007 by Charles Brenner.”

When I process this through the sentence segmenter, NAD+ comes back as NAD followed by a space in place of the plus sign. Is there a way I should be escaping symbols so they’re properly processed?

Rapid account: Textanalysis
textanalysis Commented 7 years ago

sorry, close this issue now

Rapid account: Simonsmith
simonsmith Commented 7 years ago

Thanks for the quick response.

Using https://textanalysis.p.mashape.com/nltk-sentence-segmentation.

Sending text from here: https://en.wikipedia.org/wiki/Nicotinamide_riboside

The acronym “NAD+” gets turned into "NAD " with a space replacing the + sign.

I’m using this through Node.js with unirest.

Rapid account: Textanalysis
textanalysis Commented 7 years ago

there are four sentence sgementer api, which one you used for this example? I have test it on the online demo: http://textanalysisonline.com/ , find no problem for this. can you give me which one you used?

Join in the discussion - add comment below:

Login / Signup to post new comments