(Ким) TextMiner | Оновлено 2 years ago | Tools

9.6 / 10



Рівень обслуговування


Повернутися до всіх обговорень

Sentence segmentation is converting some symbols into spaces

Rapid account: Simonsmith
7 years ago

Example paragraph: “Nicotinamide riboside ( NR ) is a pyridine - nucleoside form of vitamin B 3 that functions as a precursor to nicotinamide adenine dinucleotide or NAD+. According to the peer-reviewed literature, NR was discovered as a human vitamin precursor of NAD+ in 2004 and as a sirtuin-activating compound in 2007 by Charles Brenner.”

When I process this through the sentence segmenter, NAD+ comes back as NAD followed by a space in place of the plus sign. Is there a way I should be escaping symbols so they’re properly processed?

Rapid account: Textanalysis
textanalysis Commented 6 years ago

sorry, close this issue now

Rapid account: Simonsmith
simonsmith Commented 7 years ago

Thanks for the quick response.

Using https://textanalysis.p.mashape.com/nltk-sentence-segmentation.

Sending text from here: https://en.wikipedia.org/wiki/Nicotinamide_riboside

The acronym “NAD+” gets turned into "NAD " with a space replacing the + sign.

I’m using this through Node.js with unirest.

Rapid account: Textanalysis
textanalysis Commented 7 years ago

there are four sentence sgementer api, which one you used for this example? I have test it on the online demo: http://textanalysisonline.com/ , find no problem for this. can you give me which one you used?

Приєднуйтесь до обговорення — додайте повідомлення нижче:

Вхід / Реєстрація, щоб публікувати нові повідомлення