Codeq Natural Language Processing

By Codeq | Updated 9 days ago | Text Analysis

7.6 / 10



Service Level


Codeq Natural Language Processing Overview

Followers: 0
Product Website Terms of use
API Creator:
Rate API:
Log In to Rate API


Codeq’s proprietary NLP technology offers advanced Deep Learning models and linguistic analysis tools to extract rich representations from your textual data. Our suite of NLP modules can be customized to your needs; we hand-tune the parameters of our solution until our API produces accurate and concise results.

Extract rich insight into what users are expressing in your application by identifying a wide range of language utterances. Our API provides the back-end to take your application to the next level.

Powerful customization

Define your own NLP pipeline based on the linguistic tools that you require for your application. Codeq offers modules you will not find in other NLP APIs like our Speech Act, Question, Emotion, and Sarcasm classifiers as well as Sentence Compression and Semantic Role Labeling.

In addition to these exclusives, our API features the following modules:

High Level: Named Entity Recognition, Named Entity Linking, Named Entity Salience, Sentiment Classifier, Coreference Resolution, Date Resolution, Task Extraction, Key Phrase Extraction, and Summarization

Low level: Language Identifier, Tokenization, Sentence Splitting, Stopword Removal, Stemming, True Casing, Detrue Casing, Part of Speech Tagging, Lemmatization, Dependency Parser, and Chunker

Understand your users

Extract rich insight into what users are expressing by identifying a wide range of language utterances. Codeq’s API offers a granular approach to identify not only sentiment from texts, but also emotions, giving you a richer way to understand your users. Our sarcasm detection module works in combination with the sentiment and emotion detectors, something you won’t find anywhere else.

Identify the relevant content and avoid information overload

Codeq’s comprehensive offering of summarization and information technology allows you to encapsulate the important sentences and key phrases of your texts, as well as to detect and disambiguate the most relevant Named Entities. We enrich your content with semantic knowledge, so you can dissect the subject easily.

Extract actionable content that needs your attention

Our API can analyze your unstructured textual data and extract important tasks and commitments, including a concise representation of priority patterns.Tasks are presented in a rich format that simplifies integration within your applications, including a concise representation of priority patterns and suggestions of possible actions to take to complete the tasks.


The Codeq Natural Language Processing API can be accessed at one of the following endpoints:

NLP Pipeline
Receives one text and returns a list of analyzed sentences based upon the specific NLP Annotators requested.

Text Similarity
Receives two texts and returns a text similarity score.

Endpoint: NLP Pipeline

By default, when you can call the NLP Pipeline endpoint you will retrieve a text fully analyzed by our complete set of NLP Annotators. Or you can specify a custom pipeline depending on your needs.

For example, if you are only interested on getting the speechact and sentiment labels of a text, you can declare a “pipeline” key as parameter and send as value a comma separated list of the Annotators you need:

Sample Response Body:

  "document": {
    "language": "English",
    "raw_text": "This model is an expensive alternative with useless battery life.",
    "sentences": [
        "position": 0,
        "raw_sentence": "This model is an expensive alternative with useless battery life.",
        "tokens": ["This", "model", "is", "an", "expensive", "alternative", "with", ... ],
        "pos_tags": ["DT", "NN", "VBZ", "DT", "JJ", "NN", "IN", "JJ", "NN", "NN", "."],
        "speech_acts": ["Statement"],
        "sentiments": ["Negative"],
        "emotions": ["Disgust/Dislike"],

Pipeline Annotators

The following sections show the list of Annotators of our NLP API grouped by topic, including the key you can use as value of the pipeline parameter, as well as the description of the Annotators’ JSON output



Generates an extractive summary with the most relevant sentences of the input text.
KEY: summarize
ATTR: document.summary

Sentence Compression

Provides, where applicable, a shortened version of a sentence that gives its main point without extraneous clauses. It uses the output of the dependency parser Annotator to determine parts of the sentence that serve to modify, explain, or embellish the main points and strips them off, leaving only the core information provided by the sentence.
KEY: compress
ATTR: sentence.compressed_sentence

Summarization with compression

Generates an extractive summary with the most relevant sentences of the input text in its compressed form, independently if the compress Annotator is specified in the pipeline or not.
KEY: summarize_compress
ATTR: document.compressed_summary

Keyphrase Extraction

Generates a list of keyphrases to capture the topics covered by the document, in order from most to least relevant. Keyphrases can be retrieved with or without relevance scores. | |
KEY: keyphrases
ATTR: document.keyphrases
ATTR: document.keyphrases_scored

Text Classification

Speech Act Classifier

Generates a list of tags indicating the predicted speech acts of a sentence.
KEY: speechact
ATTR: sentence.speech_acts

Output Labels:

Question Classifier

Generates a list of tags indicating the predicted type of question, if a sentence is classified as such.
KEY: question
ATTR: sentence.question_type

Output Labels:
Yes/No question (qy)
Wh- question (qw)
Open-ended question (qo)
Or question (qr)
Declarative question (d)
Tag question (g)
Rhetorical question (qh)

Sentiment Classifier

Generates a list of values for each sentence indicating the predicted sentiment label.
KEY: sentiment
ATTR: sentence.sentiments

Output Labels:

Emotion Classifier

Generates a list of values for each sentence indicating the predicted emotion label.
KEY: emotion
ATTR: sentence.emotions

Output Labels:
No emotion

Sarcasm Classifier

Generates a label predicting if a sentence is sarcastic or not.
KEY: sarcasm
ATTR: sentence.sarcasm

Output Labels:

Abuse Detection

Generates a list of values for each sentence indicating the types of abuse detected.
KEY: abuse
ATTR: sentence.abuse

Output Labels:
Hate speech/racist
Unknown abuse

Task Extraction

Generates different values including whether a sentence is predicted to be a task, and if so, it returns a list of tags indicating its predicted task type and a list of tuples indicating suggested task actions.
KEY: task
ATTR: sentence.is_task
ATTR: sentence.task_subclassification
ATTR: sentence.task_actions

Named Entities

Named Entity Recognition

Produces a list of named entities found in a sentence, containing the tokens of the entity, its type and its span positions.
KEY: ner
ATTR: sentence.named_entities

Output Labels:
PER Person
LOC Location
ORG Organization
MISC Miscellaneous
PHONE Phone number
EMAIL Email address
TWITTERNAME Twitter name
TRACKINGNUMBER Tracking number
AIRLINECODE Airline code
AIRLINENAME Airline name
AIRPORTCODE Airport code
AIRPORTNAME Airport name

Named Entity Linking

Produces a list of disambiguated named entities and a link to their Wikipedia page. Each linked entity is a dictionary containing the label of the disambiguated entity, its probability score, a description, and links to their Wikipedia and Wikidata entries.
KEY: nel
ATTR: sentence.named_entities_linked

Named Entity Salience

Produces a list of tuples indicating the salience of named entities, that is how central they are to the content input Each tuple contains a boolean indicating if the entity is salient or not and its salience score.
KEY: salience
ATTR: sentence.named_entities_salience

Date resolution

Generates a list of tuples for each sentence with all resolved date entities given a relative date (by default: today). The output includes the date entity, its tokens span and the resolved timestamp.
KEY: date
ATTR: sentence.dates

Coreference resolution

Generates a list of resolved pronominal coreferences. Each coreference is a dictionary that includes: mention, referent, first_referent, where each of those elements is a tuple containing a coreference id, the tokens and the span of the item. Additionally, each coreference dict contains a coreference chain (all the ids of the linked mentions) and the first referent of a chain.
KEY: coreference
ATTR: sentence.coreferences

Linguistic Features

Language Identifier

Generates a label indicating the language of the text and its probability.
KEY: language
ATTR: document.language
ATTR: document.language_probability

Supported languages:
Afrikaans Albanian Arabic Basque Bulgarian Catalanv Chinese Croatian Czech Danish Dutch English Esperanto Estonian Finnish French Galician
German Greek Hebrew Hindi Hungarian Icelandic Italian Japanese Korean Latvian Lithuanian Norwegian Pashto Polish Portuguese Romanian Russian
Serbian Slovak Slovenian Spanish Swahili Swedish Tamil Telugu Thai Turkish Ukrainian Urdu Vietnamese Welsh Wolof Yiddish


Generates a list of words from raw text.
KEY: tokenize
ATTR: document.tokens
ATTR: sentence.tokens

Sentence segmentation

Generates a list of sentences from a raw text.
KEY: ssplit
ATTR: document.sentences

Stopword Removal

Produces a list of tokens after removing common stopwords from the text.
KEY: stopword
ATTR: sentence.tokens_filtered


Generates a list of the inflected forms of the tokens.
KEY: stem
ATTR: sentence.stems

True casing

Produces a string with the true case of sentence tokens.
KEY: truecase
ATTR: sentence.truecase_sentence

Detrue casing

Produces a string with the predicted original case of the tokens.
KEY: detruecase
ATTR: sentence.detruecase_sentence

Part of Speech Tagging

Generates a list containing the PoS-tag for each sentence token.
KEY: pos
ATTR: sentence.pos_tags

Output Labels:
Penn Treebank Reference


Generates a list containing the lemma for each sentence token.
KEY: lemma
ATTR: sentence.lemmas

Dependency parser

Generates a list of dependencies in 3-tuples consisting of: head, dependent and relation. Head and dependent are in the format “token@@@position”. Positions are 1-indexed, with 0 being the index for the root.
KEY: parse
ATTR: sentence.dependencies

Output Labels:
Stanford Dependencies version 3.5.2 Reference


Groups the tokens of the sentence into small, non-overlapping groups based on prominent parts of speech, such as NP chunks (“the tall person”) or VP chunks (“will leave”).
KEY: chunk
ATTR: sentence.chunks

Output Labels:
CONLL 2000 Reference

Semantic Role Labeling

Generates a list of dictionaries containing the retrieved predicates of each sentence, their lemmas, the constituents of the sentence found to be arguments of each predicate, and the classified argument type.
KEY: semantic_roles
ATTR: sentence.semantic_roles

Twitter Preprocessing

Removes artifacts like user mentions and URLs, segments hashtags and generates a list of words from raw text.
KEY: twitter_preprocess
ATTR: sentence.tokens_clean

Endpoint: Text Similarity

The output of this endpoint is a number indicating the similarity between the two texts submitted in request.

Sample Response Body:

  "text_similarity_score": 4.6

Text Similarity Output Score

The following table lists the range of text similarity scores, which is based on the SemEval (Semantic Textual Similarity) tasks.

Score Meaning
5 The two sentences are completely equivalent, as they mean the same thing.
4 The two sentences are mostly equivalent, but some unimportant details differ.
3 The two sentences are roughly equivalent, but some important information differs/missing.
2 The two sentences are not equivalent, but share some details or are on the same topic.
1 The two sentences are completely dissimilar.
Rating: 5 - Votes: 1