Natural language processing (NLP) is one of the fastest evolving branches in machine learning and among the most fundamental. It has applications in diplomacy, aviation, big data sentiment analysis, language translation, customer service, healthcare, policing and criminal justice, and countless other industries.
NLP is the reason we’ve been able to move from CTRL-F searches for single words or phrases to conversational interactions about the contents and meanings of long documents. We can now ask computers questions and have them answer.
Algorithmia hosts more than 8,000 individual models, many of which are NLP models and complete tasks such as sentence parsing, text extraction and classification, as well as translation and language identification.
Allen Institute for AI NLP Models on Algorithmia
The Allen Institute for Artificial Intelligence (Ai2), is a non-profit created by Microsoft co-founder Paul Allen. Since its founding in 2013, Ai2 has worked to advance the state of AI research, especially in natural language applications. We are pleased to announce that we have worked with the producers of AllenNLP—one of the leading NLP libraries—to make their state-of-the-art models available with a simple API call in the Algorithmia AI Layer.
Among the algorithms new to the platform are:
- Machine Comprehension: Input a body of text and a question based on it and get back the answer (strictly a substring of the original body of text).
- Textual Entailment: Determine whether one statement follows logically from another
- Semantic role labeling: Determine “who” did “what” to “whom” in a body of text
These and other algorithms are based on a collection of pre-trained models that are published on the AllenNLP website.
Algorithmia provides an easy-to-use interface for getting answers out of these models. The underlying AllenNLP models provide a more verbose output, which is aimed at researchers who need to understand the models and debug their performance—this additional information is returned if you simply set debug=True.
The Ins and Outs of the AllenNLP Models
Machine Comprehension: Create natural-language interfaces to extract information from text documents.
This algorithm provides the state-of-the-art ability to answer a question based on a piece of text. It takes in a passage of text and a question based on that passage, and returns a substring of the passage that is guessed to be the correct answer.
This model could feature into the backend of a chatbot or provide customer support based on a user’s manual. It could also be used to extract structured data from textual documents, such as a collection of doctors’ reports could be turned into a table that says (for every report) the patient’s concern, what the patient should do, and when they should schedule a follow-up appointment.
Entailment: This algorithm provides state-of-the-art natural language reasoning. It takes in a premise, expressed in natural language, and a hypothesis that may or may not follow up from. It determines whether the hypothesis follows from the premise, contradicts the premise, or is unrelated. The following is an example:
The input JSON blob should have the following fields:
- premise: a descriptive piece of text
- hypothesis: a statement that may or may not follow from the premise of the text
Any additional fields will pass through into the AllenNLP model.
The following output field will always be present:
- contradiction: Probability that the hypothesis contradicts the premise
- entailment: Probability that the hypothesis follows from the premise
- neutral: Probability that the hypothesis is independent from the premise
Semantic role labeling: This algorithm provides state-of-the-art natural language reasoning—decomposing a sentence into a structured representation of the relationships it describes.
The concept of this algorithm is considering a verb and the entities involved in it as its arguments (like logical predicates). The arguments describe who or what does the action of this verb, to whom or what it is done, etc.
NLP Moving Forward
NLP applications are rife in everyday life, and applications will only continue to expand and improve because the possibilities of a computer understanding written and spoken human language and executing on it are endless.