Info Extract
NLP
POS Tagger
Crawlers
Subscribe to the
TAI Newsletter
E-Mail Address:

First Name:

Last Name:



Manage Subscription

Privacy Policy

NLP

NOTE: The latest VisualText is required to run the download below. So sign up for VisualText Pro now.
Download TAIParse here (geared to VisualText 2). Approximately 7MB.

Also, ask us about a version of TAIParse that performs part-of-speech tagging at 94% accuracy.

Natural language processing (NLP) generally refers to the complete linguistic and conceptual processing of a text. To facilitate the construction of natural language processing products, TAI is now making available some general text analysis prototypes that can be used as a starting point for a host of applications, such as information extraction, categorization, summarization, and question parsing.

TAIParse is a general analyzer that emphasizes the minimal use of knowledge ("just-in-time" knowledge) to perform part-of-speech tagging, entity extraction, and shallow parsing.  TAIParse is an excellent starting point for customizing your own text analysis capabilities. For one thing, TAIParse includes a full lexicon with part-of-speech information within its knowledge base. For another, it illustrates the latest features of the NLP++® language in action. TAIParse further illustrates the ease of implementation of NLP systems with the VisualText® IDE (SDK, tools, etc.). TAIParse includes these capabilities and more:

  • Zoning and "parsing-per-line" to characterize regions and formats in text
  • Dynamic and context-dependent part-of-speech assignment and parsing
  • Successive segmentation of text in a "divide-and-conquer" strategy
  • Treatment of unknown words
  • Noun phrase extraction
  • A semantic and discourse processing framework that ties into an ontology and dynamic representation of the analysis within the knowledge base
  • Processing INDEPENDENT of capitalization, so that, for example, all-uppercase text regions can be analyzed.
  • Robust analysis in the face of errors, misspellings, and ungrammatical text

 

 

keywords: natural language processing products, nlp, integrated development environment, ide, information extraction, sdk, tool set, software tools.