What is a thesaurus? To understand it better, let’s look at this simple example. Consider the word “house,” which is defined as “a building for human, in which people live.” In addition to the definition, you would also consult the etymology, the spelling and the pronunciation. If you do the same search with a thesaurus, it suggests synonyms (“home, habitation, place of residence, homestead” etc.) but it doesn’t explain the meaning of words.
While in the dictionary you can see the word’s definition and how it’s used in speech (noun, verb, adjective etc.), when you want to know similar words you have to look in a thesaurus. And sometimes a thesaurus also includes words with opposite meaning, antonyms.
So, to give a brief explanation of the difference between a dictionary and a thesaurus, we can say that a thesaurus is not an ordinary dictionary, a list of single terms and their definitions in alphabetical order but it includes “clusters” of words with similar meanings grouped together.
The importance of a thesaurus for text analytics
Any text analytics tool needs a detailed thesaurus to be able to understand and identify all the concepts and relevant data. An organization’s thesaurus includes and describes the objects and relationships—products, materials, geographies, people, etc.—that are essential to its business. However, available thesauri can hardly cover 100% of the terminology related to a specific domain, either because it would be too costly to identify all possible forms of a term, or because the way these forms occur cannot be predicted, let alone their various misspellings.
Based on the Expert System Cogito technology, Cogito Studio Express is the ontology editor that coordinates the workflows and resources required to create, manage, enrich and apply thesauri and ontology to content for semantic enrichment.
Cogito Studio Express helps companies efficiently leverage domain-specific thesauri for entity extraction, extending the terminology of the thesaurus with a good quality, and allows multiple users to work daily on multilingual thesauri. It allows you to enrich existing thesauri or import new ones, showing immediately how they apply to content. Compared to other applications, Cogito Studio Express makes it simple for end users to design, maintain their taxonomy/ontology and enable the development of powerful text analytics solution for categorization or extraction. Thanks to these features, organizations can simplify and accelerate their development while significantly reducing their operating cost.
Want to learn more? Download the brochure of Cogito Studio Express