Great content is an invaluable asset. However, compelling content is the expectation for information providers, not the exception. The true content battle is won and lost in content discoverability. Consider the old adage, if a tree falls in the woods and nobody is around to hear it, did it make a sound? If you produce an amazing piece of content but consumers can’t locate it on your site, is it really amazing?
Whether consumers come to you for academic research, financial trends or something else entirely, they want to find content that’s relevant to them. Their ability to find that content quickly and easily will determine if and when they return to you.
The objective is clear, but how can you ensure that your content is discoverable? The answer lies in your content metadata.
The Influence of Content Metadata
Simply put, metadata is descriptive data about your content. It enables you to classify and categorize content by the information most valuable to you and your audience. This information can include common fields such as:
- Primary topic
- Subject matter (individual, team, etc.)
- Author
- Publish date
Metadata is subjective though, so what works for one organization may be useless for another. While the customizable nature of metadata has its perks, it can also complicate matters, especially as you label more and more content. Here we get to the crux of the issue (one of them, at least) in content labeling.
Labeling and Annotation
Labeling and annotating content is not a difficult task in itself. However, it becomes an extremely tedious and time-consuming task as your content volume grows. The same can be said as you add new and more subjective metadata fields.
It’s this manual slog that prevents information providers from maximizing their content potential. Artificial intelligence can help automate the process via entity extraction, but the success of it depends on the chosen AI approach and technology applied to it.
Standard natural language processing techniques such as keyword recognition can provide useful surface-level information. However, this approach is limiting as it lacks the contextual understanding to discern terms that have multiple meanings as well as synonyms that should be grouped together.
Metadata is only as effective as it is accurate. By taking a symbolic approach to extracting entities from content, you bring a level of knowledge and common sense to your model that can help you identify and capture keywords and their synonyms and also establish semantic relationships between terms within a single document or multiple (i.e., taxonomy and ontology). This ability to enrich metadata, and do so without additional man hours, is critical to creating a more robust content database with which users can easily navigate.
How Metadata Transforms Discoverability
At our recent webinar with Outsell and SAGE Publishing we took a deep dive into metadata and discoverability. Outsell discussed the opportunities available to content providers (discoverability included) while SAGE shared its experience capitalizing on enriched metadata and how it impacts the user experience.
SAGE’s platform, called SAGE Knowledge, encompasses content in various media formats including books, videos, business cases and more. The challenge was that the content was siloed and lacked metadata. As a result, the platform lacked accurate and dynamic discoverability of content. The solution? Create a domain-specific thesaurus to standardize the way content is tagged.
Rather than maintain the unscalable process of manual tagging, SAGE sought out the expert.ai Platform to help augment it. They started by extracting key entities (e.g., people, places, organizations, concepts, numerical expressions, etc.) and topics from their trove of existing content to assemble the baseline for their thesaurus. By using a knowledge graph to process the content, they were able to identify similar topics and terms (rather than basic keywords) that could establish more connectivity throughout.
More importantly, as new content is added to their content library, new terms can be extracted and built into the existing thesaurus, adding breadth and depth to this resource without requiring additional labor or expertise.
With more robust and accurate annotation, SAGE established a platform that users could rely upon to discover relevant and valuable content. They also laid the groundwork for additional metadata use cases which you can learn more about in our webinar.
Your Opportunity Awaits
Traditionally, enriching content metadata is a labor-intensive process that requires significant domain expertise. This has consistently proven unsustainable for information providers, especially as they develop content at increasing rates. However, taking a knowledge-based NLU approach to labeling content and enriching metadata, can alleviate the burden on subject matter experts by scaling their domain expertise in a reliable way.
Don’t let good content go to waste. Enable your users to discover the content that’s most relevant to them, and you’ll reap the reward of better discoverability.