A lot has been said about the remarkable opportunities of Generative AI (GenAI), and some of us have also been extremely vocal about the risks associated with using this transformative technology.
The rise of GenAI presents significant challenges to the quality of information, public discourse, and the general open web. GenAI’s power to predict and personalize content can be easily misused to manipulate what we see and engage with.
Generative AI search engines are contributing to the overall noise, and rather than helping people find the truth and forge unbiased opinions, they tend (at least in their present implementation) to promote efficiency over accuracy, as highlighted by a recent study by Jigsaw, a unit inside Google.
Despite the hype surrounding SEO alligator parties and content goblins, our generation of marketers and SEO professionals has spent years working towards a more positive web environment.
We’ve shifted the marketing focus from manipulating audiences to empowering them with knowledge, ultimately aiding stakeholders in making informed decisions.
Creating an ontology for SEO is a community-led effort that aligns perfectly with our ongoing mission to shape, improve, and provide directions that truly advance human-GenAI interaction while preserving content creators and the Web as a shared resource for knowledge and prosperity.
Traditional SEO practices in the early 2010s focused heavily on keyword optimization. This included tactics like keyword stuffing, link schemes, and creating low-quality content primarily intended for search engines.
Since then, SEO has shifted towards a more user-centric approach. The Hummingbird update (2013) marked Google’s transition towards semantic search, which aims to understand the context and intent behind search queries rather than just the keywords.
This evolution has led SEO pros to focus more on topic clusters and entities than individual keywords, improving content’s ability to answer multiple user queries.
Entities are distinct items like people, places, or things that search engines recognize and understand as individual concepts.
By building content that clearly defines and relates to these entities, organizations can enhance their visibility across various platforms, not just traditional web searches.
This approach ties into the broader concept of entity-based SEO, which ensures that the entity associated with a business is well-defined across the web.
Fast-forward to today, static content that aims to rank well in search engines is constantly transformed and enriched by semantic data.
This involves structuring information so that it is understandable not only by humans but also by machines.
This transition is crucial for powering Knowledge Graphs and AI-generated responses like those offered by Google’s AIO or Bing Copilot, which provide users with direct answers and links to relevant websites.
As we move forward, the importance of aligning content with semantic search and entity understanding is growing.
Businesses are encouraged to structure their content in ways that are easily understood and indexed by search engines, thus improving visibility across multiple digital surfaces, such as voice and visual searches.
The use of AI and automation in these processes is increasing, enabling more dynamic interactions with content and personalized user experiences.
Whether we like it or not, AI will help us compare options faster, run deep searches effortlessly, and make transactions without passing through a website.
The future of SEO is promising. The SEO service market size is expected to grow from $75.13 billion in 2023 to $88.91 billion in 2024 – a staggering CAGR of 18.3% (according to The Business Research Company) – as it adapts to incorporate reliable AI and semantic technologies.
These innovations support the creation of more dynamic and responsive web environments that adeptly cater to user needs and behaviors.
However, the journey hasn’t been without challenges, especially in large enterprise settings. Implementing AI solutions that are both explainable and strategically aligned with organizational goals has been a complex task.
Building effective AI involves aggregating relevant data and transforming it into actionable knowledge.
This differentiates an organization from competitors using similar language models or development patterns, such as conversational agents or retrieval-augmented generation copilots and enhances its unique value proposition.
Imagine an ontology as a giant instruction manual for describing specific concepts. In the world of SEO, we deal with a lot of jargon, right? Topicality, backlinks, E-E-A-T, structured data – it can get confusing!
An ontology for SEO is a giant agreement on what all those terms mean. It’s like a shared dictionary, but even better. This dictionary doesn’t just define each word. It also shows how they all connect and work together. So, “queries” might be linked to “search intent” and “web pages,” explaining how they all play a role in a successful SEO strategy.
Imagine it as untangling a big knot of SEO practices and terms and turning them into a clear, organized map – that’s the power of ontology!
While Schema.org is a fantastic example of a linked vocabulary, it focuses on defining specific attributes of a web page, like content type or author. It excels at helping search engines understand our content. But what about how we craft links between web pages?
What about the query a web page is most often searched for? These are crucial elements in our day-to-day work, and an ontology can be a shared framework for them as well. Think of it as a playground where everyone is welcome to contribute on GitHub similar to how the Schema.org vocabulary evolves.
The idea of an ontology for SEO is to augment Schema.org with an extension similar to what GS1 did by creating its vocabulary. So, is it a database? A collaboration framework or what? It is all of these things together. SEO ontology operates like a collaborative knowledge base.
It acts as a central hub where everyone can contribute their expertise to define key SEO concepts and how they interrelate. By establishing a shared understanding of these concepts, the SEO community plays a crucial role in shaping the future of human-centered AI experiences.
Screenshot from WebVowl, August 2024SEOntology – a snapshot (see an interactive visualization here).
The Data Interoperability Challenge In The SEO Industry
Let’s start small and review the benefits of a shared ontology with a practical example (here is a slide taken from Emilija Gjorgjevska’s presentation at this year’s ZagrebSEOSummit)
Image from Emilija Gjorgjevska’s, ZagrebSEOSummit, August 2024
Imagine your colleague Valentina uses a Chrome extension to export data from Google Search Console (GSC) into Google Sheets. The data includes columns like “ID,” “Query,” and “Impressions” (as shown on the left). But Valentina collaborates with Jan, who’s building a business layer using the same GSC data. Here’s the problem: Jan uses a different naming convention (“UID,” “Name,” “Impressionen,” and “Klicks”).
Now, scale this scenario up. Imagine working with n different data partners, tools, and team members, all using various languages. The effort to constantly translate and reconcile these different naming conventions becomes a major obstacle to effective data collaboration.
Significant value gets lost in just trying to make everything work together. This is where an SEO ontology comes in. It is a common language, providing a shared name for the same concept across different tools, partners, and languages.
By eliminating the need for constant translation and reconciliation, an SEO ontology streamlines data collaboration and unlocks the true value of your data.
The Genesis Of SEOntology
In the last year, we have witnessed the proliferation of AI Agents and the wide adoption of Retrieval Augmented Generation (RAG) in all its different forms (Modular, Graph RAG, and so on).
RAG represents an important leap forward in AI technology, addressing a key limitation of traditional large language models (LLMs) by letting them access external knowledge.
Traditionally, LLMs are like libraries with one book – limited by their training data. RAG unlocks a vast network of resources, allowing LLMs to provide more comprehensive and accurate responses.
RAGs improve factual accuracy, and context understanding, potentially reducing bias. While promising, RAG faces challenges in data security, accuracy, scalability, and integration, especially in the enterprise sector.
For successful implementation, RAG requires high-quality, structured data that can be easily accessed and scaled.
We’ve been among the first to experiment with AI Agents and RAG powered by the Knowledge Graph in the context of content creation and SEO automation.
Screenshot from Agent WordLift, August 2023
Knowledge Graphs (KGs) Are Indeed Gaining Momentum In RAG Development
Microsoft’s GraphRAG and solutions like LlamaIndex demonstrate this. Baseline RAG struggles to connect information across disparate sources, hindering tasks requiring a holistic understanding of large datasets.
KG-powered RAG approaches like the one offered by LlamaIndex in conjunction with WordLift address this by creating a knowledge graph from website data and using it alongside the LLM to improve response accuracy, particularly for complex questions.
Image from author, August 2024
We have tested workflows with clients in different verticals for over a year.
From keyword research for large editorial teams to the generation of question and answers for ecommerce websites, from content bucketing to drafting the outline of a newsletter or revamping existing articles, we’ve been testing different strategies and learned a few things along the way:
1. RAG Is Overhyped
It is simply one of many development patterns that achieve a goal of higher complexity. A RAG (or Graph RAG) is meant to help you save time finding an answer. It’s brilliant but doesn’t solve any marketing tasks a team must handle daily. You need to focus on the data and the data model.
While there are good RAGs and bad RAGs, the key differentiation is often represented by the “R” part of the equation: the Retrieval. Primarily, the retrieval differentiates a fancy demo from a real-world application, and behind a good RAG, there is always good data. Data, though, is not just any type of data (or graph data).
It is built around a coherent data model that makes sense for your use case. If you build a search engine for wines, you need to get the best dataset and model the data around the features a user will rely on when looking for information.
So, data is important, but the data model is even more important. If you are building an AI Agent that has to do things in your marketing ecosystem, you must model the data accordingly. You want to represent the essence of web pages and content assets.
Image from author, August 2024
2. Not Everyone Is Great At Prompting
Expressing a task in written form is hard. Prompt engineering is going at full speed towards automation (here is my article on going from prompting to prompt programming for SEO) as only a few experts can write the prompt that brings us to the expected outcome.
This poses several challenges for the design of the user experience of autonomous agents. Jakon Nielsen has been very vocal about the negative impact of prompting on the usability of AI applications:
“One major usability downside is that users must be highly articulate to write the required prose text for the prompts.”
Even in rich Western countries, statistics provided by Nielsen tell us that only 10% of the population can fully utilize AI!
Simple Prompt Using Chain-of-Thought (CoT)
More Sophisticated Prompt Combining Graph-of-Thought (GoT) and Chain-of-Knowledge (CoK)
“Explain step-by-step how to calculate the area of a circle with a radius of 5 units.”
“Using the Graph-of-Thought (GoT) and Chain-of-Knowledge (CoK) techniques, provide a comprehensive explanation of how to calculate the area of a circle with a radius of 5 units. Your response should: Start with a GoT diagram that visually represents the key concepts and their relationships, including: Circle Radius Area Pi (π) Formula for circle area Follow the GoT diagram with a CoK breakdown that: a) Defines each concept in the diagram b) Explains the relationships between these concepts c) Provides the historical context for the development of the circle area formula Present a step-by-step calculation process, including: a) Stating the formula for the area of a circle b) Explaining the role of each component in the formula c) Showing the substitution of values d) Performing the calculation e) Rounding the result to an appropriate number of decimal places Conclude with practical applications of this calculation in real-world scenarios. Throughout your explanation, ensure that each step logically follows the previous one, creating a clear chain of reasoning from basic concepts to the final result.” This improved prompt incorporates GoT by requesting a visual representation of the concepts and their relationships. It also employs CoK by asking for definitions, historical context, and connections between ideas. The step-by-step breakdown and real-world applications further enhance the depth and practicality of the explanation.”
3. You Shall Build Workflows To Guide The User
The lesson learned is that we must build detailed standard operating procedures (SOP) and written protocols that outline the steps and processes to ensure consistency, quality, and efficiency in executing particular optimization tasks.
We can see empirical evidence of the rise of prompt libraries like the one offered to users of Anthropic models or the incredible success of projects like AIPRM.
In reality, we learned that what creates business value is a series of ci steps that help the user translate the context he/she is navigating in into a consistent task definition.
We can start to envision marketing tasks like conducting keyword research as a Standard Operating Procedure that can guide the user across multiple steps (here is how we intend the SOP for keyword discovery using Agent WordLift)
4. The Great Shift To Just-in-Time UX
In traditional UX design, information is pre-determined and can be organized in hierarchies, taxonomies, and pre-defined UI patterns. As AI becomes the interface to the complex world of information, we’re witnessing a paradigm shift.
UI topologies tend to disappear, and the interaction between humans and AI remains predominantly dialogic. Just-in-time assisted workflows can help the user contextualize and improve a workflow.