This means that someone who’s interested in the word “joker” as it applies to contract clauses will have to do an additional search to find what they were looking for (e.g., “meaning of joker when referring to contract clauses”).
Which is better?
Well, it depends.
If the searchers interested in playing cards or people who tell lots of jokes make up more than 90% of the people who enter this search query, then the Google result might be the better of the two.
As it is, I scored the ChatGPT search result a bit higher than Google’s for this query.
Another example of disambiguation failure is simply not addressing it at all. Consider the query example: “where is the best place to buy a router?”
Here is how ChatGPT search addressed it:
You might think this result is perfect, but routers also refer to a tool used in woodworking projects.
I use one frequently as a part of building furniture from scratch (true story).
There is a large enough audience of people who use these types of routers that I hope to see recognition of this in the SERPs.
Here is Google’s response to the query:
This part of the SERP is followed by:
Google focuses on the internet router to the same degree as ChatGPT.
For this class of queries, I assigned these scores:
ChatGPT search: 6.00.
Google: 5.29.
Query type: Maintaining context in query sequences
Another interesting aspect of search is that users tend to enter queries in sequences.
Sometimes those query sequences contain much information that helps clarify their query intent.
An example query sequence is as follows:
What is the best router to use for cutting a circular table top?
Where can I buy a router?
As we’ve seen, the default assumption when people speak about routers is that they refer to devices for connecting devices to a single Internet source.
However, different types of devices, also called routers, are used in woodworking.
In the query sequence above, the reference to cutting a circular table should make it clear that the user’s interest is in the woodworking type of router.
ChatGPT’s response to the first query was to mention two specific models of routers and the general characteristics of different types of woodworking routers.
Then the response to “where can I buy a router” was a map with directions to Staples and the following content:
All of the context of the query was 100% lost.
Sadly, Google only performed slightly better.
It identified three locations, two of which were focused on networking routers and one which was focused on woodworking routers (Home Depot):
For this query, I scored the tools this way:
ChatGPT search: 2.00.
Google: 3.00.
Query type: Assumed typos
Another interesting example is queries where your search is relatively rare, yet it has a spelling that’s similar to another word.
For this issue, my search was: “Please discuss the history of the pinguin.”
The Pinguin was a commerce raider used by the German Navy in World War 2. It just has a spelling very similar to “penguin,” which is an aquatic flightless bird.
Both ChatGPT and Google simply assumed that I meant “penguin” and not “pinguin.”
Here is the result from ChatGPT:
The result continues after what I’ve shown here but continues to focus on the bird, not the boat.
Google makes the same mistake:
After the AI Overview and the featured snippet I’ve shown here, the SERPs continue to show more results focused on our flightless friends.
To be fair, I’ve referred to this as a mistake, but the reality is that the percentage of people who enter “pinguin” that simply misspelled “penguin” is probably far greater than those who actually mean the German Navy’s WW2 commerce raider.
However, you’ll notice that Google does one thing just a touch better than ChatGPT here.
At the top of the results, it acknowledges that it corrected “pinguin” to “penguin” and allows you to change it back.
The other way I addressed the problem was to do a second query: “Please discuss the history of the pinguin in WW2,” and both ChatGPT and Google gave results on the WW2 commerce raider.
For this query, I assigned these scores:
ChatGPT search: 2.00.
Google: 3.00.
Query type: Multiple options are a better experience
There are many queries where a single (even if it is well thought out) response is not what someone is probably looking for.
Consider, for example, a query like: “smoked salmon recipe.”
Even though the query is in the singular, there is little chance that anyone serious about cooking wants to see a single answer.
This type of searcher is looking for ideas and wants to look at several options before deciding what they want to do.
They may want to combine ideas from multiple recipes before they have what they want.
Let’s look at the response from ChatGPT search:
I’ve included the first three screens of the response (out of four), and here you will see that ChatGPT search provides one specific recipe from a site called Honest Food.
In addition, I see some things that don’t align with my experience.
For example, this write-up recommends cooking the salmon to 140 degrees. That’s already beginning to dry the salmon a bit.
From what I see on the Honest Food site, they suggest a range of possible temperatures starting from as low as 125.
In contrast, Google offers multiple recipes that you can access from the SERPs:
This is an example of a query that I scored in Google’s favor, as having multiple options is what I believe most searchers will want.
The scores I assigned were:
ChatGPT search: 4.00.
Google: 8.00.
Get the newsletter search marketers rely on.
Types of problems
Next, we’ll examine the types of things that can go wrong. I looked for these issues while scoring the results.
The analysis noted where problems that generative AI tools are known for were found and potential areas of weakness in Google’s SERPs.
These included:
Errors.
Omissions.
Weaknesses.
Incomplete coverage.
Insufficient follow-on resources.
Problem type: Errors
This is what the industry refers to as “hallucinations,” meaning that the information provided is simply wrong.
Sometimes errors aren’t necessarily your money or your life situations, but they still give the user incorrect information.
Consider how ChatGPT search responds to a query asking about the NFL’s overtime rules:
Notice the paragraph discussing how Sudden Death works. Unfortunately, it’s not correct.
It doesn’t account for when the first team that possesses the ball kicks a field goal, in which case they could win the game if the second team doesn’t score a field goal.
If the second team scores a field goal, this will tie the game.
In this event, it’s only after the field goal by the second team that the next score wins the game.
This nuance is missed by ChatGPT search.
Note: The information on the NFL Operations page that ChatGPT search used as a source is correct.
Google’s AI Overview also has an error in it:
In the second line, where Google outlines “some other NFL overtime rules,” it notes that the same ends if the first team to possess the ball scores a touchdown.
This is true for regular season games but not true in the postseason, where both teams always get an opportunity to possess the ball.
Scores were as follows:
ChatGPT search: 3.00.
Google: 4.00.
Problem type: Omissions
This type of issue arises when important information that belongs in the response is left out.
Here is an example where ChatGPT search does this:
Under Pain Management, there is no mention of Tylenol as a part of a pain management regimen.
This is an unfortunate omission, as many people use only a mix of Tylenol and Ibuprofen to manage the pain after a meniscectomy.
Scores were as follows:
ChatGPT search: 6.00.
Google: 5.00.
Problem type: Weaknesses
I used weaknesses to cover cases where aspects of the result could have been more helpful to the searcher but where the identified issue couldn’t properly be called an error or omission.
Here is an example of an AI Overview that illustrates this:
The weakness of this outline is that it makes the most sense to charge the battery as the first step.
Since it takes up to 6 hours to complet,e it’s not that useful to set up the app before completing this step.
Here is how I scored these two responses:
ChatGPT search: 3.00.
Google: 5.00.
Problem type: Incomplete coverage
This category is one that I used to identify results that failed to cover a significant user need for a query.
Note that “significant” is subjective, but I tried to use this only when many users would need a second query to get what they were looking for.
Here is an example of this from a Google SERP.
The results are dominated by Google Shopping (as shown above).
Below what I’ve shown, Google has two ads offering online buying opportunities and two pages from the Riedl website.
This result will leave a user who needs the glasses today and therefore wants to shop locally without an answer to their question.
ChatGPT search did a better job with this query as it listed both local retailers and online shopping sites.
Scores for this query:
ChatGPT search: 6.00.
Google: 4.00.
Problem type: Insufficient follow-on resources
As discussed in “How query sessions work” earlier in this article, it’s quite common that users will try a series of queries to get all the information they’re looking for.
As a result, a great search experience will facilitate that process.
This means providing a diverse set of resources that makes it easy for users to research and find what they want/need. When these aren’t easily accessed it offers them a poor experience.
As an example, let’s look at how ChatGPT search responds to the query “hotels in San Diego”:
While this provides 11 hotels as options, there are far more than this throughout the San Diego area.
It’s also based on a single source: Kayak.
The user can click through to the Kayak site to get a complete list, but other resources aren’t made available to the user.
In contrast, Google’s results show many different sites that can be used to find what they want. The scores I assigned to the competitors for this one were:
ChatGPT search: 3.00.
Google: 6.00.
The winner?
It’s important to note that this analysis is based on a small sample of 62 queries, which is far too limited to draw definitive conclusions about all search scenarios.
A broader takeaway can be gained by reviewing the examples above to see where each platform tends to perform better.
Here’s a breakdown of category winners:
1. Informational queries
Queries: 42
Winner: Google
Google’s average score: 5.83
ChatGPT search’s average score: 5.19
Google’s slight edge aligns with its strong track record for informational searches.
However, ChatGPT Search performed respectably, despite challenges with errors, omissions, and incomplete responses.
2. Content gap analysis
Winner: ChatGPT Search
ChatGPT search’s average score: 3.25
Google’s average score: 1.0
ChatGPT Search excels in content gap analysis and related tasks, making it particularly useful for content creators. Winning use cases include:
Content gap analysis
Standalone content analysis
Comparing direct or indirect SERP competitors
Suggesting article topics and outlines
Identifying facts/statistics with sources
Recommending FAQs for articles
While ChatGPT search outperformed Google in this category, its lower overall score highlights areas where improvements are needed, such as accuracy.
3. Navigational queries
Navigational queries were excluded from the test since they typically don’t require detailed text responses.
Google’s dominance in this category is assumed based on its straightforward, website-focused results.
4. Local search queries
Winner: Google
Google’s average score: 6.25
ChatGPT search’s average score: 2.0
Google’s extensive local business data, combined with tools like Google Maps and Waze, ensures its superiority in this category.
5. Commercial queries
Winner: Google
Google’s average score: 6.44
ChatGPT search’s average score: 3.81
This category, comprising 16 queries, favored Google due to its stronger capabilities in showcasing product and service-related results.
6. Disambiguation queries
Winner: ChatGPT search
ChatGPT search’s average score: 6.0
Google’s average score: 5.29
ChatGPT Search edged out Google by more effectively presenting multiple definitions or interpretations for ambiguous terms, providing users with greater clarity.
These scores are summarized in the following table:
Summary
After a detailed review of 62 queries, I still see Google as the better solution for most searches.
ChatGPT search is surprisingly competitive when it comes to informational queries, but Google edged ChatGPT search out here too.
Note that 62 queries are a tiny sample when considered against the scope of all search.
Nonetheless, as you consider your search plans going forward, I’d advise you to do a segmented analysis like what I did before deciding which platform is the better choice for your projects.
Contributing authors are invited to create content for Search Engine Land and are chosen for their expertise and contribution to the search community. Our contributors work under the oversight of the editorial staff and contributions are checked for quality and relevance to our readers. The opinions they express are their own.