The world of SEO has gotten increasingly nuanced. If you’re here, you know that merely writing an article and buying links to it hasn’t worked in a long while.
When I’m working on my own articles or with authors, I start with outlines.
After all, it’s easier to optimize pieces built to rank than to reverse engineer what you want later.
I’ve often focused a lot of my attention on understanding the content and structures of the top-ranking pages. Specifically, what questions they answered and what entities they contained.
I did a lot of this manually. And it took a lot of time.
Thankfully, with the rise of LLMs, a couple of APIs and a new scraper, we’re able to automate a lot of this. This can reduce the amount of time spent creating content structures that will rank. You can spend that saved time adding more of the insights that only you, as a human, can provide.
This article will walk you through a script to create article outlines based on:
The keywords you’re targeting.
The type of article you want to write.
The top entities that appear on the top-ranking websites.
The top questions answered by the top-ranking sites.
Summaries of the top-ranking sites.
And if you’re not interested in coding yourself, I’ve even written a Google Colab you can use. You only need to sign up for the APIs (easy) and click some play buttons.
You’ll find the Colab here.
Before we dive into the code, here’s why I base my article outlines on these features.
Why use summaries, entities and top questions for article outlines?
We could use a number of features to inspire our outlines. I chose these three because they form the foundation of what should be included in content inspired by the top 10. Here’s why:
Summaries
Summaries help you distill what the top-ranking pages are doing well, including how they address search intent.
Keen eyes will notice that I’ve focused the summaries around the heading tags in the scripts. This ensures that when we teach the system to create an outline, it will look at the heading structures of the top-ranking sites.
We’ll be using a newer library called Firecrawl, which puts the content in a markdown format. This allows us to base our understanding of the content on more advanced structural elements, including headings. This is a big leap forward.
Entities
Entities are specific, well-defined concepts, such as “artificial intelligence,” “New York City” or even an idea like “trust.” Google recognizes entities to better understand content at a deeper level, beyond just keywords.
For SEO, this means Google connects entities within your content to related ideas, creating a web of associations. Including the top entities relevant to your topic helps you align your article with what Google sees as contextually important and related.
I rank the entities by their salience scores, which indicate how important they are to the page’s content.
Top questions
Top questions are another key piece because they represent what real users want to know.
My thought is that, by extension, questions Google wants answered. By answering these questions within the article, you will likely satisfy search intent more fully.
Getting started
You can run this script locally on your machine using an IDE (Integrated Development Environment).
The remainder of this article is written for that environment.
The great news? You can have this all set up and running on your machine in just a few minutes.
If you’ve never used an IDE before, you can start by installing Anaconda. It does many things, but we only want to use Jupyter Notebook for now.
Download and install Anaconda here, after which you can simply launch “jupyter” to get going.
Once you’ve done that, you can jump into the tutorial.
This route will give you an easier overall experience. You can save your API keys locally and not have to enter them each time. Plus, you can edit the prompts and other variables more easily.
Alternatively, you can simply run the Colab. This is also an easy way to test whether it’s worth setting up locally.
Step 1: Getting your API keys
First, you’ll need to sign up for the API keys you’ll need.
What is an API?
APIs (application programming interfaces) are essentially pipelines that let different software systems communicate and share data.
Think of them as a structured way to request specific data or functions from a service (like Google or OpenAI) and get back exactly what you need, whether that’s analyzing entities on a page, generating text or scraping content.
The APIs we will need are:
Custom search and Cloud Natural Language APIs (see below): This is to manage your custom search engine and extract entities from pages. It is paid, but my bill using it a lot is just a couple dollars per month.
OpenAI: This is for content analysis and creation via GPT-4o. The API is low-cost (a dollar or two per month under regular use) and has a free trial.
Firecrawl: This is for scraping webpages. As this level, the free option should work just fine, but if you start using it a ton they do have inexpensive options available.
Weights & Biases (disclosure: I’m the head of SEO at Weights & Biases): Sign up and obtain your API key. The free option will do everything we need.
Custom search and Google Cloud Natural Language APIs
I found getting these APIs set up a bit non-intuitive, so I wanted to save you the 10 minutes of figuring it out. Here’s a step-by-step guide to getting yourself setup with the Google APIs:
To set up your search engine, just follow the quick instructions at https://developers.google.com/custom-search/docs/tutorial/creatingcse.
The settings I use are:
Google API key
The API key gives you access to the search engine. Again, it’s easy to set up, and you can do so at https://support.google.com/googleapi/answer/6158862?hl=en.
When you’re in the console, you’ll simply click to Enable APIs and services:
And you’ll want to enable the Custom Search API and Cloud Natural Language API.
You’ll also need to set up your credentials:
Then, select API key from the drop-down:
Copy the key to a Notepad file. You can save it there, but we’ll need it in a moment, so I generally paste mine in an unsaved doc.
For good measure, I recommend clicking on the API Key you just created. It will have an orange triangle beside it, noting that it’s unrestricted.
You can click it, set the API to restricted and give it access to just the Custom Search API to help safeguard against misuse.
Google service account
While you’re on this screen, you can set up the service account.
Again, you’ll click on Create credentials, but instead of API key, you’ll click Service account.
You’ll just need to name your project and select the role.
As I’m the only one with access to my projects and machine, I just set it as owner. You may want to choose otherwise. You can find out more about the roles here.
Once you’ve created the service account, you need to create a key for it. If you’re not automatically taken to do so, simply click on the service account you just created: