Avoid using non-descriptive anchor text, such as “here” or “this article,” and provide some context as to the page being linked to.
Avoid internal links without context, such as automating the first or second instance of a word or phrase on each page to point to one specific page.
Use Ahrefs’ Internal Link Opportunities tool or Google search (site:[yourdomain.com] “keyword”) to find linking opportunities.
Image Optimization
Many overlook image SEO, but optimizing images can improve page load speeds – and, if important to you, improve your visibility within image search. A good SOP should include:
Using descriptive file names, and not keyword stuffing them.
Writing alt text that accurately describes the image for accessibility, and not including sales messaging within them.
Choosing the right file format and compressing images to improve load speed.
URL Structures
Ensure URLs are optimized for search engines and users by making them clear, concise, and keyword-relevant. The SOP should cover:
Removing unnecessary stop words, punctuation, and white spaces (20%).
Using hyphens instead of underscores.
Not keyword stuffing the URLs.
Using parameters that don’t override the source or trigger a new session within Google Analytics 4.
Technical Auditing Nuances
One of the more complex elements of performing a technical audit on any enterprise website with a large number of URLs is crawling.
There a number of ways you can tackle enterprise website crawling, but two common nuances I come across are the need to perform routine sample crawls, or tackling the crawl of a multi-stack domain.
Sample Crawling
Sample crawling is an efficient way to diagnose large-scale SEO issues without the overhead of a full crawl.
By using strategic sampling methods, prioritizing key sections, and leveraging log data, you can gain actionable insights while preserving crawl efficiency.
Your sample should be large enough to reflect the site’s structure but small enough to be efficient.
I typically work to the following guidelines for the size of the website or the size of the subdomain or subfolder.
Size
Number of URLs
Sample Size
Small
<10,000
Crawl all or 90%+ of the URLs.
Medium
10,000 to 500,000
10% to 25%, depending on which end of the spectrum your number of URLs falls.
Large
>500,000
A 1-5% sample, focusing on key sections.
You also want to choose your samples strategically, especially when your number of URLs enters hundreds of thousands or millions. There are four main types of sampling:
Random Sampling: Select URLs randomly to get an unbiased overview of site health.
Stratified Sampling: Divide the site into key sections (e.g., product pages, blog, category pages) and sample from each to ensure balanced insights.
Priority Sampling: Focus on high-value pages such as top-converting URLs, high-traffic sections, and newly published content.
Structural Sampling: Crawl the site based on the internal linking hierarchy, starting with the homepage and main category pages.
Crawling Multi-Stack Websites
Crawling websites built on multiple stacks requires a strategy that accounts for different rendering methods, URL structures, and potential roadblocks like JavaScript execution and authentication.
This also means you can’t just crawl the website in its entirety and make broad, sweeping recommendations for the “whole website.”
The following is a very top-line checklist that you should follow, and it covers a lot of the key areas and “bases” that you may encounter:
Identify and map out which parts of the site are server-rendered vs. client-rendered.
Determine which areas require authentication, such as user areas.
If sections require login (e.g., product app), use session cookies or token-based authentication in Playwright/Puppeteer.
Set crawl delays if rate-limiting exists.
Check for lazy-loaded content (scrolling or clicking).
Check if public API endpoints offer easier data extraction.
A good example of this is a website I worked on for a number of years. It had a complex stack that required different crawling methods to crawl and identify issues at a meaningful scale.
Stack Component
Approach
Nuxt
If using SSR or SSG, standard crawling works. If using client-side hydration, enable JavaScript rendering.
Ghost
Typically SSR, so a normal crawl should work. If using its API, consider pulling structured data for better insights.
Angular
Needs JavaScript rendering. Tools like Puppeteer or Playwright help fetch content dynamically. Handle infinite scrolling or lazy loading carefully.
Zendesk
Zendesk often has bot restrictions. Check for API access, or RSS feeds for help center articles.
The above are extreme approaches to crawling. If your crawling tool allows you to render webpages and avoid using tools like Puppeteer to fetch content, you should do so.
Final Thought
Working on technical SEO for large organizations presents unique challenges, but it also offers some of the most rewarding experiences and learning opportunities that you can’t find elsewhere – and not all SEO professionals are fortunate enough to experience.
Making a lot of the “day-to-day” more manageable – and gaining buy-in from as many client stakeholders as possible – can lead to a better client-agency relationship, and lay the foundations for strong SEO campaigns.
More Resources:
Featured Image: Sammby/Shutterstock