How To Conduct a Blog Content Audit

A blog is a great way to regularly produce content that helps new people discover your site and keep returning users engaged. Over time, however, you may find your blog falling into the trap of producing repetitive, disengaging, and, dare I say it, useless content.

If this does happen, you may wind up with multiple pages targeting the same keywords,  competing with each other in search, and ultimately cannibalizing your site’s prominence for your target keyword altogether.

I’m going to walk through how I conduct a blog content audit to uncover the blog posts that are yielding great results, and the plan of action to take for the ones that are underperforming, and possibly negatively affecting organic search results altogether.

I Heard a Rumor That You Use Screaming Frog

I’m pretty sure I mention Screaming Frog just about every time I write a blog post. In case you’re not familiar with it, it’s a fantastic tool that allows you to crawl websites and see an abundance of technical information to help with audits and analysis. As you might have guessed, we’re going to be using Screaming Frog to tackle this blog content audit, so go ahead and open it up – we’ve got some Custom Extraction rules to set up.

Limit The Crawl to the Blog

More often than not all of a site’s blog posts will fall under a specific folder (e.g. /blog/). In which case, we can use Screaming Frog to limit our site crawl just to the blog by navigating to Configuration > Include, and adding the following line:

https://www.yourwebsite.com/blog/.*

Be sure to update the URL specifically for your blog. Here’s what it looks like for UpBuild’s blog:

https://www.upbuild.io/blog/.*

If your blog posts don’t happen to live under a specific hierarchy, this step obviously would not apply to you. But don’t fear, this next step might help you solve that problem.

Extracting The Blog Post Publication Date

When it comes to cleaning up a blog, I can understand the thought process behind wanting to quite simply delete “all the old stuff”, but this can be a really big misstep. If your blog post contains content that searchers still find useful (which is what search engines care about most), then there’s not necessarily any reason to delete it just because it was published a few years ago.

Understanding the publication date for each blog post is going to help us with this audit, but not for mass deletion.

Extracting the blog post date is going to be different for everyone, but the following method, using the Google Chrome browser, should help you find yours:

  1. Open up any blog post on your site in Google Chrome.
  2. Locate and highlight the publication date that is displayed on the page.
  3. Right-click on the highlighted date, and choose “Inspect” from the menu

This will open a DevTools window, and you should find that the HTML for the date is highlighted (see screenshot below). Next, simply right-click on the highlighted HTML and choose Copy > Copy XPath from the menu (as shown):

This step will have copied the XPath for the publication date from the blog post, and we can use this to find the location for the publication dates for all of our blog posts since they would likely all be using the same page template. So let’s add this to Screaming Frog by navigating to Configuration > Custom > Extraction:

Now change the drop down option in the first row from Inactive to XPath, and paste the XPath value that you copied earlier into the main text field. As you can see in the screenshot below, I’ve labeled this “Date Published”:

I also recommend using the same method to uncover the Category for each blog post too. This will help you to quickly segment your blog posts by topic later down the line.

If all goes well, you should see something that looks similar to the screenshot below (remember the XPath values will look different for you because every site and element path is different).

Now let’s continue setting up Screaming Frog.

Connecting to Google Analytics and Search Console APIs

One of my absolute favorite features of Screaming Frog is the ability to seamlessly connect with Google Analytics and Google Search Console APIs. Connecting to these services helps to bridge the gap between the blog posts and the invaluable information needed to conduct this audit.

When you connect to each these API services you’ll want to make a couple of specific setting configurations.

For Google Analytics, make sure that you’re set up to look at the correct Property and View containing your blog post data. It sounds obvious but can be an easy thing to overlook. You’ll also want to ensure that you change the segment to crawl All Users, not the default setting, Organic Traffic. The key takeaway we want from Google Analytics is the most important metric of all – Goal Completion – and we don’t want to restrict that to any channel. If the blog post is valuable, we want to keep it.

Next, you’ll want to make sure the date range settings match for both Google Analytics and Google Search Console. I’d recommend pulling two sets of data here. One for the last 16 months of data, and then another just looking at the last 3 months. This is going to help you take a broad look at the performance of your blog posts over the course of a substantial timeframe, and help you compare that to a recent active window.

Crawl and Export the Google Analytics and Search Console Data

Once you’ve set your date range for 16 months, you can start the crawl for your site. Once the crawl is complete, export the data from the Google Analytics and Google Search Console tabs respectively.

Remember to save your crawl, then change the date range to the last three months, for both Google Analytics and Google Search Console, and run the crawl again. Once that’s finished, export the data from the Google Analytics and Google Search Console tabs again.

Export the Custom Extraction Data

Under the Custom Extraction tab, you should now see all of the blog posts along with the date that each of them was published and their assigned categories:

Next, to make cross-referencing the data easier I recommended copying the data from all of the CSV files into respective tabs on the same worksheet (I like to use Google Sheets).

Compiling the Data

The goal at this stage is to end up with two sheets. One for the 16-month date range, and the other for the 3-month date range. A lot of data is going to export in the CSV files, but here are the columns of data I want to know for each URL:

  • Date Published (Screaming Frog)
  • Clicks (GSC)
  • Impressions (GSC)
  • Average CTR (GSC)
  • Average Position (GSC)
  • Goal Completions (GA)

When you put it all together you should have something that looks like the screenshot below. I recommend adding some conditional formatting to help you quickly identify top performing and low performing pages. The thresholds that you set here may differ based on what you considered high or low performing. I personally like to start by highlighting blog posts with an average position of 30 or under, or with an Average CTR of 3% or higher as ‘top performing’.

Checking for Assisted Conversions

Although Google Analytics provides us with the blog posts that directly converted, it’s important to also know which blog posts helped lead to a conversion too. Here are a couple of Google Analytics reports to help you uncover that information:

  • Conversion > Goals > Reverse Goal Path
    • Use the Filter to display only URLs containing “blog” (or comparable).
      • Export the report to CSV and make a note to keep each of the blog posts listed in your main worksheet.
  • Conversions > Multi-Channel Funnels > Assisted Conversions
    • Under the Primary Dimension MCF Channel Grouping, set the Secondary Dimension to Landing Page URL.
    • Set the Advanced Filter to only include Landing Page URLs containing “blog” (or comparable).
      • Export the report to CSV and make a note to keep each of the blog posts listed in your main worksheet.

Note: I’d go ahead and set the date range for these reports to the last 16 months.

Gather Keywords and Ranking Positions For Your Blog Posts

We’re now at the final stage of the data gathering process.

Knowing the Average Position is going to help you with understanding how a page has been performing in general, but you’ll want to know the current positions for all of your blog posts.

There are many tools out there that can help you accomplish this, but I like to use the URL Organic Search Keywords report via the SEMRush API.

I use the formula below, replacing the {API-KEY} with one supplied from an SEMRush Business Plan, and ensuring all of the blog posts being analyzed are listed vertically, starting in cell A1.

=transpose(importdata(“https://api.semrush.com/?type=url_organic&key={API-KEY}&display_limit=10&export_columns=Ph,Po,Nq&url=“&A1&”&database=us&display_sort=po_desc”))

This formula specifically will display horizontally over the next 10 columns, for each blog post URL, the top 10 keywords that it ranks for in descending order (if any), the position, and the estimated search volume for each keyword.

It will look something like this (Keyword; RankingPosition; Search Volume)

I recommend using the SPLIT function here to help you separate the data in each these cells into their own columns.

=SPLIT(A2,”;“)

For example, this formula will split the information in a single cell, in this example A2, using the semicolon as the separator, spreading it across three columns instead:

After you’ve uncovered the keywords and ranking positions for all of your blog posts, I recommend organizing it all into four simple columns; the blog post URL, the Keyword the Page is Ranking For, the Position, and the Search Volume:

It’s a good idea to copy and paste the values once the information has populated. This will avoid the API call in the formula running again, helping you to save valuable API credits.

The Consolidation Process

With all of the information that we’ve gathered, it can be overwhelming knowing where to even begin. So I like to use some pretty hard and fast rules to begin the process. The general criteria that I use here to start the decision process for which blog posts I want to keep are as follows:

  • I keep any blog post that was published in the last full calendar year onward.
  • I keep any blog post that is ranking in the top 40 for a relevant keyword.
  • I keep any blog post that converted or assisted with a conversion.
    • It would be worth investigating and taking note of when these conversions happened. If the blog post converted one time, 16 months ago, the week it was published and never again, you may have more work to do. I’d recommend evaluating posts like these to see what you can do to refresh it and promote it again.

The blog posts that did not meet these criteria (i.e. were published before our targeted calendar year, are ranking lower than position 40, never led to or assisted with a conversion in the last 16 months, or did not receive a click in the last 16 months), are now eligible for evaluation.

What To Do Next

At this point you should have a pretty good idea of which blog posts you’re planning on keeping and which ones you want to…well…what do we do with these other posts?

  • Evaluate the Keyword – If a page is ranking for a keyword, but lower than position 40, check to see if the keyword is unique (as compared to the blog posts that you’re planning on keeping) and if that keyword is relevant to your business. If that’s the case then the blog post may be worth refreshing and re-optimizing for the ranking keyword.
  • Multiple blog posts ranking for the same keyword – If one blog post ranks significantly for a keyword, you might consider taking the content from other blog posts that are ranking lower for the same keyword and consolidating that content into the top-performing post. If that isn’t an option, you might want to 301 redirect those other posts to the post that is performing well.
  • Consolidate Multiple Blog Posts – Leverage the Categories column to break down and tackle each blog post via topic. Can you consolidate any of the underperforming posts to create a new, more comprehensive blog post? Or better yet, can you combine multiple posts together to build a new guide/resource?
  • Time-Sensitive Blog Posts – Do you have blog posts that are focused on particular years? (e.g. “What to Expect in 2014”, “Best of 2016”, “2017 Predictions”). Great! These are perfect excuses to write ‘throwback’ posts.
    • Create a new blog post, looking back at your predictions and your best of lists. Write about whether your predictions came true, and provide “where are they now” updates for the best-of lists – also a great excuse to write a new best of list. Retrospective posts can be really interesting, and help to highlight your keeping pace with the industry.

Cleaning Up Your Site

Now that you have made the decision for which blog posts will be removed, you’ll want to make sure that they’re addressed the correct way.

301 Redirects

For all of the blog posts that will no longer exist, you’ll want to make sure that you implement 301 redirects pointing each non-existent blog post URL to its nearest analog.

A common practice that I must warn against is the taking the path of least resistance by applying a blanket 301 redirect to the home page for all of these now non-existent posts. In short, this is not recommended, and I encourage you to read why you should never blanket-redirect all of your invalid URLs.

  • If you’re consolidating a blog post’s content into an existing blog post that you’re keeping, then you’ll want to point your 301 redirect to that existing post.
  • If you’re compiling multiple blog posts into a single new blog post, then you’ll want to 301 redirect all of them to the new blog post.
  • If you’re not consolidating or compiling, then you’ll want to leverage the Categories data that we pulled to help you find the nearest topical analog and 301 redirect it to that.

Internal Links

Finally, you’ll want to make sure that all of your internal links that point to posts that were removed (URLs which should all theoretically be 301 redirects at this point) are updated to point to the new pages instead, or removed altogether. You may find that you may even need to change some context for the internal links to make sense for the new landing page.

You can use Screaming Frog to find all of the internal links in three easy steps:

  1. Change the Mode to List
  2. Upload or paste the list of all the removed, 301-redirecting blog post URLs.
  3. Once complete, simply Bulk Export > All Inlinks

That’s it!

Every Blog is Different

This is by no means a quick, easy strategy, but if done correctly you should have plenty of opportunity to create new, useful content and clean up your current site in the process.

It’s important for me to state here that each blog is different and I treat the criteria in this blog post as a general guide and not strict mandates. Overall I hope that this has provided some helpful insight into how I generally approach blog content consolidation and perhaps highlighted some helpful ways to approach underperforming content.


Written by
James McNulty was born in Sidcup, Kent England in 1985. James now lives in North Richland Hills, Texas with his wife Megan, and two dogs Colin and Davey. James has been building websites since 1999 and is currently a Senior Marketing Strategist at UpBuild.

Related Posts

Leave a Reply

Your email address will not be published. Required fields are marked *