This blog post is going to provide you with step-by-step instructions on how to run detailed page-level Lighthouse reports on multiple pages at once (even an entire site).
First of all, what is Lighthouse?
Developed by Google, Lighthouse is an open-source, automated tool for improving the performance, quality, and correctness of web applications.
When auditing a page, Lighthouse runs a barrage of tests against a page and then generates a report on how well the page did. From there you can use the failing tests as indicators on areas to improve.
I should note here that Lighthouse Reports are not solely referencing page load time. In fact, there are a lot of page speed metrics beyond what the end user experiences and those metrics are very important to Google. These reports are very detailed, and when Google releases a tool like this it helps us see through Google’s lens and help us understand the metrics that they consider important.
What You’ll Need
Before you look at this laundry list, rest assured: you can be up and running with this whole thing in less than 20 minutes, and I’ll show you exactly how we use each of these items to make it less daunting.
Let’s get a couple of quick setup tools out of the way.
- If you aren’t using it already, go ahead and download Google Chrome.
- Now open the following link in Google Chrome to download the Lighthouse Chrome Extension.
- Click the ‘Add to Chrome’ button.
Once that’s done, we can dive into the details. We’ll be gathering the rest of our items throughout the tutorial, but here they are to give you an idea:
- Lighthouse Batch Reporter
- An FTP program (I recommend FileZilla)
- Google Sheets
- And last but not least, a place you can host and access files online.
Great, let’s get started!
1. Install Node.js
Go to https://nodejs.org/en/download/ and click the installer that matches the OS you’re using and once the program has downloaded, follow the instructions to install it onto your computer.
Once that’s done we’ll need a package that can run on node.js to generate multiple Lighthouse reports.
2. Install Lighthouse Batch Reporter
The Lighthouse Batch Reporter supports executing Lighthouse analysis over a number of sites in sequence and generating a summary report including all of their scores.
- If you’re on Mac:
- Open Terminal (command + spacebar, then type Terminal)
- If you’re on Windows:
- Open a Command Prompt session (Win-R, then type cmd and press Enter)
To install Lighthouse Batch Reporter, simply type or copy/paste the line below into Terminal/Command Prompt)
sudo npm install lighthouse-batch -g
3. Run Reports on Multiple URLs
See, this isn’t so bad, right?
Next, we need to gather all of the page URLs we want to run a report on and then place them into a comma-delimited list (no spaces), but first, we need to have this list of URLs prepared in order to begin the batch process in one single execution. To extract a list of URLs on an entire domain you could use a crawler such as Screaming Frog.
Once you have all of those URLs handy you can head on down to Convert Town to put them into one large comma delimited list. It’s worth pointing out here that there are many ways to achieve a comma-delimited list, but after stumbling upon the Convert Town website the opportunity to use it in a rhyme was too good not to take up.
Once we have our comma-delimited list of URLs, we are going to simply prepend it with:
What we’re doing here is preparing to run the lighthouse-batch package, and then using the option of -s to prepare the package to analyze a comma-delimited list of URLs.
The final result should look like a longer version of this, depending on how many URLs you have added:
lighthouse-batch -s https://url.com,https://url.com/page-one
Remember to make sure you don’t have a comma after the last URL. Once that entire command is compiled and ready, we can plug it into Terminal and begin to generate these reports:
- Open Terminal (or Command Prompt if you’re on Windows), then:
- Copy and paste the whole final result that you have compiled.
- Give it a hot minute to handle everything you’ve just pasted and once the Terminal is idle, hit return
You should notice that a new window will open and close for each URL in your comma-delimited list. That means it’s working!
It’ll take some time, depending on how many URLs you included. For context, each URL takes, on average, around 6 seconds. Just let it do its thing in the background — we’ve got some more work to do.
Preparing to Read the JSON Lighthouse Reports
The Lighthouse Batch Reporter that we are running in the background writes out JSON reports for each page, and stores them in a folder on your computer.
Look under /report/lighthouse on your computer and you should see the JSON report files for each URL being added one by one, looking something like this:
and so on…
Warning: Running the command in Terminal (or Command Prompt if you’re on Windows) on another batch of URLs will overwrite everything in this directory. So be sure to move them to another folder on your machine before finding out the hard way, as I did.
You will also notice a summary.json file that generates a summary calculating the average scores from the five sections that are included in the Lighthouse report (Performance, PWA, Accessibility, and Best Practice and SEO) for all of the URLs that are included in your batch analysis. We won’t be utilizing the summary.json file since we want to get page-level findings. But before we dive into that, we need to put all of our JSON files online.
4. Upload the JSON files to an Online Server
Once the Lighthouse Batch Reporter has finished running analyses on all of the URLs that you’ve entered, and all of your JSON files have downloaded to your computer, we can move onto extracting the findings. That’s right, we’re really close to the good stuff!
At first, I tried to extract the information from these JSON files while they were still local on my machine, using Google Sheets. Unfortunately, that’s just not how Google Sheets works. So I had to find a way to put all of these JSON files online, in order for Google Sheets to interact with them.
Hopefully, you already have access to an online server where you can easily upload these files. If you don’t, you can find monthly hosting plans for less than a cup of coffee from a variety of providers. You don’t need anything special; you just need to able to access these files easily online.
Once you have your server setup, you will want to use an FTP (File Transfer Protocol) program to transfer the files from your computer to the server. Most hosting companies actually have FTP programs built into their systems (you might want to look in your account under the cPanel of your hosting provider), otherwise you can use a program like Filezilla (pictured below), and all you’ll need is your FTP credentials in order to use it. If you’re having trouble here you can reach out to your hosting company and they should be able to provide you with your FTP credentials.
If you’re not familiar with FTP, it’s really quite simple.
- You connect to your specific server by filling out the host, username and password.
- Once connected, you will be able to browse your computer in the left pane, and the server in the right pane.
- We are going to want to locate the JSON files from our computer (under /report/lighthouse), then drag and drop them onto our server. You can drag the whole lighthouse folder or the JSON files individually, it won’t make a difference.
5. Add ImportJSON.gs to Google Sheets
Stay with me now, this is the final piece of the puzzle! You got this!
Now that all our JSON files are online, we can use a Google Script within Google Sheets to reference all of the glorious report data from these JSON files using formulas.
- Open up a new Google Sheets document.
- Go to Extension > Apps Script and replace the code that is there with the following ImportJSON.gs script.
- Rename the project and the filename, (currently named code.gs) to: ImportJSON.
- Remember to Save the file.
It should look something like this:
6. Add the JSON URLs to Google Sheets
Next we need to gather all of the online URLs for our JSON files and add them to our Google Sheet.
For example, let’s assume that we are hosting our files at mynewserver.com, and we dropped our entire lighthouse folder into the root directory. We would be able to access our JSON files at the following URL:
and so on…
Place these URLs in column A, starting with row 1. It should look something like this:
7. Use Formulas to Find Report Data
It’s not the Krabby Patty formula, but we can finally use other formulas to reference our JSON data!
This formula is compiled from four key components:
- The script name.
- The cell we want to reference.
- The path structure to the data we want.
- The parameters to format the data.
Here is how an example of the formula is structured:
=ImportJSON(CELL, “PATH STRUCTURE“, “PARAMETERS“)
Let’s walk through how we compiled this:
The Script Name
First, each formula will always begin with the name of the script that we added =ImportJSON.
Since our list of URLs is in column A, we’ll put our formulas in column B. Then we’ll reference the cell that contains the URL we want to reference, for example, A1.
The Path Structure
This next part uses a hierarchical path structure to help us dig down and extract the exact value that we want from the JSON report.
In order to understand what that path structure looks like, we need to take a look at one of these JSON reports. If you open one up, you’ll see a whole lot of information. In the screenshot below we can see that the Lighthouse report found that the URL we analyzed received a score of 0.31 as far as render-blocking resources go.
To give you a barometer here, a score of 1 is essentially 100%. So this result below is telling us that this page scored a 31% in the render blocking resources category.
You can’t see the parent piece of this JSON structure in the screenshot above, but in order to extract this particular score our path structure would be:
To better understand how you can build a path structure to find data, you can utilize this online JSON viewer.
Simply copy and paste an entire JSON report into the Text tab:
Then simply click Viewer in the top left corner to get an easy view of how the report is nested.
In the example above you can see the JSON report hierarchy of parent and child sections more easily, and how we can dig down three levels to extract the path structure for whether or not the page uses HTTPS:
Lastly we will want to use some parameters in order to extract only the score value from the report. The first row contains the headers, with each column header indicating the path to that data in the JSON report. The remaining rows contain the data. By default, the data gets transformed in the Lighthouse Batch JSON conversion so it looks more like a normal data import.
The problem with this is values that are longer than 256 characters get truncated in the report. Headers have slashes converted to spaces, common prefixes are removed and the resulting text converted to title case. To change this behavior, we can pass in these optional parameters:
- noInherit – Don’t inherit values from parent elements
- noTruncate – Don’t truncate values
- noHeaders – Don’t include headers, only the data
- rawHeaders – Don’t prettify headers
- debugLocation – Prepend each value with the row & column it belongs in
Here are the three that I recommend we use for now:
The Final Formula
Here is how our final formula when compiled together should now look:
=ImportJSON(A1, “/audits/render-blocking-resources/score“, “noInherit,noTruncate,noHeaders“)
This formula says: For the data in cell A1, import the value from /audits/render-blocking-resources/score, without inheriting values from parent elements, truncating the value, or including headers.
Reading the JSON Lighthouse Reports
The Lighthouse reports cover four key areas:
- Best Practices
When a Lighthouse audit is run on any given URL, a report will generate a score out of 100 for each section. That score is based on optimization factors for each category.
For example, if you run the Lighthouse Chrome Extension in the browser, you can see that the factors affecting Performance include things like Unused CSS rules, Unoptimized images, etc.
Using this batch processing of JSON Lighthouse reports we can extract these factors at scale and uncover a whole lot of information using formulas.
Once you have your list of JSON report URLs in column A, and the formulas set in row 1, you can drag the formula all the way down your Google Sheet, and after applying some nifty conditional formatting, you can see in one easy overview how each page is impacted:
Helping you put together actionable strategies to fix problems on a page level quickly:
Digging Into the Details:
If we take another look at the render-blocking resources example from before we can see that under displayValue there are 9 resources delaying the first paint by 2341 milliseconds.
The great thing about this report is we can extract that list of resources into our Google Sheet. Let’s place our report URL in A2, and use the following formula in B2:
=ImportJSON(A2, “/audits/render-blocking-resources/details/items/”, “noInherit,noTruncate,noHeaders“)
Be sure to make enough room for 9 rows of data (and just like in this scenario below, make sure you have a couple of empty columns, too, as the formula for this particular piece of data will pull the total bytes of each resource and how many milliseconds the page wastes retrieving them as well):
Remember this is a page-level report. Being able to extract this information for each page on an entire site can really help you drill down and find sitewide optimization opportunities. In the example above for render-blocking resources alone, we can add up all of those wasted milliseconds to a total of 4.6 seconds. That’s a lot of time in the world of performance optimization.
If you’re already lost and have no clue how these render-blocking resources impact your page, then you should take some time to read Page Speed Metrics Beyond Page Load Time, which dives into metrics such as First Paint, First Meaningful Paint, Time to Interactive and more. These are the metrics that Google are looking at when they consider sites for ranking performance in regard to speed, and these Lighthouse reports are key to uncovering the details and factors that impact them.
For the sake of this blog post, below are some ideal goals as far as these metrics go:
|Metric||Ideal Time per Page|
|First Contentful Paint||1.5 seconds or less|
|First Meaningful Paint||1.6 seconds or less|
|Time to Interactive||5 seconds or less|
|Time to First Byte||500 ms or less|
|Perceptual Speed Index||1.25 seconds or less|
Quickstart Template with Formulas
There is definitely a lot of information to unpack within these JSON reports and I want to leave you with a template with a few formulas to get you started.
When you open the document above, you’ll notice we ran a test Lighthouse audit on the marvel.com home page. One thing that you can see in the findings as outlined in the SEO tab, was a lack of descriptive links. So if you navigate to the SEO Action Items tab you will see the formula that we used to dive deeper into the report to list out those particular links and anchor text that could be more descriptive:
There are many data points to extract from a JSON Lighthouse report, and taking the time to build a template with formulas able to extract and compile legible page-level analysis for an entire site can be invaluable.
I hope that this tutorial and template (to make your own editable version, simply click File then Make a Copy) will help give you a jumpstart to compiling the data you need to optimize your pages for these important metrics and more.