06 Nov

Unlocking the Real Potential of PageRank with Data

pagerank

Google’s main approach to evaluating content – and the technology that supports the process – is smart, sophisticated, and precise. While PageRank is perhaps one of the most well-known metrics involved in the analysis of your website, it is more sophisticated than it may seem at first glance.

Broadly speaking, Google checks the number of links to your website, and the perceived quality of those references, to understand how important your website is. However, it’s not quite that simple. PageRank isn’t a quick and easy rating that tells you whether you’re performing well or not. So how can you make sense of what your PageRank really means – and its potential impact on your search ranking?

The answer comes from Google themselves, a company founded on mathematical precision and robust data science. For precise results in your marketing, you should take the same precise approach to decision making.

A precise definition of PageRank

PageRank is the core algorithm invented by Google’s co-founder Larry Page, which calculates the probability of people finding your content for a given keyword. The calculation is broadly based on the number of:

  • links from websites containing similar content
  • the age of the above links
  • unique pages with relevant content

As a result, the more dedicated content pages on your website and the more inbound links (internal and external) that you have, the higher the probability your customers will find your web page – and the higher the PageRank of searches for your product or service.

Most search engines use PageRank-like algorithms to precisely evaluate your ranking potential. While each search engine has its own version, these algorithms calculate the probability that a user will find content that is relevant to their search.

According to the patent, Google will make its measurements and calculations, then assign PageRank indicating a website’s authority relative to other websites on the internet. The highest score is 10 (most important) and the lowest is 0 (least important).

For example, a website that is regularly linked to from The Huffington Post, the BBC, and the UK Government is likely to be more important than a website that once received a single link from a local retailer.

However, your PageRank isn’t simply good or bad, high or low. The impact it has on your search performance is a little more complicated.

The real implications of PageRank on SEO

A top-level definition omits important details that could be the difference between search engine success and failure.

Given that your PageRank score is your site’s measure of importance in the web, Google will use that to determine how often they visit your site pages. This means your PageRank will affect how long it takes for changes you make to be noticed by Google and factored into search rankings.

PageRank also represents the potential of your site pages to rank for searches. This would explain why the home page is the strongest page for sites in SEO, as the home page is often the most linked to page on your website.

Perhaps the most important detail is that a site’s overall PageRank is allocated based on the number of indexed pages on the website. This is best illustrated with an example.

Example PageRank allocation

 

A website has a PageRank of 53. There are 100 pages on the website that are currently indexed by Google. As a result, we can say that the average PageRank is 0.53 per page.

However, not all pages are the same. Some pages quite simply matter more than others.

Not all pages are the same. Some pages quite simply matter more than others.

Not every page is made equal

Most web pages matter to your organisation. Some may handle the heavy lifting of transactions. Others may present search results to make navigation more convenient. But not every page should be made accessible to search engines.

If your goal is to attract, inform, and engage with visitors, a page committed to your privacy policy probably doesn’t cut it.

From a marketing perspective, your priority is the pages you’d want a visitor to see first. The essential information that conveys what you do, how you do it, and – crucially – why someone would want to do business with you. If your goal is to attract, inform, and engage with visitors, a page committed to your privacy policy probably doesn’t cut it.

So while every page matters, not every page is a marketing priority. That’s what needs to be reflected in the way PageRank is allocated.

Improving your PageRank allocation

PageRank is a pie and, by default, every page on your website gets a slice. A more considered approach to PageRank allocation ensures that the most important pages – the ones people will be searching for – are best fed.

You can look at two key areas to identify the most important pages:

  • Traffic Volumes: Which pages are most popular with your visitors? Where do they tend to land, whether they’re coming from search engines or direct input? These pages tend to be important parts of the customer experience through your website.
  • Engagement: How are pages being used? Where do visitors spend a lot of time, and where do they merely rest before clicking away to somewhere else? Pages with good engagement are clearly working; it makes sense to increase the search engine traffic that finds them since the content is likely to satisfy the user’s search, which helps you keep your rankings.

As you begin to understand the pages you should prioritise, you will soon discover pages that you should not. These URLs, or entire folders and URL patterns, can be added to your robots.txt file, which instructs search engine crawlers to disallow pages from being added to the search engine.

The PageRank allocated to these pages is ‘freed up’ – and, instead of being spread evenly over every resource on your website, your PageRank is put to work where it matters most.

Google uses maths to rank your website, so we use maths to make informed decisions about PageRank allocation.

The practical impact of PageRank reallocation

Google uses maths to rank your website, so we use maths to make informed decisions about PageRank allocation. As you’d expect, that means we also use maths and statistical analysis to predict and measure the results.

Let’s use an example. A website could include 11,889 pages with a potential PageRank of 2.93 per page, weighted by the number of visits to assign more importance to more popular pages. After some investigation, we find that only 2,208 of those pages should be made available to Google.

As a result, we can reclaim the PageRank from across these pages and put it to better use on a smaller, more focused group of URLs.

A t-test is common statistical technique used in many fields to help scientists test a hypothesis. For example, in pharmaceuticals, a t-test would be used to see whether a new drug had a treatment effect or not. If there is a 95% chance or higher that the averages of the two test groups are different then the treatment effect is deemed to be statistically significant.

In our SEO context, we’re interested in the treatment effect on the site’s new average PageRank as a result of the reallocation exercise.

Using a t-test, we can determine the effect of blocking these pages. We calculate a 99.6% probability (+/- 0.7% margin of error) that the ranking potential increases 750% to 22 points per URL – a dramatic increase.

PageRank, after disallowing ‘nonsense’ pages is almost 3 standard deviations better than the current site i.e. a material improvement in average ranking potential per page.

We can also predict the impact of our changes on engagement, presented in a box plot that estimates the increased engagement when visitors are directed to valuable, prioritised pages.

averagetimedistribution

Single recommendation, massive impact

The effect of above analysis usually results in a single SEO recommendation (usually an amendment of the robots.txt file). In practice we have observed the impact to be a large as increasing website rankings in Google by as much 68 ranking positions to the first page in Google. This shows that it’s not the number of recommendations but the quality of the recommendations that count.

Data science makes things simple

It’s easy to be misled by the simplified explanations of PageRank. It’s tempting to underestimate the sophistication of search engine algorithms. But it’s only with a scientific, accurate understanding of what search engines are doing that you can get the same understanding of potential outcomes.

For all of the complexity of t-tests and box plots, the truth is that science actually makes things easier – because of optimisation and marketing becomes more tangible. The research is exhaustive. The process is practical and robust. And the results are easier to predict, assess, and analyse.

We can help you bring the advantage of data-driven optimisation to your organisation improving revenue by swiftly delivering measurable improvements. Visit https://artios.io/seo/ to learn more.

Andreas Voniatis
Data Scientist