A comprehensive SEO glossary of over 220+ SEO terms with definitions including useful resources.
Above the fold: Any webpage content viewed in a user’s browser before scrolling. Once the user scrolls down, the content would be considered “below the fold.” Having critical content above the fold is important as it makes it immediately obvious that they are in the right place and the content is available. Google attends to content not above the fold with the Page Layout Algorithm.
Above the line: Advertising campaigns that are not online or non digital i.e. Press/PR, Outdoor
Absolute URLs: The fully qualified URL which includes protocol (http/https), the optional subdomain (e.g. www), hostname, top level domain extension and path (which includes the subdirectory and slug). Highly recommended for specifying locators for resource files and canonical URLs.
Algorithm Updates: Updates search engines make to their algorithms to improve the quality of the search results for their users.
ALT tags: The ALT attribute is the HTML attribute used to specify alternative text (alt text) that is to be rendered when the image to which it is applied cannot be rendered.
AMP: Accelerated Mobile Pages (AMP), created by Google, are stripped-down versions of webpages designed to load quickly on mobile devices. While AMP pages load 4 times faster than their non-AMP counterparts, they come with detriments:
- Brands are highly restricted on the functionality of those pages such as not being to implement CSS overlays
- As of September 2021, AMP is no longer a requirement to rank in the ‘Top Stories’ SERP Feature (see guide)
- Any backlinks earned by AMP pages don’t pass link equity (see guide) to the brand domain
Therefore AMP is not recommended, especially with the anticipated 5G connectivity which will render AMP irrelevant.
Anchor Text: Visible and clickable text that displays a hyperlink on a website, which is used to describe to users and search engines what the destination page is potentially about. For example, the anchor text ‘Click here’ is not as relevant as ‘SEO Consultant’ when pointing to the content.
Aside: The <aside> element is used to identify content that is related to the primary content of the webpage, but is not integral to the primary/body content of the page i.e. it’s indirectly related. Asides are frequently presented as sidebars or call-out boxes. Content like author information, glossary definitions, related links, related content, and advertisements would be examples of content that found in the <aside> HTML element.
Authority: Authority is the process by which search engines determine the relative importance of sites and their content. Pioneered by Google, link equity is the concept of passing value between websites and pages through links.
Backlinks: A hyperlink from an external site (known as the referring domain) to a brand site page (known as the link destination). For Example, the Bustle links to Artios.io and is therefore one of Artios’ backlinks.
Backlinks are used by search engines as a signal of advocacy, and therefore authority, Google created PageRank by attributing value to ‘seed’ authority sites (like the BBC). This value would then be distributed via links to other pages, which in turn would pass this value on to other websites.
Baidu: China’s search engine of choice and the 4th most popular site (Alexa 500).
<base>: HTML tag used to specify a base URI, or URL, for relative links which is set using the fully qualified domain URL as the ‘base’ in the opening <head> section such that all subsequent relative links will use that base URL as the starting point.
Bait and Switch: Bait-and-switch is a form of fraud used in retail sales but also employed in other contexts. An example used in SEO would be where webmasters would apply for a directory inclusion such as Yahoo! presenting a site for inclusion, getting the site included. Once included, the site would be changed by the webmaster to its true content.
Bing: Launched in 2009 and owned by Microsoft, has the second largest search user base in the world thanks to the prevalence of the Windows operating system and its pre-installed Internet Explorer (IE) browser. The search engine was repositioned as a ‘decision engine’ to help users find products they were looking to buy.
Black-Hat: Unethical method of SEO used to improve a site’s ranking in a search query. This is executed thorough a variety methods including (but by no means limited to): keyword stuffing, cloaking (selected IP delivery), paid (but not declared/marked up) guest posting, buying links, and Private Backlink Networks (PBNs).
Blog: Short for ‘weblog’, these are regularly updated website articles that features content intended for the general public or a targeted audience relevant to a specific market. WordPress started out as a blog content management system (CMS), before become a more holistic CMS. Blogs are the main content channel of websites for ‘brand journalism’.
Bounce Rate: A website bounce happens when a user visits the site but leaves without taking any further action. An alternative definition is a visit that ends with a single page visit. The bounce rate is the ratio of users who bounce off the site relative to the total site traffic:
Bounce Rate (%) = Bounces / Visitors
Bot: An agent of a search engine (or other software programme trawling the internet) also known as a “spider” or a “crawler,” which downloads content from websites to provide relevant links or gather information either for the purpose of responding to user search queries (search engines) or gathering intelligence (other).
Brand Keywords: Search queries containing trademark names of the website.
Brand Mentions: Reference to a company, brand or online service through another website or news media. These can be hyperlinked or un-hyperlinked.
Breadcrumb: Found at the top of the website or under the navigation bar, these allow users to track their position on a webpage and how far they are from the website’s homepage. A breadcrumb trail on a page indicates the page’s position in the site hierarchy, and it may help users, and search engines alike, understand and explore a site effectively. A user can navigate all the way up in the site hierarchy, one level at a time, by starting from the last breadcrumb in the breadcrumb trail. See guide on Structured Data.
Broken Links: Broken links are links referencing pages that don’t exist anymore. Redirecting links are links to URLs that are redirect to other URLs. These usually result in a 404 server status being served to both search engines and users. These link types are detrimental because they present an adverse user experience (UX). The website also appears to be poorly managed hosting non value adding content and thereby wasting page authority.
Broken Link Building: Finding websites that have already linked to brand content, but asking them to fix the link or update it to a current URL.
Cache: Snapshot of a website that has been scanned/indexed by bots that has been stored by search engines to help a developer analyse the webpage. For example, the Google cache can show when the page was last visited by Google, what Google sees when the visual elements are stripped out (known as Text Cache) by either clicking on cache in the dropdown option against the URL SERP result or by searching for: cache:[URL] in Google.
Cannibals / Cannibalisation: Where multiple URLs from the same site are rank more than once (usually together and on the 2nd page of the SERPs). This can usually be remedied by either aligning one of the URLs to the other including setting the canonical URL.
Canonical URL: This is a URL recommended to search engines as the accurate URL if there are duplicate versions on a site. Also known as the “Primary URL”. Canonical tags specify the canonical URL which helps search engines to determine which URL version should be included in the index and where the page authority should ultimately be accrued. One of the main benefits of a canonical tag is that the search engines are redirected while leaving the user experience intact (i.e. users are not redirected).
The canonical tag is placed inside the head section and is set as follows:
<link rel="canonical" href="https://artios.io/seo-glossary" />
Chat-GPT: Released in November 2022, ChatGPT is an artificial intelligence (AI) chatbot developed by OpenAI using their GPT-3.5 and GPT-4 large language models. It has since been acquired by Microsoft Bing. Chat GPT is used by the SEO community for content ideation, content evaluation according to Google search rater EEAT guidelines among other applications.
Citations: In academia, the number of citations an author has is considered a measure of their authority – if the work is judged as credible enough to be cited by other scholars, logic follows it must be reputable. If a well-known scholar cites a paper, it’s an even higher accolade, even more so if they are in exactly the same field.
Search engines interpret links along the same principle in that, the more links from reputable sources a brand has, the more authoritative they are. A single link from a high authority source will add more value to the brand than hundreds of links from zero to low authority sources.
Click Bait: Marketing technique where web pages have links that are intended to boost a page’s clickthrough rate regardless of whether the content is valuable or truthful.
Click Through Rate (CTR): The ratio of user clicks from the SERP to Impressions, expressed as a percentage. This data can be extracted from Google Search Consoles Performance reports.
Client side: Code or SEO implementation that is executed on the users computer which is most usually the browser, such as a meta robots tag. Client-side rendering (CSR) of web content is also completed through the user’s browser
Cloaking: Also known as selective IP delivery, where websites serve different content based on whether the visitor is a user or a search engine, based on the IP address of the request, which is in violation of Google Webmaster Guidelines.
Citation Flow (CF): A metric created by Majestic that measures a site’s URL importance based on how many websites are linking to it.
Co-Citations: Co-citation is the frequency with which two documents are cited together by other documents. An important aspect of co-citation is the significance (for Google) of the words that surround links. However, anchor texts according to John Mueller still carry more weight.
Competitive Analysis: Research method to understand the rankings factors (and their benchmarks) that can explain the variation in competitor ranks and therefore what brands/businesses must do to rank higher.
Content: Web content refers to the textual, aural, or visual content published on a website. Content means any creative element, for example, text, applications, images, videos etc. In SEO, this is the target object of a user’s search query which is information which one or all of being useful, entertaining, informative.
Content Management System (CMS): An application that helps users (typically marketers) create, edit, publish and manage content on the website without requiring technical knowledge of web development coding. CMS works by combining a set of templates for different content types (such as blogs, landing pages) using different content modules while storing the content in a database. The most popular content management systems include WordPress, Shopify, Webflow, and Hubspot. Other CMS include Drupal and Magento, although these are less popular.
Content Delivery Network (CDN): Refers to a geographically distributed group of servers which work together to provide fast delivery of Internet content.
A properly configured CDN may also help protect websites against some common malicious attacks, such as Distributed Denial of Service (DDOS) attacks.
(Broad) Core Updates: One of the changes Google makes to their algorithms to improve their overall search results in terms of their relevance to users searching online. These typically take 1-2 weeks to fully roll out, which means that search results are subject to more variation than usual. This update most likely took its origins from Google’s Panda algorithm which was made a permanent ongoing feature of Google’s core search system in 2016. Core Updates have been introduced by Google since 2018 and the most recent update took place in May 2022.
Core Web Vitals: Core Web Vitals (CWV) is a set of metrics introduced by Google on 5th May 2020, to measure the user experience (UX) quality of website pages on the internet. The CWV metrics encompass the core user experience needs include loading experience, interactivity, and visual stability of page content. The announcement was made in anticipation of the Page Experience update which would take CWV into account of website rankings in Google’s search results.
The CWV metrics comprise 3 components:
- Largest Contentful Paint (LCP) measures (in seconds) perceived load speed and marks the point in the page load timeline when the page’s largest content element has likely loaded.
- First Input Delay (FID) measures (in milliseconds) responsiveness and quantifies the experience users feel when trying to first interact with the page.
- Cumulative Layout Shift (CLS) measures (as proportion of space shifted) visual stability and quantifies the amount of unexpected layout shift of visible page content.
Crawl: The process by which search engines access web content using an agent known as a bot. Because bots are depicted as spiders, when the agent accesses web content, the content is ‘crawled’.
Crawl Budget: The fixed amount of resources allocated to a website by a search engine for crawling.
Crawl Errors: Where a search engine is not able to access content. These may be discovered by or reported on using Google Search Console (GSC) and/or a site auditing tools (crawlers).
Crawlability: Hyperlinks are visited (crawled) by search engines to access content and discover new content. Search engine’s ability to access content and follow links on a page.
Crawler: Software program used to collect data from the internet to find new and updated content. The main tools are:
Desktop based: Screaming Frog, Sitebulb
Cloud based: Botify, OnCrawl, DeepCrawl
CSS Overlay: Known as a lightbox, these are visual layers that are presented to the user set atop of the web page. These can be used for interstitial ads, cookie notices or other.
Data Driven: Data-driven SEO is an empirical approach to SEO that is less reliant on anecdotal rationalist best practices and more reliant on a statistical analysis of the data generated from the SEO process, be it crawl, search results, visitor analytics, other or a combination of. Read the guide on how to practise data-driven SEO here.
DMCA: Digital Millennium Copyright Act is a 1998 United States copyright law (under Bill Clinton) that implements two 1996 treaties of the World Intellectual Property Organization which criminalises the production and dissemination of technology, devices, or services intended to circumvent measures that control access to copyrighted works. In 2012, Google added a filter to respond to clear and specific notices of alleged copyright infringement (i.e. piracy). Webmasters can initiate the process to delist content from Search results, a copyright owner who believes a URL points to infringing content sends Google a takedown notice for that allegedly infringing material. When Google receives a valid takedown notice, their teams carefully review it for completeness and other criteria. If the notice is complete and Google find no other issues, Google delists the URL from Search results. However, it has been documented that the DMCA process has been abused such that Moz and Search Engine Land have erroneously had their sites delisted by the DMCA takedown process. Webmasters can dispute such requests and have them reversed. DMCA requests to Google may be submitted here.
Dead-End Page: Webpage with no outgoing links.
Deep-Link: A hyperlink that allows websites to build traffic to pages deeper within their site.
De-indexed: To have a site temporarily or permanently taken out of a user’s SERP.
Direct Traffic: Traffic sourced from users that types or pastes in a URL to visit a site directly, clicks a bookmark, or otherwise
Directories: Sites containing business listings that include company name, address, website address and phone numbers.
Disavow: A facility in Google Search Console (GSC) that allows webmasters to negate links from their backlink profile the webmaster believes will negatively affect a website’s authority.
Disavow File: A text document that allows webmasters to choose links for deactivation from the website’s backlink profile.
DMOZ: A multilingual open-content directory of sites. The community who maintained it were also known as the Open Directory Project which was owned by AOL but constructed and maintained by a community of volunteer editors. This used to be a must ‘get into’ directory for all sites due to the value of the links and the difficulty of being included.
Do-follow Links: Links that signal to bots and search engines to point back to a brands website or blog.
Domain: String of text that maps to a numeric IP address, used to access a website from a user’s browser or other client software. The domain appears as a website address without the subdomain or URL slug.
Domain Age: The age of a domain which is calculated as the time period difference between today’s date and the domain registration date, usually measured in days and can be found using WHOIS data
Domain Authority (DA): Invented by Moz and based on google’s PageRank, Domain Authority (DA) is a measure of a site’s relevance within a specific subject area or industry. Based on a scale of 0 to 100, where 100 is the best/most authoritative, the authority is generally based on number of inbound hyperlinks from other websites on the internet. The scale is logarithmic so a domain with a DA 30 is 10 times more authoritative than a domain with a 20 DA score.
Doorway Page: Content is created for every single keyword. These are frowned upon as they are seen as delivering a poor UX for users. Sometimes these are combined with redirects to ‘money’ pages i.e. targeted merchant pages where affiliates earn commissions should a purchase or a targeted interaction takes place.
DuckDuckGo: A privacy-focused search engine with the mantra “no tracking, no ad targeting, just searching”, which means they don’t track their users, integrate with their social media, store their information, nor use their search history to target them with ads. The search engine was founded by Gabriel Weinberg and is headquartered in Pennsylvania, United States
Duplicate Content: Observed where website pages are highly similar, or identical which may arise because of:
Internal search result pages
Duplicate pages waste crawl budget and dissipate page authority leading to lower rankings and traffic. In general, the canonical tag is the preferred solution, other solutions include Noindex (meta robots or X-robots tag) and 301 (permanent) redirects, where the outcome is to consolidate page level authority.
Dwell Time: Also known as Time on Page is the amount of time users spend viewing a specific page or screen or set of pages or screens. The metric is an average, i.e.
total amount of time spent on page / the total number of visits
Dwell time shows how well content is performing such that a higher average time spent on a specific page means the content is performing well. Conversely, ToP reveals false leads (and potential bot visitors) for users who view the page but leave quickly.
E-A-T Content: A recent factor brought in by Google for (YMYL: Your Money or Your Life) content published by regulated industries such as finance and law. E-A-T optimisation requires:
positive reviews on authoritative 3rd party websites
lack of overt advertising on content pages
detailed disclosure on the company including content authors
citation of credible sources
scientific evidence for medical content
Ecommerce: The facilitation of buying and selling products online.
Editorial Link: A backlink given to a site voluntarily given by a site editor on a meritocratic basis of the content’s value and not because it was paid for.
.edu Link: Link given by an academic institute that has .edu as a TLD. These are high value despite having been historically manipulated by institutional insiders that created pages and sold the links for SERP manipulation.
Entity: A metaphorical object that is “singular, unique, well-defined, and distinguishable, which would take the form of people, places, organisations, websites, events, groups, facts, and other things – making it linkable to a knowledge graph. This could be considered the more structured form of a keyword which is ambiguous and less concrete. Entities make it possible fr search engines to connect all the world’s information together, regardless of the user’s language and therefore provide better search results.
Entity SEO: The SEO optimisation of websites for their content to appear for Knowledge Panel results. SEOs may attempt this by seeking gaps between website content not marked up for structured data where it could be using the Google Natural Language API. Note that Google will only list keywords in its API and will therefore rely on other methods to detect entities on the web which will be cross referenced with Wikipedia.
Evergreen Content: Content that remains continually relevant over long periods of time. Labelled as evergreen because the content is considered sustainable and lasting.
External Link: Outbound link to a URL that resides on a different domain entirely.
Faceted navigation: The type of navigation found on category pages of (typically ecommerce) sites that have a high volume of options. The navigation changes the content according to the user selection and will usually load a new URL upon selection. This can often cause duplicate URL issues.
Featured Snippet: An extracted quote from a website complete with the cited source URL (as opposed to an ‘Answer Box’ which has information without a cited source). Also known as Position (P) Zero results.
First Link Priority: Where two links to the same URL destination are present on the same page, Google will give more weight and thus prioritise the first link including its anchor text.
Footer: A section common to all pages of a website that is located at the bottom, underneath the main content section.
Footer Link: A site-wide link that appears in the footer section of the website page.
Freshness: The age of the content online, Google will sometimes give priority to content that has a younger publishing date (by virtue of being more up to date) depending on the subject matter. This is sometimes manipulated by webmasters that consistently update the content updated date daily without having actually updated the content.
Gated Content: Content that is primarily meant for lead generation. Typically blocked by bots, accessing this content typically requires filling out contact information in a form.
Glossary: Is Glossary good for SEO? Evidently yes, why because they act as a sitemap and help other content get crawled. Plus, as content in their own right, glossaries are likely to get a significant volume of searches.
Google Analytics: A website analytics tool provided by Google to help digital marketers track, analyse and report performance data for their website on users, content and traffic sources. Google has a premium offering known as Google Analytics 360.
Google Alerts: Notification service that alerts users via email when new content, relevant to a topic of a user’s choosing, is found on websites indexed by Google.
Google Bomb: or Google Bombing/Washing is a practice of causing a website to rank highly in web search engine results for a targeted search term usually by intensive link building. At one stage George W. Bush was google bombed to appear as the top result for searches on ‘born loser’.
Googlebot: The name of Google’s search engine crawling agent
Google Dance: Since 2002, this term has been coined to describe the highly changeable periodic episode during which Google updated its search index every month.
Google Sandbox: A waiting period applied by Google to new sites targeting competitive and high volume search phrases.
Google My Business Listing: Product offered by Google that will show users using a SERP where and how to find a business. Formerly known as Google Local and Google Places.
Google Search Console (GSC): Tool and a reporting suite offered by Google to webmasters to submit XML sitemaps, manual request crawls of URLs as well as performance reporting on visibility, Core Web Vitals, Mobile usability, index coverage.
Google Search Quality Rater Guidelines: Guidelines based on feedback from third-party Search Quality Raters appointed by Google. Their feedback helps Google understand which changes make their search results more useful. Raters also help Google categorise information to improve Google’s systems and to evaluate changes, without directly impacting how Google’s search result positions.
Google Tag Manager (GTM): A tag management system (TMS) offered by Google to quickly and easily update measurement codes and related code fragments collectively known as tags on a website or mobile app. SEOs sometimes use this to serve meta tags.
Google Trends: A website tool by Google that analyses the popularity of top search queries in Google Search which can be segmented regions, languages, topical categories and search result types (web, images, news, youtube). The outcome metrics are on a scale from 0 to 100, and not search volume data which is useful for comparing search queries. Data is available from 2004. Although the data is not available in API form, CSV downloads are available and by date.
Google Webmaster Guidelines: General SEO best practices issued by Google to help sites appear in Google Search, as well as quality guidelines that, if not followed, can cause your page or site to be omitted from Search
.GOV Links : Links from a .gov TLD which is a government website and would be considered high value for SEO due to its authority and exclusiveness.
Guest Posts/Blogging: A blog article published on a website that is written by a non website staff person i.e. a guest, usually for the purpose of achieving a backlink to the guest’s website.
Header: The top section of the web page common to all pages on a website that typically include the logo, and navigation menu.
Heading: A heading is a core HTML element which separates content into sections, based on importance, where H1 is the most important. In general, the main headline should be tagged as H1, contain the main target search phrase, be visible above the fold and be as consistent as possible with the title tag. Multiple H1s according to HTML5 conventions may be used however this isn’t recommended.
Head Term: A search phrase of 1 to 2 tokens (keywords) that has a comparatively high search volume. These won’t usually convert as well and are much harder to rank for as middle body or long tail keywords.
Hidden Text: A deceptive practice used by where invisible or unreadable text is used to show search engines content invisible to users so that the content ranks higher without affecting the user experience. Search engines have probably overcome this by measuring the hexadecimal distance between font colour and it’s background colour to infer it’s colour contrast.
Hilltop: An algorithm used to find documents relevant to a particular keyword topic in news search. Created by Krishna Bharat while he was at Compaq Systems Research Center and George A. Mihăilă University of Toronto, it was acquired by Google for use in its news results in February 2003.
HITS Algorithm: An abbreviation of Hyperlink-Induced Topic Search based on Jon Kleinberg’s paper “Authoritative sources in a hyperlinked environment”. The algorithm to identifies authorities and hubs for a topic by assigning two numbers to a page: an authority and a hub weight which are defined recursively. The idea is that high authority websites will have common hyperlinks therefore SEOs that acquire links from these hubs will help their sites become authorities.
Homepage: Welcome page on a brand’s website that highlights important tools or information throughout a brand’s website for easy user navigation.
.htaccess: (or “distributed configuration files”) provide a means to make server configuration changes on a per-directory basis. A file, containing one or more configuration directives, is placed in a particular document directory, and the directives apply to that directory, and all subdirectories thereof. SEOs use these to deploy server level redirects.
HTTP: Hypertext Transfer Protocol (HTTP) is the basis of the World Wide Web, and uses hypertext links to load web pages. HTTP is an application layer protocol designed to transfer information between networked devices and runs on top of other layers of the network protocol stack. A flow over HTTP involves a client machine making a request to a server, which then sends a response message.
HTTPS: The secure encrypted version of HTTPS, it’s a secure way to send data between a web server and a web browser, protecting from interception and tampering. In 2014, Google announced HTTPS would be used as a ranking signal.
Hub Page: The main page of a brand’s website pertaining to a certain topic or keyword or a site that links to authoritative sites
Hummingbird: Search algorithm announced by Google in September 2013, after it had been deployed for a month. Hummingbird was launched to better comprehend the full context of queries using semantic search as opposed to keywords, by relying more on entities, in order to provide better results.
Inbound Link: Usually referring to backlinks (i.e. from an external site), these are links on different websites that lead a user to a brand/business website.
Indexing/ Indexed Pages: where search engines try to understand the meaning of the content extracted from the crawling process. As much of the content is reduced to text form from content tags and attributes, such as <title> elements and alt attributes, images, videos, and more. Pages that are indexed are those included in a search engines database ready to be served to users in response to a search query.
Image Carousels: Interactive image experience that allows search engine users to see images in a slideshow format which is a feature of universal search.
Image Links: Images on a website that are hyperlinked to another URL.
Impressions: Number of user views of a site’s SERP listing in a given time period. This could be considered the actual search volume though not the maximum potential.
Infographic: Visual graphic content usually with some data with the purpose of being entertaining and/or informative. These were a key link building tactic.
Information Architecture (IA): The intersection of user experience, content and its context. Information Architecture (IA) helps by defining a website’s structural design according to its content labels and organisation, which directly impacts the user navigation and accessibility to search engines. The purpose is to make website information available, comprehensible and useful to its audiences in the most effective way possible i.e. with the minimum of friction.
Information Retrieval: The process and science of obtaining information system resources that are relevant to an information need (think search) from a collection of those resources (i.e. stored data). Searches can be based on full-text or other content-based indexing. Prabhakar Raghavan, a senior vice president at Google co wrote and published a book on Information Retrieval.
Intent: The main goal a user has when using a search engine. The SERPs and the top ranking content hint at the search intent as to the kind of content the user is seeking for a given keyword. These are categorised as transactional, navigational, informational or commercial keywords which are derived from keywords or offered by the SERP tracking software.
Internal Links: Internal links are the main means by which search engines access and discover content within a website. Their integral role also signal to search engines their relative importance and their semantic relevance.
IP Address: An Internet Protocol address is a unique numerical label such as 192.0.2.1 (IPv4) that identifies each computer using the Internet Protocol to communicate over a network. Websites are hosted on servers that have an IP address. IP addresses are split into for octets (separated by a period mark), where each octet has a maximum value of 256. These IP addresses thus take the form A.B.C.D where A, B, C and D are octets. Private Backlink Networks (PBN) used to deploy sites on different C class IP addresses (the 3rd octet of an IP address) to make the sites in the PBN non nepotistic to feign the independence of the PBN that would inflate the popularity of the sites they were linking to. PBNs are now more likely to use separate and distinct A or B class octets.
Keyword: The word(s) that constitute a search phrase (representing topics and ideas) which are used to optimise pages so search engines can comprehend which searches these should appear for.
Keyword Clustering: SEO practice of grouping keywords by search intent and/or latent semantic meaning for the purpose of mapping the correct keywords to the same landing page. This helps prevent SERPs cannibalisation.
Keyword Density: Proportion of times a keyword or phrase appears on a web page to the total number of words on the page, expressed as a percentage. In the SEO context, keyword density was used to determine whether a web page is relevant to a specified keyword or keyword phrase. However, this metric is outdated as search engines rely on other factors such as user signals, inbound links and the content of those backlink pages to inform content relevance to keywords.
Keyword Research: Keyword research and selection aims to anticipate the content requirements of target audiences. By researching the likely phrases these users will use to find the products and services of brands, which can be discovered using multiple tools.
Keyword Stemming: The form of a word before any inflectional affixes are added. In English, most stems also qualify as words. The term base is commonly used by linguists to refer to any stem (or root) to which an affix is attached. Search engines, including Google may use stems to treat variations of word as equivalent according to their common stem and thereby save computing resources.
Keyword Stuffing: A web-spam or spam-dexing technique, in which keywords are loaded into a web page’s meta tags, visible content, and/or backlink anchor text in an attempt to gain an illegitimate rank advantage in search engines.
Knowledge Graph: Used by Google Search to help users discover information faster and easier by using entities which are cross reference by Wikipedia on a daily basis which will help Google:
Display knowledge panels for the entities searched for
Refine the results of other Google services based on users’ interests.
Knowledge Panel: A subset and visual extension of Knowledge Graphs, except Google will only use data from Google Maps or My Business listings. Thus, knowledge panels are shown for queries about brands, businesses, or organizations. Knowledge panels will usually include images, facts, social media links, and related searches. To optimise for Knowledge panel listing, a website must be an authority on a given subject.
KPI: Key Performance Indicators. From a SEO perspective, KPIs are key performance indicators are measures of:
Organic search channel performance and
Its contributions to the wider digital marketing mix
SEO campaigns and activities performance
These KPIs help both digital marketers and SEO practitioners get insights to understand:
The value of Organic search to brands/businesses
How much SEO is contributing to that value
What the drivers of performance are i.e. what’s working and not working
What to do next i.e. the highest priority activities and immediate targets
opportunities in non SEO channels such as Paid Search such as keywords for greater multi-channel integration.
Landing Page: The first web page that a visitor lands on after opening an email, link or other. In SEO this is usually the ranking URL and performance data can be gathered in Google Analytics using the landing page report.
Latent Semantic Indexing (LSI): also known as Latent Semantic Analysis (LSA), is a Natural Language Processing (NLP) technique from the field of Information Retrieval (IR), which discovers patterns in the relationships between the terms and concepts contained in an unstructured collection of text. LSI is particularly useful for dealing with different words that have similar meanings (synonymy) and words that have multiple meanings (polysemy) in complex datasets. SEO best practices would use synonyms to help the web page capture to be visible for as many different search variations as possible.
Large Language Models (LLMs): A form of Artificial Intelligence (AI) using a neural network trained on millions of documents (mainly text based such as books, web articles) that is capable of generating ‘human-like’ responses to questions and content. OpenAI’s Chat-GPT, Google’s BARD and X’s Grok are examples of a LLM. Around Chat GPT2, we have built our own LLM to optimise meta titles and descriptions for web pages.
Lazy Loading: Lazy loading is the practice of delaying load or initialisation of resources or objects until they’re actually needed to improve performance and save system resources. For example, if a web page has an image that the user has to scroll down to see, you can display a placeholder and lazy load the full image only when the user arrives to its location.
The benefits of lazy loading include:
- Reduces initial load time – Lazy loading a webpage reduces page weight, allowing for a quicker page load time.
- Bandwidth conservation – Lazy loading conserves bandwidth by delivering content to users only if it’s requested.
Linkable Assets / Link Bait: Content with value created attract inbound links from external websites (i.e. backlinks), something that Artios can help with.
Link Acquisition/Building/Development: Process of acquiring backlinks from other websites to a brand’s website. Tactics include:
PR & Influencer campaigns
Creating link bait
Social media promotion
Broken link building
Reclaim lost links
Link Equity / Juice: Measures the authority of a site based on the number of quality inbound links. By measuring the authority passed through to certain pages, link authority allows search engines to gauge page popularity and rank them accordingly. The value passed through links is known as link equity.
Link Farm: Black-Hat method where a website is created for the sole purpose of adding backlinks to increase SERP positions of sites listed.
Link Profile: The collection of backlinks a site has which can be analysed and compared (between sites) to understand the tactics and site sources used to acquire links.
Link Reclamation: Finding broken or removed links to your site and fixing and replacing them with updated URLs, which is maximises the authority flow of a site’s link profile and thereby their SERPs positions potential.
Link Schemes: Any links intended to manipulate domain authority or a site’s ranking in a search engines results may be considered part of a link scheme and in violation of Google’s Webmaster Guidelines. This includes any behaviour that manipulates links to a website or outgoing links from said website.
Link Velocity: The speed at which backlinks to a domain or website are added over a specific period of time (usually a month). AHREFs keeps track of new links acquired and existing links that are lost for a more accurate and granular measure.
Local Link: Typically found on business directories, these links are created to rank a local business or service in search engines.
Local Pack: A SERP Features showing a list of local businesses in response to search queries containing “near me” or “near (location)” in their search query. These are typically businesses that have a “Google My Business” listing.
Log File Analysis: An advanced site audit that uses server logs to understand the actual interaction of search engines with websites among other purposes.
Long-tail: A search phrase of 5+ tokens (words in the search phrase) that has a high search volume. These are usually higher converting and easier to rank for compared to head term keywords. A whole book was published on this subject by Chris Anderson.
LSI Keywords: Synonym keywords and search modifiers that are related to a main keyword and are perceived as semantically relevant. These can be sourced in Google Search Console (GSC) by filtering on the page URL and looking at the query.
Machine Learning (ML): A sub discipline of Artificial Intelligence, ML is the more flexible approach of using computers to analyse and learn from data. As opposed to writing code that uses explicit rules in the form of if else statements, Machine Learning constructs a model of the data looking for statistically verifiable patterns which is more reliable and uses much less code than traditional computing methods. ML is used by search engines to help organise their content indices and deliver more useful search results. ML algorithms are largely tree based and include Decision Tree, Random Forest and AdaBoost. ML is not limited to search engines as SEOs can use it to understand search engines better including the user intent behind a query and conduct statistical SEO analyses more efficiently and reliably.
Manual Action: Google issues a manual action against a site when a human reviewer at Google’s Search Quality Team has determined that the pages on a site are not compliant with Google’s webmaster quality guidelines. Most manual actions address attempts to manipulate Google’s search index. Notifications are given to the site’s webmasters in their Google Search Console (GSC).
Meta Description: Meta descriptions are the visible piece of text below the page titles in search results. Unlike page titles, they are not used as a direct ranking signal, however they will be used by search engines, if they provide a more accurate description than the page content. They are used by users to prompt click through. In the source code they are encapsulated by the title HTML tag as follows:
<meta name="description" content="[Page content abstract. Call to action.]" />
Google provides examples of meta description writing best practices here.
Meta Keywords: A core HTML element residing inside the <head> section listing the keywords that relate to the page content with the purpose of assisting search engines on what searches the page should appear for. Google does not use the keywords meta tag in web ranking.
Meta Tags: Code snippets used in HTML documents to provide structured metadata about a Web page. specified a web page’s head section. These include the title, keywords, description, robots.
Metrics: Used in SEO to measure performance across SEO processes (Technical, Content/UX and Authority) and outcomes (SERPs, User Engagement). The data is obtained from a variety of 1st Party (analytics, Google Search Console) and 3rd Party sources.
Migration: Site migrations are changes of URLs are involved happen mainly because of changes of:
Content Management System (CMS) platforms
HTTP to HTTPS
Natural Link: Backlinks given to a site by other webmasters, bloggers or website owners link to the site’s content (blogs, images, products, videos etc) because of the content’s utility to their readers and/or adds value to their websites or pages.
Negative SEO: Malicious tactics aimed at sabotaging search rankings of a competitor’s website, usually by getting the content URL(s) removed, lowered, marked adversely or otherwise.
Niche: A marketing segment or industry pertaining to the needs of a particular audience.
noarchive: Tag directive to search engines not store and serve a cached copy of the web page in search results otherwise, Google et al may generate a cached page and users may access it through the search results.
noindex: Tag directive to not show this page, media, or resource in search results otherwise the page, media, or resource may be indexed and shown in search results.
nofollow: Attribute of links instructing search engines not to follow the links on this page otherwise, Google et al may use the links on the page to discover content on those linked pages.
nosnippet: directive to not show a text snippet or video preview in the search results for that page. A static image thumbnail (if available) may still be visible in Google SERPs, if it deems the outcome to be a better user experience – for all forms of search results (i.e. Search, Images, Discover). Otherwise, Google may generate a text snippet and video preview based on information found on the page.
not provided: In 2013, Google Analytics (GA) removed the keyword dimensional data from URL reports and replaced it with “not provided”. The main impact was that webmasters could no longer attribute revenue, conversions and other commercial performance data to keywords. Webmasters can still find keyword data on URLs via performance reports in Google Search Console (GSC).
not set: Where Google Analytics couldn’t precisely define which request brought organic traffic to a website which may have been caused by old search systems that include keyword data, not setting the keyword in a campaign with manual UTM tagging or email referral traffic.
Off-Page SEO: The interaction between the website and other sites on the internet. While this is not an area where SEOs have direct control, it can be influenced by research to plan and execute campaigns. This covers site evaluation, selection and strategy for backlinks
On-Page SEO: The user interaction, ensuring the content is not only optimised to satisfy user queries but also the content created has online search demand covering:
Keyword selection, research and mapping
Content gap analysis, creation and tagging
Core Web Vitals (which impacts both users and search engines)
Organic Search: also known as natural search, refers to unpaid web search results.
Orphan Page: Orphaned pages are live website pages that do not have an eventual link path from the home page of the same website hence the name ‘orphaned’ because the page has no parent URL. These can occur usually due to a poor internal linking structure – for example, where the design of the site does not allow for older content to be found.
Open Graph: A marketing parameter that measures the percentage of emails that are opened.
Outbound Link: Links that direct users, search engines and others to another URL. These can be internal or external.
Outreach: The activity of webmasters to request backlinks to a website domain and its pages.
Page Authority (PA): A metric created by Moz, this is an index between 0-100 that scores the relative importance a page is relative to other pages on the web based on its backlink profile and accrued authority (equity).
Page Layout Algorithm: Launched in 2012 to target websites that had an excessive number of ads above the fold.
PageRank: Google’s system for measuring domain authority (importance) is known as PageRank. It is named after both the term “web page” and co-founder Larry Page. PageRank measures the probability a random internet user will stumble upon a website’s content. The latest patent may be found here and full explanation is given here. The PageRank increases if the hyperlinking site is:
- Topically similar
- Receives relatively high levels of traffic
- Makes the hyperlinks easily found
PageSpeed Insights: PageSpeed Insights (PSI) reports on the performance of a page on both mobile and desktop devices, with suggestions on how that page may be improved. PSI provides both lab and field data about a page where: Lab data is useful for debugging performance issues, as it is collected in a controlled environment but, it may not capture real-world bottlenecks. Whereas Field data captures true, real-world user experience – but has a more limited set of metrics. PSI has an API to help automate data collection.
Paid Link: When a website pays a third party domain for a followed backlink that points back to their domain, which is in violation of Google Webmaster Guidelines.
Paid Search: A digital advertising channel of online search results where the listings have been marked as Ad or Sponsored. These are usually shown at the top of, bottom of and side of organic search results. These are charged on a Pay Per Click basis.
Panda: Named after Google engineer Navneet Panda and launched in February 2011 as part of Google’s quest to lower rankings on low-quality sites known as ‘content farms’ (i.e. “sites which are low-value add for users, copy content from other websites or sites that are just not very useful”) that relied on black hat SEO tactics and webspam. This impacted search rankings for 11.8% of queries in the U.S. In terms of signals Panda targets:
Content lacking substance or plagiarised incl. doorway pages, thin content
Poor website architecture e.g. duplicate content (e.g. not using HREFLANG)
Panda was fully incorporated into Google’s core algorithm set in Jan 2016.
Penguin: Google’s algorithm for ignoring or neutralising spammy links. 4 iterations of the algorithm were released since April 2012 until it was fully incorporated into Google core algorithm set in September 2016.
People Also Ask (PAA): A Google SERP feature that typically displays the top 4 most searched for related questions with corresponding answers. Each answer is sourced from a web page (accessed by clicking the downward chevron), which is displayed in Google’s SERP with a clickable link for attribution.
Personalised Search: Web search experiences that are tailored specifically to an individual’s interests by incorporating information about the individual beyond the specific query provided which may use search history, browsing history, cookies, demographic and psychographic data.
Pillar Page: A web page that covers the overall topic in some depth and links to the clusters of related content, which is also known as a hub page.
Pogo-Sticking: Where a user browses several of the pages in the SERP results before settling on the final page. The final page gets promoted for that user and logged as a statistic for that user’s profile if logged in. RankBrain also logs the statistics and uses the aggregate results to reorder the SERPs.
Private Blog Network (PBNs): A network of websites that are used to build backlinks for target websites, usually favoured by industries (such as gaming) that would struggle to attract links from digital PR campaigns.
Query: A phrase or a keyword combination users enter in search engines to find things of interest, that people actually used in a search engine. These can be surfaced via Google Search Console (GSC) reports.
Query Deserves Freshness (QDF): Introduced into Google by invented by Amit Singhal in 2007, search results get reordered if the query sees a sudden rise in relevancy/mentions (news reports) and traffic (search volume), for the duration of the “public interest”, based on Blogs and magazines, News portals and Search requests. More detail is provided here.
RankBrain: RankBrain is a part of Google’s algorithm that uses machine-learning (ML) and AI on user signals to better understand the intent of a search query, helping return the most relevant search results to users.
Implemented into the core Google algorithm in 2015, RankBrain was initially only applied to the 15% of never before seen queries it received (around 450 million searches per day). Once RankBrain was able to filter results better than the then incumbent Google Search engineering team by 10% on internal tests, RankBrain became a part of every search query.
Ranking Factor: Any variable that a search engine uses to decide the best ordering of relevant, indexed results returned for a search query. These are likely to be factors that a search engine internally can reliably explain the variation in content quality based on its search logs.
Reciprocal Links: Any set of hyperlinks between two websites that point both ways. In general, reciprocal links happen when two webmasters agree to each host a link on their own website, which points to the other webmaster’s website. In practice, websites operating this link scheme, would have a ‘links’ page which listed all the links to sites they were reciprocating with. A variation on this theme were ‘triangular’ or ‘one-way’ reciprocal links where site A would link to site B, which would link to site C and site C would link to site A. In 2007-8, Google took a hard stance against reciprocal links and started adding filters to sites that participated in such link schemes.
Reconsideration Request: Also known as a reinclusion request, these are requests to have Google review a website after its webmaster fixes the problems identified in a manual action or security issues notification, which is given in the webmaster’s Google Search Console (GSC) account. Fundamentally, Google wants to know that any spam on the site is gone or fixed, and that it’s not going to happen again. Matt Cutts recommended penalised webmasters provide a short explanation of what happened from their perspective including what actions may have led to any penalties and any corrective action that the webmaster has taken to prevent any spam in the future. If an agency was involved then show good faith providing details of said firm and what they did to help Google evaluate reinclusion requests. Note that mostly-affiliate sites may need to provide more evidence of good faith before a site will be reincluded given such sites should already be quite familiar with Google’s quality guidelines.
Redirects: The process of redirecting requests for an existing URL to a different one, effectively signalling visitors and search engines that a page has a new location. Redirects divert both users and search engines to the other URL, usually for the purposes of:
moved content: the URL address of a page or an entire section or website has been updated such that a single URL will map 1-to-1 onto another single URL
multiple paths: to reach the same content for example:
non www to www (or vice versa)
non secure to secure
merged content: the content has been shifted onto another URL such that two (or more) URLs (and their content) are combined onto a single URL.
Redirects can be executed at the server side or client side.
Redirect Chain: Occurs when there are multiple redirects between the initially requested URL (A) and the final destination URL (C). For example, A redirects to B, which in turn redirects to C. This results in the users and search engines going through multiple hops (which are processed by the site) and waiting longer for content to load. John Mueller of Google, recommends such chains to contain less than 5 hops.
Referrer: The address of the source web page where a person clicked a link that sent them to the destination web page. Technically, an optional field that is transmitted via the HTTP header when a web page is requested from a server. The main function of this field is to determine which page a given user was using before landing on the current page.
Referring Domains: Number of backlinking domains a website has. This KPI measures a site’s popularity on the web based on the link equity and relevance of its content. Referring Domains should be tracked across URLs and aggregated to show:
- Total Referring Domains
- Average Referring Domains per URL
- Link Velocity: Referring Domains added/lost per time period
Google Search Console (GSC), AHREFs, Majestic, SEMRush, Buzzsumo
Reinclusion: Where Google removes any spam penalty, which usually follows a successful reconsideration request. Matt Cutts, Google’s former Head of Webspam, gives more details here.
Related Searches: Search suggestions that are displayed on search result pages (SERP) and zero result search pages. When a user performs a searche, Related Searches places associated search queries usually either (or both) within the SERPs and above the pagination links located at the bottom of the SERPs.
Relevance: An indication of how relevant web content is in relation to a particular search query. The higher the likelihood a URL’s content satisfies the search query (aggregated over a statistically significant sample of users), the more relevant the content is.
Reputation Management: A discipline that makes use of SEO, public relations and legal means to improve the portrayal of brands (companies, people, products, services, political entities) in the SERPs. The first step is to audit the brand term SERPs and the sentiment of the top 30 for each keyword, which will give an overview of the brands online search reputation. The higher the average sentiment, the more positive the online reputation. SEO techniques would involve promoting positive and neutral URLs above content URLs of negative sentiment.
Resource Pages: Page on a website that includes helpful links regarding a particular topic. Historically used by participants in Reciprocal Linking schemes.
Rich Snippet: Visual layer to an existing organic web search result such as reviews and prices which usually improves click through. This can be optimised using Schema.org markup.
Robots.txt: A TXT file uploaded at the root level of the server, used to direct search engines which content can/cannot be crawled and reference XML sitemap locations
Robots meta tag: Used to control how individual HTML pages are shown in search results or to make sure that it’s not shown. Robots meta tags are mainly used to prevent content from appearing in search results by including a noindex meta tag. When search engines next crawls that page and sees the tag or header, the search engine will drop that page entirely from search results, regardless of their backlinks. Robots Meta Tags as a method is:
- Easy: to implement by the content community, directly in the Content Management System (CMS), requiring less technical knowledge
- HTML only: and cannot be deployed on other non HTML content such as PDFs
RSS: RSS stands for RDF Site Summary or “Really Simple Syndication”. These are web feeds that allow users and applications to access updates to websites in a standardized XML format.
Schema Markup: A collaborative community founded by Google, Bing and Yahoo! in 2011, with a mission to create, maintain, and promote schemas for structured data on the Internet, on web pages, in email messages, and beyond. Schema.org vocabulary is the language of structured data which covers entities, relationships between entities and actions, and can easily be extended thanks to their comprehensive model. Schema.org is used to markup web pages and email messages helping to convert web content from strings to actual objects search engines can understand, have more contextual understanding of and respond with richer experiences. The shared vocabulary was designed to make it easier for webmasters and developers to decide on a schema and get the maximum benefit for their efforts, and thereby become the industry standard for the web. A useful tool for marking up web content may be found here.
Scrape: Using manual or most commonly automated means to extract or make copies of website content. One way to achieve this in Python is to use the BeautifulSoup API package.
Search Operators: Search commands used to find SEO information. For example, the ‘site:’ will display the pages of a website that are in the search engine’s index and usually the most authoritative. Other commands include ‘inurl:’, ‘intitle:’
Search engine optimisation (SEO): The art and science of optimising websites to be visible on the search engines.
Search Generative Experience (SGE): A new feature of search engines (such as Google) to generate rich, fully formed and highly targeted content in the search results. The implications are that the value of an organic web result will be much higher as there will be less positions listed on the 1st search results page. SEOs can best respond by opening content to discovery as much possible and enriching it with schema so that it becomes the preferred LLM source. Google’s SGE is powered by its own LLM known as BARD.
Search Intent: While users have different words to express the same idea, the intent is singular. From a keyword research and selection perspective, there are 4 main search intentions: Informational, Commercial, Transactional, Navigational. Before allocation however, keywords should be grouped by search intent because audience users will have variations on a core search phrase when searching for the same object. Thus, content has a 1-to-many relationship with keywords in that a single URL can rank for multiple variations of a core keyword providing they share the same search intent. Clustering keywords by search intent can be executed by comparing the search results of both keywords for similarity.
Search Volume: The maximum number of search queries for a specific search term in a search engine such as Google per month. This statistic is used by SEOs to evaluate the maximum traffic potential of a keyword for campaign targeting and optimisation purposes. The data which is usually extracted from Google Ads keyword planner, can be subject to auction popularity (long tail keywords may return zero according to Google Ads as there isn’t enough bidders in the auction), seasonal, regional and thematic fluctuations. The best source of actual search volume data is from Google Search Console’s Impressions metric, however this needs to be with respect to the ranking position.
Seed Keywords: Words or phrases that SEOs can use as the starting point in the keyword research and selection process to discover more keywords, thus serving as the foundations of keyword research be it in GSC or Google Ads Keyword Planner.
SERPs: Search Engine Results Page. Copies of the SERP may be tracked using SERP tracking tools such as SEO Monitor, getSTAT, Advanced Web Ranking (AWR), Accuranker, DataForSEO’s SERP API and many others. DataForSEO also offers a historic SERP API which allows you to go back in time which is potentially useful if you are trying to forensically investigate a recent site migration or Google algorithm update.
SERP Features: Search Engine Result Pages (SERP) formats are supplemental results shown in addition to the organic web results. They take the form of bespoke formats, used to enhance the SERP UX. The most common organic SERP Features are:
Rich Snippets: which add a visual layer to an existing organic web search result such as reviews and prices.
Featured Snippets: an extracted quote from a website complete with the cited source URL (as opposed to an ‘Answer Box’ which has information without a cited source).
Universal Results: that appear in addition to organic results (e.g., images, videos, featured snippets)
Knowledge Graph: data which appears as panels or boxes (e.g., weather, Celebrity Knowledge Panel)
Server Logs: A log file (or several files) automatically created and maintained by a server of activities it performed such as a history of page requests, transactions, errors and intrusions. The anatomy of a log is comprised of the timestamp, user information and event information. These are mainly used by IT and Dev operations, however more advance SEOs will analyse these for opportunities.
Server Logs File Analysis: Analysis of log files to forensically examine the actual interaction of search engines with a website for technical auditing purposes. This can help SEOs understand precisely what a search engine reads, how often the search engine reads it, how much crawl resources are used in terms the of time-spent (ms) accessing a URL.
Server side: Code or SEO implementation items executed at the server level such as an X-Robots tag. Server-side rendering (SSR) is the process of rendering web pages through 1st party servers.
Shopify: An e-commerce content management system (CMS) platform enabling businesses to sell products and services online, released by Shopify Inc, a Canadian company listed on the New York Stock Exchange as SHOP. Read our guide to Shopify SEO.
Site audit: An analysis of factors that affect website’s visibility in search engines giving insight into a website’s overall traffic and any individual page’s performance. A technical site audit uses crawler tools to simulate the interaction of search engines with the site, identifying potential issues that might prevent content from being accessed and included in the search engine index. The audit will include an explanation of the issue, the diagnosis, impact and solution with a level of priority and difficulty.
Site links: Links from the same domain that are clustered together under a web result. Search engines systems analyse the link structure of a website to find navigational shortcuts that will save users time and allow them to quickly find the information they’re looking for. Queries that trigger SERPs including site links are categorised as having ‘Navigational’ search intent.
Site Architecture: Also known as website architecture, this is a hierarchical structure of a brand’s website’s pages.
(HTML) Sitemap: Page providing users with information about the content as an extended menu usually within topical categories. While these were in vogue, HTML sitemaps are rarely used as an extensive menu will normally suffice.
Site-wide links: Links that are located in a (boilerplate) section common to all web pages where the effect is to have all pages on a website linking to the destination URL. For early blog websites, this was known as a ‘blogroll’.
Social media signals: A webpage’s collective shares, likes and overall social media visibility as perceived by search engines. These activities contribute indirectly to a page’s organic search ranking and are seen by SEOs as another form of citation, similar to backlinks. These are more likely to drive a higher volume of brand searches which increases the authority metrics of a site.
Source code: A computer program which is converted into a machine language that is read by a machine and compiled in image and function. The HTML code of a website is also called source code.
Spam/Spamdexing: Also known as search engine spam, search engine poisoning, black-hat SEO, search spam or web spam – the deliberate manipulation of search engine indexes. It involves a number of methods, such as link building and repeating unrelated phrases, to manipulate the relevance or prominence of resources indexed, in a manner inconsistent with the purpose of the indexing system.
Spiders: A type of bot that is typically operated by search engines to access, evaluate and index the content of websites all across the Internet so that those web URL contents may appear in search engine results.
Split AB testing: Method of statistically comparing two versions of a webpage or app against each other to determine which one performs better. A is the control group (with no change) and B is the treatment (the group with change/hypothesis being tested). Users are exposed at random to one of the variants, and statistical analysis is conducted to determine which variation performs better for a given goal – usually 2 standard deviations difference in performance between groups.
Sponsored: Both Google and the Advertising Standards Authority (ASA) have a very strong stance on paid partnerships and backlinks. The ASA’s guidelines state that all paid partnerships between a brand and third party should be clearly disclosed within all content. Google’s guidelines state that any ‘link scheme’ could have a negative impact on search results for both parties, urging webmasters to add nofollow tags to ‘paid links’ in these cases. While it’s against Google’s guidelines to pay for links or content that includes links, there may still be an exchange of payment as long as the third party:
reserves full editorial control of their content
discloses the sponsorship
Marks up the link using rel=”sponsored” attribute in the link
Srcset: <picture> attribute specifies the URL of the image for the browser to use in different situations. Most commonly used with the sizes attribute to create responsive images that adjust according to browser conditions.
SSL Certificate: An abbreviation of Secure Sockets Layer, it is a digital certificate that authenticates a website’s identity and enables an encrypted connection for HTTPS.
(HTTP) Status Codes: Generated by the server hosting the website when it responds to a request made by a user’s browser or a search engine. Search engines use HTTP status codes to understand the status of content to potentially update their index.
Stop words: A set of commonly used words in any language to assist computers in NLP and text mining applications get the gist of textual content documents by removing ‘noise’. For example, in English, “the”, “is” and “and”, would qualify as stop words. A number of APIs in Python can be used to achieve this purpose: NLTK Library, SpaCy Library, Gensim Library
Structured Data: Structured data is a standardised format for providing information about a page and classifying the page content. Search engines use structured data that it finds on the web to understand the content of the page, as well as to gather information about the web and the world in general. Google Search also uses structured data to enable special search result features and enhancements. Structured data is coded using in-page markup on the page that the information applies to. The structured data on the page describes the content of that page. Most search engine structured data uses schema.org vocabulary.
Subdomain: A piece of additional information preceding the website’s domain name. It allows websites to separate and organise content for a specific function — such as a blog or an online store — from the rest of the website. Search engines will normally treat subdomains are separate sites to other sites hosted on other subdomains or the root domain.
Subfolder / Subdirectory: A type of website hierarchy under a subdomain that uses folders to organise content on a website. A subdirectory is the same as a subfolder and the names can be used interchangeably. Sites organised under subdirectories as opposed to separate subdomains generally perform better in terms of accruing and distributing authority among content.
Syndicated Content: Web-based content that is re-published by a third-party website, which finds the content using RSS feeds. Note that according to Google the canonical link element is not recommended for avoiding duplication by syndication partners, as the pages are often very different and that the most effective solution is for partners to block indexing of the syndicated content.
Taxonomy: The practice of organising and classifying items based on similarities which typically follows the user research and content inventory processes.
Technical SEO: The discipline of optimising the website build code and configuration to ensure search engine crawlers and indexers can:
access the site
find all pages and content
extract the content
To meet the above objectives, Technical SEO covers:
Text to HTML code: A ration used for measuring the amount of text content on the web page in comparison to the amount of HTML code required for displaying it, providing an indication of the code’s efficiency for delivering the web content experience. The Text to HTML ratio is commonly known as a text to code ratio.
TF-IDF: term frequency–inverse document frequency, is a numerical statistic that is intended to reflect how important a word is to a document in a collection (known as a corpus). Used in information retrieval, text mining, and geneeral Natural Language Processing (NLP) to help computers understand what a document is mainly about.
Thin Content: Content that has little or no value to the user which could be doorway pages, low-quality affiliate pages, or autogenerated. URLs containing such content should either be refurbished or migrated to other URLs. Google Panda targeted thin content.
Title Tag: Page titles are a ranking signal and used by search engines to semantically identify content on a webpage. In the source code they are encapsulated by the title HTML tag, placed inside the <head> tag. These can be autogenerated using a AI Recurring Neural Network model, see our guide.
Top Level Domain (ccTLD and gTLD): Country code Top Level Domains (ccTLDs) are two-letter domains assigned to specific countries. For example, .uk is for the United Kingdom. Generic Top Level Domains (gTLDs or just TLDs) refers to domain extensions with three or more characters. These TLDs are maintained by the Internet Assigned Numbers Authority (IANA). Common examples include .com, .gov and .org (deployed)
Toxic links:Backlinks that weaken the organic placement of a website which means means less organic traffic from the search engines, usually because of negative SEO.
Traffic: Number of visits to a brand’s website from organic or paid-search results.
Transactional: Intent where the user is ‘buyer ready’ and searches for products or services usually with modifier keywords such as ‘buy’, ‘for sale’, ‘cheap’, ‘price’
Transport Layer Security (TLS): an Internet Engineering Task Force (IETF) standard cryptographic protocol designed to provide communications security over a computer network by providing authentication, privacy and data integrity between two communicating computer applications.
Trust Flow (TF): Metric a metric from the SEO software company Majestic that measures the perceived trustworthiness of a website based on its backlink’s authority. In essence this is an alternative measure of domain authority.
User-Generated Content (UGC): aka User-Created Content (UCC), is a form of content posted by users on websites such as blog comments, social media, discussion forums and wikis. Where links are generated from UGC, these should be markup with the attribute rel=“UGC”
Unlinked Mention: Mention of a brand on a website with no hyperlink to the said brand’s website. Sometimes the result of PR campaign coverage.
Universal Search: Results that appear in addition to organic results (e.g., images, videos, featured snippets)
Unnatural Link: An artificially created link, used to manipulate search results. These fall foul of Google’s Webmaster Guidelines.
URL: Uniform Resource Locator, known as a web address, and used to reference to a web resource that specifies its location on a computer network and a mechanism for retrieving it. A URL is a specific type of Uniform Resource Identifier (URL). These will contain the HTTP protocol (http|https), the subdomain if present (e.g. www), the hostname and the top level domain (TLD) extension.
URL Parameter: known as “query strings”, this are a method to structure additional information for a given URL. For the purpose of:
Sorting and Filtering
Parameters are appended after ‘?’ symbol in the URL, where multiple parameters can be added and separated by the ‘&’ symbol.
URL Slug: The portion of the URL after the last backslash.
Usability: Also known as User Interface Design, from website perspective, usability is a user-centric design process to ensure that websites/applications are efficient and easy to use for their target audience. The principles of usability are
availability – content on the website is easy to access including server uptime and device friendliness
clarity – the design makes it easy for users to complete tasks
recognition – the design uses conventions that require little to no learning on the users part to navigate and use
relevance – content that engages
User Agent: A browser user agent is a string identifying the browser and operating system to the web server. A search engine user agent is a string identifying a search engine. Request by all user agents are store in server logs.
User Experience (UX): A design process teams use to create products that provide meaningful and relevant experiences to users, which concerns the entire process of acquiring and integrating a product/website which encompasses aspects of branding, design, usability and function.
Vertical Search: The process of content indexation and exposition for a given industry or market place ie. sites dedicated to serving search results for users looking for something in a single product/market category such as flights, hotels, restaurants, clothing, jobs, mortgages, insurance. Vertical experiences are channel specific such as mobile apps.
Video: Google and other search engines also display dedicated results to video content by way of Universal search, Google Video search and Featured snippets. Note that adding a description to your videos in structured data is no longer required although Google still recommends doing it.
Visibility: The overall volume and rating of a site’s value in organic search. It is a function of its ranking positions in the search results (rating) and the top 10 coverage extent of the target keywords (by volume).
Voice Search: Combines speech recognition technology with search engine keyword queries to enable users to speak questions rather than type them. Voice search is mirroring the rise of mobile search being present on smartphones, tablets, and on home assistant devices like Amazon’s Echo, which means audiences will find optimised content from anywhere including mobile devices
Webpage: A hypertext document on the World Wide Web which are delivered by a web server to the user and displayed in an internet browser. A web page is accessed by typing in the URL into the the browsers address bar.
Webmaster Guidelines: Information about how to create and maintain a website that helps search engine pages to find, index and rate a brand’s website.
Website: Group of related web pages under a domain.
Webspam: The deliberate manipulation of search engine indexes using techniques such as manufacturing links and keyword stuffing, to manipulate the relevance or prominence of web pages.
White-Hat: Techniques of SEO which adhere to the search engine webmaster guidelines and use sustainable user centric practices to publish content, manage the search engine website interact and acquire backlinks.
Word Count: The number of words in the body copy which excludes words in the boilerplate sections such as the header, footer and sidebar. The optimal word count depends on the competition, the content expected of the user query.
WordPress: A holistic, free and open-source content management system written in PHP and paired with a MySQL or MariaDB database with supported HTTPS that started life as as a blogging platform. The platform has extensibility to customise the appearance using themes and functionality using plugins.
XML Sitemap: Sitemaps uploaded to the server in XML format containing a list of desired list URLs for search engines to crawl with supplemental information on last updated and any alternate language versions of the page. XML sitemaps may be referenced from the robots.txt file or submitted to a search engine’s webmaster tools console. Additional sitemaps may be submitted for news, videos and images.
X Robots Tag: Server level method version of the robots meta tag control how content is shown in search results or to make sure that it’s not shown. Instead of a meta tag, the robots directive can be set at server level using the X-Robots-Tag header with a value of either noindex or none in the response. This method is especially useful for non-HTML resources, such as PDFs, video files, and image files.
Yandex: Russia’s most popular and the largest technology company in Russia.
YMYL pages: “Your Money or Your Life”. Content that could have a significant negative impact on the quality of people’s lives and/or their finances, usually content concerning products sold in regulated industries i.e. “pages or topics [that] could potentially impact a person’s future happiness, health, financial stability, or safety”. Google offers the following categories of YMYL:
- News and current events
- Health and safety
- Groups of people