Duplicate content is content that appears on the Internet in more than one place (URL). When there are multiple pieces of identical content on the Internet, it is difficult for search engines to decide which version is more relevant to a given search query. To provide the best search experience, search engines will rarely show multiple duplicate pieces of content and thus, are forced to choose which version is most likely to be the original—or best matched to the search query.
In general, there are a few ways of resolving this issue:
- Set a preferred version to the Canonical URL (also known as the Primary URL) with a canonical tag in the pages’ headers
- Permanently redirect or rewrite the duplicate URLs to their Canonical URL.
- Ensure each of these pages have unique content:
- Unique user reviews which differentiate them from other similar product pages
- Create original product descriptions for every product on the site (prioritise highest margin product pages first)
Thin content refers to a lack of text based content found in your website. It can be something like product descriptions that are taken from a feed that can be found on many other sites, or can simply be a page that has a little content on it other than elements like the navigation, such as doorway pages.
These techniques don’t provide users with substantially unique or valuable content, and are in violation of Google’s Webmaster Guidelines. Here are a few common examples of pages that often have thin content with little or no added value:
- Automatically generated content (like paginated pages unless value is added)
- Thin affiliate pages
- Content from other sources. For example: Scraped content or low-quality guest blog posts
- Doorway pages