You should be familiar with the principle of a sitemap. If not, do not worry, I will briefly address them if you are not. But too often I see that good use is not yet made of all the options that come with sitemaps.
But not all the requirements of a sitemap are always considered.
Why I am going to talk about sitemaps
As the web evolves, so does Google and SEO. This means that what is considered best practice is often in motion. What may have been good advice yesterday is not anymore today. This is especially true for sitemaps, which are almost as old as SEO itself.
The problem is that when everybody has talked about them on forums, published recommendations on blogs and strengthened opinions on social media, it takes time to sort out valuable advice from incorrect information. This creates a piece of disinformation and I want to counter this.
So although most of you probably already know that a sitemap is important in Google Search Console, you may not know how complicated its implementation is in a way that makes SEO performance indicators possible.
Let me discuss the confusion about best practices regarding sitemaps and clear up the misunderstandings once and for all.
What is a Sitemap?
A sitemap is a page (or file) with a list of all pages that can be found on a website.
Simply put, an XML sitemap is a list of the URLs of your website. It acts as a roadmap to show search engines what content is available and how it can be accessed.
But why are they useful, let me explain that with a simple example;
- A search engine will place all nine pages in a sitemap with one visit to the XML sitemap file.
- It will have to go through five internal links via the website to find all 9 pages.
So there is a clear difference between how the search engine and a visitor must move in order to gain insight into everything.
This ability of an XML sitemap to assist crawlers with faster indexing is especially important for websites that:
- Have thousands of pages and / or a deep website architecture.
- They often add new pages.
- They often change the content of existing pages.
- Suffers from weak internal links and / or orphaned pages (with orphaned pages I mean pages that cannot be found in a normal / logical ways)
Although search engines can find your URLs technically without sitemaps, by including pages in an XML sitemap, you indicate that you consider them to be high quality pages yourself. Although there is no guarantee that an XML sitemap will allow your pages to crawl, let alone that they are indexed or ransomed, sending one of them certainly provides opportunities.
Because you simply make it clearer and easier for a search engine.
Important attributes for a sitemap
If you look at the structure of an XML sitemap then there are a number of elements that will not be clear to everyone. These attributes do have a certain semantic value. So it’s good to know what they do and whether you use them properly.
Loc (Locatie) Tag
This mandatory tag contains the absolute, canonical version of the URL location. It must accurately display site protocol (http or https) and whether you have chosen to include or exclude www.
Hreflang / language
For international websites you MUST also implement hreflang handling here.
Using the xhtml: link attribute to specify the language and control variants for each URL reduces the page load time, which the other implementations of link elements in the <head> or HTTP headers cannot provide.
Yoast has written an extensive post about how to deal with hreflang;
Lastmod (Last modfied) Tag
Officially this tag is optional but I would still like to recommend it very strongly. It is used to communicate the last changed date and time of the url. This is important for search engines so that they have a good idea of when the content was first published and when it was last changed.
Search engines use this as a signal to rank your pages.
It is also advisable to communicate the freshness, but do not forget to only update the change date if you have made major changes.
If it only concerns small, not noteworthy, adjustments, you could suffice with a short edit note in the message or on the page itself. In addition, this can happen if you try to mislead a search engine, you should of course pay attention to that.
This optional tag, which I also strongly recommend, gives an indication of how often that the content of the URL would change to the search engines.
Google, and in particular John Mueller, has stated that “change frequency does not play so much of a role with sitemaps” and that “it is much better to simply give the time indication of directly”. I would therefore recommend that you follow the advice of Google in this.
This optional tag would apparently tell the search engines how important a page is relative to the other URLs on a scale between 0.0 and 1.0. However, several people from Google, including Gary Illyes, indicated that they are not doing anything with this.
Your website needs an XML sitemap, but not necessarily the metadata about the priority and the change frequency, since the search engines do not have to worry about that.
Use the lastmod tags accurately and focus your attention on ensuring that you receive the correct URLs.
Different forms of sitemaps
Outside the general sitemap, there are several other sitemaps that you may need to use. Think of sitemaps for images, videos, news and, for example, mobile sitemaps.
If you do a lot with images; use an image sitemap.
If you do a lot with videos; use a video sitemap
If you work specifically with news items (press releases); then use a news sitemap.
If you occasionally do one of the above, it is not necessary to devote a complete sitemap to it. You may also have very specific entities, such as a website with recipes. You can still have the normal sitemap but you could also use a specific recipes sitemap. This way you can work in a more modular way.
A mobile sitemap is not necessarily a must even if you think so. It is advisable to use this if the mobile experience and desktop experience differ completely or greatly from each other.
Now we come to the fun part. How do you use XML sitemaps most useful?
Only include relevant SEO pages in XML sitemaps
An XML sitemap is a list of pages that you recommend for crawling, which is not necessarily every page of your website. The XML sitemap indicates that you find the included URLs more important than unlisted URLs that are not in the sitemap.
You actually use your sitemap to let the search engines know; “I would really appreciate it if you would focus on these URLs in particular.”
Basically, you help use the crawl budget effectively. The crawl budget is something that you build up over time, by continuing to handle content creation properly.
By including only SEO-relevant pages, you help search engines crawl your site in a smarter way to reap the benefits of better indexing.
What you should be excluding from your sitemaps
- Urls without a canonical
- Duplicate pages
- URLs with pagination
- URLs with parameters / session ID
- Pages with search results
- Comment URLs
- Share via email URLs.
- Filtering URLs
- Archive pages.
- All redirects (3xx), missing pages (4xx) or server error pages (5xx).
- Pages blocked by robots.txt.
Keep your crawling budget as clean as possible in this way because you do not add any useful pages to your Sitemap.
It is therefore advisable to regularly check your crawl statistics in the Google Search Console.
BONUS; Sitemaps and de-indexing
Did you know that you can also use a Sitemap to get pages from the Google Index?
Sometimes it happens that despite the best intentions you get pages in the Google Index while that is not desired.
Google Search Console offers the options to remove some URLs or to work with a disavow file ( https://www.google.com/webmasters/tools/disavow-links-main?pli=1 )
But your sitemap has the ability to get the same done without having to rely on separate tools.
Submitting a sitemap with noindex URLs can speed up the de-indexing. This can be more efficient than removing URLs in Google Search Console if a lot has to be indexed.
Please note; use this with care and make sure that you temporarily add such URLs to your sitemaps.
So you see sitemaps are important
Proper use of a sitemap is therefore much more often than just creating it and adding it to the search engines. For this sitemap to function optimally, you have to look at what should and should not be included.
Maybe you will now better understand the header image of this item!
As you can see, I also go fairly deeply into even the smallest aspects because I want you to get the most out of your medium. Spruce over your sitemap? Feel free to let me know!