You’re working on a website built with the Jekyll static site generator, and you need to automatically generate a sitemap.xml to submit to Google’s Search Console. There are plugins for this, but this process is in fact very easy and you can add generation options exactly to your liking. In this tutorial, we’ll look at how a sitemap can be generated by Jekyll with simple Liquid tags.

Generating a basic sitemap.xml with Jekyll

Create a new file sitemap.xml in your Jekyll root directory. Give it some front matter, so that Liquid will parse the template. We’ll be adding Liquid tags in a minute, and without front matter they’ll be ignored:

---
layout: null
---

Let’s start with a quick framework for our sitemap. It must have an XML header and a <urlset> element. Inside that element, we loop through all of our posts:

---
layout: null
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  {% for post in site.posts %}
  {% endfor %}
</urlset>

You can have Jekyll build your site right away and see the sitemap.xml file appear in the _site folder.

Let’s add <url> elements for all our posts now, but only for published posts:

<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  {% for post in site.posts %}
    {% unless post.published == false %}    
    <url>
      <loc>{{ site.url }}{{ post.url }}</loc>
      <lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
      <changefreq>monthly</changefreq>
      <priority>0.5</priority>
    </url>
    {% endunless %}
  {% endfor %}
</urlset>

This will generate a sitemap of all of our posts, using the each post’s publication date as the value for the <lastmod> element, and setting the change frequency and priority to monthly and 0.5, respectively.

We can add static pages to our sitemap.xml, too. While doing this, we’ll take care to remove the index.html from the home page:

  {% for page in site.pages %}
  <url>
    <loc>{{ site.url }}{{ page.url | remove: "index.html" }}</loc>
    {% if page.date %}
      <lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
    {% else %}
      <lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
    {% endif %}
    <changefreq>monthly</changefreq>
    <priority>0.3</priority>
  </url>
  {% endfor %}

Note that pages don’t usually have dates associated with them, so unless a date is present, we’ll use the site’s update time.

Generating a sitemap.xml with custom post inclusion, customizable change frequency and priority

There may be some published posts or pages that we don’t want to show up in our sitemap. Also, we may want to give certain content a higher priority. Ideally, these things are configured through the YAML front matter of each post and page:

---
sitemap:
  lastmod: 2018-05-25
  priority: 0.7
  changefreq: 'weekly'
---

or

---
sitemap:
  exclude: 'yes'
---

We can add some conditional logic to our sitemap.xml to use these attributes, if present. Note that we add an exclude attribute to the sitemap’s own front matter too, to prevent it from including itself.

---
layout: null
sitemap:
  exclude: 'yes'
---
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9">
  {% for post in site.posts %}
    {% unless post.published == false %}
    <url>
      <loc>{{ site.url }}{{ post.url }}</loc>
      {% if post.sitemap.lastmod %}
        <lastmod>{{ post.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
      {% elsif post.date %}
        <lastmod>{{ post.date | date_to_xmlschema }}</lastmod>
      {% else %}
        <lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
      {% endif %}
      {% if post.sitemap.changefreq %}
        <changefreq>{{ post.sitemap.changefreq }}</changefreq>
      {% else %}
        <changefreq>monthly</changefreq>
      {% endif %}
      {% if post.sitemap.priority %}
        <priority>{{ post.sitemap.priority }}</priority>
      {% else %}
        <priority>0.5</priority>
      {% endif %}
    </url>
    {% endunless %}
  {% endfor %}
  {% for page in site.pages %}
    {% unless page.sitemap.exclude == "yes" or page.name == "feed.xml" %}
    <url>
      <loc>{{ site.url }}{{ page.url | remove: "index.html" }}</loc>
      {% if page.sitemap.lastmod %}
        <lastmod>{{ page.sitemap.lastmod | date: "%Y-%m-%d" }}</lastmod>
      {% elsif page.date %}
        <lastmod>{{ page.date | date_to_xmlschema }}</lastmod>
      {% else %}
        <lastmod>{{ site.time | date_to_xmlschema }}</lastmod>
      {% endif %}
      {% if page.sitemap.changefreq %}
        <changefreq>{{ page.sitemap.changefreq }}</changefreq>
      {% else %}
        <changefreq>monthly</changefreq>
      {% endif %}
      {% if page.sitemap.priority %}
        <priority>{{ page.sitemap.priority }}</priority>
      {% else %}
        <priority>0.3</priority>
      {% endif %}
    </url>
    {% endunless %}
  {% endfor %}
</urlset>

Note, also, that I’ve added feed.xml as a special case of a file to exclude from the sitemap. This file is generated by the Jekyll Atom feed plugin, and we don’t have access to its YAML front matter to give it an exclude attribute.

Happy developing!