Search engine optimisation (SEO) best practices¶

Why SEO matters¶

Search engine optimisation (SEO) is the practice of adapting a website to ensure that search engines (Google, DuckDuckGo, etc.) can easily index them. Good SEO means that search users are more likely to find the site based on relevant searches, and their search results are more likely to contain useful information, which may include media as well as text. SEO can make a difference to the number of customers and clients a small business is able to reach.

However, simply chasing search rankings is not necessarily a good idea. Firstly, search engines adapt their algorithms quickly, and constantly tweaking a site without adding real value to the reader is not useful in itself. Secondly, not all websites really need good SEO. We do not make an special effort to improve the SEO of our internal Beautiful Canoe websites, for example, because their main audience is internal to the company.

Nevertheless, for client websites (or our own company website) which do need good rankings, it is worth following a few simple best practices. The list in this page should not be taken to be exhaustive, and will likely fall out of date. In general, you should use some tools to measure SEO, and think carefully about following their feedback. If you are not sure whether a particular change is worth implementing, please ask on the relevant Slack channel.

Effective metadata¶

Clearly, the content of each page on a website makes a significant difference to SEO rankings. However, the best way to improve page content for SEO is to write clearly and provide real value to the reader. These two tasks are not addressed here, and for client websites, it is likely to be the client that provides content for the site.

The more technical side of SEO is related to the way that pages are served, and metadata. Metadata (data about data) is held in the <head></head> section of each page, and can be improved in a variety of ways.

If you are using a templating system to generate a site, look for plugins that can manage some SEO tasks automatically, or ways to template your metadata. For example metadata for mkdocs-material

`robots.txt`¶

robots.txt files instruct search engines that some files should not be indexed. There is no guarantee that search engines will respect these files, and so anything genuinely sensitive should be hidden behind site authentication. However, if there are pages that are not useful to index, but still served, then they should be noted in robots.txt.

Sitemaps¶

A sitemap is an XML document which describes which URLS are served on a site with some metadata about the pages. Most web frameworks will come with a way of automatically generating sitemaps, you should not usually write code to create them.

Descriptions¶

Description metadata looks like this:

<meta name="description" content="Description goes here...">

This is the text appears underneath a search result if you search for your website online. Each page should have a distinct description. The description should be reasonably long (around 160 characters) and should be a full, grammatically correct sentence.

Titles¶

Title metadata looks like this:

<title>Beautiful Canoe | Aston's student-led software enterprise</title>

This is the text that appears on the title bar of your browser, if you are using a desktop web browser. Each page should have a distinct title. Titles should also be grammatically correct, and should clearly describe each page.

Canonical URLs¶

A canonical URL looks like this:

<link rel="canonical" href="http://localhost:4000/">

Some pages can be reached via more than one address, for example / and index.html, on many sites. A canonical URL will tell the search indexer that the two URLs return the same data, which will improve search results for the site.

Keywords (and not taking them too seriously)¶

Keyword metadata looks like this:

<meta name="keywords" content="software, software development, software development agile, ...">

There is a big industry in helping developers choose keywords, even though Google does not use them. It is worth adding a small number of keywords (<30) to your site metadata. These should be variations on the search terms that you expect to lead to your site. However, it is not worth spending much time generating large numbers of these.

`JSON-LD`¶

JSON-LD is a JSON schema for linking data that should appear on the index page of your website. There are many ways to use linking data, but a common use is to associate your site with social media identities for the same owner.

For example:

<script type='application/ld+json'>
{
    "@context": "http://www.schema.org",
    "@type": "...",
    "name": "...",
    "url": "...",
    "logo": "...",
    "description": "...",
    "address": {
        "@type": "PostalAddress",
        "streetAddress": "...",
        "addressLocality": "...",
        "addressRegion": "...",
        "postalCode": "...",
        "addressCountry": "..."
    },
    "sameAs" : [
        "https://www.facebook.com/...",
        "http://www.twitter.com/...",
        "https://www.linkedin.com/company/..."
    ]
}
</script>

Manifests¶

Web app manifests are JSON files which are mostly used by progressive web apps which are websites that can be installed on mobile devices.

For other sites, it may still be useful to generate a manifest, with one of a number of online tools, such as get manifest or dunplab manifest generator. However, some data (splash screens, service workers, etc.) will not be relevant.

To validate a manifest, you can go to Application -> Manifest in Chrome Developer Tools.

Favicons¶

A favicon is the icon that appears in a browser tab. Different favicons with different sizes are necessary for different mobile devices. Rather than trying to generate these yourself, you should create one, reasonably large favicon and use a tool to generate the rest. There are a number of free tools online, such as Real favicon generator.

Viewports¶

The viewport metatag looks like this:

<meta name="viewport" content="width=device-width, initial-scale=1.0, maximum-scale=2" />

A viewport is the area of a browser window where the user can see page content. On a mobile device, this may well be smaller than the rendered page, in which case the browser will provide scrollbars.

The metadata tag enables the developer to communicate how the page should be rendered and at what zoom level. Viewports are not simple, it is a good idea to read Mozilla's documentation. You should also test your site with the Chrome developer tools device mode.

Open Graph Protocol¶

The Open Graph protocol enables you to define rich metadata about your pages, including metadata about media. This is an example from the Open Graph website:

<meta property="og:audio" content="https://example.com/bond/theme.mp3" />
<meta property="og:description" content="Sean Connery found fame and fortune as the suave, sophisticated British agent, James Bond." />
<meta property="og:determiner" content="the" />
<meta property="og:locale" content="en_GB" />
<meta property="og:locale:alternate" content="fr_FR" />
<meta property="og:locale:alternate" content="es_ES" />
<meta property="og:site_name" content="IMDb" />
<meta property="og:video" content="https://example.com/bond/trailer.swf" />

If you do implement OGP metadata, each page should have distinct data, which you should template or auto-generate. Think carefully about whether it is useful to add images or videos to your metadata (in most cases it may not be).

Twitter cards¶

Twitter cards define how a page will look on Twitter if someone Tweets a link to it. If you implement Twitter cards, each page on your site should have a different card, with a different description.

If you look through the documentation for Twitter cards, you will see some overlap between cards and Open Graph markup. When Twitter crawls your page, it will first look for Twitter card metadata, and a particular value cannot be found, it will fall back on any Open Graph data you have defined. Use this to avoid duplication and adhere to DRY.

Google specific metadata¶

There are a number of metadata keys that are specific to Google. For most sites, it will not be worth defining many of these. However, it is a good idea to briefly look through the list and see if any are relevant to your site.

Once your site is publicly available¶

Once you have followed some basic good practices, and your site is publicly available, it is a good idea to test your SEO and set up some basic analytics.

SEO measurement tools¶

There are a number of free tools that can do this, including seoptimer, sitechecker.pro and seobility. Rather than using one tool, it is a good idea to use several, as they will all look for slightly different metadata. Think carefully about the advice you get from these tools. There is a trade off between "better" SEO and time-spent, and once you have covered most best practices you will experience diminishing returns.

You can also validate your web app manifest with a number of online tools.

Site verification¶

Some search engines will allow you to submit your site to them for verification. This usually involves placing some metadata on the site which the search engine can detect. The most useful verification comes from Google and Bing.

Visitor tracking¶

Site tracking allows you to see how many visitors your site has received, and will give you some statistical insight into their behaviour.

Google analytics is worth setting up for most sites.

For sites where the client wants to pay for Facebook adverts, Facebook pixel may be useful.

For more detailed usability data, that can help you improve the user experience of the site, try HotJar.

All tracking solutions require cookies. To comply with UK and EU laws, you should must add a cookie consent button. Osano is one option here, but again, you should not implement a solution by hand.

Improving page speeds¶

Improving page speeds, especially on mobile platforms, will increase audience retention. There are a number of metrics you can track to determine your page speed, such as:

Time to interactive
Time to first paint
Total blocking time

Google pagespeed and Lighthouse will give you a number of these metrics and provide advice on improving your page speed.

Image compression¶

There are a number of algorithms for compressing images without losing (much) detail. Optimizilla image compressor is one, but there are many others.

Serving compressed pages¶

Most web browsers can render pages (and other objects) that the web server sends in gzipped format. If you cannot do this automatically via a setting in your web host, you should gzip all of your pages in your CI/CD pipeline.

For example, this pipeline job is for GitLab pages:

pages:
    stage: deploy
    image: ...
    before_script:
        - ...
    script:
        - ...
        - find _site -type f -regex '.*\.\(htm\|html\|txt\|text\|js\|css\)$' -exec gzip -f -k {} \;
        - mv _site public
    artifacts:
        paths:
            - public

Minification¶

Minifying web pages means removing most of the whitespace from all text files that will be read by the browser (usually HTML, CSS, JavaScript). Do not try to minify files yourself (you may cause issues for browser parsers). Instead, you should look for a well maintained library or plugin for the stack you are using.

Adding custom 404 and other error pages¶

You should add customised (and branded) pages for 404 errors, and ideally other errors. There are some very creative examples of error pages on the web. However, something that is clean and clearly branded is enough for most sites.