Tomi Hiltunen

Enthusiastic app developer in Google Go & HTML5

maanantai 18. maaliskuuta 2013

Using URL slugs

What are slugs in URLs?

Slugs are a part of URLs that make the URLs more human-readable and search engine friendly. A URL with a slug gives the viewer a better idea of what is behind the link before even clicking on it.

Lets say that you have an article named "Tower of London". The unique URL for this article without the slug could be:

http://example.com/articles/21

Now with a slug generated from the title of the article:

http://example.com/articles/21/tower_of_london

Seems a lot more informative? Do you find the topic more obvious in the last example?

Essentials in generating a slug

In this article I will discuss just the idea of generating a slug from the title of your content. It is also suggested that other important keywords should be included in the slug for further SEO friendliness. I'm happy using just the title without extra keywords, but feel free to include them in your own app.

The key is that slugs should consist only of characters that do not need to be URL encoded.

There's different variations for generating slugs. I find it more visually pleasing to replace all white-space characters with underscores. Some replace other punctuations with a dash or underscore but I rather leave them out of the slug all together.

How I do it

Here's an algorithm used by me for slugging the titles:
  • Check for all non URL-friendly characters with reg-exp [^A-Za-z0-9\s-_]
    • Non-friendly character was found!
    • See if it can be transliterated
      • Yes?
        • Replace with the latin counter part
      • No?
        • Replace with an empty string
  • Trim all the leading and trailing white-space characters
  • Replace all 1...* characters long white-spaces with an underscore. \s+
  • Convert the string to lower-case
  • Done!

In Go

import (
    "strings"
    "regexp"
)

/*
 *  Dictionary of accented characters and their transliterations.
 *  Don't consider this dictionary complete!
 */
var dictionary = map[string]string {
    "Š": "S",
    "š": "s",
    "Đ": "Dj",
    "đ": "dj",
    "Ž": "Z",
    "ž": "z",
    "Č": "C",
    "č": "c",
    "Ć": "C",
    "ć": "c",
    "À": "A",
    "Á": "A",
    "Â": "A",
    "Ã": "A",
    "Ä": "A",
    "Å": "A",
    "Æ": "A",
    "Ç": "C",
    "È": "E",
    "É": "E",
    "Ê": "E",
    "Ë": "E",
    "Ì": "I",
    "Í": "I",
    "Î": "I",
    "Ï": "I",
    "Ñ": "N",
    "Ò": "O",
    "Ó": "O",
    "Ô": "O",
    "Õ": "O",
    "Ö": "O",
    "Ø": "O",
    "Ù": "U",
    "Ú": "U",
    "Û": "U",
    "Ü": "U",
    "Ý": "Y",
    "Þ": "B",
    "ß": "Ss",
    "à": "a",
    "á": "a",
    "â": "a",
    "ã": "a",
    "ä": "a",
    "å": "a",
    "æ": "a",
    "ç": "c",
    "è": "e",
    "é": "e",
    "ê": "e",
    "ë": "e",
    "ì": "i",
    "í": "i",
    "î": "i",
    "ï": "i",
    "ð": "o",
    "ñ": "n",
    "ò": "o",
    "ó": "o",
    "ô": "o",
    "õ": "o",
    "ö": "o",
    "ø": "o",
    "ù": "u",
    "ú": "u",
    "û": "u",
    "ý": "y",
    "þ": "b",
    "ÿ": "y",
    "Ŕ": "R",
    "ŕ": "r",
}

/*
 * Creates a lower-case trimmed string with underscores for white-spaces.
 *
 *      - Converts to lower case.
 *      - Trims the leading/trailing white spaces.
 *      - Converts applicable accented caharcters to non-accented and removes invalid ones.
 *      - Converts leftover white-spaces to underscores regardles of type.
 */
func Slug(original string) (edited string) {
    // Remove invalid characters
    re, _ := regexp.Compile(`[^A-Za-z0-9\s-_]`)
    edited = re.ReplaceAllStringFunc(original, convertAccent)
    // Trim leading and trailing white-space
    edited = strings.TrimSpace(edited)
    // Convert all white-spaces to underscores
    re, _ = regexp.Compile(`\s+`)
    edited = re.ReplaceAllString(edited, "_")
    // All done!
    return strings.ToLower(edited)
}

/*
 * Converts accented characters if found from the dictionary.
 * Otherwise will replace the character with an empty string.
 */
func convertAccent(found string) (string) {
    if newValue, ok := dictionary[found]; ok {
        return newValue
    }
    return ""
}

1 kommentti:

  1. youtube Views
    Great to know the -- in depth from this blog.This will really help for my forward steps to be taken.

    VastaaPoista