String slugification in Python
There is a python package named python-slugify
, which does a pretty good job of slugifying:
pip install python-slugify
Works like this:
from slugify import slugify
txt = "This is a test ---"
r = slugify(txt)
self.assertEquals(r, "this-is-a-test")
txt = "This -- is a ## test ---"
r = slugify(txt)
self.assertEquals(r, "this-is-a-test")
txt = 'C\'est déjà l\'été.'
r = slugify(txt)
self.assertEquals(r, "cest-deja-lete")
txt = 'Nín hǎo. Wǒ shì zhōng guó rén'
r = slugify(txt)
self.assertEquals(r, "nin-hao-wo-shi-zhong-guo-ren")
txt = 'Компьютер'
r = slugify(txt)
self.assertEquals(r, "kompiuter")
txt = 'jaja---lol-méméméoo--a'
r = slugify(txt)
self.assertEquals(r, "jaja-lol-mememeoo-a")
See More examples
This package does a bit more than what you posted (take a look at the source, it's just one file). The project is still active (got updated 2 days before I originally answered, over seven years later (last checked 2020-06-30), it still gets updated).
careful: There is a second package around, named slugify
. If you have both of them, you might get a problem, as they have the same name for import. The one just named slugify
didn't do all I quick-checked: "Ich heiße"
became "ich-heie"
(should be "ich-heisse"
), so be sure to pick the right one, when using pip
or easy_install
.
Install unidecode form from here for unicode support
pip install unidecode
# -*- coding: utf-8 -*-
import re
import unidecode
def slugify(text):
text = unidecode.unidecode(text).lower()
return re.sub(r'[\W_]+', '-', text)
text = u"My custom хелло ворлд"
print slugify(text)
>>> my-custom-khello-vorld
There is python package named awesome-slugify:
pip install awesome-slugify
Works like this:
from slugify import slugify
slugify('one kožušček') # one-kozuscek
awesome-slugify github page
It works well in Django, so I don't see why it wouldn't be a good general purpose slugify function.
Are you having any problems with it?