Convert int to ASCII and back in Python
I'm working on making a URL shortener for my site, and my current plan (I'm open to suggestions) is to use a node ID to generate the shortened URL. So, in theory, node 26 might be short.com/z
, node 1 might be short.com/a
, node 52 might be short.com/Z
, and node 104 might be short.com/ZZ
. When a user goes to that URL, I need to reverse the process (obviously).
I can think of some kludgy ways to go about this, but I'm guessing there are better ones. Any suggestions?
Solution 1:
ASCII to int:
ord('a')
gives 97
And back to a string:
- in Python2:
str(unichr(97))
- in Python3:
chr(97)
gives 'a'
Solution 2:
>>> ord("a")
97
>>> chr(97)
'a'
Solution 3:
If multiple characters are bound inside a single integer/long, as was my issue:
s = '0123456789'
nchars = len(s)
# string to int or long. Type depends on nchars
x = sum(ord(s[byte])<<8*(nchars-byte-1) for byte in range(nchars))
# int or long to string
''.join(chr((x>>8*(nchars-byte-1))&0xFF) for byte in range(nchars))
Yields '0123456789'
and x = 227581098929683594426425L
Solution 4:
What about BASE58 encoding the URL? Like for example flickr does.
# note the missing lowercase L and the zero etc.
BASE58 = '123456789abcdefghijkmnopqrstuvwxyzABCDEFGHJKLMNPQRSTUVWXYZ'
url = ''
while node_id >= 58:
div, mod = divmod(node_id, 58)
url = BASE58[mod] + url
node_id = int(div)
return 'http://short.com/%s' % BASE58[node_id] + url
Turning that back into a number isn't a big deal either.