How to explain the str.maketrans function in Python 3.6?
I am currently participating in an Udacity course that instructs students on programming using Python. One of the projects has students rename photo files (remove any numbers in the name) in a directory in order to have the files arranged alphabetically, after which a secret message will be spelled out. For instance, if a file name is "48athens"
, the program seeks to remove the numbers, leaving only "athens"
as the file name.
I am using Python 3.6, while the course instructor is using Python 2.7. I should likely be using Python 2.7 so as to simplify the learning process. However, for now I will keep using Python 3.6.
The way in which the instructor has the files renamed is using the .translate
function, which takes two arguments in Python 2.x, while Python 3.x only takes one argument. It removes any numbers (0 through 9) from the file names.
import os
def rename_files(): #Obtain the file names from a folder.
file_list = os.listdir(r"C:\Users\Dennis\Desktop\OOP\prank\prank")
print (file_list)
saved_path = os.getcwd()
os.chdir(r"C:\Users\Dennis\Desktop\OOP\prank\prank")
for file_name in file_list: #Rename the files inside of the folder.
os.rename(file_name, file_name.translate(None, "0123456789"))
os.chdir(saved_path)
rename_files()
However, this does not work in Python 3.x, as it says that:
TypeError: translate() takes exactly one argument (2 given)
Thankfully, I found another way using someone's assistance. However, I'm not really sure how it works. Can someone explain the str.maketrans
function to me, and what the first two blank arguments in quotes are for? My thought is that it's saying: for the first two characters in the file name, remove any numbers (0 through 9). Is that correct? For instance, in "48athens"
, remove the first two characters (4 and 8) if they are numbers between 0 and 9.
import os
def rename_files(): #Obtain the file names from a folder.
file_list = os.listdir(r"C:\Users\Dennis\Desktop\OOP\prank\prank")
print (file_list)
saved_path = os.getcwd()
os.chdir(r"C:\Users\Dennis\Desktop\OOP\prank\prank")
for file_name in file_list: #Rename the files inside of the folder.
os.rename(file_name, file_name.translate(str.maketrans('','','0123456789')))
os.chdir(saved_path)
rename_files()
My Understanding of the Documentation:
static str.maketrans(x[, y[, z]])
This static method returns a translation table usable forstr.translate()
.
It's saying that the arguments passed to str.maketrans
, along with the actual function str.maketrans
, will make a table that says, "If this character appears, replace it with this character." However, I'm not sure what the brackets are for.
If there is only one argument, it must be a dictionary mapping Unicode ordinals (integers) or characters (strings of length 1) to Unicode ordinals, strings (of arbitrary lengths) or None. Character keys will then be converted to ordinals.
It's saying that it can only change integers, or characters in strings of length one, to other integers or strings (of any length you want). But I believe I have three arguments, not one.
If there are two arguments, they must be strings of equal length, and in the resulting dictionary, each character in x will be mapped to the character at the same position in y. If there is a third argument, it must be a string, whose characters will be mapped to None in the result.
I have three arguments ('', '', '0123456789')
. I think x
is the first ''
, and y
is the second ''
. I have the third argument, which is a string '0123456789'
, but I don't understand what it means to be mapped to 'None'
.
str.maketrans
builds a translation table, which is a mapping of integers or characters to integers, strings, or None
. Think of it like a dictionary where the keys represent characters in the input string and the values they map to represent characters in the output string.
We go through the string to translate and replace everything that appears as a key in the mapping with whatever its value in the map is, or remove it if that value is None
.
You can build a translation table with one, two, or three arguments (I think this may be what's confusing you). With one argument:
str.maketrans({'a': 'b', 'c': None})
You give the function a mapping that follows the rules for translation tables and it returns an equivalent table for that mapping. Things that map to None
are removed
With two arguments:
str.maketrans('abc', 'xyz')
You give it two strings. Each character in the first string is replaced by the character at that index in the second string. So 'a'
maps to 'x'
, 'b'
to 'y'
, and 'c'
to 'z'
.
The one you're using, with three arguments, works the same as two arguments, but has a third string.
str.maketrans('abc', 'xyz', 'hij')
This is the same as the two argument version, except that the characters from the third string are removed, as if they were mapped to None
. So your table is saying "Don't replace anything, but remove the characters that show up in this string".
From the documentation on str.maketrans
:
If there is a third argument, it must be a string, whose characters will be mapped to
None
in the result.
This is what str.maketrans
is doing; it is taking each element in the third argument and creating a map (a Python dictionary) that maps each ordinal value of the characters in the string to None
:
>>> str.maketrans('', '', '0123456789')
{48: None,
49: None,
50: None,
51: None,
52: None,
53: None,
54: None,
55: None,
56: None,
57: None}
If extra values exist as the first and second arguments, they are added to this mapping as additional characters to be translated (this is why the author selected ''
and ''
; he doesn't want extra characters to be translated):
>>> str.maketrans('a', 'A', '0123456789')
{48: None,
49: None,
50: None,
51: None,
52: None,
53: None,
54: None,
55: None,
56: None,
57: None,
97: 65} # map ord('a') to ord('A')
If you apply this to your string now, it'll also capitalize 'athens'
to 'Athens'
due to the extra 'a', 'A'
we've provided to maketrans
. Not the finest of translations but suffices to grasp the functionality.
str_obj.translate
will then perform look-ups on this dictionary for every character in str_obj
replacing its values with the ones found in the mapping. If it doesn't find it inside the mapping, it leaves it as-is, if it is None
it removes it. This is stated in the documentation for str.translate
:
When indexed by a Unicode ordinal (an integer), the table object can do any of the following: return a Unicode ordinal or a string, to map the character to one or more other characters; return
None
, to delete the character from the return string; or raise aLookupError
exception, to map the character to itself.
(Emphasis mine)