Deleting consonants from a string in Python
Correcting your code
The line if char == vowels:
is wrong. It has to be if char in vowels:
. This is because you need to check if that particular character is present in the list of vowels. Apart from that you need to print(char,end = '')
(in python3) to print the output as iiii
all in one line.
The final program will be like
def eliminate_consonants(x):
vowels= ['a','e','i','o','u']
for char in x:
if char in vowels:
print(char,end = "")
eliminate_consonants('mississippi')
And the output will be
iiii
Other ways include
-
Using
in
a stringdef eliminate_consonants(x): for char in x: if char in 'aeiou': print(char,end = "")
As simple as it looks, the statement
if char in 'aeiou'
checks ifchar
is present in the stringaeiou
. -
A list comprehension
''.join([c for c in x if c in 'aeiou'])
This list comprehension will return a list that will contain the characters only if the character is in
aeiou
-
A generator expression
''.join(c for c in x if c in 'aeiou')
This gen exp will return a generator than will return the characters only if the character is in
aeiou
-
Regular Expressions
You can use
re.findall
to discover only the vowels in your string. The codere.findall(r'[aeiou]',"mississippi")
will return a list of vowels found in the string i.e.
['i', 'i', 'i', 'i']
. So now we can usestr.join
and then use''.join(re.findall(r'[aeiou]',"mississippi"))
-
str.translate
andmaketrans
For this technique you will need to store a map which matches each of the non vowels to a
None
type. For this you can usestring.ascii_lowecase
. The code to make the map isstr.maketrans({i:None for i in string.ascii_lowercase if i not in "aeiou"})
this will return the mapping. Do store it in a variable (here
m
for map)"mississippi".translate(m)
This will remove all the non
aeiou
characters from the string. -
Using
dict.fromkeys
You can use
dict.fromkeys
along withsys.maxunicode
. But remember toimport sys
first!dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')
and now use
str.translate
.'mississippi'.translate(m)
-
Using
bytearray
As mentioned by J.F.Sebastian in the comments below, you can create a bytearray of lower case consonants by using
non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))
Using this we can translate the word ,
'mississippi'.encode('ascii', 'ignore').translate(None, non_vowels)
which will return
b'iiii'
. This can easily be converted tostr
by usingdecode
i.e.b'iiii'.decode("ascii")
. -
Using
bytes
bytes
returns an bytes object and is the immutable version ofbytearray
. (It is Python 3 specific)non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))
Using this we can translate the word ,
'mississippi'.encode('ascii', 'ignore').translate(None, non_vowels)
which will return
b'iiii'
. This can easily be converted tostr
by usingdecode
i.e.b'iiii'.decode("ascii")
.
Timing comparison
Python 3
python3 -m timeit -s "text = 'mississippi'*100; non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
100000 loops, best of 3: 2.88 usec per loop
python3 -m timeit -s "text = 'mississippi'*100; non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
100000 loops, best of 3: 3.06 usec per loop
python3 -m timeit -s "text = 'mississippi'*100;d=dict.fromkeys(i for i in range(127) if chr(i) not in 'aeiou')" "text.translate(d)"
10000 loops, best of 3: 71.3 usec per loop
python3 -m timeit -s "import string; import sys; text='mississippi'*100; m = dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')" "text.translate(m)"
10000 loops, best of 3: 71.6 usec per loop
python3 -m timeit -s "text = 'mississippi'*100" "''.join(c for c in text if c in 'aeiou')"
10000 loops, best of 3: 60.1 usec per loop
python3 -m timeit -s "text = 'mississippi'*100" "''.join([c for c in text if c in 'aeiou'])"
10000 loops, best of 3: 53.2 usec per loop
python3 -m timeit -s "import re;text = 'mississippi'*100; p=re.compile(r'[aeiou]')" "''.join(p.findall(text))"
10000 loops, best of 3: 57 usec per loop
The timings in sorted order
translate (bytes) | 2.88
translate (bytearray)| 3.06
List Comprehension | 53.2
Regular expressions | 57.0
Generator exp | 60.1
dict.fromkeys | 71.3
translate (unicode) | 71.6
As you can see the final method using bytes
is the fastest.
Python 3.5
python3.5 -m timeit -s "text = 'mississippi'*100; non_vowels = bytes(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
100000 loops, best of 3: 4.17 usec per loop
python3.5 -m timeit -s "text = 'mississippi'*100; non_vowels = bytearray(set(range(0x100)) - set(b'aeiou'))" "text.encode('ascii', 'ignore').translate(None, non_vowels).decode('ascii')"
100000 loops, best of 3: 4.21 usec per loop
python3.5 -m timeit -s "text = 'mississippi'*100;d=dict.fromkeys(i for i in range(127) if chr(i) not in 'aeiou')" "text.translate(d)"
100000 loops, best of 3: 2.39 usec per loop
python3.5 -m timeit -s "import string; import sys; text='mississippi'*100; m = dict.fromkeys(i for i in range(sys.maxunicode+1) if chr(i) not in 'aeiou')" "text.translate(m)"
100000 loops, best of 3: 2.33 usec per loop
python3.5 -m timeit -s "text = 'mississippi'*100" "''.join(c for c in text if c in 'aeiou')"
10000 loops, best of 3: 97.1 usec per loop
python3.5 -m timeit -s "text = 'mississippi'*100" "''.join([c for c in text if c in 'aeiou'])"
10000 loops, best of 3: 86.6 usec per loop
python3.5 -m timeit -s "import re;text = 'mississippi'*100; p=re.compile(r'[aeiou]')" "''.join(p.findall(text))"
10000 loops, best of 3: 74.3 usec per loop
The timings in sorted order
translate (unicode) | 2.33
dict.fromkeys | 2.39
translate (bytes) | 4.17
translate (bytearray)| 4.21
List Comprehension | 86.6
Regular expressions | 74.3
Generator exp | 97.1
You can try pythonic way like this,
In [1]: s = 'mississippi'
In [3]: [char for char in s if char in 'aeiou']
Out[3]: ['i', 'i', 'i', 'i']
Function;
In [4]: def eliminate_consonants(x):
...: return ''.join(char for char in x if char in 'aeiou')
...:
In [5]: print(eliminate_consonants('mississippi'))
iiii