Determining how many times a substring occurs in a string in Python

I am trying to figure out how many times a string occurs in a string. For example:

nStr = '000123000123'

Say the string I want to find is 123. Obviously it occurs twice in nStr but I am having trouble implementing this logic into Python. What I have got at the moment:

pattern = '123'
count = a = 0
while pattern in nStr[a:]:
    a = nStr[a:].find(pattern)+1
    count += 1
return count

The answer it should return is 2. I'm stuck in an infinite loop at the moment.

I was just made aware that count is a much better way to do it but out of curiosity, does anyone see a way to do it similar to what I have already got?

Use str.count:

>>> nStr = '000123000123'
>>> nStr.count('123')
2

A working version of your code:

nStr = '000123000123'
pattern = '123'
count = 0
flag = True
start = 0

while flag:
    a = nStr.find(pattern, start)  # find() returns -1 if the word is not found, 
    #start i the starting index from the search starts(default value is 0)
    if a == -1:          #if pattern not found set flag to False
        flag = False
    else:               # if word is found increase count and set starting index to a+1
        count += 1        
        start = a + 1
print(count)

The problem with count() and other methods shown here is in the case of overlapping substrings.

For example: "aaaaaa".count("aaa") returns 2

If you want it to return 4 [(aaa)aaa, a(aaa)aa, aa(aaa)a, aaa(aaa)] you might try something like this:

def count_substrings(string, substring):
    string_size = len(string)
    substring_size = len(substring)
    count = 0
    for i in xrange(0,string_size-substring_size+1):
        if string[i:i+substring_size] == substring:
            count+=1
    return count

count_substrings("aaaaaa", "aaa")
# 4

Not sure if there's a more efficient way of doing it, but I hope this clarifies how count() works.

import re

pattern = '123'

n =re.findall(pattern, string)

We can say that the substring 'pattern' appears len(n) times in 'string'.

In case you are searching how to solve this problem for overlapping cases.

s = 'azcbobobegghaklbob'
str = 'bob'
results = 0
sub_len = len(str) 
for i in range(len(s)):
    if s[i:i+sub_len] == str: 
        results += 1
print (results)

Will result in 3 because: [azc(bob)obegghaklbob] [azcbo(bob)egghaklbob] [azcbobobegghakl(bob)]

I'm pretty new, but I think this is a good solution? maybe?

def count_substring(str, sub_str):
    count = 0
    for i, c in enumerate(str):
        if sub_str == str[i:i+2]:
            count += 1
    return count

Determining how many times a substring occurs in a string in Python

Related

Recent Posts