How to count word frequencies within a file in python

I have a .txt file with the following format,

C
V
EH
A
IRQ
C
C
H
IRG
V

Although obviously it's a lot bigger then that, this is essentially it.Basically I'm trying to sum how many times each individual string is in the file (each letter/string is on a separate line, so technically the file is C\nV\nEH\n etc. However when I try to convert these files into a list, and then use the count function on, it separates out letters so that strings such as 'IRQ' are ['\n'I','R','Q','\n'] so then when I count it I get the frequencies of each individual letter and not of the strings.

Here is the code that I have written so far,

def countf():
    fh = open("C:/x.txt","r")
    fh2 = open("C:/y.txt","w")
    s = []
    for line in fh:
        s += line
    for x in s:
        fh2.write("{:<s} - {:<d}".format(x,s.count(x))

What I want to end up with is an output file that looks something like this

C  10
V  32
EH 7
A  1
IRQ  9
H 8

Solution 1:

use Counter(), and use strip() to remove the \n:

from collections import Counter
with open('x.txt') as f1,open('y.txt','w') as f2:
    c=Counter(x.strip() for x in f1)
    for x in c:
        print x,c[x]   #do f2.write() here if you want to write them to f2

output:

A 1
C 3
EH 1
IRQ 1
V 2
H 1
IRG 1

How to count word frequencies within a file in python

Solution 1:

Related

Recent Posts