Basic Python text extraction scenario

I am currently working with a text file that looks like this.

NUMBER = 6367283940 |  FOOD = PASTA | NAME = JOHN WALKER
NUMBER = 6367283940 |  FOOD = PASTA | NAME = JOHN WALKER
NUMBER = 6367283940 |  FOOD = PASTA | NAME = JOHN WALKER

I would like to extract the number (just the integers) and save them all to a text file that would read:

6367283940
6367283940
6367283940

How would I go about doing this?

I am brand new.

There's perhaps a few ways you might approach this.

Regex

A simple regex pattern should work.

import re
text = """\
NUMBER = 6367283940 |  FOOD = PASTA | NAME = JOHN WALKER
NUMBER = 6367283940 |  FOOD = PASTA | NAME = JOHN WALKER
NUMBER = 6367283940 |  FOOD = PASTA | NAME = JOHN WALKER
"""
pattern = '^NUMBER = (\d+)'

for number in re.findall(pattern, text):
    print(number)

6367283940
6367283940
6367283940

For an explanation of the regex, see this regex101 link.

String splitting

A more rudimentary way may be to use regular string operations, like .split

with open('mytext.txt') as f:
    for line in f:
        fields = line.split('|')
        number_field = fields[0]
        _, number = number_field.split(' = ')
        print(number)

Csv/pandas

Because your file is pipe-delimited, you could also use the csv module or pandas as Nuno Carvalho answered.

Basic Python text extraction scenario

Regex

String splitting

Csv/pandas

Related

Recent Posts