Reading contents of zip file without extracting

Example of what I am trying to achieve:

My text file (test1.txt) contains following two line:

John scored 80 in english

tim scored 75 in english

I have compressed this file to test1.zip and I am trying to read the contents with following code:

f = 'test1.zip'
z = zipfile.ZipFile(f, "r")
zinfo = z.namelist()
for name in zinfo:
    with z.open(name) as f1:
        fi1 = f1.readlines()
for line in fi1:
print(line)

But the result I am getting is

b'John scored 80 in english\r\n'

b'tim scored 75 in english\r\n'

How can I read the contents of this zip file which should give me same output as original file content that is:

John scored 80 in english

tim scored 75 in english

You actually are reading what exactly is in the file.

The /r/n character is the newline character in windows. The question Difference between \n and \r? goes into a bit more detail, but what it comes down to is that Windows uses /r/n as its newline.

The b' character you seeing is related to python and how it parses the file. The question What does the 'b' character do in front of a string literal? does a good job answering why exactly that is happening, but the documentation quoted is:

Bytes literals are always prefixed with 'b' or 'B'; they produce an instance of the bytes type instead of the str type. They may only contain ASCII characters; bytes with a numeric value of 128 or greater must be expressed with escapes.

EDIT: I actually found a very similar answer you can pull from for reading without the extra characters: py3k: How do you read a file inside a zip file as text, not bytes?. The basic idea was you could use this:

items_file  = io.TextIOWrapper(items_file, encoding='your-encoding', newline='')

Reading contents of zip file without extracting

Related

Recent Posts