Iterate a large .xz file line by line in python
I was faced to the same question some weeks ago. This snippet worked for me:
import lzma
with lzma.open('filename.xz', mode='rt') as file:
for line in file:
print(line)
This assumes that the text data in the compressed file was encoded in utf-8 (which was the case for my data). There is an encoding
argument in function lzma.open()
which allows you to set another encoding if needed
EDIT (after you own edit): try to force encoding='utf-8'
in lmza.open()