cat command doesn't show the lines of the text [duplicate]
Libreoffice format has the text within a compressed section of a binary file, so cat
doesn't work. There is an option: lowriter --convert-to example.txt
which will repackage it, & there is a --print option if that's what you wanted. man lowriter
is informative.
Why it does not work as you expected
cat
works on text files. An odt file is technically (and very simplified) a ziped folder containing some xml files.
As such 'cat' can not be used for this purpose. It works only on plain text.
What you can do instead
You could of course extract it and parse the respective xml files, but I guess this is overkill for your purposes.
An alternative for what you are trying is:
odt2txt --stdout file.odt
this will provide the same as cat on a txt file, but will take more time depending on the size of the file. you will need to have unoconv installed
sudo apt install unoconv
The odt file is a zip package that include formatting and other features for the document.
I you want to see the content of an odt file you would have to unzip. The actual words contained in the document is in the content.xml
file.
Micosoft word documents (*.docx) is the same type of package. The text of a word documents is in a file of a zipped sudirectory named document.xml
.
I wrote a script to perform text search on my documents. The script would take two argument for the file (filename and text to find), extract the file to a temp folder, grep the contents of the xml file then display the filename that matches the text searched.
Sample Script to search all odt files in a directory and it's subdirectories:
#!/bin/bash
directory="$1"
string="$2"
tempdir="/tmp/searchdir"
echo "Searching directory [$directory] for [$string]"
echo "---------------------------------------------"
if [ $# -ne 2 ]; then
echo "Parameter error... Usage: [Directory to Search] [String to search]"
echo "Note: Use quotes if spaces are included in directory or search string."
echo "Exiting..."
exit 1
fi
mkdir $tempdir
while IFS= read -r -d '' i;
do
# echo Processing: $i
unzip -o "$i" -d $tempdir content.xml > /dev/null 2>&1
found=$(egrep -i "$string" $tempdir/content.xml)
if [[ "$found" ]]; then
echo "Found in [$i]"
fi
[[ -f /tmp/content.xml ]] && rm /tmp/content.xml # remove the temporary file if exist
done < <(find $directory -name \*odt -print0)
rm -r $tempdir