Use OpenOffice from command line to convert HTML to RTF
I'm trying to build a bash script in Cygwin that will convert HTML files to RTF. In OS X this is trivial with textutils
, but that doesn't exist for regular Linux or Cygwin. Instead I'm trying to use OpenOffice from the command line.
I've read elsewhere that OpenOffice can run headlessly with a program normally installed as /usr/bin/ooffice
, but in Cygwin under Windows this obviously doesn't work—the OpenOffice installer doesn't built native Cygwin symlinks and might not even install the Windows equivalent of ooffice
.
How can I use OpenOffice from the command line in Cygwin to convert HTML files to RTF files?
Solution 1:
There is a really handy shell script called unoconv
that handles conversion of any files from and to any file format that OpenOffice/LibreOffice supports. You can read up about it on its site and be sure to check out the man page. Many distros have packages for it that you can install easily, including, I believe, cygwin.
Once you have it installed, usage in your case would mean specifying an input html file and an output rtf file like this:
unoconv file.html file.rtf
All done :)
Of course this could be scripted to handle multiple file situations as well. If you are using zsh
, you could run something like this to convert a whole folder of html files:
for file in *html; do
unoconv "$file" "${file/html/rtf}"
done
Solution 2:
I would suggest the JODConverter. It is a java wrapper around the OpenDoc Api for conversion. Allows you to convert files like this:
java -jar jodconverter-cli-2.2.0.jar foo.html foo.rtf
It's also available in python.
instead of using the openoffice SDK DocumentSaver class like this:
java -classpath .;./bin;\
$OO/program/classes/jurt.jar;\
$OO/program/classes/ridl.jar;\
$OO/program/classes/sandbox.jar;\
$OO/program/classes/unoil.jar;\
$OO/program/classes/juh.jar \
DocumentSaver uno:socket,host=localhost,port=8100;urp;StarOffice.ServiceManager file:///C:/test/foo.html file:///C:/test/foo.rtf
Solution 3:
I can help with the first part of your question. Here's an example of running OpenOffice from the Cygwin command line:
/cygdrive/c/Program\ Files/OpenOffice.org\ 3/program/soffice.exe -help
That will give you a list of command line arguments. I didn't see any that would convert file types or even "Save As", but I didn't research the API. Perhaps you can fill in that part. I have OpenOffice.org 3.2 320m12(Build:9483).