Why does clang generate unintelligible text when redirected?

I am trying to save the output of a command to a file. The command is:

clang -Xclang -ast-dump -fsyntax-only main.cpp > output.txt

However the resulting output.txt file when opened (by gedit and jedit on ubuntu) gives me this:

[0;1;32mTranslationUnitDecl[0m[0;33m 0x4192020[0m <[0;33m<invalid sloc>[0m> [0;33m<invalid sloc>[0m
[0;34m|-[0m[0;1;32mTypedefDecl[0m[0;33m 0x4192558[0m <[0;33m<invalid sloc>[0m> [0;33m<invalid sloc>[0m implicit[0;1;36m __int128_t[0m [0;32m'__int128'[0m
[0;34m| `-[0m[0;32mBuiltinType[0m[0;33m 0x4192270[0m [0;32m'__int128'[0m
[0;34m|-[0m[0;1;32mTypedefDecl[0m[0;33m 0x41925b8[0m <[0;33m<invalid sloc>[0m> [0;33m<invalid sloc>[0m implicit[0;1;36m __uint128_t[0m [0;32m'unsigned __int128'[0m
[0;34m| `-[0m[0;32mBuiltinType[0m[0;33m 0x4192290[0m [0;32m'unsigned __int128'[0m
...

When it should really look like this:

TranslationUnitDecl 0x4e46020 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x4e46558 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
| `-BuiltinType 0x4e46270 '__int128'
|-TypedefDecl 0x4e465b8 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
| `-BuiltinType 0x4e46290 'unsigned __int128'
...

I thought it might be a problem of encoding, I checked the encoding of the file, file -bi output.txt which outputs text/plain; charset=us-ascii.

I thought maybe if I change the encoding to utf-8 the problem would be fixed so I tried this:

clang -Xclang -ast-dump -fsyntax-only main.cpp | iconv -f us-ascii -t UTF-8 > output.txt

but it didn't make a difference.

What can I do to solve this problem?

The problem isn't that I'm trying to view the syntax-highlighted version (I didn't have a problem viewing it in the first place). I need to save the AST generated by clang to a file and then parse it, which would be difficult with the colour information left in.


It has nothing to do with codepages/encoding. Your output isn't plain text. It contains the sequences like [0;1;32m. These strings (there is a, not shown, [escape] character as well before each of these) are instructions to the terminal to show text bold, italics, in various colors, etc. This results in easier to read output, if your terminal supports it.

There should be an option to tell clang not to try to beautify the output, but use plain text instead. Check the manual. (I don't have one handy, so I can't tell you what the proper command would be.)


Alternatively, instead of removing the colours from the output, you can view the coloured output in your terminal by using the raw option of less

less -r output.txt