Removing ANSI color codes from text stream
The characters ^[[37m
and ^[[0m
are part of the ANSI escape sequences (CSI codes).
See also these specifications.
Using GNU sed
sed 's/\x1b\[[0-9;]*m//g'
-
\x1b
(or\x1B
) is the escape special character
(sed
does not support alternatives\e
and\033
) -
\[
is the second character of the escape sequence -
[0-9;]*
is the color value(s) regex -
m
is the last character of the escape sequence
⚠ On macOS, the default sed
command does not support special characters like \e
as pointed out by slm and steamer25 in the comments. Use instead gsed
that you can install using brew install gnu-sed
.
Example with OP's command line: (OP means Original Poster)
perl -e 'use Term::ANSIColor; print color "white"; print "ABC\n"; print color "reset";' |
sed 's/\x1b\[[0-9;]*m//g'
Tom Hale suggests to remove all other escape sequences using [a-zA-Z]
instead of just the letter m
specific to the graphics mode (color) escape sequence. But [a-zA-Z]
may be too wide and could remove too much. Michał Faleński and Miguel Mota propose to remove only some escape sequences using [mGKH]
and [mGKF]
respectively. Britton Kerin indicates K
must also be used in addition to m
to remove the colors from gcc
error/warning (do not forget to redirect gcc 2>&1 | sed...
).
sed 's/\x1b\[[0-9;]*m//g' # Remove color sequences only
sed 's/\x1b\[[0-9;]*[a-zA-Z]//g' # Remove all escape sequences
sed 's/\x1b\[[0-9;]*[mGKH]//g' # Remove color and move sequences
sed 's/\x1b\[[0-9;]*[mGKF]//g' # Remove color and move sequences
Last escape
sequence
character Purpose
--------- -------------------------------
m Graphics Rendition Mode (including Color)
G Horizontal cursor move
K Horizontal deletion
H New cursor position
F Move cursor to previous n lines
Using perl
The version of sed
installed on some operating systems may be limited (e.g. macOS). The command perl
has the advantage of being generally easier to install/update on more operating systems. Adam Katz suggests to use \e
(same as \x1b
) in PCRE.
Choose your regex depending on how much commands you want to filter:
perl -pe 's/\e\[[0-9;]*m//g' # Remove colors only
perl -pe 's/\e\[[0-9;]*[mG]//g'
perl -pe 's/\e\[[0-9;]*[mGKH]//g'
perl -pe 's/\e\[[0-9;]*[a-zA-Z]//g'
perl -pe 's/\e\[[0-9;]*m(?:\e\[K)?//g' # Adam Katz's trick
Example with OP's command line:
perl -e 'use Term::ANSIColor; print color "white"; print "ABC\n"; print color "reset";' \
| perl -pe 's/\e\[[0-9;]*m//g'
Usage
As pointed out by Stuart Cardall's comment, this sed
command line is used by the project Ultimate Nginx Bad Bot (1000 stars) to clean up the email report ;-)
I have found out a better escape sequence remover if you're using MacOS. Check this:
perl -pe 's/\x1b\[[0-9;]*[mG]//g'
What is displayed as ^[
is not ^
and [
; it is the ASCII ESC
character, produced by Esc or Ctrl[ (the ^
notation means the Ctrl key).
ESC
is 0x1B hexadecimal or 033 octal, so you have to use \x1B
or \033
in your regexes:
perl -pe 's/\033\[37m//g; s/\033[0m//g'
perl -pe 's/\033\[\d*(;\d*)*m//g'
ansi2txt
https://unix.stackexchange.com/a/527259/116915
cat typescript | ansi2txt | col -b
-
ansi2txt
: remove ANSI color codes -
col -b
: remove^H
or^M
update: about col handle tabs and space //mentioned by @DanielF
〇. about col
handle spaces and tabs
col -bx
replace '\t' to ' ',
col -bh
replace ' ' to '\t'.
// seems col
can't keep space/tabs as it is, it's a pity.
0. orig string
$ echo -e ' ff\tww' | hd
00000000 20 20 20 20 20 20 20 20 66 66 09 77 77 0a | ff.ww.|
1. -h repace spaces to tab
$ echo -e ' ff\tww' | col -b | hd
00000000 09 66 66 09 77 77 0a |.ff.ww.|
$ echo -e ' ff\tww' | col -bh | hd
00000000 09 66 66 09 77 77 0a |.ff.ww.|
$ echo -e ' ff\tww' | col -bxh | hd
00000000 09 66 66 09 77 77 0a |.ff.ww.|
2. -x repace tab to spaces
$ echo -e ' ff\tww' | col -bx | hd
00000000 20 20 20 20 20 20 20 20 66 66 20 20 20 20 20 20 | ff |
00000010 77 77 0a |ww.|
$ echo -e ' ff\tww' | col -bhx | hd
00000000 20 20 20 20 20 20 20 20 66 66 20 20 20 20 20 20 | ff |
00000010 77 77 0a |ww.|