Removing ANSI color codes from text stream

The characters ^[[37m and ^[[0m are part of the ANSI escape sequences (CSI codes).  See also these specifications.

Using GNU sed

sed 's/\x1b\[[0-9;]*m//g'
  • \x1b (or \x1B) is the escape special character
    (sed does not support alternatives \e and \033)
  • \[ is the second character of the escape sequence
  • [0-9;]* is the color value(s) regex
  • m is the last character of the escape sequence

⚠ On macOS, the default sed command does not support special characters like \e as pointed out by slm and steamer25 in the comments. Use instead gsed that you can install using brew install gnu-sed.

Example with OP's command line:   (OP means Original Poster)

perl -e 'use Term::ANSIColor; print color "white"; print "ABC\n"; print color "reset";' | 
      sed 's/\x1b\[[0-9;]*m//g'

Tom Hale suggests to remove all other escape sequences using [a-zA-Z] instead of just the letter m specific to the graphics mode (color) escape sequence. But [a-zA-Z] may be too wide and could remove too much. Michał Faleński and Miguel Mota propose to remove only some escape sequences using [mGKH] and [mGKF] respectively. Britton Kerin indicates K must also be used in addition to m to remove the colors from gcc error/warning (do not forget to redirect gcc 2>&1 | sed...).

sed 's/\x1b\[[0-9;]*m//g'           # Remove color sequences only
sed 's/\x1b\[[0-9;]*[a-zA-Z]//g'    # Remove all escape sequences
sed 's/\x1b\[[0-9;]*[mGKH]//g'      # Remove color and move sequences
sed 's/\x1b\[[0-9;]*[mGKF]//g'      # Remove color and move sequences
Last escape
sequence
character   Purpose
---------   -------------------------------
m           Graphics Rendition Mode (including Color)
G           Horizontal cursor move
K           Horizontal deletion
H           New cursor position
F           Move cursor to previous n lines

Using perl

The version of sed installed on some operating systems may be limited (e.g. macOS). The command perl has the advantage of being generally easier to install/update on more operating systems. Adam Katz suggests to use \e (same as \x1b) in PCRE.

Choose your regex depending on how much commands you want to filter:

perl -pe 's/\e\[[0-9;]*m//g'          # Remove colors only
perl -pe 's/\e\[[0-9;]*[mG]//g'
perl -pe 's/\e\[[0-9;]*[mGKH]//g'
perl -pe 's/\e\[[0-9;]*[a-zA-Z]//g'
perl -pe 's/\e\[[0-9;]*m(?:\e\[K)?//g' # Adam Katz's trick

Example with OP's command line:

perl -e 'use Term::ANSIColor; print color "white"; print "ABC\n"; print color "reset";' \
      | perl -pe 's/\e\[[0-9;]*m//g'

Usage

As pointed out by Stuart Cardall's comment, this sed command line is used by the project Ultimate Nginx Bad Bot (1000 stars) to clean up the email report ;-)


I have found out a better escape sequence remover if you're using MacOS. Check this:

perl -pe 's/\x1b\[[0-9;]*[mG]//g'


What is displayed as ^[ is not ^ and [; it is the ASCII ESC character, produced by Esc or Ctrl[ (the ^ notation means the Ctrl key).

ESC is 0x1B hexadecimal or 033 octal, so you have to use \x1B or \033 in your regexes:

perl -pe 's/\033\[37m//g; s/\033[0m//g'

perl -pe 's/\033\[\d*(;\d*)*m//g'

ansi2txt

https://unix.stackexchange.com/a/527259/116915

cat typescript | ansi2txt | col -b
  • ansi2txt: remove ANSI color codes
  • col -b: remove ^H or ^M


update: about col handle tabs and space //mentioned by @DanielF

〇. about col handle spaces and tabs

col -bx replace '\t' to ' ', col -bh replace ' ' to '\t'.

// seems col can't keep space/tabs as it is, it's a pity.


0. orig string

$ echo -e '        ff\tww' | hd
00000000  20 20 20 20 20 20 20 20  66 66 09 77 77 0a        |        ff.ww.|

1. -h repace spaces to tab

$ echo -e '        ff\tww' | col -b | hd
00000000  09 66 66 09 77 77 0a                              |.ff.ww.|
$ echo -e '        ff\tww' | col -bh | hd
00000000  09 66 66 09 77 77 0a                              |.ff.ww.|
$ echo -e '        ff\tww' | col -bxh | hd
00000000  09 66 66 09 77 77 0a                              |.ff.ww.|

2. -x repace tab to spaces

$ echo -e '        ff\tww' | col -bx | hd
00000000  20 20 20 20 20 20 20 20  66 66 20 20 20 20 20 20  |        ff      |
00000010  77 77 0a                                          |ww.|
$ echo -e '        ff\tww' | col -bhx | hd
00000000  20 20 20 20 20 20 20 20  66 66 20 20 20 20 20 20  |        ff      |
00000010  77 77 0a                                          |ww.|

3. seems col can't keep spaces and tabs as it is.