How to clean up output of linux 'script' command
I'm using the linux 'script' command http://www.linuxcommand.org/man_pages/script1.html to track some interactive sessions. The output files from that contain unprintable characters, including my backspace keystrokes.
Is there a way to tidy these output files up so they only contain what was displayed on screen?
Or is there another way to record an interactive shell session (input and output)?
Solution 1:
If you want to view the file, then you can send the output through col -bp
; this interprets the control characters. Then you can pipe through less, if you like.
col -bp typescript | less -R
On some systems col
wouldn't accept a filename argument, use this syntax instead:
col -bp <typescript | less -R
Solution 2:
cat typescript | perl -pe 's/\e([^\[\]]|\[.*?[a-zA-Z]|\].*?\a)//g' | col -b > typescript-processed
here's some interpretation of the string input to perl
:
-
s/pattern//g
means to do a substitution on the entire (theg
option means do the entire thing instead of stopping on the first substitute) input string
here's some interpretation of the regex pattern:
-
\e
match the special "escape" control character (ASCII 0x1A) -
(
and)
are the beginning and end of a group -
|
means the group can match one of N patterns. where the N patterns are-
[^\[\]]
or -
\[.*?[a-zA-Z]
or \].*?\a
-
-
[^\[\]]
means- match a set of NOT characters where the not characters are
[
and]
- match a set of NOT characters where the not characters are
-
\[.*?[a-zA-Z]
means- match a string starting with
[
then do a non-greedy.*?
until the first alpha character
- match a string starting with
-
\].*?\a
means- match a string that starts with
]
then do a non-greedy.*?
until you hit the special control character called "the alert (bell) character"
- match a string that starts with