Best way to escape and unescape strings in Ruby?
Does Ruby have any built-in method for escaping and unescaping strings? In the past, I've used regular expressions; however, it occurs to me that Ruby probably does such conversions internally all the time. Perhaps this functionality is exposed somewhere.
So far I've come up with these functions. They work, but they seem a bit hacky:
def escape(s)
s.inspect[1..-2]
end
def unescape(s)
eval %Q{"#{s}"}
end
Is there a better way?
Solution 1:
Ruby 2.5 added String#undump
as a complement to String#dump
:
$ irb
irb(main):001:0> dumped_newline = "\n".dump
=> "\"\\n\""
irb(main):002:0> undumped_newline = dumped_newline.undump
=> "\n"
With it:
def escape(s)
s.dump[1..-2]
end
def unescape(s)
"\"#{s}\"".undump
end
$irb
irb(main):001:0> escape("\n \" \\")
=> "\\n \\\" \\\\"
irb(main):002:0> unescape("\\n \\\" \\\\")
=> "\n \" \\"
Solution 2:
There are a bunch of escaping methods, some of them:
# Regexp escapings
>> Regexp.escape('\*?{}.')
=> \\\*\?\{\}\.
>> URI.escape("test=100%")
=> "test=100%25"
>> CGI.escape("test=100%")
=> "test%3D100%25"
So, its really depends on the issue you need to solve. But I would avoid using inspect for escaping.
Update - there is a dump, inspect uses that, and it looks like it is what you need:
>> "\n\t".dump
=> "\"\\n\\t\""
Solution 3:
Caleb function was the nearest thing to the reverse of String #inspect I was able to find, however it contained two bugs:
- \\ was not handled correctly.
- \x.. retained the backslash.
I fixed the above bugs and this is the updated version:
UNESCAPES = {
'a' => "\x07", 'b' => "\x08", 't' => "\x09",
'n' => "\x0a", 'v' => "\x0b", 'f' => "\x0c",
'r' => "\x0d", 'e' => "\x1b", "\\\\" => "\x5c",
"\"" => "\x22", "'" => "\x27"
}
def unescape(str)
# Escape all the things
str.gsub(/\\(?:([#{UNESCAPES.keys.join}])|u([\da-fA-F]{4}))|\\0?x([\da-fA-F]{2})/) {
if $1
if $1 == '\\' then '\\' else UNESCAPES[$1] end
elsif $2 # escape \u0000 unicode
["#$2".hex].pack('U*')
elsif $3 # escape \0xff or \xff
[$3].pack('H2')
end
}
end
# To test it
while true
line = STDIN.gets
puts unescape(line)
end