How to trim a std::string?
I'm currently using the following code to right-trim all the std::strings
in my programs:
std::string s;
s.erase(s.find_last_not_of(" \n\r\t")+1);
It works fine, but I wonder if there are some end-cases where it might fail?
Of course, answers with elegant alternatives and also left-trim solution are welcome.
Solution 1:
EDIT Since c++17, some parts of the standard library were removed. Fortunately, starting with c++11, we have lambdas which are a superior solution.
#include <algorithm>
#include <cctype>
#include <locale>
// trim from start (in place)
static inline void ltrim(std::string &s) {
s.erase(s.begin(), std::find_if(s.begin(), s.end(), [](unsigned char ch) {
return !std::isspace(ch);
}));
}
// trim from end (in place)
static inline void rtrim(std::string &s) {
s.erase(std::find_if(s.rbegin(), s.rend(), [](unsigned char ch) {
return !std::isspace(ch);
}).base(), s.end());
}
// trim from both ends (in place)
static inline void trim(std::string &s) {
ltrim(s);
rtrim(s);
}
// trim from start (copying)
static inline std::string ltrim_copy(std::string s) {
ltrim(s);
return s;
}
// trim from end (copying)
static inline std::string rtrim_copy(std::string s) {
rtrim(s);
return s;
}
// trim from both ends (copying)
static inline std::string trim_copy(std::string s) {
trim(s);
return s;
}
Thanks to https://stackoverflow.com/a/44973498/524503 for bringing up the modern solution.
Original answer:
I tend to use one of these 3 for my trimming needs:
#include <algorithm>
#include <functional>
#include <cctype>
#include <locale>
// trim from start
static inline std::string <rim(std::string &s) {
s.erase(s.begin(), std::find_if(s.begin(), s.end(),
std::not1(std::ptr_fun<int, int>(std::isspace))));
return s;
}
// trim from end
static inline std::string &rtrim(std::string &s) {
s.erase(std::find_if(s.rbegin(), s.rend(),
std::not1(std::ptr_fun<int, int>(std::isspace))).base(), s.end());
return s;
}
// trim from both ends
static inline std::string &trim(std::string &s) {
return ltrim(rtrim(s));
}
They are fairly self-explanatory and work very well.
EDIT: BTW, I have std::ptr_fun
in there to help disambiguate std::isspace
because there is actually a second definition which supports locales. This could have been a cast just the same, but I tend to like this better.
EDIT: To address some comments about accepting a parameter by reference, modifying and returning it. I Agree. An implementation that I would likely prefer would be two sets of functions, one for in place and one which makes a copy. A better set of examples would be:
#include <algorithm>
#include <functional>
#include <cctype>
#include <locale>
// trim from start (in place)
static inline void ltrim(std::string &s) {
s.erase(s.begin(), std::find_if(s.begin(), s.end(),
std::not1(std::ptr_fun<int, int>(std::isspace))));
}
// trim from end (in place)
static inline void rtrim(std::string &s) {
s.erase(std::find_if(s.rbegin(), s.rend(),
std::not1(std::ptr_fun<int, int>(std::isspace))).base(), s.end());
}
// trim from both ends (in place)
static inline void trim(std::string &s) {
ltrim(s);
rtrim(s);
}
// trim from start (copying)
static inline std::string ltrim_copy(std::string s) {
ltrim(s);
return s;
}
// trim from end (copying)
static inline std::string rtrim_copy(std::string s) {
rtrim(s);
return s;
}
// trim from both ends (copying)
static inline std::string trim_copy(std::string s) {
trim(s);
return s;
}
I am keeping the original answer above though for context and in the interest of keeping the high voted answer still available.
Solution 2:
Using Boost's string algorithms would be easiest:
#include <boost/algorithm/string.hpp>
std::string str("hello world! ");
boost::trim_right(str);
str
is now "hello world!"
. There's also trim_left
and trim
, which trims both sides.
If you add _copy
suffix to any of above function names e.g. trim_copy
, the function will return a trimmed copy of the string instead of modifying it through a reference.
If you add _if
suffix to any of above function names e.g. trim_copy_if
, you can trim all characters satisfying your custom predicate, as opposed to just whitespaces.
Solution 3:
What you are doing is fine and robust. I have used the same method for a long time and I have yet to find a faster method:
const char* ws = " \t\n\r\f\v";
// trim from end of string (right)
inline std::string& rtrim(std::string& s, const char* t = ws)
{
s.erase(s.find_last_not_of(t) + 1);
return s;
}
// trim from beginning of string (left)
inline std::string& ltrim(std::string& s, const char* t = ws)
{
s.erase(0, s.find_first_not_of(t));
return s;
}
// trim from both ends of string (right then left)
inline std::string& trim(std::string& s, const char* t = ws)
{
return ltrim(rtrim(s, t), t);
}
By supplying the characters to be trimmed you have the flexibility to trim non-whitespace characters and the efficiency to trim only the characters you want trimmed.