is there a way to highlight all the special accent characters in sublime text or any other text editor?
I a using the the HTML encode special characters in Sublime text to convert all the special character into their HTML code. I have a lot of accented characters in different parts of the file. So, it would be great if I could select all the special character and then use the plugin to convert all at once!
Is there a regex that helps select all special characters only?
Solution 1:
Yes.
Sublime text supports regular expression and you can select all non-ASCII (code point > 128) characters. This regex find should be enough for you:
[^\x00-\x7F]
Just search and replace.
But if you are doing manual HTML encode in the first place you are doing it wrong. Save your files as UTF-8 encoding (Sublime Text 2 default) and make sure your web server also sends out those files as UTF-8. No conversion, encoding or anything needed.
Solution 2:
Just as further reference (or as complement):
The Sublime Text 2/3 package, named Highlighter
, can (as his name says) highlight some characters with regex...
"You can also add a custom regex for characters to highlight."
So, with this package, plus @Mikko Ohtamaa
answer, we can edit the file...
highlighter.sublime-settings -
User
...and include the proposed regex, (expresed here as [^\\x00-\\x7F]
) to end up with something like this:
{
"highlighter_regex": "(\t+ +)|( +\t+)|[^\\x00-\\x7F]|[\u2026\u2018\u2019\u201c\u201d\u2013\u2014]|[\t ]+$"
}
The result would be an automatic highlight of any "non-ASCII (code point > 128) characters" in our file.
Note, this wil not made a selection of those characters, only will highlight them to easily realize if you have any.
Solution 3:
Another plugin option
I recently wrote a plugin dedicated to highlighting non-ascii characters: https://github.com/TuureKaunisto/highlight-dodgy-chars
The exactly same functionality can be achieved with Highlighter but with the less generic Highlight Dodgy Chars plugin you don't need to write a regular expression, you can just list the non-ascii characters you don't wish to highlight in the settings. The European special characters are whitelisted by default.