Save all files in Visual Studio project as UTF-8
I wonder if it's possible to save all files in a Visual Studio 2008 project into a specific character encoding. I got a solution with mixed encodings and I want to make them all the same (UTF-8 with signature).
I know how to save single files, but how about all files in a project?
Since you're already in Visual Studio, why not just simply write the code?
foreach (var f in new DirectoryInfo(@"...").GetFiles("*.cs", SearchOption.AllDirectories)) {
string s = File.ReadAllText(f.FullName);
File.WriteAllText (f.FullName, s, Encoding.UTF8);
}
Only three lines of code! I'm sure you can write this in less than a minute :-)
This may be of some help.
link removed due to original reference being defaced by spam site.
Short version: edit one file, select File -> Advanced Save Options. Instead of changing UTF-8 to Ascii, change it to UTF-8. Edit: Make sure you select the option that says no byte-order-marker (BOM)
Set code page & hit ok. It seems to persist just past the current file.
In case you need to do this in PowerShell, here is my little move:
Function Write-Utf8([string] $path, [string] $filter='*.*')
{
[IO.SearchOption] $option = [IO.SearchOption]::AllDirectories;
[String[]] $files = [IO.Directory]::GetFiles((Get-Item $path).FullName, $filter, $option);
foreach($file in $files)
{
"Writing $file...";
[String]$s = [IO.File]::ReadAllText($file);
[IO.File]::WriteAllText($file, $s, [Text.Encoding]::UTF8);
}
}
I would convert the files programmatically (outside VS), e.g. using a Python script:
import glob, codecs
for f in glob.glob("*.py"):
data = open("f", "rb").read()
if data.startswith(codecs.BOM_UTF8):
# Already UTF-8
continue
# else assume ANSI code page
data = data.decode("mbcs")
data = codecs.BOM_UTF8 + data.encode("utf-8")
open("f", "wb").write(data)
This assumes all files not in "UTF-8 with signature" are in the ANSI code page - this is the same what VS 2008 apparently also assumes. If you know that some files have yet different encodings, you would have to specify what these encodings are.
Using C#:
1) Create a new ConsoleApplication, then install Mozilla Universal Charset Detector
2) Run code:
static void Main(string[] args)
{
const string targetEncoding = "utf-8";
foreach (var f in new DirectoryInfo(@"<your project's path>").GetFiles("*.cs", SearchOption.AllDirectories))
{
var fileEnc = GetEncoding(f.FullName);
if (fileEnc != null && !string.Equals(fileEnc, targetEncoding, StringComparison.OrdinalIgnoreCase))
{
var str = File.ReadAllText(f.FullName, Encoding.GetEncoding(fileEnc));
File.WriteAllText(f.FullName, str, Encoding.GetEncoding(targetEncoding));
}
}
Console.WriteLine("Done.");
Console.ReadKey();
}
private static string GetEncoding(string filename)
{
using (var fs = File.OpenRead(filename))
{
var cdet = new Ude.CharsetDetector();
cdet.Feed(fs);
cdet.DataEnd();
if (cdet.Charset != null)
Console.WriteLine("Charset: {0}, confidence: {1} : " + filename, cdet.Charset, cdet.Confidence);
else
Console.WriteLine("Detection failed: " + filename);
return cdet.Charset;
}
}