Remove a BOM character in a file

I have a BOM character in my html file. I want to remove It. I have searched a lot and used a lot of scripts and etc... . But no one worked. I have downloaded notepad++ too, but there is not encoding "UTF8 without BOM" in its encoding menu. How can I delete that BOM character? thanks.

The screenshot of my notepad++


If you look in the same menu. Click "Convert to UTF-8."

If you look in the same menu. Click "Convert to UTF-8.


You can solve the problem using vim, where you can get easily with MinGW-w64 (If you have installed Git it comes along) or Cygwin.

So, the key is to use:

  • The option -s, which will execute a vim script with vim commands.
  • The option -b, which will open your file in binary mode, where you'll see those awkward BOM bytes
  • The option -n, which is very important! This option refuses the use of swap files, so all your work runs in memory. It gives you assurance because if the file is large, the swap files can mislead the process.

That said, let's go to the code!

  1. First you create a simple file, here named 'script', which will hold the vim commands

    echo 'gg"+gPggdtCZZ' > script
    

    ...this weird string says to vim "Go to the beginning of the file, copy the first word and paste it behind the cursor, so delete everything until character 'C', then, save the file"

    Note: If your file starts with other character than 'C', you have to specify it. If you have different 'first characters', you can follow the logic and create a bash script which will read the first character and replace it for you in the snippet above.

  2. Run the vim command:

    vim -n -b <the_file> -s script