git log output encoding issues on Windows 10 CLI terminal

Problem

How to make git log command output properly displayed on Windows CLI terminal?

Example

git commands sequence leading to the problem As you can see I can type diacritical characters properly but on git log the output is somehow escaped. According to UTF-8 encoding table the codes between angled brackets (< and >) from the output correspond to the previously typed git config parameters.

I have tried to set LESSCHARSET environment variable to utf-8 as sugested in one of the answers for similar issue but then the output is garbled:

git log output after setting LESSCHARSET=utf8

I know .git/config is encoded properly with utf-8 as it's handled by gitk as expected.

Proper gitk output

Here is locale command output if necessary

LANG=
LC_CTYPE="C.UTF-8"
LC_NUMERIC="C.UTF-8"
LC_TIME="C.UTF-8"
LC_COLLATE="C.UTF-8"
LC_MONETARY="C.UTF-8"
LC_MESSAGES="C.UTF-8"
LC_ALL=

EDIT:

The output is the same also in pure git-bash:

enter image description here

so I believe the problem is shell independent and relates to Git or its configuration itself.


Solution 1:

Okay, I experimented a bit and found out that Windows Git commands actually need UNIX variables like LC_ALL in order to display Polish (or other UTF-8 characters) correctly. Just try this command:

set LC_ALL=C.UTF-8

Then enjoy the result. Here is what happened on my console (font "Consolas", no chcp necessary):

Windows console CMD


Update:

  • Well, in order for Windows commands like type (display file on console) to work correctly, you do need chcp 65001.
  • And if you prefer commands from Git Bash like cat you profit from the aforementioned set LC_ALL=C.UTF-8.

Windows console CMD, part 2


Update 2: How to make the changes permanent

As user mono blaine said, create an environment variable LC_ALL and assign it the value C.UTF-8, either globally or for your own user profile only (sorry for the German screenshot):

Create environment variable

Next time you open a command processor console (cmd.exe) you should see the variable value when issuing the command echo %LC_ALL%. In PowerShell you should see it when issuing $env:LC_ALL.

The simplest way to make the UTF-8 code page permanent ist to open regeedit and add a new value named Autorun of type string to section HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Command Processor and assign it the value chcp 65001.

Registration editor

Henceforth, this command will be executed each time you open a new cmd.exe console. You even see its output in the new window: "Aktive Codepage: 65001." (or similar in your respective language).

Oh, by the way: In order to display a UTF-8 encoded file correctly in PowerShell you can use Get-Content -encoding UTF8 file.txt or cat -encoding UTF8 file.txt (cat being an alias for Get-Content in PowerShell).

Solution 2:

If anyone is interested in the PowerShell equivalent of set LC_ALL=C.UTF-8, that is:

$env:LC_ALL='C.UTF-8'

However this works only for the current session. To make it permanent, two possibilities:

  • create an environment variable named LC_ALL with the value C.UTF-8
  • or put $env:LC_ALL='C.UTF-8' in your $Profile file

Solution 3:

I am using Git via Powershell Core v7.0.3 inside Windows Terminal on Windows 10.

I have been browsing through answers and tried many of them. The solutions that worked for me were:

  • Change a Git setting: git config --global core.pager 'less --raw-control-chars'
  • Add $env:LC_ALL = 'C.UTF-8' to the current Powershell profile

These solutions both work separately. I chose to use the Git command as the problem seems to be related to Git, and Powershell profile stays cleaner.

Solution 4:

I use git bash on WIN10. As for me, 4 settings make the appearance as my expectation.

  • env setting. Add LC_ALL=C.UTF-8,LESSCHARSET=UTF-8 to PATH globally.

  • git config. git config --global i18n.logOutputEncoding utf-8.

  • git bash setting. Set Options-> Text-> Character set to utf-8. Or set locale and Character set both to default. It is smart enough to choose the correct encoding.

Done.

Solution 5:

I had such problem on Linux. And the problem was that I did not generated locales. So my output of locale was cantaining all "C" letters, without UTF-8. To solve this, I uncommented en_US.UTF-8 and ru_RU.UTF-8 in /etc/locale.gen. Then I run localectl set-locale LANG=ru_RU.UTF-8 and rebooted. And relogined to the system. After that ciryllic was displayed normally.