# Fix Git Blame Showing Wrong Encoding for Non-ASCII Characters

You run git blame on a file containing non-ASCII characters (Chinese, Japanese, emojis, accented characters), and the output shows garbled text:

bash
git blame src/i18n/messages.py
bash
abc1234 (Author 2026-04-01 10:00:00 +0000  15) message = "пÑ\200ивеÑ\u0082 миÑ\u0080"
def5678 (Author 2026-04-02 14:30:00 +0000  16) greeting = "ä½ å¥½ä¸\u0096ç\u0095\u008c"

The commit hashes and dates are correct, but the author names and line content are garbled because Git is displaying the file with the wrong encoding.

Step 1: Check Your Terminal Encoding

First, verify your terminal supports UTF-8:

bash
echo $LANG
# Should show: en_US.UTF-8 or similar

If it shows C or POSIX, set the correct locale:

bash
export LANG=en_US.UTF-8
export LC_ALL=en_US.UTF-8

Add to ~/.bashrc or ~/.zshrc for persistence.

Step 2: Configure Git Encoding

Git assumes files are UTF-8 by default, but some repositories use different encodings. Configure Git for the repository:

bash
git config --global i18n.commitEncoding utf-8
git config --global i18n.logOutputEncoding utf-8

These settings tell Git what encoding was used when commits were made and what encoding to use for output.

Step 3: Fix Blame Output for Specific Encodings

If the file was committed with a different encoding (e.g., GBK for Chinese):

bash
git -c i18n.logOutputEncoding=GBK blame src/i18n/messages.py

Or configure it per-repository:

bash
git config i18n.logOutputEncoding GBK
git blame src/i18n/messages.py

Step 4: Detect the Actual File Encoding

If you do not know the file's encoding, detect it:

```bash file -i src/i18n/messages.py # Output: text/plain; charset=utf-8

# Or use enca enca -L zh src/i18n/messages.py # Output: Universal transformation format 8 bits; UTF-8 ```

If the file shows a different encoding than expected, convert it:

bash
iconv -f GBK -t UTF-8 src/i18n/messages.py > src/i18n/messages_utf8.py
mv src/i18n/messages_utf8.py src/i18n/messages.py
git add src/i18n/messages.py
git commit -m "Convert file encoding from GBK to UTF-8"

Step 5: Git Blame With Porcelain Output

For programmatic access to blame data with correct encoding:

bash
git blame --porcelain src/i18n/messages.py

This outputs machine-readable data that includes the raw bytes. You can then decode with the correct encoding in your script.

Step 6: Terminal Font Support

Even with correct encoding, the terminal font must support the characters. Check if your font supports the required character set:

```bash # Test Chinese characters echo "你好世界"

# Test Japanese characters echo "こんにちは"

# Test emojis echo "🚀 🔥 ✅" ```

If these show as boxes or question marks, install a font with broader Unicode support:

bash
# Noto Fonts cover almost all Unicode characters
sudo apt install fonts-noto fonts-noto-cjk fonts-noto-color-emoji

Then configure your terminal emulator to use the Noto font.

Step 7: VS Code Git Blame Extension

If using the GitLens or Git Blame extension in VS Code, the extension handles encoding independently of the terminal. Check the VS Code file encoding:

json
{
    "files.encoding": "utf8",
    "files.autoGuessEncoding": true
}

The autoGuessEncoding setting attempts to detect the file encoding automatically.

Step 8: Blame With Line Encoding Detection

For files with mixed encoding (some lines UTF-8, others GBK), blame each section individually:

```bash # Blame specific line range git blame -L 10,20 src/i18n/messages.py

# Blame specific function git blame -L "my_function" src/i18n/messages.py ```

This helps isolate which lines have encoding issues.

Step 9: Core Quoting Path

If file paths with non-ASCII characters are showing as quoted escape sequences:

bash
git config --global core.quotepath false

This disables the quoting of non-ASCII characters in file paths. After this change:

bash
git status
# Before: "src/\344\270\255\346\226\207/file.py"
# After:  src/中文/file.py

Verifying the Fix

After applying the encoding fixes:

bash
git blame src/i18n/messages.py

The output should show correct characters:

bash
abc1234 (张三 2026-04-01 10:00:00 +0800  15) message = "欢迎使用"
def5678 (李四 2026-04-02 14:30:00 +0800  16) greeting = "你好世界"

Author names, dates, and line content should all display correctly.