Not seeing Unicode support in file differences

General questions about using ExamDiff Pro, ideas for new features, bug reports, and usage tips.
Post Reply
MikeTheGeek
New Member
Posts: 3
Joined: Fri Mar 10, 2017 5:01 pm

Not seeing Unicode support in file differences

Post by MikeTheGeek »

Greetings,

I'm working on a project that involves C++ code that is mostly in English, but contains some Korean characters in comments. I just purchased ExamDiff Pro today because of the advertised Unicode support.

However, when I diff two of these files, I see all the Korean characters turn to gibberish.

Strangely, if I view the same files in Notepad++ the characters are displayed as actual Korean characters. If it helps, Notepad++ identifies the file encoding as EUC-KR.

I've tried manually forcing the encoding to any and all of the Unicode options in the open file dialog. Some turn the entire file into what look like Chinese characters, but with UTF-8 (which I gather is the correct choice) the file is correctly displayed with English characters while the Korean characters are gibberish.

Neither of the two files has byte-order marks which I suppose means EDP has to make some assumptions. Does it also make assumptions about the code page to use?

It's a little disconcerting that the only font options I can select are mono-spaced fonts that have a "script" option pull-down. The pull down only supports some alphabets (Hebrew, Greek, Cyrillic, etc.) How these fonts (specifically the default Courier New in Western script) end up displaying what looks like Chinese characters with the forced Unicode encoding is beyond me.

I understand there is more to displaying foreign characters than just using Unicode (the Notepad++ site has a whole page dedicated to explaining character encoding) but I'm wondering if I'm missing something or if ExamDiff Pro simply doesn't support Unicode characters in the file comparison view.

I'm not here to push Notepad++, I'm just using it for comparison, since everything else (document file, OS, fonts, etc.) is the same except the software I'm using to display these documents. Notepad++ shows me the Korean characters, EDP does not.
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Not seeing Unicode support in file differences

Post by psguru »

In order to display Unicode files you need two things:

1. Make sure that the files are recognized as Unicode (you can see if they are by looking at the file status bars). If not, force Unicode by going to the Browse For File button in the Compare dialog and selecting Unicode in a combo-box at the bottom.

2. Use a font that has a script for your language. Even though ExamDiff Pro's file comparison forces fixed-pitch fonts, there some fonts available that provide necessary scripts.
psguru
PrestoSoft
MudGuard
Expert Member
Posts: 69
Joined: Mon Jun 07, 2004 12:42 am

Re: Not seeing Unicode support in file differences

Post by MudGuard »

if the encoding really is EUC-KR (as Notepad++ assumes), then it is no wonder that you get gibberish if you try to display it as if it were utf-8 - EUC-KR is a different encoding.
MikeTheGeek
New Member
Posts: 3
Joined: Fri Mar 10, 2017 5:01 pm

Re: Not seeing Unicode support in file differences

Post by MikeTheGeek »

Thanks, MudGuard and psguru, you are correct. After posting, I looked into the EUC-KR encoding and discovered it, like ANSI is a precursor to Unicode. I tried to reply with my findings but my post had not been made public yet.

Guess I'll look into how I can convert the encoding of these files from EUC-KR to Unicode.

Assuming I do, though, can anyone explain how I will see Korean characters in ExamDiff Pro if ExamDiff Pro only supports certain fixed-pitch fonts with certain scripts (I don't see any options in the scripts drop-down that would suggest Chinese, Japanese, or Korean characters are supported). Is there a way I can add fonts?
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Not seeing Unicode support in file differences

Post by psguru »

See my point 2: yes, you will need to find a fixed-pitch font that has Korean script. After googling this for a bit, there appear to be some free fonts available, although I didn't dig much into this.
psguru
PrestoSoft
MikeTheGeek
New Member
Posts: 3
Joined: Fri Mar 10, 2017 5:01 pm

Re: Not seeing Unicode support in file differences

Post by MikeTheGeek »

After converting the documents to UTF-8 (Bom) ExamDiff Pro now shows me the Korean characters. (I didn't do anything other than convert the document.)

So I'm a happy camper. Thanks for the help.

Guess I'm still curious as to how it can do that when the available fonts don't seem to support Korean characters, but I guess that can be homework for me.
Post Reply