Bug: Silently changing UTF-8 characters

General questions about using ExamDiff Pro, ideas for new features, bug reports, and usage tips.
Post Reply
User avatar
Alexo
Expert Member
Posts: 154
Joined: Fri Oct 22, 2004 10:18 am
Location: Canada

Bug: Silently changing UTF-8 characters

Post by Alexo »

Version: EDP 10.0.1.17 64-bit
Use plug-ins: off
Use document type-settings: on

Scenario:
Comparing 2 directories.
Comparing one of the changed .XML files by doubleclicking on it.
The has some non-encoded UTF-8 emojis (specifically: F0-9F-91-91) -- those are shown correctly in Notepad++, Windows 10 Notepad and other editors.
Saving the file from EDP silently changes the character from F0-9F-91-91 to ED-A0-BD-ED, which breaks the application that uses that file since it considers it to be a different string.

I found no setting related to UTF-8 that will allow me to save the file without the silent changes.

Result:
It is impossible to reconcile the files with EDP.

Expected behaviour:
EDP should not change any parts of the file behind the user's back.

Note:
The file is not a "real" XML file, it is just a settings file used by the application which has an XML structure. It has no BOM and can include non-encoded UTF-8 characters. I cannot make arbitrary changes to the file (like adding a BOM) since it will likewise break the application.

Note2: Treating the files as binary is not feasible.

Please fix ASAP!
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Bug: Silently changing UTF-8 characters

Post by psguru »

We'll need your pair of files to debug this issue. Is it possible for you to attach them here? If not, please email to examdiffpro [at] prestosoft.com.
psguru
PrestoSoft
User avatar
Alexo
Expert Member
Posts: 154
Joined: Fri Oct 22, 2004 10:18 am
Location: Canada

Re: Bug: Silently changing UTF-8 characters

Post by Alexo »

Those files contain private information but I'll see if I can extract just the relevant parts.
User avatar
Alexo
Expert Member
Posts: 154
Joined: Fri Oct 22, 2004 10:18 am
Location: Canada

Re: Bug: Silently changing UTF-8 characters

Post by Alexo »

Fines and settings sent in PM.

The "You cannot make another post so soon after your last." is annoying as hell.
MSpagni
Expert Member
Posts: 537
Joined: Mon Mar 30, 2009 12:53 am
Location: Italy

Re: Bug: Silently changing UTF-8 characters

Post by MSpagni »

Maybe (but only maybe) I had the same problem, but I was in a hurry and I didn't investigate.
The "You cannot make another post so soon after your last." is annoying as hell.
I agree!
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Bug: Silently changing UTF-8 characters

Post by psguru »

Alexo wrote: Sun Nov 10, 2019 3:18 pm Fines and settings sent in PM.
There were no attachments in your PM.
The "You cannot make another post so soon after your last." is annoying as hell.
It's for spam protection. I changed the setting to 60 seconds.
psguru
PrestoSoft
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Bug: Silently changing UTF-8 characters

Post by psguru »

Got the files, thanks. Yes, since the files are treated as UTF-8, due to UTF-8 markers within them (Notepad also thinks they are UTF-8), EDP tries to save then as such. However, due to a bug in UTF-8 saving in such specific circumstances that your files present, it saves incorrectly.

The fix will appear in the next builds of 10.0 and 11.0 Beta.
psguru
PrestoSoft
User avatar
Alexo
Expert Member
Posts: 154
Joined: Fri Oct 22, 2004 10:18 am
Location: Canada

Re: Bug: Silently changing UTF-8 characters

Post by Alexo »

Thanks!

I hope it arrives soon, I need to mitigate that problem often.
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Bug: Silently changing UTF-8 characters

Post by psguru »

We are targeting tomorrow.
psguru
PrestoSoft
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Bug: Silently changing UTF-8 characters

Post by psguru »

psguru
PrestoSoft
MSpagni
Expert Member
Posts: 537
Joined: Mon Mar 30, 2009 12:53 am
Location: Italy

Re: Bug: Silently changing UTF-8 characters

Post by MSpagni »

psguru wrote: Mon Nov 11, 2019 12:37 pmIt's for spam protection.
I had no doubt about it! :D
Post Reply