Comma/Tab delimited text: Detail level for difference ident

General questions about using ExamDiff Pro, ideas for new features, bug reports, and usage tips.
Post Reply
Bill
New Member
Posts: 4
Joined: Wed Jul 18, 2007 11:33 am
Location: Tennessee

Comma/Tab delimited text: Detail level for difference ident

Post by Bill »

I frequently compare Comma/Tab delimited text files with "Detail level for difference identification" set to "Lines and words". The resulting highlighting is broken by signs and decimal points within numeric values.

I would like an option something like "Lines and Comma/Tab delimited columns" so the entire text for a column is highlighted if different.

It might help some if I wrote a Plug-in that would align all the columns, but the above option would be best for my purpose.
User avatar
psguru
Site Admin
Posts: 2246
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Post by psguru »

It's actually almost possible today. Under Options | Misc | "Word separators" you can specify custom separators, and they are used to define what a word means for comparison purposes as well as for word navigation. The only limitation in the current version is that the SPACE and the TAB characters are always implied. I think that simply adding the ability to define all separating characters, including SPACE and TAB, would achieve the goal.

In the meantime you could try playing with this setting to see if you get to meaningful comparison results.
psguru
PrestoSoft
Bill
New Member
Posts: 4
Joined: Wed Jul 18, 2007 11:33 am
Location: Tennessee

Post by Bill »

Setting the "Word separators" to "," solved my problem because my data contains no spaces. The ability to define all separating characters would provide a complete solution.

The only additional enhancement I can think of is an option to ignore separators embedded in quoted strings.
User avatar
psguru
Site Admin
Posts: 2246
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Post by psguru »

The only additional enhancement I can think of is an option to ignore separators embedded in quoted strings.
This one is tricky, especially considering nested quotes. But we'll add the way to select ALL word separators to the next version.
psguru
PrestoSoft
Bill
New Member
Posts: 4
Joined: Wed Jul 18, 2007 11:33 am
Location: Tennessee

Post by Bill »

The option to ignore separators embedded in quoted strings would be mainly useful for comparing consistently formatted tabular data such as the output from a Spreadsheet (e.g. a CSV file output from Excel) or from a Data Base Management System. A quoted string would start with a single or double quote, and end with the same kind of quote. An embedded quote of the same kind would be represented by 2 quotes in a row. A starting quote without a companion closing quote should not occur and should be ignored if it does.

As you say, this one is tricky. If comparing data that does not follow the above rules (e.g. punctuated English text with quotes and apostrophes) the quoted string highlighting would probably result in confusion, not a useful comparison.
User avatar
psguru
Site Admin
Posts: 2246
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Post by psguru »

OK, I think this can be added as well. The only thing to note here is that separators embedded in quoted strings will only be considered during comparison, not during word-by-word text navigation, word selection, and search ("Match whole word only").

Also, for performance reasons quotes will be considered matched: if a normal (not embedded) quote opens, it will be assumed to close later in the line. If this is not the case, and a quote never closes, all separators will be ignored after such quote till end of line (all of this, of course, is valid only if the new option to ignore separators embedded in quoted strings is enabled).
psguru
PrestoSoft
Bill
New Member
Posts: 4
Joined: Wed Jul 18, 2007 11:33 am
Location: Tennessee

Post by Bill »

Thank you.

The option to "ignore separators embedded in quoted strings" should work well when comparing objects such as Excel XLS data saved as CSV files or by using the "XLS to CSV" plug-in that came with ExamDiff Pro.
Post Reply