Compare split lines with non-split lines

General questions about using ExamDiff Pro, ideas for new features, bug reports, and usage tips.
Post Reply
fiddyschmitt
New Member
Posts: 3
Joined: Wed Jul 03, 2019 10:14 pm

Compare split lines with non-split lines

Post by fiddyschmitt »

Hi,

I'm comparing a word document (left) with a PDF file (right). The PDF file was generated from a newer revision of the word document.

The PDF has hard line-breaks in places that the word document just visually wraps to the next line. As such, ExamDiff concludes that they are very different.

Is there a way to get ExamDiff to examine sentences rather than lines?

I've tried:
  • Replacing \n with spaces in ExamDiff
  • Playing with the wrap type and width settings in ExamDiff
  • Converting the word document to PDF, but this ends up producing a PDF with slightly different wrapping to the other PDF
  • Playing with the margins in Microsoft Word to achieve the same PDF wrapping
but none of that worked...

Thanks,
Fidel
Attachments
screenshot.JPG
screenshot.JPG (250.02 KiB) Viewed 6965 times
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Compare split lines with non-split lines

Post by psguru »

No, there's really no way to do it in EDP. If there was a tool that converts linebreaks within sentences to spaces, it could be used as a plug-in in EDP, but I don't know of such tool.
psguru
PrestoSoft
fiddyschmitt
New Member
Posts: 3
Joined: Wed Jul 03, 2019 10:14 pm

Re: Compare split lines with non-split lines

Post by fiddyschmitt »

No worries, thanks guru.

In the end I used this process to make the files comparable:

- From ExamDiff, copy the text from the left pane into Notepad++
- In Notepad++
   //The following removes carriage returns
   Press Ctrl+H
      Find what: \r\n
      Replace with: (leave this blank)
      Search mode: Regular Expression
      Click 'Replace All'

   //The following places each sentence on its own line
   Press Ctrl+H
      Find what: \.
      Replace with: \n
      Search mode: Regular Expression
      Click 'Replace All'

- Do the same for the PDF content from ExamDiff
- Now create a new ExamDiff window and compare the two texts from Notepad++
User avatar
psguru
Site Admin
Posts: 2228
Joined: Sat May 15, 2004 4:23 pm
Location: California
Contact:

Re: Compare split lines with non-split lines

Post by psguru »

If it's the process you described, you could write two scripts, one for .DOC files and the other for .PDF, using, say, sed, and use them as additional plug-ins for these respective file types (Options | Tools | Plug-ins).
psguru
PrestoSoft
fiddyschmitt
New Member
Posts: 3
Joined: Wed Jul 03, 2019 10:14 pm

Re: Compare split lines with non-split lines

Post by fiddyschmitt »

Brilliant, thanks guru
Post Reply