Neil's News

+ 2010
+ 2009
+ 2008
- 2007
 Sequence Converter
 Widgetless
 Visa Swap
 Voxel Brain
 Quake
 Prefix Matching
 12 Girls Band
 Waves to Wine
 Sphere Builder
 SF Panorama
 Global Warming
 Cycling at Mach 1
 First Steps
 Power
 Who Gadget
 Transposing Diffs
 Bike to Work
 Google Code
 Spaghetti Monster
 Visitors
 Maker Faire 2007
 Brave New World
 420
 COMP2405
 H-1B Visa
 Mandelbrot Scroll
 Carved Links
 New Page
 Moo Inspector
 Fridge Letters
 Vet Tax
 Boredom and Frustration
 Crescent Moon
 Nesting
 Queen Mary 2
 Social Security
 SketchUp
 San Francisco
 Traffic Bugs
+ 2006
+ 2005
+ 2004
+ 2003
+ 2002

Transposing Diffs

26 June 2007

Diff is notorious for returning results which while technically correct, don't make sense. Take this example:

Text 1: The cat.
Text 2: The cow and the cat.

The sensible diff would be this:

Good Diff: The cow and the cat.

However, the following is also a possibility and may be returned if the diff algorithm feels so inclined:

Bad Diff: The cow and the cat.

It is possible to transform a bad diff into a good diff. The trick is to slide edits (insertions or deletions) sideways if the edit is next to an equality and the whole text of the equality makes up the letters at the opposite end of the edit. To illustrate:

Diff 1: The cow and the cat.
Diff 2: The cow and the cat.
Diff 3: The cow and the cat.

Thus the silly three-edit diff has been reduced down to a completely equivalent single-edit diff.

However, this is still not ideal. To increase the diff's human readability, a further transposition to line the edits up with word boundaries would be preferable.

Diff 4: The cow and the cat.

The first type of post-diff transpositions (Diff 3) have been added to the Diff, Match and Patch libraries. The second type (Diff 4) will follow soon.

I've also made a large number of updates to my Diff Strategies paper. One of the intended changes was to move from hard-coded syntax highlighting of sample code to client-side rendering. Unfortunately the best library I could find was way too buggy and the developer appears to be unresponsive. So I reverted all that work.


Someone Digged my image to html converter, resulting in a panicked shutdown of the script by Digital Routes when the load average spiked. I quickly reprogrammed the script to cache the common case of converting Tux and installed a rate-limiter for non-cached conversions.


While cycling up Shepherd Canyon in Oakland on Sunday, I came across this humourously modified sign. The view from the top was pretty sweet too.

< Previous | Next >

 
-------------------------------------