Diff, Match and Patch

Demo of Match

Match looks for a pattern within a larger text. This implementation of match is fuzzy, meaning it can find a match even if the pattern contains errors and doesn't exactly match what is found in the text. This implementation also accepts an expected location, near which the match should be found. The candidate matches are scored based on a) the number of spelling differences between the pattern and the text and b) the distance between the candidate match and the expected location. The match distance parameter sets the relative importance of these two metrics.

Text:

Fuzzy pattern:


Approximate pattern to search for in the text. Due to limitations of the Bitap algorithm, the pattern has a limited length.

Fuzzy location:


Approximately where in the text is the pattern expected to be found?

Match distance:


Determines how close the match must be to the fuzzy location (specified above). An exact letter match which is 'distance' characters away from the fuzzy location would score as a complete mismatch. A distance of '0' requires the match be at the exact location specified, a threshold of '1000' would require a perfect match to be within 800 characters of the fuzzy location to be found using a 0.8 threshold.

Match threshold:


At what point does the match algorithm give up. A threshold of '0.0' requires a perfect match (of both letters and location), a threshold of '1.0' would match anything.


Back to Diff, Match and Patch