It's nice to look at, but it doesn't actually work that well, because it has no ...

marijn · on March 29, 2017

> It's nice to look at

I don't think it is -- it's full of the kind of features (complicated condition expressions, long chains of elses, hard-coded numbers) that, when I find myself writing them, suggest that I'm approaching the problem in the wrong way, because previous experiences make me associate that kind of code with lack of generality and corner-case bugs.

jayajay · on March 29, 2017

I thought this too, at a first glance. There must be a more elegant method.

Robin_Message · on March 29, 2017

There is: https://en.wikipedia.org/wiki/Longest_common_subsequence_pro...

which is what most diff tools use. It is a simple and clear algorithm with no special cases once you get your head around it. It works in O(NxM) time, which seems like a reasonable lower bound (you need to compare everything to everything else to have a chance of getting the best alignment) although there are ways to do better with constraints.

(I remember one for gene alignment which broke N and M in two, recurse 4 times on each pairing, and then had a quick-ish way to put those together again. Can't remember the details though!)