Cursor Preservation

by Neil Fraser, August 2009

Collaborative editors have to combine input coming from the local user and from remote users.  One of the significant user interface challenges is cursor behaviour.  When a remote user's change is merged into the document, what happens to the local user's cursor?  Likewise, what happens to a selection which the local user has made?

Failure to preserve the cursor and selection can result in the cursor jumping to the top of the document and selections becoming unselected every time a remote change occurs.  This behaviour can make such an editor completely unusable.

First, let's define the related concepts of cursors and selections.  A text cursor indicates one point in the document where user keystrokes will be inserted.  A selection indicates the highlighted range between two points in the document.  Under most (but not all) environments there can be at most one cursor and one selection.  Furthermore, the cursor would normally be locked to either the left or the right side of the selection (if there is one).  Thus the cursor and selection can be completely described with two points (the selection start and finish) and a Boolean (which end the cursor is located at).  If there is no selection, then the start and end points are the same (often described as 'collapsed') and the Boolean is not relevant.

Absolute Referencing

The most common cursor preservation technique is to perform absolute referencing.  Under this model the start and finish locations are recorded in terms of character offsets from the document start, then the remote changes are applied while keeping track of deltas on both points.  If an insertion is made ahead of one of these points, then the point's location is incremented by the length of the insertion.  Likewise if a deletion is made ahead of one of these points, then the point's location is decremented by the length of the deletion.  If a deletion is made which encompasses the point, then the point's location is decremented by the distance between it and the start of the deletion.  Insertions and deletions made behind a point have no effect on the point's location.

Here is an example of absolute referencing.  The cursor is currently at offset 24, just before the word "slithy":

`Twas brillig, and the ^slithy toves

The following edits arrive from a remote user (strike-through represents a deletion, underline represents an insertion):

`Twas brillig, and& the slithy toves

Three characters were deleted and one was added.  If there were no deltas made to the cursor offset, the cursor would shift by two characters:

`Twas brillig, & the sl^ithy toves

By subtracting three and adding one to the cursor location, the cursor is moved to the expected location:

`Twas brillig, & the ^slithy toves

Here's another example involving a selection:

`Twas brillig, and the slithy toves

The same two edits arrive:

`Twas brillig, and& the slithy toves

Without any deltas, the six-character selection would slip sideways:

`Twas brillig, & the slithy toves

The start of the selection was inside a deleted chunk, so its location gets decremented by the number of characters between the start of the deletion and the selection start.  The following insertion of one character occurs after the selection start thus has no effect:

`Twas brillig, & the slithy toves

Finally the end of the selection is moved back by three and forwards by one, making a total delta of negative two:

`Twas brillig, & the slithy toves

Had the same edits arrived in the opposite order:

`Twas brillig, &and the slithy toves

Then the final result would have been different, but equally acceptable:

`Twas brillig, & the slithy toves

This technique is relatively simple and produces a reasonably good user experience.  Google Wave is an editor which performs absolute referencing.

Absolute Referencing Problems

While good (and depending on the application, potentially "good enough"), absolute referencing is not perfect.  Consider the following case:

`Twas brillig, and the slithy toves
  Did gyre and ^gimble in the wabe:
All mimsy were the borogoves,
  And the mome raths outgrabe.

If the middle two lines were swapped, the difference might look like this:

`Twas brillig, and the slithy toves
  Did gyre and gimble in the wabe:
All mimsy were the borogoves,
  Did gyre and gimble in the wabe:
  And the mome raths outgrabe.

Given that the original position of the cursor was in the middle of line 2, and that line 2 was completely deleted, then absolute referencing would set the new location of the cursor at the beginning of line 2:

`Twas brillig, and the slithy toves
^All mimsy were the borogoves,
  Did gyre and gimble in the wabe:
  And the mome raths outgrabe.

However the user would expect their cursor to appear in the middle of line 3.

A potential solution would be to extend the differencing algorithm to include not only insertions and deletions, but also moves.  Thus if a block containing a cursor or selection point is moved to a new location, then that point is moved along with it.  However this solution fails to address the case of moves with modifications:

`Twas brillig, and the slithy toves
  Did gyre and gimble in the wabe:
All mimsy were the borogoves,
  Did Gyre & Gimble in the Wabe:
  And the mome raths outgrabe.

No known differencing algorithm would interpret this as a move plus some edits.

Context Matching

A completely different approach to restoring cursor and selection locations is to use fuzzy matching.  For each point of interest a number of characters before and after the point are recorded, as well as the absolute offset.  Once the remote edits are applied, a fuzzy matching algorithm is used to search for the best match for the previously recorded context, weighted around the area where the result is expected.  If such a match is found, then the point's new location has been established.

In the above case the prefix might be "yre and " while the suffix might be "gimble i" (in practice the context would be longer, but this example only uses eight characters for brevity).  The offset from the start of the document would be 52.  After the edits are applied, the matching algorithm start looking for "yre and gimble i" at around position 52.  The best match would be "yre & Gimble i" (a Levenshtein distance of 4) at position 80 (an offset distance of 28).

Once a match is obtained, an exact position for the point needs to be determined.  The original point was in the middle of the original context string, but where is it located in the newly matched string?  To find out, a detailed difference between the two strings is computed:

yre and& gGimble i

Using the difference as framework, we can now map the old cursor location to the new string:

`Twas brillig, and the slithy toves
All mimsy were the borogoves,
  Did Gyre & ^Gimble in the Wabe:
  And the mome raths outgrabe.

One issue with context matching is that a point at the start of the document will not obtain a contextual prefix.  Likewise a point at the end of the document will not obtain a contextual suffix.  The result is that the following matching effort will be significantly weaker since there is less context to match against.  This is unfortunate since it is extremely common for a cursor to be at the start or end of a document.  One solution is to increase the size of the context on the opposite side such that the total amount of context remains constant.  Another solution is to generate a short string of random characters (preferably unprintable Unicode) and append it to the beginning and end of the text before acquiring and restoring the points, such that the match algorithm has something to match against.

Context matching relies on a rather complicated fuzzy matching algorithm.  Good results have been obtained from the efficient Bitap algorithm, modified to weight results by both the Levenshtein distance and the offset distance.  Google Documents is an editor which performs context matching.

Context Matching Problems

While context matching robustly solves the move (with potential modification) case which absolute referencing failed at, context matching has an issue of its own.

Consider the case of repeated strings:

  Did gyre and gimble in the wabe:
  Did gyre and ^gimble in the wabe:

In this case there are two lines, both identical with the cursor is in the middle of the second line.  Next, a remote edit arrives adding a new line to the start:

`Twas brillig, and the slithy toves
  Did gyre and gimble in the wabe:
  Did gyre and gimble in the wabe:

The matching algorithm starts looking for "yre and gimble i" at around position 51.  And it quickly finds an excellent match with Levenshtein distance of 0 and an offset distance of just 1:

`Twas brillig, and the slithy toves
  Did gyre and ^gimble in the wabe:
  Did gyre and gimble in the wabe:

The result is that the user's cursor jumped up one line.  Had the matching algorithm been given a larger context (e.g. "abe:\n  Did gyre and gimble in the wabe:\0") there would have been sufficient information to make the second match yield a higher score despite being further away.  However, increasing the length of the context not only reduces the efficiency of the matching algorithm, it also reaches a point where accuracy is reduced too -- such as an inability to detect moves like the one which the absolute referencing algorithm failed at above.

Context Matching with Absolute Referenced Offsets

Merging the two algorithms allows each to solve the other's problems.  The strategy is conceptually simple:

  1. Capture the context and absolute offset of each point (the first step of context matching).
  2. Apply the insertions and deletions while updating the absolute offsets of each point (the absolute referencing algorithm).
  3. Restore the points on the new text based on the context and delta adjusted offsets (second step of context matching).

Injecting the absolute referencing algorithm in the middle of the context matching algorithm corrects the offsets such that matches are targeted to the expected location in the new text rather than the previous location in the old text.

Consider an escalation of the previous case:

  Did gyre and gimble in the wabe:
  Did gyre and ^gimble in the wabe:

The context recorded as "yre and gimble i" and the offset is 51.  If a first line is added and the last line is edited, the difference would be:

`Twas brillig, and the slithy toves
  Did gyre and gimble in the wabe:
  Did gGyre and& gGimble in the wWabe:

The absolute referencing algorithm would add 36 to the offset to compensate for the insertion of the first line, would both subtract and add 1 for the case change on the letter 'g', and would subtract 3 and add 1 for the change of 'and'.  The result would be an offset of 85.

Next the fuzzy matching locates two potential matches.  One is "yre and gimble i" at offset 52 (a Levenshtein distance of 0 and a delta offset of 33).  The other is "yre & Gimble i" at offset 85 (a Levenshtein distance of 3 and a delta offset of 0).

Based on how the scoring formula in the matching algorithm is tuned, either answer might be chosen.  If the score ranks a small Levenshtein distance over a small offset distance, then the first match would be chosen, otherwise the second match would be chosen.  Indeed the expected result in this case is unclear.  Thus the tuning of the scoring formula becomes a matter of usability studies.

The combined algorithm is more complex, but significantly more accurate than either algorithm alone.  MobWrite is an editor which uses context matching with absolute referenced offsets.

Scrollbar Preservation

Once the cursor has been properly restored, the next challenge is restoration of the scrollbar.  It is not useful for the cursor to be restored to the correct location if the scrollbar snaps the to the top each time an update is made.

Successful scrollbar restoration involves setting the scroll position such that the text which the user is looking at does not move.  Left-right movement is unavoidable in a line-wrapped document, but vertical movement is controllable in the event that the document is longer than the widget it is being displayed in.  The challenge is to identify what the user is looking at.  In the absence of eye-trackers, there are a set of heuristics which provide decent usability:

  1. If the scrollbar was at the upper or lower limits, it should be restored to those limits.  This allows one to stick to the ends and either view the start or track a document in real time as collaborators append to it.
  2. If the cursor is on screen, the user is likely looking at the cursor and the scrollbar should be set to ensure that the restored cursor appears at the same height.
  3. In the absence of any useful reference points, one can only pick a point at the start of the middle line visible on screen and restore the scrollbar such that the chosen point reappears at the same height.

The horizontal scrollbar (if there is one) ought to be treated in the same manner.

-------------------------------------
Last modified: 12 August 2009