My research group has been writing a lot of papers recently. Like any reasonable set of CS folks, we use LaTeX as our typesetting tool of choice. Of course, like any modern set of collaborators, we also all want to be editing the paper at the same time, changing words as others are writing, and completing each-others sentences.
To accomplish this, we have generally turned to Overleaf, which -- like LaTeX itself -- isn't a great solution to the problem, but it is the best solution we have. One of the interesting things about Overleaf is that it supports two completely different models of collaboration, which -- for the purposes of this post and without reference to any external sources -- I'll term "interactive" and "generational".
Interactive Collaboration
Interactive collaboration models a shared workspace. All changes are made simultaneously and with fine-grained (real-time) synchronization. Interactive collaboration seems to work best when edit operations are "local" -- generally modeled by granting participants a notion of a cursor or selection around which their edits are centered.
Interactive collaboration doesn't work very well with dependent or global edits (e.g., trying to make an overarching change to the order of sections). And -- as often happens on Overleaf -- it really doesn't work well when the synchronization system begins to bog down.
Generational Collaboration
Generational collaboration models... well, I'm not exactly sure what it models. Perhaps a series of independent artifacts, individually crafted, but with reference to other previously created artifacts? Regardless, changes are made individually, and merged non-real-time into a final artifact.
Generational collaboration works well for changes that don't overlap and (thus) can be merged automatically. It is, thus, a natural fit for global but distinct changes (e.g., changing all references to a figure; doing a global re-ordering of paragraphs in a the "Results" section while another collaborator works on the "Introduction" section).
Further, generational collaboration can operate in a decentralized and disconnected way, which are distinct advantages, especially on a paper deadline when nobody has time to wait on a slow server. (Or, e.g., on an airplane.)
Combining the Models
So, how do we combine the models to get the benefits of both?
Overleaf's approach is to support the generational collaboration model via git push/pull support, but to require that no interactive edits happen between a git pull and the subsequent git push. This, effectively, makes generational collaboration lower priority than interactive, making it incredibly awkward to leverage this model while others are using the interactive model.
My preference here would, instead, be to start with a decentralized generational model and build in some notion of interactivity. I tend to like disconnected/disjoint tools, so perhaps the way to do this is to have the notion of a multiplayer editor protocol that can work (optionally, in parallel) with a git repository.
Here's a sketch of how that might work:
When a multiplayer (text?) editor opens a file, it would check in the same folder (and parent folders) for a .multiplayer
subdirectory and contained config
file. (Similar to how git
looks for a .git
subdirectory.)
The .multiplayer
config file would contain either the address of a responsible server and a project token to identify the project to that server, or a list of peers to contact directly.
Any multiplayer-capable editor would -- after prompting the user -- get the list of peers to coordinate editing with (possibly via connecting to a server, possibly via direct peer entries).
The server could also handle STUN to make peer-peer communications easier.
Regardless, said multiplayer-capable editor would now be connected to a peer cloud and could use peer-to-peer multicast techniques to keep the document synchronized, publish cursor state, and so on.
So that takes care of the interactive collaboration. How about the generational part?
We could make it so the .multiplayer
directory contains a reference to the most recent point in shared edit history at which an editor had synchronized to the peer cloud.
If -- when your multiplayer editor connects -- your current file matches the recent version in the .multiplayer
directory, then all changes between that version and the current peer cloud version could be automatically applied.
If, on the other hand, your current file differs from the reference version in the .multiplayer
directory, then there's a generational change to merge.
This is the classic "three-way merge" problem that (e.g.) git solves reasonably well.
In the interactive setting, it might even make sense to allow you to request others stop for a second and help with the merge.
Of course, if your text editor of choice doesn't support multiplayer, you could still edit single-player and use a command-line utility to deal with the three-way merges. (Perhaps with a "hang on, let's do the merge together everyone" option to avoid the same problem that overleaf has.)
So, Who's Building It?
Perhaps the most surprising thing for me in my (scant) research around this topic is that nobody has built an interoperable interactive collaboration system that actually supports, e.g., cross-editor cursor sharing; at least not one that is still in action today (I seem to recall that google docs at some point had an API with cursors).
Perhaps the notion of a cursor is so editor-specific that it takes special sauce to do it. Perhaps the idea that you need to run a server makes people want to lock in users to finance said server. And perhaps I just haven't looked hard enough.
In the latter case, let me know! I'd love to try out something different with my group when we next have a big crop of papers ready to go.
No comments:
Post a Comment