A text editor can be a real game changer when it’s equipped with the functionality of collaborative editing. There are so many tools out there providing such features with great excellence such as Google Doc, Microsoft word etc.
I planned to design and develop such a system which can handle basic text editing and can scale on need. So I tried developing a POC(Proof of concept) which can handle real-time editing and went through some ideas to make this solution scalable.
Obviously, there are scopes for improvement and please share your ideas which will give better insights.
Requirement
Let’s define some project requirement to start on
Basic editing
Realtime collaboration
Basic operational transform
List of joined users
Add / Remove users based on event
Tools
CKEditor 4
Plain & Simple Javascript
Node.js & Express
Socket.io
For operational trasform I’ve used the CKEditor 5 diff library to adapt changes to the editor.
Initial setup
First we need a Node.js server which will host our application server and respond the front end in root endpoint.
I’ve added a redirection in the root endpoint to generate document id and load the html. If we store the document user specific then this step is completely unnecessary.
// writer.js const documentId = new URL(window.location.href).pathname.substring(1);
const editor = CKEDITOR.instances.textarea;
// Get data editor.getData();
// Set data editor.setData("Hello world");
Collaborative editing
To add collaboration in our editor, we have different way to implement.
Asynchronous collaborative editing
In this approach, the editor synchronize the content based on user event. For an example, after the user manually save the document or trigger an event, the content delivered to different clients associated with the content. To publish local content, manual action is involved.
Real-time editing
The content of the editor synchronize with other clients in real-time.
We will be moving forward with the real-time sync approach.
Socket
In this phase, we are going to add socket implementation in our project.
// Let all the clients know about new user to the document socket.to(room).emit('register', { id: socket.id, name: data.handle }); // Post all the user list registered to this document io.in(room).emit('members', roomMembers[room]); }); socket.on('disconnect', function (data) { console.log("Disconnected") let room = socketRoom[socket.id]; if (room) { roomMembers[room] = roomMembers[room].filter(function (item) { return item.id !== socket.id }); if (roomMembers[room].length == 0) { delete roomMembers[room]; } delete socketRoom[socket.id]; socket.to(room).emit('user_left', { id: socket.id }); } }); }); ...
In terms of collaborative editing we can face different type of issues. The most critical of them is conflict resolution. When multiple user changing same text at a time can cause conflict which has to resolve in terms of real time collaboration. There are many algorithms to solve this issue in certain extent, but they have different pros and cons in different situations. The most popular of them is operational transform.
Operational transformation (OT) is a technology for supporting a range of collaboration functionalities in advanced collaborative software systems. OT was originally invented for consistency maintenance and concurrency control in collaborative editing of plain text documents.
Implementing OT is something complex which takes a lot of time to implement and be good at. There is a famous quote by Joseph Gentle, one of the early people implemented OT
Unfortunately, implementing OT sucks. There’s a million algorithms with different tradeoffs, mostly trapped in academic papers. The algorithms are really hard and time consuming to implement correctly. […] Wave took 2 years to write and if we rewrote it today, it would take almost as long to write a second time.
Let’s look at a example of operational transform. Let’s say we have a text CA and two different user is doing operation
String –> CA User 1 –> CAT (operation O1) USER 2 –> HAT (opearion O2)
O1 = [insert T at position 2] O2 = [insert T at position 2, delete C at position 0, insert H at position 0]
For User 1, local operation is O1 and incoming change operation is O2 For User 2, local operation is O2 and incoming change operation is O1
To add transformation, we have to apply both these changes to the string. The transformation will be,
For User 1, String --> O2 --> O1
String --> CA
O2 --> HAT
O1 --> HAT (Result)
For User 2, String --> O1 --> O2
String --> CA
O1 --> CAT
O2 --> HAT (Result)
The states are synchronized for both the user.
This way we’ll get the same result at the end of the transformation and for every changes we made we will apply that with the last sync value.
For this project, I have used CKEditor 5 diffToChanges util to check the changes(operations) from sync state and then apply in the editor.
The most difficult part of OT is not the code, but the difficulty in proving that your system is correct. Therefore, maintaining OT code is difficult. Either you need to prove your code correct repeatedly (and historically people make mistakes), or you need a powerful testing infrastructure for concurrent/distributed systems (which is also hard to write)1
Now, We need to apply transformation to the editor content. In the code, I publish changes to the socket after the user stopped typing. In this case, the user will compare the changes with the sync state and then apply the changes to the state and publish the changes to the socket.
While receiving the change, first it’ll check with the sync state and then apply it there. After that, it’ll apply the local changes to the state and apply and set value of the editor. For the local additional changes it’ll then publish again to the socket.