Programming Musings: November 2012

In this post I'm going to talk about projectional editors, how they work and what are they good for. I also present a simple prototype (implemented in Common Lisp) and a raw demonstration video.

Let me talk a little about text editors first

We all know that text editors are great tools for editing arbitrary text. The simplest data structure used by a text editor (e.g. Microsoft Notepad) is an ordered sequence of characters including line breaks. A somewhat more sophisticated data structure extends this with style related data such as colors and fonts. A word processing editor (e.g. Microsoft Word) further extends the data structure with chapters, bulleted lists, tables, pictures, etc.

A projectional editor differs from a text editor in that it is a general purpose editor with respect to the data structure being edited, and also how the data structure is edited. According to this view, a text editor is basically a projectional editor specialized for the text document data structure. The simplest projection that a text editor provides is that it writes characters to the screen from left to right. If line breaks are also included in the document, or if word wrapping is used, then the editor also writes characters to the screen from top to bottom. There are several text editors that provide more complicated projections, but only a few of them provide multiple different projections.

The term structure(d) editor is often used to describe editors that are cognizant of the underlying data structure that is being edited. I think this terminology is somewhat misleading in that it suggests that text editors are not structured editors. People might say that a text editor is not cognizant of the edited data structure, because the actual meaning is hidden in the user's mind. I think this distinction is wrong, because no matter how clever an editor is, there can always be an extra projection in the user's mind. The editor has no way of knowing what's in the user's mind. According to this view a text editor is a perfectly valid structured editor for the text document data structure.

So what is a projectional editor then?

It's a general purpose editor (framework) that provides multiple ways (notations) to edit arbitrary data structures. The difficulty lies in the implementation of the projectional editor, because it needs to be in par with the very high usability of text editors. In my terminology a projectional editor provides all the features and usability of text editors and word processing editors.

It's called projectional editor, because it provides its editing capabilities using bidirectional multistage projections between problem domains. The more domains it knows, the more kind of documents it can edit. The more projections it knows, the more alternative ways of editing and notations it provides.

I'd like to stress the importance of that it should be able to provide the same textual editing behavior that standard text editors have. This should be possible for all kind of structured documents that have textual representation. For example, it should provide editing for XML structured documents in the same way as text editors do as far as the user is concerned. That is, it should support invalid syntax, incomplete syntax, etc. while still having all the benefits of projectional editors.

What does the projection editor offer compared to text editors?

It allows to reuse existing generic projections that can be applied on the documents of any problem domain. The important thing to note here is that the projection is not simply applied to the document only once. It is rather applied to the document continuously while the document is being edited. Here are some examples:

filtering - it may be used to focus only some parts of the document while the rest is hidden (e.g. show the chapter titles only from a text document, show some elements from an XML document, etc.)
ordering - it may be used to sort the content (e.g. sort the functions in alphabetical order in Java code document, sort the calendar events in chronological order, etc.)
grouping - it may be used to arrange the content into groups (e.g. group the rows of a table according to the value in the cells of the first column, etc.)

It allows to reuse existing projections by building on top of them. For example, there might be multiple different tree projections that allows editing the tree data strcutures through textual (e.g. using separators and delimiters) or graphical notations. The new projection can reuse them allowing to use the multiple different notations for the new problem domain.

It allows to reuse existing generic documents as part of the documents of any new problem domain.

text - stylized text similar to word processing documents
table - two dimensional arrangements
tree - collapsible hierarchical drawings
graph - drawings with nodes and edges

It allows to reuse existing domains by mixing them in the new problem domain. For example, there might be documents and projections for word processing documents with text style and word wrapping. The new document can reuse them by composition allowing the already known notations to be part of the new problem domain.

How does the editor work internally?

It communicates with the user in a similar way to the well known and widely used read-eval-print loop. Simply speaking the REPL is a very powerful and generic way to talk to the computer, but I think Wikipedia explains it better than me. The main difference from the standard REPL is that it has to handle different kind of input and output devices simultaneously. These IO devices include among others a keyboard, a mouse and a graphical display as a bare minimum.

The standard REPL prints characters to a single character output device. The generalized REPL prints (projects) the editor's state to all output devices simultaneously. For example, printing to a graphical display needs a projection from the domain being edited to the graphics domain (the domain of the output device).

The standard REPL reads characters from a single character input device. The generalized REPL reads (projects) from all input devices simultaneously to the editor's state. For example, reading from a keyboard needs a projection from the keyboard domain (the domain of the input device) to the domain being edited.

What is a domain after all?

It describes the data structures of documents, selections and operations that belong to a problem domain. Domains are mainly important, because they can be analyzed, described and implemented independently. Here are some problem domain examples:

data structures - boolean, integer, number, string, pair, tuple, list, tree, graph, set, map, etc.
data formats - XML, JSON, ASN.1, etc.
programming languages - C code, Java code, Lisp code, etc.
text related - paragraphs, styled text, word processing with chapters, spreadsheet, presentation, graphics, etc.
databases - relational data, object oriented data, etc.

You might surprised that integer is listed as a domain. In fact, an integer is also a compound data structure consisting of digits or bits if you like, and as such it can be edited.

What is a document data structure?

It describes the possible states of domain specific documents. Here are some document data structure examples:

integer domain - an integer consists of a sequence of bits
string domain - a string consists of a sequence of characters
list domain - a list consists of a sequence of arbitrary elements
XML domain - an XML element consists of a list of XML attributes and a list of XML child elements, an XML attribute consists of a name string and a value string, etc.
graphics domain - a point consists of a 2D coordinate and a color, a line segment consists of two 2D coordinates, a color and a thickness, etc.

What is a document?

It's a piece of data that is being edited. Here are some document examples:

true, 12, "", "hello", "hello world", (1, 1, 2, 3, 5), etc.
, {"name": "levy" birthday: "08/05"}, etc.
public static void main(String[] args) { }, int main() { return 0; }, etc.

What is a selection data structure?

It describes the possible states of domain specific selections. Here are some selection data structure examples:

string domain - a character position selection consists of a string reference and an integer in the range [0, length]
XML domain - an element range selection consists of a parent element reference and two integers in the range [0, child count)
Java code domain - a method selection consists of a method reference

What is a selection?

It specifies some part of a document that is being edited. Here are some selection examples:

integer domain - a bit selection referring to the 1st bit of the integer 42
string domain - a character position selection referring to the 3rd position (between the 'l' characters) of the string "hello world"
graph domain - a subgraph selection referring to vertices A, B and C including the edges between them

What is an operation data structure?

It describes the possible states of domain specific operations. Here are some operation data structure examples:

integer domain - bit negation operation consists of a bit selection
string domain - delete character operation consists of a character selection
list domain - replace element range operation consists of an element range selection and a replacement list document

What is an operation?

It specifies a way of modifying a document using selections and other documents. Here are some operation examples:

integer domain - a negate bit operation of a bit referred by a bit selection.
graph domain - an add edge operation between two vertices referred by two vertex selections.

What is a projection?

Simply speaking it's a transformation from one domain to another. A projection transforms the domain specific data structures of documents, selections and operations.

The projection is used by both the printer and the reader in two distinct algorithm. These algorithms are purely functional, they don't have any side effect on the document, or the input, or the output devices. This aspect is very important, because it makes them very good candidates for various optimizations such as constraint based change propagation and parallel execution.

There are two different kind of projections: primitive and compound projections. Primitive projections always directly transform between two problem domains. Here are a few examples:

styled string to graphics - transforms a styled string to graphics document directly related to API calls
tree to string - transforms a tree of nodes to a flat string using delimiters and separators
XML to tree - transforms an XML document to a tree with attributes and elements being nodes

On the other hand, compound projections require other projections to combine them (primitive or compound).

sequential - applies projections one after the other
parallel - applies projections on the same input resulting in a list of outputs
recursive - applies one projection recursively
type dispatching - applies the selected projection based on the type of the input
predicate dispatching - applies the selected projection based on an arbitrary predicate on the input
nested - applies projections one after the other nestedly

Prototype

The projectional editor prototype is implemented in Common Lisp. If you want to take a look at it you can browse the source online. If you want to actually get the source, then you will need the git version control system.

Demonstration

The following video demonstrates the projectional editor for several different problem domains. The currently supported domains are:

string - a sequence of characters similar to the primitive types found in programming languages
styled text - a sequence of characters with colors and fonts organized into a list of paragraphs
graphics - point, line, rectangle, circle, text, image similar to what can be found in the graphics APIs of programming languages
list - a one dimensional array of arbitrary objects
table - a two dimensional array of arbitrary objects
XML - element, attribute, text, etc.
JSON - null, boolean, number, string, array, object, etc.
Java code - class, method, argument, variable, for statement, function call, etc.
Common Lisp code - function definition, function call, argument, lexical binding, etc.
and combinations of the above

The currently supported projections are:

string to styled string
styled string to string
styled string to graphics
tree of styled string to flat styled string
table of styled string to flat styled string
walked lisp form to lisp form
lisp form to tree of styled string
XML to tree of styled string
JSON to tree of styled string
Java code to tree of styled string
combining multiple projections sequentially
nesting multiple projections

This video demonstrates how the editor keeps track of the selection while navigating in the document. The domain specific selection is represented as a Common Lisp code fragment. If the selection is applied to the document, then it returns the actual thing (or a representation in cases such as character position) that is selected. The document is projected to the graphics domain in a domain specific way.

Programming Musings

Thursday, 22 November 2012

Projectional Editor