What Sci-Fi Can Teach Us About Publishing: The Problem of Structured Authoring
Structured content authoring is a lot like great science fiction: the essential problem is rooted in the inherent differences between humans and machines.
Every business which needs to publish digital content today faces the same urgent imperative to create structured content. Our machines demand it. If we want our users to be able to access our content on their computers, on their phones, on their tablets, on whatever currently unimaginable device becomes the next big thing, then the content itself must be structured: that is, semantically marked up in code in a way that captures its meaning, irrespective of its exact final appearance on any particular device. If we want our machines to allow us to magically “create once and publish everywhere,” it’s not unreasonable that they demand structured content inputs from us.
The problem is that (most) humans don’t think about content this way, don’t intuitively frame their creativity into consistently structured machine-readable chunks. If confronted with a blank screen and blinking cursor, any human could imagine starting a project by typing something like “A long time ago in a galaxy far, far away…” Very few humans would ever dream of launching instead straight into <body><header><p class=”intro”> … and we shouldn’t expect them to. Humans need to be able to think like humans, not machines.
So most of us humans continue to do our creative work using unstructured tools: Microsoft Word, Google Docs, yellow legal pads, etc. Then our businesses must take the unstructured output we create and transform it into structured code that works for the machines, a tricky conversion task that is costly in both time and money and also raises the near certainty of introducing errors.
Even worse, as Karen McGrane highlights in her brilliant and aptly titled essay, “WYSIWTF” (A List Apart, 2 May 2013), the tools we are using today to create unstructured content create a false sense of control over what that content will look like to our end user once it’s gone through the process of being structured into machine-readable code and rendered on a multitude of devices. Thirty years after the first Apple Macintosh revolutionized desktop publishing with its WYSIWYG interface, the humans who create content have an expectation that what they see on their authoring screens is exactly what their readers will get on their devices. In most cases of digital content production today, that’s just not true; these words are never going to look exactly the same to you, displayed on the tiny screen of your iPhone, as they do to me, while I’m writing them in Microsoft Word on a 27-inch widescreen monitor.
If I format a line in my Word doc by spacing in seven characters to create a perfectly normal-looking but totally non-semantic indent, who knows what that will look like inside your app… much less what that will sound like to a blind reader consuming the content through a text-to-speech accessibility tool?
Unstructured content wrecks our modern content machines, but humans tend to think creatively in unstructured ways, and the authoring tools we’ve spent a generation training ourselves to use push us even farther in an unstructured direction. Even as the value of structured content has never been more clear, the process of creating it remains bogged down by broken processes and tools optimized for the fixed-layout 1990s.
So how could a better publishing platform allow us to think like humans but create content that works for machines?
Some have abandoned the notion of WYSIWYG entirely, forcing content creators to work inside CMS tools where all authoring occurs, chunk by chunk, inside unstyled data entry fields. That model is good at enforcing structure, to be sure, but hard for most people to author to.
Others tools, like the recently shut-down and widely mourned Editorially, seek a middle ground between rigid CMS structures and free-form text blobs by using Markdown to make structured code patterns more easily parsed by humans than the tag soup of regular HTML markup.
Here at Inkling, trying to bridge the gap between humans’ need for WYSIWYG (or at least WYSIWYM) authoring tools and machines’ demand for structured outputs deeply informs the core authoring experience in Habitat, our collaborative, cloud publishing environment.
Authors in Habitat work inside an environment that offers a real-time editable preview of their content across multiple outputs; they can visualize, as they write and edit, what their content will look like when consumed on a desktop display, or a tablet, or a phone.
More technically-inclined authors can also, optionally, view the HTML code itself in a live-updating side-by-side display.
Perhaps most powerfully, Habitat surfaces the notion of content patterns–each pattern is a semantically meaningful chunk of HTML code–as the core building blocks of content, without requiring nontechnical content creators to know anything about code. Authors can assemble complex pieces of content simply by dragging and dropping patterns onto the page; content authored within those patterns will be properly structured HTML from the start. Essentially, this means that authors are creating code at the same time as they’re creating content … without having to think “in code” at all.
Habitat is just one example of how content creators and developers can bridge the gap between the way humans need to create content and the ways machines need to read it. We think we’re on the right track, but we know there’s a ton more work to do to continuously improve our structured authoring tools moving forward. If you’re interested, you can learn more about our cloud publishing platform here. We’ve also recently published a free how-to eBook that gives you practice advice on structuring content for machines: “5 Mission-Critical Questions for Digital Content Creation“. It’s a great resource before you begin any new content project.
Otherwise, let us know what you think about structured authoring and where it’s going in the comments below!