Tidying up your free-form content with structured body field

2 Dec 2014  | 
Igor Vrdoljak
Tidying up your free-form content with structured body field

There has been an ongoing conversation in web development/design community over the last few years. Fueled by the disruption the mobile explosion had brought, there has been a search for a more suitable content modeling approach, the one that could cope with conflicting requirements set in place.

Blobs and chunks

On the one hand, we all know that, if we want our content to be really ready for a multichannel world, we need to fight the "content blob" of unstructured HTML and define the content types and their structure in a pedantic and precise way. With this approach, we can optimize output to the delivery channel and reuse the content in a wide range of use cases, including those unimagined at the content creation time. This is also a natural way of thinking for information architects - one that describes different entities (article, frontpage, news, person) with relatively simple building blocks (fields such as title, image, text line and different kinds of metadata).

On the other hand, the "mechanical" approach to building pages out of predefined blocks can be limiting for designers and content producers. They often demand more freedom in composing complex content, content which doesn't conform to the static structure defined a long time ago by a CMS developer. Following this lead, many of today's content management systems allow entering raw HTML that can be formatted and interwoven with multimedia elements.

Each one of these approaches is perfectly legitimate and can serve well in specific use cases, but in order to build a scalable solution we often need something more, a solution that has both the purity of structured content and the expressiveness of a free-form body field.

Why not just use HTML?

itsatrap

OK, so we know we need to split our content into chunks over distinct content fields. The first instinct would be just to introduce an HTML body field as one of said chunks and just let the editors rave on. But this is a trap! In its core, HTML is a markup language with predefined semantics (largely bound to presentation), and if we're trying to describe a complex, domain-specific concept, soon we'll be running it too thin. Surely, we can use HTML5 custom data attributes to enrich our HTML, or go even further and introduce completely new tags (like Web Components and Polymer do), but this would probably still be a misused tool.

There is a well-known cousin of HTML that was built with customization and extensibility in mind. Having the extensibility even in its name, XML can be (and is) used to describe practically everything and anything. Building on its core, a scalable and flexible solution for "structured unstructured" content is provided within the eZ Publish CMS.

The eZ way

Quite some time ago, even before all the mobile-driven craze, eZ people decided to ditch the HTML as an option for the rich text field and introduced the ezxml datatype. Within it, aside from all the usual tags and formatting options that come with HTML editors (links, headlines, lists, tables, bold and italic text etc.), two concepts were introduced that are important for this post: embedded objects and custom tags.

Embedded objects are used to interweave different kinds of content into the rich text field. As almost everything is an object in eZ, this is often used for multimedia elements such as images, image galleries, and video. The view in which these objects are rendered is customizable, so an image gallery can be viewed one way when accessed via the web browser and completely differently when viewed in an accompanying mobile application. Additionally to multimedia, any kind of content object (related articles, categories, user profiles, domain specific objects like a car, vacation) can be embedded into the rich text. This opens up a wide range of options that would be impossible if we were only working with HTML tags.

<embed 
  view="embed" 
  href="ezobject://132" 
  class="big-car" 
  custom_attribute_view="3D" 
  custom_attribute_show_configurator="yes" />

The principle of embedding an object, instead of describing it directly in HTML, brings additional flexibility to changing the output as technology and user expectations evolve. For example, since an image is nothing but an embedded content object of an image type, it is easy to change its output from a simple <img> tag to a more expressive <picture> element as support for this tag grows. Also, changing flash video players for HTML5 based ones is an easy task if you do not need to manually dig through tons of HTML, but only to change the template that renders the HTML.

The other interesting concept available in eZ Publish is a custom tag. Defined in configuration files, custom tags can be used for creating configurable elements that can render whatever output is needed. Combined with parameters that can be provided upon content entry, custom tags can be used for creating different kinds of elements, such as quote blocks and embedded menus, but also site-specific custom elements like "car comparator widget" or "vacation planner".

<custom 
  name="car_comparator_widget" 
  custom:car_object_id_0="132" 
  custom:car_object_id_1="234" 
  custom:interactive_mode="yes" 
  custom:dealerships_view="gmaps">
    (optional additional content text goes here....)
</custom>

With these tools under her belt, our content editor can begin creating experiences such as the praised Snowfall piece on the NY Times site or similar examples by other publishers.

This way text content can be enriched with expressive multimedia and custom widgets, but in a controlled manner, one that does not hinder reusability or prevent accessing content through multiple channels. For example, your special timeline widget can be rendered in its full interactive glory when accessed by a class A web browser, but can also be downscaled to a simple comma separated text when sent as a part of a text message or in an email newsletter campaign.

Win-win scenario.

What is (still) missing?

Scenarios mentioned above are easy to implement in the eZ Publish CMS today. One issue left over from the old days is the infamous WYSIWYG design of the rich text edit interface. Envisioned in the times when web editing tried to mimic desktop publishing software, the approach of designing the visual presentation of the content while entering it in a fixed width container fails miserably when confronted with chaotic modern web environment.

The main problem with WYSIWYG and its accompanying preview button is that it is actually deceitful, giving the editor a false sense of complete control over the visual output of the page. Today's modern rich content edit interface, one that is more suited to the multichannel, multi-device, multi-everything world, is the one which focuses on the content entry and leaves the presentation to the designer. Editors should concentrate on content creation and enrichment with metadata, and output should be defined and designed according to specifics of the channel we are using. After all, what good is a preview button if we are using SMS as a delivery platform?

There are already some good examples of non-WYSIWYG edit interfaces, such as Sir Trevor or the editor used by Craft CMS for its Matrix datatype.

Sir Trevor editor

Sir Trevor editor avoides the pitfalls of WYSIWYG

As the underlying XML format in eZ Publish already supports this kind of expressiveness, it should be easy to integrate a new (or modify existing) editor in a way that would be a better fit for today's web editors' work.

The future

Although the currently used custom XML format is potent enough for a lot of use cases, eZ plans to soon replace it with well-known DocBook schema. The benefits of making this switch are numerous, mostly coming from widespread adoption of the DocBook format which brings support to a wide range of existing tools. All the existing features remain and we get a more robust and better-supported tool. Great.

Also, changes to the edit interface are planned, but it still remains to be seen in which direction the development will go. The great thing with eZ Publish, as with all other open source software, is that nothing stays in one's way to implement its own rich text editor. We at Netgen are also considering making this move, so stay tuned.

Conclusion

In my opinion, eZ Publish remains a largely undiscovered gem in the CMS world. It provides one of the most advanced and battle-proven content modeling engines, perfectly suited for modern web development. Coupled with the power of Symfony framework at its core, eZ provides developers and end-users a winning combination of flexibility and enterprise level stability.

And the funny thing is that we had it all along, due to some smart decisions made a long time ago. It appears that good architecture goes a long way!

Some links for further reading:

Blobs and chunks

http://karenmcgrane.com/2012/09/04/adapting-ourselves-to-adaptive-content-video-slides-and-transcript-oh-my/
http://alistapart.com/article/battle-for-the-body-field

Rich article examples

http://www.nytimes.com/projects/2012/snow-fall/#/?part=tunnel-creek
http://www.theguardian.com/environment/2014/may/21/-sp-european-bison-europe-romania-carpathian-mountains
http://www.polygon.com/2014/11/26/7297125/world-of-warcraft-warlords-draenor-review-pc-mac-mmo

WYSIWYG fallacy

http://www.markboulton.co.uk/journal/wysiwtfftwomg
http://madebymany.github.io/sir-trevor-js/
https://buildwithcraft.com/features/matrix

Docbook

http://www.docbook.org/whatis

Photo credit: ThisParticularGreg / Foter / CC BY-SA

Comments

blog comments powered by Disqus

Short backstory of our blog: Sharing our experience from various web projects based on eZ Publish / eZ Platform, Symfony, PHP, HTML5, MySQL, jQuery, CSS, etc. and focusing on solving the problems we encountered.

Subscribe to RSS feed

Tags