Semantic Markup And Authoring Tools

It all started with a comment from [redacted] about a talk on Microformats and Semantic Markup.

Note to self: talks on semantics are boring.

Indeed. And this is partly due because lot of the semantic markup is written in vacuum. Let’s put a step back and starts with the provocative statement that semantic markup is completely useless. Let’s ask ourselves why would we write semantic (X)HTML?

because search engines will benefit and information will become more usable. We know that it is an overstatement of the Web reality. Right now, the elephant in the room is indexing a few elements with meaning giving a more “semantics score”. Writing good HTML gives a potential longterm benefit for the community if the tools are developed.
because the people I cafe for, my communities will benefit of it. There are multi-aspects on this. If you believe in a society with mutual rights, etc, you will want to create a good markup so that all these specific people can access with a better experience (accessibility for example). If the document is written with good markup for your professional community, you might give in. Specifically if this attitude gives professional advantages.
because it is directly and immediately useful to me.

Writing well is a source of costs, except for passionate people (whatever your passion is). Without showing what it brings (aka the benefits), the efforts of creating good markup is mostly cumbersome. The winner argument is the me, not a theoretical one, but one which has a direct, “visible” impact on my time, my ease editing the document.

People used h1, h2 to make bigger texts or ul to indent text, because it was directly visible. It was showing what they wanted to convey as a meaning even if it was machine meaningless. The feedback loop was very short and then the effort quantifiable, palpable.

How do we encourage Semantic Markup?

Most of the authoring tools I know do not help you write good markup, be text-only or wysiwyg. Right now, The editing window for this article proposed a series of iconic metaphor for putting markup around texts.

Editing UI

But is it really what I need? Why should I put a strong element here? Why using a h1 or h2? or a blockquote for this piece of text. It doesn’t make my life easier on the longterm. It doesn’t leverage the knowledge I have pass through the tools.

Imagine a few cases such as (There are probably hundred more):

A visible table of content is automagically built in the UI live when you edit the text. It doesn’t need to be in the final text, but just as a visual cue of what you are doing.
When you quote a text coming from a blog, there is a template with already the name of the author, the date because you have already in the past made a quote of this author with a similar URI.
You can even after search through all the previous quotes, authors to find one you are not quite sure about anymore.
It remembers all the links you already entered in the past, and when you are about to insert a new link, a pop up of all your previous edit comes and you can search by keywords.
It create a list of all tables you have edited with references to the places and/or files you have edited as a separate database than you can insert again in another document.
When you are editing an image, it proposes a longdesc form box that will create the file and put the URI automagically in the image. When reediting similar content, it will check what where the previous content.

Coming with the tools and examples which have a direct impact is what Authoring tools developers should focus on not only how to throw tags around text.