What’s the point in XHTML?

So, why do people use XHTML?

Mark talks about the XHTML vs. RSS debate, in somewhat disparaging terms, stating that “Syndication is not Publication“.

The entire success of RSS is predicated on the principle that “you can keep doing whatever messed up stuff you’ve always done on your web pages… oh, and do this other thing too. Look, it’s simple, you can code it up in an hour with a few print statements and an escape function.” By contrast, this latest XHTML-as-syndication movement seems to be based on the principle that “syndication is so incredibly important that you must immediately stop whatever you’re doing with your web pages, upgrade to XHTML, validate your markup, restructure your home page to include all and only the content you’re willing to syndicate, and by the way, would you please unlearn that ugly nasty presentational page layout language you’ve been using for years and learn this wonderful happy structured semantic markup language instead?”
It should be obvious to any rational observer that this will go nowhere fast. A syndication format that requires valid semantic XHTML markup? Spare me. 9 out of 10 bloggers can’t even spell XHTML.

One of the issues here is that 9 out of 10 bloggers can’t spell RSS either. In fact, a reasonable propertion of bloggers don’t even know what RSS is, or that their weblogging tool generates it for them. That so many pages are invalid is not the fault of these 9 out of 10 bloggers, it’s the fault of their tool of choice. Imagine if a weblogging tool checked your built pages for XML well-formedness. (Obviously this only works for baked systems, rather than fried.) Not validation against a DTD or anything, just well-formedness. It’s not going to solve all problems overnight, I admit, but being able to use XHTML in pursuit of one of its major wins — that it’s XML as well, and therefore can be parsed by the vast suite of XML tools which insist on a minimum level of compliance (by which I mean well-formedness) and then derive power from that (or are there, say, XPath tools for HTML 4.01 that I don’t know about? — would make life a lot easier. If your intention, O XHTML user, is to have your pages marked up in XHTML in pursuit of this goal, then why do you need RSS or similar formats? If your aim is not to use XHTML for this purpose, what is your purpose? Mark’s “It’s like…semantic and stuff” remark isn’t a flappy-handed meaningless contention on the part of XHTML pushers, it’s the point of the whole exercise.

He does make a good point about bandwidth, I freely admit. That, however, reduces RSS from being the Right Thing to being an optimisation away from correctness in order to live in the real world. RSS makes your bandwidth bill cheaper; true. And that’s quite possibly a major motivating factor for almost everyone, because real world issues come above principles to some extent. It doesn’t make RSS right, though.

Moreover, the further points about corner cases such as Dorothea’s Latin dates are entirely valid. That backs up the core contention that publication and syndication are not the same thing, assuming that you actually need an accurate date on a post for anything. Those of you who use RSS newsreaders, do you tend to look at “all the latest posts from all my feeds” (in which case a date is obviously vital) or “all feeds with new content” and then read that feed in isolation (in which case you don’t need a date at all, although the newsreader does need to cache what it last saw to check for updates rather than caching the last time you read a feed to see whether there are posts since — an MD5 hash of the feed contents is not much larger than a stored date). Quoting Dean Allen’s use of hand-crafted descriptions in RSS misses the point; I fail to see how any hand-crafted description could be a better description of the post than the post itself is! I’m open to correction on this point; Mark quotes Shelley’s desire to not have her whole feed in RSS to avoid republication — I can’t see how those who would want to republish her content couldn’t just rip it off her website, so I’m unclear on the point of this.

I mean, I’ve set explicit excerpts on posts just for my RSS feed. But not because I think that this is a great way to summarise my message. Because I want to use entities like — in my HTML, and it doesn’t work in RSS. I have to remember to use ” instead of ” in all my posts to avoid breaking my RSS feed. This is a hindrance, not the easy path to syndication.

This is in danger of descending into the age old argument between attempting to change the world to make the Right Thing the thing that works, and stepping away from the Right Thing so that your implementation works in the world. Pushing the former too far leads to zealotry. Pushing the latter too far leads to Internet Explorer v 4. The real truth is somewhere in the middle, and once again some people are trying to decide where that line which they won’t step over actually lies.

I'm currently available for hire, to help you plan, architect, and build new systems, and for technical writing and articles. You can take a look at some projects I've worked on and some of my writing. If you'd like to talk about your upcoming project, do get in touch.

More in the discussion (powered by webmentions)

  • (no mentions, yet.)