So much for the Semantic Web

As I have mentioned before, I’d like to use gravatars here but can’t because I don’t capture commenter’s email addresses. This is because there is no reason for you to give me an email address, and frankly I wish more sites didn’t demand one; an awful lot of weblog sites seem to make completing the email address compulsory, which I do not understand at all. I mean, why? Anyway, I had this additional idea, which was: let’s use the Semantic Web! People do leave their URL. I can connect to that URL, find the FOAF autodiscovery tag, and thus find their FOAF file! So I tried that, over all the URLs on all the comments on Total number of FOAFs found: 18. Ha! One of which, interestingly, was a very complete indeed LiveJournal FOAF file; the LJ people seem to actually be getting into all these standards and stuff, probably without making a big deal of it in the LJ community itself, because they have such a massive collection of metadata. If the Semantic Web works then it’ll work on LJ first, I tell you. Anyway, my original plan was to grab images out of the FOAF feed: there is a foaf:image tag and a foaf:depiction tag. Total number of FOAFs in my 18 that had these: 1 (one). Bah. I did also think that I could grab the sha1sum of the email address, which is in everyone’s FOAF, and then convince the gravatars guy to not only serve up a gravatar when you have the md5 hash of an email address but also when you have the sha hash of an email address. Response to my email: 0 (zero). I was quite disappointed by that. So, basically: so much for the Semantic Web in practice, at least so far. I shall wait a while before trying this sort of thing again. For those of you thinking, “I have a linked URL, and a autodiscoverable FOAF file, and an image in that FOAF file; am I the one?” then the answer is no, unless you are Gary Fleming. You may be thinking: huh? Why was mine not included? The answer is: because tidy wouldn’t tidy your website into compliant XHTML properly, or the Python minidom parser wouldn’t parse that XHTML properly. Because I couldn’t be arsed to write a Python HTML parser, because I was doing this with an (increasingly complex) one-liner command, I grabbed the HTML at the URL you specified with a comment, threw it through tidy, threw that through

python -c "from xml.dom import minidom;import sys;d=minidom.parseString(;print [n.getAttribute('href') for n in d.getElementsByTagName('link') if n.getAttribute('title') == 'FOAF'][0]"

and so on. Not sure what the problem was, but by eyeballing it there weren’t that many it failed for. (Failed for my own site, ironically enough. Aquarion’s, too.) This is my fault for being lazy, rather than your fault for site coding or tidy’s fault or minidom’s fault, but it did give me an indication that this images-from-FOAF trick is not worth coding if it’s going to provide one image every three years.

I'm currently available for hire, to help you plan, architect, and build new systems, and for technical writing and articles. You can take a look at some projects I've worked on and some of my writing. If you'd like to talk about your upcoming project, do get in touch.

More in the discussion (powered by webmentions)

  • (no mentions, yet.)