Wolff over at a klog apart offers the thought that we need a census
of blogspace. He asks a few questions, as well, about the size of the “blogsphere”:
- Do you have an educated guess?
- Not even remotely. No idea. I could pick a figure, but it could be
out by two orders of magnitude.
- Do you know of any prior work in this area?
- Not that I’m aware of, I must admit, although it could well have
passed me by.
- Can you think of a methodology or two to create useful measures of the number of bloggers and the number of weblogs?
- Google. Google is the best way for queries about all of the net,
because it indexes all of the net. You could get a rough estimate of
the number of webloggers by making a few simplifying assumptions:
all webloggers either have their own domain or are using one of a
few weblogging hosts (blogspot, Livejournal, etc, it’s a fairly
short list), getting user counts from each of the major hosts, and
then searching Google for the word “permalink” and extracting the
number of unique domains. That’ll be a low estimate, because there
are multiple weblogs on some domains, and because not all weblogs
use the word permalink, but it’d be a figure to begin working with.
The other alternative is to assume that all weblogs are
interconnected (see the next question), start at one place, and
link-crawl yourself, counting as you go. You’d need rules of what
constituted a weblog, which is something not well-defined for a
person looking at one, never mind an automated process, but hey.
- What related questions would you want answered?
- How many different “islands” are there in the interconnected map of
weblogs? Can you navigate from any given blog to any other blog by
merely travelling links between weblogs? What does the map look
like? What’s the most connected node? Which node is at the centre of
the map? Lots of questions about the map of links, really.
- How might you use this information?
- Blimes, I dunno. It’d be interesting to look at :) I could do a “six
steps to as days pass by” thing, or something.
- Pitfalls to avoid?
- No idea, guv. At this stage, where there’s no data at all, any data
is better than none, so make assumptions, guess figures, and so on.
We can refine the data later.
- Would you join a BlogCensus.org to provide and share stats?
- Suppose so, but I always find that sort of thing fairly silly,
because the audience is self-selecting. The Linux Counter is
much the same principle, and it’s pretty useless in terms of information.