This is as days pass by, by Stuart Langridge

Why not to use domain sockets for a desktop CouchDB

The obvious idea that pops into everyone's head, including mine, when talking about having a running CouchDB that's specific to me is: why use TCP for it? Why not just use a unix domain socket? Then you don't have to worry about other people on the same machine trying to access it. Everyone thinks this, and on balance it's not the way to go. This is why.
  1. You can't browse to a unix domain socket in your web browser. This, by itself, is enough to kill the idea for me. I genuinely love the idea that applications store their data and then I can see that data in my browser using Futon, the CouchDB web UI. I can edit that data. That's fantastic. Using unix sockets would break that.
  2. One other nice thing about CouchDB is that you can do replication between two different databases; I like this because I can have the same data on my laptop and my netbook if I want. That doesn't work if you use domain sockets, because the two CouchDBs can't see one another.
  3. As far as I can tell, Mochiweb, the underlying Erlang HTTP library that Couch uses, doesn't do domain sockets anyway. (Obviously fixing this is only a Small Matter Of Programming.)
This has been a public service broadcast on behalf of the Write These Reasons Down So I Have Them To Hand Next Time Someone Suggests Unix Domain Sockets party.

Using CouchDB to store contacts

One of the things I'm looking at is using CouchDB to store data for applications on your desktop as part of the desktop data/settings idea that Rodrigo's already written about. Obviously one of the great things here is that applications can collaborate on data stored in there; obviously one of the pre-requisites for collaboration is that everyone's speaking the same language! So various people working on a number of different mail clients for the Linux desktop and so on are working out what the schema for contact records in CouchDB should look like. Being able to browse around your database with a web browser is dead handy for writing this sort of thing, I have to say :-) At the moment, this is the sort of direction we're heading in. A CouchDB document is JSON, and an example contact looks like this:
{
   "_id": "362cbeae5f408d6863bb70892d5ba345",
   "_rev": "1-182987891",

   "record_type": "http://example.com/contact-record",
   "record_type_version": "1.0",

   "first_name": "Joshua",
   "last_name": "Molby",
   "birth_date": "1945-07-04",

   "addresses": {
       "85cf156f-fcf6-4901-9201-82ee90859213": {
           "city": "Bedford",
           "state": "",
           "description": "home",
           "country": "Scotland",
           "postalcode": "cw12 3hi",
           "address1": "Nicol Street",
           "address2": "",
           "pobox": ""
       },
       "d20f7364-e80b-47a2-a7e7-0677cb293745": {
           "city": "Bedford",
           "state": "",
           "description": "work",
           "country": "England",
           "postalcode": "dk12 3av",
           "address1": "Rush Street",
           "address2": "",
           "pobox": ""
       }
   },
   "phone_numbers": {
       "f0bac2a0-83a3-46f9-b079-d41533b87391": {
           "priority": 0,
           "number": "+84 63 6220 9178",
           "description": "work"
       },
       "cf01fc9c-703b-4ae4-b303-fcc6f8ce5a53": {
           "priority": 0,
           "number": "+91 99 6920 2837",
           "description": "home"
       },
       "f0c05bf4-de4a-48f2-bbaf-f9698e52d491": {
           "priority": 0,
           "number": "+97 52 9211 6455",
           "description": "other"
       }
   },
   "email_addresses": {
       "6e3178d8-fee6-45b1-b95a-2c76be090e2b": {
           "description": "home",
           "address": "Joshua1.Molby@uck.com"
       },
       "adb1fc2a-0468-4deb-bb6c-974db23ef7fd": {
           "description": "work",
           "address": "Joshua1.Molby@vkc.com"
       }
   },
   "application_annotations": {
       "Funambol": {
             "jobTitle": "Director",
             "company": "ACME Ltd"
       }
   }
}
Fields in this are as follows:
CouchDB fields
_id
Unique document ID, provided by CouchDB (or you can choose it explicitly if you want)
_rev
revision number for this document. Managed by CouchDB.
Contact schema fields
The contact schema is the list of fields that are stored for a contact. Since this is a shared schema, everyone can rely on it. Fields that aren't in this list can be stored by applications in application_annotations, if an application cares about extra stuff.
  • first_name (string)
  • last_name (string)
  • birth_date (string, "YYYY-MM-DD")
  • addresses (MergeableSet of "address" dictionaries)
    • city (string)
    • address1 (string)
    • address2 (string)
    • pobox (string)
    • state (string)
    • country (string)
    • postalcode (string)
    • description (string, e.g., "Home")
  • email_addresses (MergeableSet of "emailaddress" dictionaries)
    • address (string),
    • description (string)
  • phone_numbers (MergeableSet of "phone number" dictionaries)
    • number (string)
    • description (string)
Basic "record schema" fields
The record schema is the basic format we're talking about for storing any data in CouchDB; it's a couple of fields that are in every record that everyone can rely on.
record_type
A URL which is a unique identifier for this type of record. It would be good if that URL had a page at it describing the record schema, but (importantly) this is not a reference to some sort of JSON DTD or anything
record_type_version
Version of this record type schema (so you can make updated versions if you want to make changes to field names, etc)
application_annotations
The application_annotations section of the document is where apps put their own data that isn't part of the schema. For example, Funambol knows about "company" for a contact, but the contact schema doesn't directly include that field. So Funambol stores it on the contact record in a Funambol-specific section, so it can happily get it back later. If it turns out that everyone's storing their own version of the same field, then that field is probably a good candidate for being in the schema (making this sort of change is what the record_type_version field is for :))
Quick script to drop contacts in this schema into a CouchDB database: createCouchContacts.py. Requires python-couchdb (and Couch, obviously).

Filmage

So, me & Bill are making a film. It’s going to be along the same sort of lines as http://www.bbsdocumentary.com/. The plan at the moment is to document how we came to be interested in all things Internet and computers in general, and then move on to document our momentous trip to http://www.har2009.nl. I'm under strict instructions that I'm not allowed to help with the soundtrack because my music tastes suck. The experience of thinking about a script gives you a new appreciation for the work that scriptwriters do, though. Just thinking of how it all fits together is hard. Me, personally, I'd like it to feel like a Top Gear film (it won't look like it, since they have the best camera work in the industry, but it might have the same sort of atmosphere, if we're really really good). Didn't I see something about a script-writing program for Linux somewhere? At the moment we're using a Google document...

Working with CouchDB

I've been working with CouchDB as a database in which applications can store their data; there's an increasing trend recently for applications to start using databases to store their data rather than flat files, and I personally think it's a jolly good trend. There's been rumours for years about the idea of a database-backed filesystem, where instead of pathnames you use queries and so on; that's never happened (and I'm not convinced it will ever happen), but individual apps can get most of the benefits of that by using a database to store their data. I like CouchDB for this because it has a number of advantages over simple databases like SQLite; for example, CouchDB does replication, meaning that I can get all my data on all my machines. This is a genuinely lovely property. It's been really interesting talking to the Tomboy team about how this sort of procedure ties in with what they're doing; we've been working on their Snowy server and talking about its API and how it should work, too. Cool times ahead for this stuff. I'm really excited!

This website belongs to Stuart Langridge. Contact details are available. Don't eat yellow snow. Valid HTML5, at least in theory, except for the bits that aren't because I'm that futuristic that I'm ahead of the spec, oh yes. HTML5 help from Bruce Lawson, among others. Fonts from the superb FontSquirrel. End.