Posts from April 2005.

Anatomy of a simple web application

I thought I’d put together some notes on how the Tory poster generator actually works, as an example of how I write a small web application. While it is small, a reasonable amount of thought went into its design, and those design principles are extendable (in a handwavy sort of sense) to larger and more complex web apps.

Introduction to the project

What the Generator does, for those of you who haven’t tried it, is create an image of a poster, by asking the user for some text to go on said poster and then creating a downloadable PNG with that text, in a particular font, superimposed over an existing image. The font is one of a selection of “handwriting” fonts which were available for free download; the user may choose any of these fonts, or one will be randomly selected.

Too many users spoil the broth

The key point I had in mind with this app is that image generation is pretty intensive. Since I had an idea in my head that the Generator might turn out to be popular, I was worried that it would overwhelm the machine that it’s on. The obvious way to build a web app like this would be to have the image generation happen “in process“; that is, the architecture looks like this:

Stage 1: request poster text from user; Stage 2: hang while generating image with PHP image library; Stage 3: Display page and newly created image

The problem with that idea, where you generate the image in PHP code, is that the page hangs while you’re waiting for the image to be generated. Moreover, if a thousand people all request an image at once, then a thousand image generation PHP pages all run at once. That’s certain death to the server, unless you’ve got a pretty studly server, which I haven’t. So, time for a more efficient approach.

A better way

Instead of doing the generation “in process“, I decided to do it “out of process“. So the bit the user sees and uses and the bit that actually generates the images would be entirely different processes. The image generation process would just run in the background on the server, generating one image at a time, and the front end would just hand “requests” for image generation to it, and then wait around until it was done. Something like this:

Stage 1: request poster text from user; Stage 2: Add a request for image generation to the queue and wait until it's done; Stage 3: Display page and newly created image by retrieving the image from the storage area. Meanwhile, the back end daemon reads requests from the queue and places complete images in the storage area.

Separating the processes makes each part quick and easy. The front end simply adds requests to the queue and then loads a “wait page”: this page refreshes itself every five seconds and, on load, checks whether its particular request has been completed, by looking for the completed image in the storage area. If it has, redirect to the final page, which shows the completed image. Meanwhile, the back end, or server process, or daemon (call it what you will) checks every five seconds to see if there is a new request in the queue. If there is then it stops checking the queue and starts generating that request. When finished, it adds the completed image to the storage area and then resumes its every-five-seconds check. This approach entirely solves the worry of too much traffic; if a thousand people all generate an image at once then a thousand image requests go into the queue (which is not intensive) and then the back end just processes them one by one. This leaves all the people near the end of our thousand waiting around a lot at Stage 2, but, critically, they’re not killing the server by doing so. Running a thousand image generating processes at once would probably leave everyone waiting nearly as long, and would not incidentally max out the server while it was doing it. Not good for anyone, that.

Actually building it

So, two separate processes: one web-based, one a background daemon. The web-based process is a very simple sequence of web pages; a good solution here would be PHP. The server process needs to have good image-creation capabilities; use whatever language you feel most comfortable writing real programs in. I chose Python, which regular readers will not be surprised to hear, and the Python Imaging Library.

The back end

The back end daemon’s simple operation is described above; I’m not going to go into much detail about how it actually uses PIL and some TrueType fonts to write the requested details onto the poster image. You can browse the source for the daemon file posterd.py if you’re interested in that.

The front end

Since I was using PHP, something I don’t do all that often, this seemed a good opportunity to check out the idea of using templates to make HTML pages by using the Smarty template engine for the project.

The pages

The front end is composed of three stages, as shown in the diagram; the stages correspond to three PHP files: index.php, waitpage.php, and display.php. The design is conceptually like that shown in the diagram, but in implementation it ended up slightly differently, because of a technique I tend to use for multi-page processes on the web. Imagine the simplest of these processes: ask the user for some data in a form, save the data in a database, and then display a thankyou message. Some people would have a two page process as follows: page1 contains the form, which submits to page2; Page2 saves the data and then displays the message. I don’t do it like that; I have the form on page1 submit to page1, and page1 is structured as follows:

if data has been submitted
  save the data in the database
  redirect to page 2
  end page
end if

display the form

So, in actuality, the “add a request for image generation” part is done in page1, and page2 doesn’t do anything but go round and round the refresh loop until the daemon has completed the request.

Smarty

Smarty made this process really, really easy. The way it works, for those of you not familiar with it, maps very neatly onto simple projects like this, because it helps the pages separate out. The actual PHP page that the user visits (take index.php, Stage 1, as an example) just contains page logic. It doesn’t contain any HTML at all. This means that, basically, it looks exactly like the pseudocode I outlined above; it’s about ten lines of code. The display the form bit actually reads $smarty->display('frontpage.tpl');, which picks up frontpage.tpl, a plain HTML file, and displays it. This means that your HTML template files, *.tpl, look like HTML, and aren’t cluttered with code. Meanwhile, your PHP files look like PHP, and aren’t cluttered up with HTML. This separation is fantastic. On more complicated projects it’s less easy, because you end up having to create little blocks of HTML in the PHP code ready to be substituted into the template, and that’s not good, but for a simple project like this it was a real boon to use.

Conclusions

There’s lots more I could write about this, like how I make sure the daemon stays running, and how I clear out old generated images, and how index.php randomly picks a font and substitutes it into the template, and how everything to do with the queue (adding new requests, seeing if a request is completed, ad nauseam) is separated out into a small library that other PHP code (in index.php, waitpage.php, etc) can call, but I think I’ll stop here. The point is that tiny projects like my poster creator give you a chance to try out these techniques; it’s all too common to think that, well, this is a quick knock-off application, so I won’t bother to apply technique to it, I’ll just code from the hip without structure. But then when you come to do a real proper project, applying the techniques is complicated and awful and so you don’t do it then either, because you’re not really sure how. Hone your skills and your approach by doing all this complex stuff on projects that don’t really need it (for example, the Generator has a spec (albeit a pretty simple one)), so that when you do need it you’ll be comfortable and confident with the techniques.

Firefox countdown

Since everyone’s loving the Firefox counter and it was written in response to a request to do cool things with the counter data from FF themselves, we present the Firefox countdown to fifty million downloads. Very similar approach to the counter; this just works out how long there is to go and counts down for you. (It’s also got a bit of Ajax in there to make sure it doesn’t go too far off track, plus, well, you gotta have some Ajax or you’re uncool, I think :))

At time of posting there is eight-and-a-half hours until it hits fifty million. Wow.

The book update, with a poster

DHTML Utopia, my book, which is going to be excellent by the way, went to print yesterday. So you, my beloved audience, should be able to buy it from SitePoint in a few weeks! Plus, if you buy it from SP you get this really cool A2 poster:
DOM and JavaScript Quick Reference Guide poster, with sections on the Document Object Model and lots of other stuff that you deliberately can't see in the image

So excited. So excited.

IE7 User Agent String published

The IE7 hackers (the ones at Microsoft hacking on the real IE7, that is, not Dean Edwards :) ) have revealed that IE7’s user agent string will be "Mozilla/4.0 (compatible; MSIE 7.0b; Windows NT 6.0)".

Those of you who are thinking, so? who cares about user agent … [Stuart Langridge]

src="http://www.kryogenix.org/favatars/http://www.sitepoint.com/"
alt="SitePoint"> Syndicated from the
SitePoint
Stylish Scripting weblog

—–

GreaseMonkey compiler

Adrian Holovaty is a glorious genius, and has written a GreaseMonkey compiler that takes GreaseMonkey user scripts and makes them into a Firefox extension that doesn’t require GreaseMonkey. This is brilliant—it means that people can prototype extensions as GM scripts and then compile them into an extension for distribution. Bloody well done, Adrian.

—–

Linked to by the Times

The Conservative poster generator goes from strength to strength, it seems: now The Times are feeling the love. Nice.
Screenshot of the Times coverage

Been busy

I haven’t written much recently, because I’ve been busy. Work is prety hectic, and I’ve been trying to get some hacking done; I am painfully aware that I haven’t hacked on anything and released it for a little while now, and I don’t like that.

Unfortunately, the main thing I’ve been hacking on is an attempt to make my Zaurus be a media player (and not a PIM), and it hasn’t worked. After lots, and lots, and lots of trying, I have reluctantly come to the conclusion that it is not possible to write an application on OpenZaurus 3.5.3 which is the only app running on the Zaurus (or provides the only UI) and is in Python (because that’s what I can write code in). Note that I do not mean “an app running inside Opie or inside GPE“, I mean “it is the app running and that’s it; no desktop environment“. There seem to be three potential approaches:

  • python-qt. To be the only app, you have to get launched instead of Opie. This requires python-quicklauncher to work, and it just doesn’t work at all.
  • python-gtk. This involves starting an X server and then running your PyGtk app, so it’s basically the only thing running. This doesn’t work either; I can create a GtkWindow, but if I put any widget in that window, python-gtk segfaults on window.show().
  • pygame. Doesn’t work at all; pygame has been patched to do something with the qt libraries which isn’t required any more, so it fails on a symbol import.

I’m really disappointed in this. I don’t want to rag on the OZ team too much, because they do a tough job (I’ve done a little work on the OZ website in order to make it a bit clearer, but I’m not part of the project) but, as far as I can tell, the packages aren’t tested much; pygame fails on “import pygame“, which means that it went into the feed broken and wasn’t tested. I’m not sure how to get around this without having unit tests for every package, mind, but when the alternative is “compile a new version myself“, involving setting up a cross-compiling toolchain, I’ve just knocked it on the head.

I have lots of other projects on the go, though. Hopefully some of them will happen, although more of them are involved with the Linux end of the world rather than the web end of the world, which is probably at odds with the relative proportions of people reading this, most of whom (I suspect) will be web people rather than Linux people.

However, whichever you are, you should be attending LugRadio Live 2005 because it’s going to be superb. There’s stuff for Linux people and web people and everyone in between. Go and look, you know it makes sense.

Book time approaches

The book goes to print very soon indeed—next week. This means that you should be able to buy it about three weeks from now! Your way to DHTML Utopia becomes clearer :-)

A little HTML test

Below lies the code for an HTML page. See if you can tell me why the form submits fine in Firefox, but the submit button doesn’t do anything in Internet Explorer.


<html>
<head>
<title>Test page</title>
</head>
<body>
<form>
<textarea>
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
</textarea>
<textarea>
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890
</textarea>
<input type="submit">
</form>
</body>
</html>

The answer is not “it’s a bug in IE! IE sux0rs!“. I’m not clear what correct behaviour here is, but IE is defensibly doing the right thing, at least possibly. Your thoughts? And I’ll give you the answer later.

Google maps in the UK!

Woo! Google Maps UK! Now I can do all this cool overlay shit that everyone else is doing!

Ten good practices for writing JavaScript in 2005

Bobby van der Sluis has put together a guide to Ten Good Practices for Writing Javascript in 2005. I suspect that most of my readers here will already know that we should be doing this stuff: Bobby talks of making your pages accessible using unobtrusive Javascript, writing scrip … [Stuart Langridge]

src="http://www.kryogenix.org/favatars/http://www.sitepoint.com/"
alt="SitePoint"> Syndicated from the
SitePoint
Stylish Scripting weblog

—–

Making a dropdown with DOM scripting

Aaron Gustafson is putting together a series of articles on how to make select elements stylable. In essence, what his work does is take the select out of the DOM and replace it with a ul, and then add script and CSS to make that ul work like a dropdown list. The advantage this … [Stuart Langridge]

src="http://www.kryogenix.org/favatars/http://www.sitepoint.com/"
alt="SitePoint"> Syndicated from the
SitePoint
Stylish Scripting weblog

—–

Clearing out the postbox

Lots of posts circling above Heathrow, none of which I’m going to have time to write, so: a linkdump. Bah. Wish I wasn’t so busy.

  • On Call Bald does a short series on VoIP: Part 1 Part 2 Part 3. He and I had a play with SIP last night; the UI for all SIP clients is shite. I’d like to fix it; I thought about hacking on Shtoom but I can’t even get that to do a call. KPhone was successful, but I’m a much better Python hacker, hence Shtoom. Maybe when it can connect properly I’ll start hacking on it.
  • ANdy Budd pimps the SonyEricsson K750i as a replacement for the K700i that both he and I own. I think I want one of these.
  • Hixie bitches about Ajax and other names like “DHTML“, when you could just say “HTML + script“. I think he’s wrong, partially for marketing reasons which Angry Matt is better at explaining than I, and partially because “Ajax” doesn’t just mean the technologies, but also the mental structure and approach. But this argument has played out all over the place already.
  • Glom looks pretty cool.
  • Jonathan Riddell’s KDE4 wishlist. I’ve said before (on LugRadio) that I think that Jonathan seems to see the same issues with KDE that I do, and he’s trying to fix them. That’s really good.
  • If you have a physical copy of a program you can modify it, legally according to a recent court decision (in the US). Good news.

The BBC Creative Archive is nearly here

Auntie Beeb has launched the Creative Archive Licence to cover the Creative Archive when it happens.

They’re doing it! Yay!

No DRM. The Creative Archive licence is permissive; it allows remixes, sharing, and so on. It’s very Creative-Commons-ish, with a couple of exceptions: there’s a “no endorsement and no derogatory use” clause, and a “UK only” bit. I can see the point on the no endorsement part; no derogatory use is also fine with me, assuming that “things the Beeb don’t agree with” doesn’t count as “derogatory“, but they’ve said “treat others and their work in the way that you’d expect them to treat you and your work…with respect!“, so it doesn’t look like it.

The UK-only thing…well, tough shit on the rest of the world, really. When the Yanks who will inevitably complain about this start getting my town into Google Maps, I’ll start giving a toss. The BBC is ours. It’s one of the greatest things there is about this little island I live on (the NHS is another), and we pay for it. So if it’s limited to us…unlucky the world, I say. Ha! I love the Beeb. Love it.

They have a project timetable which illustrates what will happen and when. During the pilot, they intend to release 100 hours of television and music. Channel 4 and the BFI are also involved.

It’s a bit worrying that it says this site will keep you informed of where we have got to and how we are addressing the complexities of clearing copyright, digitising and making available some of the great factual material stored in the BBC’s archive , mind. What about all the non-factual material? I hope it doesn’t just become a way to download worthy documentaries for free but none of the good stuff.

So, gang, when it comes out, grab that content and (this is the important bit) do something cool with it. So the BBC Board of Governors see that people aren’t just looking to leech free stuff, but instead want it because it makes their lives better. I’d like to incorporate some of the music into LugRadio but I think that our CC licence won’t permit it, since we have a No Derivative Works clause. Maybe we should look at changing that? LugRadio listeners can vote on whether we should change or not although we don’t promise to actually do it.

Update: it occurred to me on the train that Top Gear is a “factual programme“. Wicked.

The chattering classes

Blimey. My Tory poster creator was linked to by the Observer (well, the Observer blog):

Am I now officially middle class?

Conservative posters

If you’re in the UK you’ll doubtless have seen the Coservative Party’s “are you thinking what we’re thinking” poster campaign. Did you, when you looked at those posters, think: I wish I could change some of the text on those, to show the Tories what I think of them and their ideas?

Well, now you can, a little bit. Get your own Conservative poster.

Rawdog

Joey Hess mentions the Rawdog aggregator, which is for setting up Planet-style sites. I normally use spycyroll for this (that’s what Planet Wolves uses, for example) because the idea of requiring a database to do this is just brain dead. It is. Don’t deny it. However, I think I might start using rawdog for that purpose in the future. First, it looks maintained. Secondly, Adam Sampson, the hacker who built it, wrote a plugin for it to use Vellum templates! Cor. Every now and again I see Vellum pop up somewhere; I think its plugin design might have been influenced by Vellum’s, too, which is really cool. Nice work, Adam, anyway.

Hotkeys

The hotkeys program is cool. apt-get install hotkeys.

That is all.

Vote for Jono

Jono’s band, Seraphidian , have been selected in the top ten unsigned bands by Kerrang magazine. As part of the Snickers Unsigned competition, the public (that’s you) votes for their favourite of those ten bands. The winner plays at the Download metal festival (which used to be called Donington Monsters of Rock), gets a single released, gets a video done, and is basically in a really good position to get signed and become True Gods Of Metal.
Vote for Jono and Seraphidian. You know it makes sense. The voting closes on Monday.

The 22 Immutable Laws of Marketing

On LugRadio we’ve talked quite a bit about marketing, what it is, and how it applies to open-source software. Eric Sink has put together a set of commentaries on the 22 immutable laws of marketing which is a most interesting read, and Dave Neary notes the application of this work to Gnome.

Linux Desktop Hacks

The bearded Bacon bastard gets his book out before I get mine out, damn him.
Nonetheless, you should go and buy it. It’s called Linux Desktop Hacks and I am in it too—at least, a hack that I wrote is.

Nice one Jono!

—–

Full-screen video conferencing

What software is there that allows you to do video conferences with multiple people and can show the people as just the images on a black screen, with the whole screen used? I don’t want a little window (or multiple little windows) on your desktop; instead, I want something like my mockup of tilted pictures from the other day (although don’t worry about the tilting or anything like that). The point of this is: while you’re in a video conference, it’s all you’re doing, so it’s fullscreen. It’s not there so you can see someone’s face while you still write stuff in Word. Can any video conferencing software go fullscreen? Important point: I don’t care about text chat, I don’t care about audio. Just video. The audio will be transmitted by another method. Yes, yes, I know iChat in Tiger can do it, but that doesn’t help me; I’m looking at this for work, and work is all Windows boxes (so no GnomeMeeting either). It should be Free Software too, ideally. Trillian does video chat, but it needs to be the Pro version, which is a bit expensive. I’m looking for suggestions…

Spatial AbiWord

All the usual All Fools Day jokes today, blah blah blah. Except for Spatial AbiWord. Problem with that is: it sounds excellent. Wish they’d actually do it.

—–