This is

as days pass by, by Stuart Langridge

. Here I write about many things. In the past I wrote about other things but the past is past. I write code for people to play with, I write about my life on Twitter, and I write here.

On I wrote Sanitising HTML, on the subject of CouchDB, JavaScript and the DOM, and JavaScript.

It was pointed out to me that comments on my old posts were showing as raw HTML (you know, a sort of <p>this is a comment</p> sort of thing). I knew this. However, the reason it was like that is because it occurred to me about five minutes after releasing thort, the engine that now runs this place, that comment HTML was just displayed. Unsanitised.

Cross-site scripting, anyone? Oops.

So I just threw an "escape" filter into my comment template (which uses the great Trimpath JavaScript templating engine) so that I couldn't be brutally pwnt by anyone posting a comment.

Finally this evening I thought: I'd better do something about that. Two minutes of Googling brought me to Caja's HTML sanitizer, written in JavaScript. It was the work of but a moment to throw that into the CouchDB view that generates comments so that outputted comment HTML was sanitized. It was the work of but one more moment to also throw that into the client-side JavaScript that displays a posted comment. It's really nice being able to use exactly the same code on client and server.

mike

Why is this needed client side too?

Would browser w/o JS see the malicious HTML?

sil

It's not really needed client-side, tbh, but posting here uses client-side JS (you can't post a comment if you don't have scripting support; this was a deliberate design decision). Thus, it would be weird for posting to show the unescaped HTML but then refreshing the page (and getting your comment from the server) to show something different.

No'

I don't trust Couchapps... They look like a nice toy, but mixing data and applications mess my brain for some reason. Maybe I'm a bit too "old-fashion", but I tend to think that database (even document-oriented) is meant to store data, not code. And by making the split between data and apps, I feel more secure.

Larry Hosken

Thanks for the link!

Derek Sorensen

Just a thought - and you might already be defending against this - but what if someone bypasses your client-side JavaScript (for example, by faking the form elsewhere) and their evil HTML is viewed by someone with JavaScript disabled, thus not getting the benefit of the JS sanitisation?

If you aren't also sanitising server-side, and preferably on input rather than output, then perhaps this decision is worth revisiting?

sil

Derek: the same code runs both server-side and client-side.

cheap coach monster

[removed as spam]

This website belongs to Stuart Langridge. Contact details are available. Don't eat yellow snow. Valid HTML5, at least in theory, except for the bits that aren't because I'm that futuristic that I'm ahead of the spec, oh yes. HTML5 help from Bruce Lawson, among others. Fonts from the superb FontSquirrel. End.