Bridge to engine room, geek factor nine. Those of you who read this for
musings about the world, stop reading now. Alex at work has just alerted
me to the existence of non-capturing groups in regular expressions. I
had no idea these existed, and they're pretty useful if you're doing RE
matching. If you're trying to match a string which might be "fish, chips
and ketchup", might be "fish, chips and peas", and might not contain the
"and chips" at all, and what you care about is what's last on the list
(the "peas" or "ketchup") then I'd have used a regexp like
/fish(, chips)? and (.*)$/. Matching that against "fish, chips and
peas" will give you back a three-item tuple,
("fish, chips and peas", ", chips", "peas"). (Test with
JavaScript) You need the brackets around ", chips" in the regexp
because you want to treat it as a group. However, it ends up in the
results, and that's really irritating. Now I know about non-capturing
groups, I'd do this: /fish(?:, chips)? and (.*)$/. The ?: after the
opening bracket of the group means "don't capture this group". So now
the results you get back are ("fish, chips and peas", "peas") -- the
chips, which we don't care about, are not mentioned! (Test with
JavaScript) Another useful little trick to add to my toolbox.
Cheers, Alex. Everyone who is reading this and thinking "I knew about
this ages ago", why didn't you tell me?
Non-capturing groups in a regexp
I'm currently available for hire, to help you plan, architect, and build new systems, and for technical writing
and articles. You can take a look at some projects I've worked on and
some of my writing. If you'd like to talk about your upcoming project,
do get in touch.