Non-capturing groups in a regexp

Bridge to engine room, geek factor nine. Those of you who read this for musings about the world, stop reading now. Alex at work has just alerted me to the existence of non-capturing groups in regular expressions. I had no idea these existed, and they’re pretty useful if you’re doing RE matching. If you’re trying to match a string which might be “fish, chips and ketchup”, might be “fish, chips and peas”, and might not contain the “and chips” at all, and what you care about is what’s last on the list (the “peas” or “ketchup”) then I’d have used a regexp like /fish(, chips)? and (.*)$/. Matching that against “fish, chips and peas” will give you back a three-item tuple, ("fish, chips and peas", ", chips", "peas"). (Test with JavaScript) You need the brackets around “, chips” in the regexp because you want to treat it as a group. However, it ends up in the results, and that’s really irritating. Now I know about non-capturing groups, I’d do this: /fish(?:, chips)? and (.*)$/. The ?: after the opening bracket of the group means “don’t capture this group”. So now the results you get back are ("fish, chips and peas", "peas") — the chips, which we don’t care about, are not mentioned! (Test with JavaScript) Another useful little trick to add to my toolbox. Cheers, Alex. Everyone who is reading this and thinking “I knew about this ages ago”, why didn’t you tell me?

I'm currently available for hire, to help you plan, architect, and build new systems, and for technical writing and articles. You can take a look at some projects I've worked on and some of my writing. If you'd like to talk about your upcoming project, do get in touch.

More in the discussion (powered by webmentions)

  • (no mentions, yet.)