I like regular expressions (or regex), but part of their longevity is due to being developed separately from mainstream languages. Regular expressions are a way of using pattern matching, and the regular expression language uses standard keyboard symbols as special “meta” characters. This looks strange at first sight.
Writing regular expressions is a bit of a chore — I’ve seen many developers suggest using Copilot to help them out. Now, I don’t use JavaScript as an everyday language, but I’ve talked about regular expressions quite a bit, so it seemed natural to look at magic-regexp, a JavaScript package that uses more English-friendly expressions. .
Now you can pretty much guess why we use packages to express regular expressions.
- Replacing the pattern with methods and code makes it type safe. The concept of regular expressions is already native to JavaScript.
- Precompiled methods should be more efficient than live interpretation.
- Theoretically, it’s easier to read.
- Regular expression patterns contain meta-rules at the end to guide tricky behavior.
Therefore, you can provide valuable new arrows to your quiver.
There seem to be a myriad of ways to start a JavaScript project, but here’s a quick rundown. I vaguely brought my local environment up to date by running various upgrade dances on the command line.
> now Use node v16.13.0 (npm v8.1.0) |
after that
> touch testmagic regex.js > npm install magic–Regular expression > npm Initialization –y |
Let’s go back to our previous article on regular expressions and look at an expression that simply captures the first word of a sentence and its operation on a sample of Shakespearean text.
Now, to make sure that the above test captures single-letter words and not mid-sentences, I intentionally mangled Shakespeare (again) and added one line from Ms. Gaynor. to expand. test text:
Like that guy in Matrix, I’m happy to read the regex pattern directly, but what if I want to transcribe the above?
So how do we explain that operation on the phone?
“Find first uppercase letter, then any number of lowercase letters. Apply it to entire multi-line text.”
Looking at the usage of magic-regexp it looks like this:
“From the beginning” = at.linestart()
“find uppercase” = letter.uppercase
“any number of lowercase letters” = oneOrMore(letter.lowercase).optionally
“until space” = .and whitespace
So you actually need ‘*’ for zero and above, but you can use oneOrMore and add optionally.
This is testmagicregex.js File containing code:
import { createRegExp, letter, oneOrMore } from “Magic Regular Expressions”; constant Regular expression = createRegExp( letter.uppercase letter .and.head of line() .and(oneOrMore(letter.lower case).optionally()), [“g”, “m”] ); console.log(Regular expression); |
And this will return the regex on the command line:
new stack> node testmagic regex.js /^[A–Z](?:(?:[a–z])+)?/gm |
and it works:
But what’s the extra craft?A pair of question mark colons “?:” non-capturing groupThis means using parentheses to group things, as you would normally do if you wanted to apply a function to everything inside. As it happens, the default usage of parentheses in regular expressions is capture group.
The purpose of the magic-regexp package seems only partially successful. You still have to think in regular expressions. (It reminds me of the 1982 Clint Eastwood movie Firefox, in which the main character has to steal a plane from the USSR, but to operate it he has to think in Russian.)
Shakespeare’s examples are fine, but they are unlikely to be useful in a real JavaScript project. A more practical example is checking for a valid email address.
Again, how do you define valid e-mail schemes on the phone?
“Starts with a name that contains a dot, dash, or underscore as long as it starts and ends with a word letter. Then it must be followed by an at sign (@) followed by a bunch of two characters separated by a colon Stop ringing me now!
We all know there is a specific package to do this, so I don’t want to create my own package for production. But the quick one doesn’t hurt.
So from the conversation you will need at least:
- oneOrMore(word character)
- and(anyOf(”.”,”_”,”-”))
- and(oneOrMore(wordChar))
- that’s right(“@”)
- and(oneOrMore(wordChar))
- and (exactly (”.”))
- and(oneOrMore(wordChar))
Added some validity tests using claim This time, the above code alone won’t do everything for you, but after a little work, here’s what it looks like:
1 2 3 Four Five 6 7 8 9 Ten 11 12 13 14 15 16 17 18 19 20 twenty one twenty two twenty three twenty four twenty five 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 |
import claim from “node: assert”; import { createRegExp, that’s right, word letters, oneOrMore, either, } from “Magic Regular Expressions”; constant Regular expression = createRegExp( oneOrMore(word letters).and.head of line() .and(either(“.”, “_”,“-“).optionally()) .and(oneOrMore(word letters))) .and(that’s right(‘@’))) .and(oneOrMore(word letters))) .and(that’s right(“-“).optionally()) .and(oneOrMore(word letters))) .and(that’s right(“.”))) .and(oneOrMore(word letters).times.at least(2)), [“i”] ); console.log(Regular expression); claim.equivalent(Regular expression.test(“abc-@mail.com”), error); claim.equivalent(Regular expression.test(“abc..def@mail.com”), error); claim.equivalent(Regular expression.test(“.abc@mail.com”), error); claim.equivalent(Regular expression.test(“abc#def@mail.com”), error); claim.equivalent(Regular expression.test(“abc-d@mail.com”), truth); claim.equivalent(Regular expression.test(“abc.def@mail.com”), truth); claim.equivalent(Regular expression.test(“abc@mail.com”), truth); claim.equivalent(Regular expression.test(“abc_def@mail.com”), truth); claim.equivalent(Regular expression.test(“abc.def@mail.c”), error); claim.equivalent(Regular expression.test(“abc.def@mail#archive.com”), error); claim.equivalent(Regular expression.test(“abc.def@mail”), error); claim.equivalent(Regular expression.test(“abc.def@mail.com”), error); claim.equivalent(Regular expression.test(“abc.def@mail.cc”), truth); claim.equivalent(Regular expression.test(“abc.def@mail-archive.com”), truth); claim.equivalent(Regular expression.test(“abc.def@mail.org”), truth); |
So the final concern for professional developers is which one is easier to maintain. Is it a straight regex pattern, or is it a representative way of doing the above? I feel like Similar to regex quirks, you have to deal with another package quirk.
However, with a little more development, this could be a solid way to avoid staring at the Matrix.