Posted on by

Using RegEx Validation

The Participants Database plugin offers a few ways to validate the data that is submitted when a record is created or updated on the frontend. Mostly, it’s either yes or no, but sometimes you need more control over that. This is where what’s known as a “regex” is used.

If you just need to validate a single value, the simplest regex is just the word with the delimiters: /yes/ to validate on the word “yes.”

A point of primary importance I want to make before we start with the regex is that you must tell your users what you expect. Use the “help” field to let them know what will work. It is very frustrating to users if they have to guess what input will satisfy the validator!

Regex

Regex is a shortened name for “Regular Expression.” Regexes have been around a long time, but it’s still a fairly arcane art (meaning not many people know how to really use it) despite how widespread it’s use is. I will be barely touching the surface in this short article, the subject is very broad. I will supply a few helpful links at the end of this article to help you if you need more explanation.

Mostly, I want to give you just enough explanation to use regexes in this plugin.

Let’s say you have a form where you are collecting a user’s phone number, and you want to make sure they put it in in the right format. This is a very simple task for a regex, but it will get us started on explaining how they work and how to create one. The phone number must be like this example: 012-224-2334 x214 The extension is optional, and can be 1-4 numerals long.

Delimiters

First, every regex must have delimiters. These are a character that tells the regex interpreter where the regex ends. The requirements are:

  • it must be the first character
  • it can be any non-alphanumeric, non-backslash, non-whitespace character
  • it must not be used in the expression (unless escaped)
  • it must be the last character
  • common choices are: /#|

In the example above /yes/ the / is the delimiter. The regex #yes# will work exactly the same.

What kind of character?

One of the main features of regular expressions is something called “character classes” which is a way of defining which characters you want to look at. There are several ways to represent character classes, but for our purposes, I will use the bracket form, which I think easiest for beginners. If you need a character to be an alphanumeric, the bracket form for that is [[a-zA-Z0-9]]. This means: lowercase characters from a though z, uppercase A to Z and numerals from 0 through 9. If you need it to be a numeral, you can use [[0-9]].

How many characters?

Once you’ve specified what kind of character you’re looking for, you can specify how many of them is acceptable. You can say 0 or more with an * or you could say 1 or more with a + For our purposes, we need to say exactly how many are wanted, and for that you use a numeric range: {} This can be used to specify a range of acceptable numbers of characters, for instance: {1,3} for a minimum of one and a maximum of 3 characters. If you just want a specific number of characters, you just use the one number: {3}

Literal characters

If you are looking for a specific character in the data, you just use that character in the expression. If it happens to be a “metacharacter” $^*.+|\?(){}[[]] you’ll need to escape it with a backslash so it will be treated as a plain character, not as having an operational meaning in the expression.

Putting Our Example Together

These are the four main ingredients of our regex expression, so having explained those, we can start assembling our regex. This is a simple example. First, we have what must be a group of three numerals, so we use this:

[[0-9]]{3}

That means “3 and only 3 characters in the range 0-9.” After that, we need to see a dash, which will be in there as a literal character. Following that, we have another group of three, etc. You get the idea:

[[0-9]]{3}-[[0-9]]{3}-[[0-9]]{4}

Now, we have the extension, and this part is optional. We do that by creating a “capturing group” which is set off with parentheses. We follow the matching group with a question mark, indicating that the whole group is optional.

( ?x[[0-9]]{1,4})?

At the beginning of the group, we have  ?x which is how we match the “x” character with the possibility that there is a space in front of it. The space and the x are literals, and the question mark is saying the space is optional.

We’re almost complete, but there is one more piece needed that I didn’t mention: anchors.

Anchoring the expression

Anchors serve to anchor the expression in the input string. There are two anchor characters, ^$ The carat is used to anchor the expression to the beginning of the incoming string (the data you’re validating) so that nothing squeaks by in “front” of the expression. The dollar sign does that same for the end of the expression. This is needed because without anchors, the expression will match any matching string in the data and basically ignore any characters outside of the match. If you want to make sure there is nothing outside of the matched string in the data, you must use anchors.

Here is the complete expression, as you would enter it into the validation field:

/^[[0-9]]{3}-[[0-9]]{3}-[[0-9]]{4}( ?x[[0-9]]{1,4})?$/

Note it includes our delimiters and anchors.

Giving it a test

It is very important to test regular expressions, they can be tricky, difficult to understand, and have unintended effects. You must test them with a lot of different kinds of inputs to make sure it’s holding up. I like to use an online code tester, there are several to choose from, but this is the best one I’ve found: Online Regex Tester

User interface issues

It’s very important to understand how the user is going to interact with your form, especially considering they may not understand what you’re looking for as well as you do. It’s always a good idea to tell your users what is expected, especially if you are validating the input like we are doing here.

Another thing to consider, the expression we put together here is not very forgiving of slight differences. It wouldn’t validate a capital “X” for instance. If you can accept a wider range of inputs, you should craft your regex allow it because that will make it easier for your users to complete the form. Users dislike being told the form is not completed correctly, so do what you can to help them get it right the first time. If the user can’t figure out how to enter the data correctly, they will blame the website…and they will be correct: it is the site designer’s job to make user interactions work smoothly.

Learning more about regexes

The following links offer solid help and reference information on regexes. It is not a simple subject, so it will take some study to understand them enough to make your own, but you’ll see plenty of examples to get you started. There are several programmer’s forums where practical help with specific regexes can be found. If you do post a question on such a board, include as much specific information as you can. Make it easy for people to answer and they’ll be more likely to help.

  • regular-expressions.info great tutorials, extensive site on regexes
  • PHP.net regex reference
  • Online Regex Tester this is an awesome way to test and learn to use regexes
  • PHP Live Regex – great for testing PHP-specific regex functions, also lets you test several possibilities at once
  • Stack Overflow great programmer’s forum where you can ask questions and get answers from people who know their stuff

4 thoughts on “Using RegEx Validation

  1. Hello

    Is there any way to set a “regex” field as “not required”?

    Greetings

    1. Take a look at this article:

      Regexes with a Blank Alternative

      1. The main goal that i need is to don’t show the “required” message next to the label.

        Greetings

        1. Fran, you can do this with a CSS rule that hides it. There isn’t any way to prevent it from showing for a required field.

Leave a Reply
You have to agree to the comment policy.

Would you like to be notified of followup comments via e-mail? You can also subscribe without commenting.

4 thoughts on “Using RegEx Validation

  1. Hello

    Is there any way to set a “regex” field as “not required”?

    Greetings

    1. Take a look at this article:

      Regexes with a Blank Alternative

      1. The main goal that i need is to don’t show the “required” message next to the label.

        Greetings

        1. Fran, you can do this with a CSS rule that hides it. There isn’t any way to prevent it from showing for a required field.

Leave a Reply
You have to agree to the comment policy.

Would you like to be notified of followup comments via e-mail? You can also subscribe without commenting.