Best way to map preg_match() to an HTML5 input pattern – PHP – SitePoint Forums


0


I’m trying to validate a password field like this:

Must contain at least 1 uppercase and lowercase letter, at least 1 number, and at least 1 from a limited set of special characters. Also, it must be between 8 and 30 long

I want to do this both client side for speed and good user experience (HTML5) and server side for increased security (PHP).

The following two work, but I’m not sure why they are so different or if they can be improved/streamlined/fixed.

Client-side HTML5 = pattern="^(?=.*(a-z))(?=.*(A-Z))(?=.*(0-9))(?=.*(!@#$%^&*_)).{8,30}$"

Server-side PHP = if(!preg_match('/^(?=.*\d)(?=.*(A-Z))(?=.*(a-z))(?=.*(!@#$%_))(0-9A-Za-z!@#$%_){8,30}$/', $string)) {

Well, I’m not an expert in regular expressions, but it seems to me that on the server side is the last section of the regular expression (0-9A-Za-z!@#$%_) (just before defining the length limits) should be redundant and seems to be required by PHP, not by the HTML5 pattern preg_match() fails without – no error message, but does not validate correctly.

Also the server side requires no sense. before length restrictions {8,30} whereas HTML5 does .{8,30}

I just need these two to, well, get better :smiley:

Bottom up

"^(?=.*(a-z))(?=.*(A-Z))(?=.*(0-9))(?=.*(!@#$%^&*_)).{8,30}$"

Tear down
^
beginning of the line

(?=.*(a-z))
?= look ahead matches any character .* up to and including one lowercase letter (a-z).

(?=.*(A-Z))
look ahead Matches any character up to an uppercase letter (A-Z).

(?=.*(0-9))
look ahead matches any character up to and including a digit (0-9).

(?=.*(!@#$%^&*_))
look ahead Matches any character up to and including one of the following characters (!@#$%^&*_)

look-ahead

Quoting from https://www.regular-expressions.info/lookaround.html

The difference is that lookaround actually matches characters, but then abandons the match and returns only the result: match or no match. That is why they are called “claims”. They don’t consume any characters in the string, they just determine if a match is possible or not.

So the lookaheads first scout ahead to see if those matches exist in the string.

Finally

If the above lookaheads are true, we start at the beginning of the string and try to match any character between 8 and 30 characters.

.{8,30}
matches any character . 8 to 30 times {8.30}

$
end of the line

Note: when I said that the lookahead matches every character up to and including, that’s not entirely true .* means greedy. All text is matched and then traced back. Here’s a good explanation: https://javascript.info/regexp-greedy-and-lazy.

I tried it myself and came up with the following
^(?=(^a-z)*(a-z))(?=(^A-Z)*(A-Z))(?=\D*\d)(?=.*?(!@#$%^&*_)).{8,30}$

I have used negated character sets several times, e.g. (^az) and a non-greedy variant of any character .*?

Just a couple of safety points: I’m not using javascript and need the HTML pattern syntax and PHP preg_match to accomplish the same task. Are you saying that this syntax works for both?

I tested my regular expressions with regex101

You can click on different variants of PHP, Javascript etc. Note that there is no HTML pattern option, so you’ll have to investigate this yourself.

The negated and non-greedy variants are more powerful and require fewer steps in the match than the greedy variants.

I would recommend taking a look at the links I included in my post.

You’re right . Any character does the trick. However, the special character sets appear to be different. The HTML version seems to be searching ^&* whereas the PHP version does not. The PHP also uses shorthand \d for digit instead (0-9)

I really appreciate your effort and time and will continue to explore your ideas. You gave me an alternative for my server side php, thank you very much. But my original question said I have two working options (1 for PHP and 1 for HTML) but they seem to be different even though both are meant to be regular expressions.

So is anyone out there okay with my HTML and is it a fact that the syntax for the two expressions is different?

@kerry14 See above, I replied to this question while you were writing.

Edit: I also gave you a link to regex101 where you can at least test your HTML version by clicking on the PHP option to check if it works in PHP

Thanks, just saw it. Can anyone else tell me if HTML code is any good, or is there a better way? Thanks

I’ll leave it at that.

Test with the following string aBc_eFgh9

Their HTML version took 33 steps
^(?=.*(a-z))(?=.*(A-Z))(?=.*(0-9))(?=.*(!@#$%^&*_)).{8,30}$

Mine took 23 steps
^(?=(^a-z)*(a-z))(?=(^A-Z)*(A-Z))(?=\D*\d)(?=.*?(!@#$%^&*_)).{8,30}$

Sorry I’m confused so say your version works with HTML. Because you already said so

So folks, back to the original question: can someone provide me with a PHP preg_match AND an HTML 5 pattern that matches the criteria I first listed in the original question? Thanks

What have you tried @kerry14?

@rpg_digital
Hello
I’ve been a bit busy and my two original attempts – HTML 5 and PHP – are working, but I plan to try your version with both the HTML 5 input pattern and PHP preg_match() within the next few days and will get back to you.
Thanks for following up and thanks again for all your help and effort :grin:



1 like

This topic was automatically closed 91 days after the last reply. New replies are no longer allowed.

Source link


Like it? Share with your friends!

0
ncult

0 Comments

Your email address will not be published. Required fields are marked *