Here’s a fairly common code sample from Rails Applications with some sort of authentication system: If you’re experienced at Regex, this seems simple. Sections 3.2.4 and 3.4.1 of the RFC go into the requirements on how an email address needs to be formatted and, well, there’s not much you can’t do in your email address when quotes or backslashes are involved.
If (like me when I first saw this) you AREN’T experienced at Regex, it takes a while to parse. The local string (the part of the email address that comes before the @) can contain any of these characters: is a valid email address. For this reason, for a time I began running any email address against the following regular expression instead: Simple, right? This is often the most I do and, when paired with a confirmation field for the email address on your registration form, can alleviate most problems with user error.
They can get ridiculously convoluted as in the case above and, according to the specification, are often too strict anyway.
If you actually check the Google query I linked above, people have been writing (or trying to write) RFC-compliant regular expressions to parse email addresses for years.
But what if I told you there were a way to determine whether or not an email is valid without resorting to regular expressions at all? The activation email is a practice that’s been in use for years, but it’s often paired with complex validations that the email is formatted correctly.
It’s surprisingly easy, and you’re probably already doing it anyway. If you’re going to send an activation email to users, why bother using a gigantic regular expression?
Think about it this way: I register for your website under the email address . That’s probably going to bounce off of the illustrious mail daemon, but the formatting is fine; it’s a valid email address.
To fix this problem, you implement an activation system where, after registering, I am sent an email with a link I must click.
At this point, why keep parsing email addresses for their format?
The result of sending an email to a badly formatted email address would be the same: it’ll get bounced. If you really want to do checking of email addresses right on the signup page, include a confirmation field so they have to type it twice.
If your user enters a bad email address, they won’t get the activation email and they’ll try to register again if they really care about using your site. Enterprising individuals will just copy and paste, but what it comes down to is this: if your user enters a bad email address, you shouldn’t make it more of a problem for yourself than you have to.
A complex regex validation on the email address doesn’t introduce an additional solution, it introduces an additional problem.
You may be interested this regex which's based on RFC 2822, match 99.99% of all email addresses in actual use from this link