How to Validate Emails with Regex

Updated January 12, 2021


What is a regular expression?

Regular expressions, often referred to as regex, refer to encoded text strings designed to match patterns in other strings. Regular expressions are particularly helpful when you need to find a string of characters that matches a certain type of pattern. These patterns can be simple, where they match an exact string, or they can be more complex, where they match strings that contain a set of rules. One common usage for regex is for identifying whether a user has correctly entered something into a form, such as an email address. 

What are the benefits and drawbacks of using regex to validate emails?

Regex provides the ability to validate the structure of an email address. It can be handled with one or two lines of code and can easily be tweaked to handle a wide variation of different parameters. 

However, one thing to keep in mind is that it can only check the structure of an email address. There is no way, using regex by itself to verify if the person is using a legitimate address. It cannot check, for instance, against MX or SMTP records to ensure that the address being provided is a legitimate address. In other words, someone can easily construct any string of characters that fits the rules and pass through this validation, and it is not possible to check whether the address provided is actually real.

What should I keep in mind when validating emails with regex?

There are a wide range of possible combinations with emails. Up until a few years ago we could look for only 2- or 3-character top level domains, however the ICANN has recently opened up a large number of new TLDs, which means that these can have a much longer range of characters. It is also important to keep in mind international domains, wherein there is a country abbreviation associated with a domain (e.g., example.co.uk).  What this means from a regex perspective is that you need to account for several periods after the “@” symbol. Overall, there is no such thing as a perfect regex to capture all legitimate email addresses. 

One thing to remember is that there are two basic reasons for validating email addresses; one is to improve usability, and make sure users don’t accidentally leave off an important part of their email addresses. The second is to make sure that people are not entering dummy addresses. One thing you don’t want to do is to reject possibly legitimate email addresses; this goes against the basic principles of usability, so you need to take a measured approach.

For this reason, we will not be showing overly restrictive regex examples here.

H2: How can I use a regex to validate emails?

Email addresses generally take a fairly standard structure (e.g. xxx@xxxxxx.xxx). There are, however, a wide range of other limitations. A basic email regex is built into HTML5, and uses the following expression:


/^[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/


What this does is it looks for any combination of A-Z (both upper and lowercase) and numbers, and allowing a few specific special characters, including:


! # $ % & ' * + - / = ? ^ _ ` { |


Followed by the “@” symbol, and then allowing for a standard domain name and TLD after this. However, there are a few specific rules, including that a special character cannot appear as the first or last character in an email address, nor can it be repeated consecutively. Other special characters not included in the above list are forbidden. For this reason, we explicitly allow only a few special characters here.

Below we will provide a series of methods for validating email using regex in a variety of different programming languages.

Email Regex in Python 

Here is a straightforward method for checking whether an email is valid in Python:


regex = '^[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$'
def check(email):  
    if(re.search(regex,email)):  
        print("Valid Email")  
    else:  
        print("Invalid Email")  

Email Regex in PHP

It is worth noting that PHP has a built-in method for validating email addresses. You can do this using the following:


if(filter_var($email, FILTER_VALIDATE_EMAIL)) {
     echo ‘this email address is valid’;
}


However, if you prefer to use regex, here is one basic method you can use, using the preg_match function:


$email = "abc123@example.com"; 
$regex = '/^[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/'; 
if (preg_match($regex, $email)) {
 echo $email . " is valid.";
} 

Email Regex in JavaScript

JavaScript uses the “match” function for recognizing regular expressions. Below is one method for handling this.


function ValidateEmail(inputText)
{
	var mailformat = /^[a-zA-Z0-9.!#$%&’*+/=?^_`{|}~-]+@[a-zA-Z0-9-]+(?:\.[a-zA-Z0-9-]+)*$/
	if(inputText.value.match(mailformat))
	{
		alert("This is not a valid email address");
		return false;
		}
}


Email Regex in Ruby

Ruby has built in email validation into a standard library, so you can use the following:


URI::MailTo::EMAIL_REGEXP

However, if you want to understand the way that this regex works in Ruby, you can see it below:


VALID_EMAIL_REGEX = /\A([\w+\-].?)+@[a-z\d\-]+(\.[a-z]+)*\.[a-z]+\z/i


Email Regex in Go (Golang)

Using Go, you can import a few libraries to make sure that this process happens correctly. Here is a variation on the standard regex expression that we have been using:


package main
import (
	"fmt"
	"regexp"
)
var emailRegex = regexp.MustCompile("^[a-zA-Z0-9.!#$%&'*+\\/=?^_`{|}~-]+@[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?(?:\\.[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?)*$")

func main() {
	e := "test@testing.com"
	if isEmailValid(e) {
		fmt.Println(e + " is a valid email")
	}
	if !isEmailValid("just text") {
		fmt.Println("not a valid email")
	}
}
func isEmailValid(e string) bool {
	if len(e) < 3 && len(e) > 254 {
		return false
	}
	return emailRegex.MatchString(e)
}

Email Regex in Java

Java requires that you go through a few extra steps but offers a few useful features such as the ability to search for a string use a case insensitivity pattern.


public static final Pattern VALID_EMAIL_ADDRESS_REGEX = 
    Pattern.compile("^[A-Z0-9._%+-]+@[A-Z0-9.-]+\\.[A-Z]{2,6}$", Pattern.CASE_INSENSITIVE);

public static boolean validate(String emailStr) {
        Matcher matcher = VALID_EMAIL_ADDRESS_REGEX.matcher(emailStr);
        return matcher.find();
}


Email Regex in jQuery

If you prefer using jQuery over regular JavaScript, you can use the following structure:


var userinput = $(this).val();
var pattern = /^\b[A-Z0-9._%-]+@[A-Z0-9.-]+\.[A-Z]{2,4}\b$/i

if(!pattern.test(userinput))
{
  alert('not a valid e-mail address');
}​