Please enable JavaScript to view the comments powered by Disqus.

Basic Regular Expression in Java

A - Regular Expression Overview

Regular expression (Regex) is a sequence of characters that define a search pattern for strings. Regular expression can be used to search, edit and manipulate text and data.
The pattern defined by regular expression can match zero, one or several times in a string. The regular expression pattern will be applied to the string from left to right. Any character that has been used in a match cannot be reused.

For example: regular expression 121 will match string 12121212 only twice.

B - Regular Expression in Java

In Java, the java.util.regex package consists of 3 classes to work with regular expression:

-   Pattern: represents a regular expression. To create a Pattern object, you have to invoke a static method that accept regular expression string as an argument .

[java]Pattern mPattern = Pattern.compile(pattern);[/java]

-   Matcher: interprets the pattern and performs match operations against an input string. Matcher object can be obtained by invoking Matcher method on Pattern object ().

[java]Matcher mMatcher = mPattern.matcher(EXAMPLE_TEST);[/java]

-   PatternSyntaxException: unchecked exception that indicates a syntax error in a regular expression pattern.

C - Regular Expression syntax

1 - Common matching symbols

Regular Expression Descriptions
. Matches any character
regex1
^kai Matches any text that start with kai
regex2
kai$ Matches any text that end with kai
regex3
[xyz] Matches the letter x or y or z
regex4
[abc][xyz] Matches a or b or c followed by x or y or z
regex5
[^xyz] Matches any text except x or y or z. The ^ in [] negates the regex.
regex6
[a-z0-9] Matches letter from a to z or digit from 0 to 9
regex7
x|y Matches letter x or y
regex8
xy Matches letter x followed by y
regex9

 2 - Metacharacters

 Regular Expression Descriptions
 \d  Matches any digit, short for [0-9]
regex10
 \D  Matches any non-digit
regex11
 \s  Matches any whitespace character, short for [ \t\n\x0b\r\f]
 regex12
 \S  Matches any non-whitespace character
regex13
 \s+  Matches several white space characters
regex14
 \w  Matches any word character, short for[a-zA-Z0-9]
regex15
 \W Matches any non-word character
regex16

 3 - Quantifier

 Regular Expression  Descriptions
 *  Repeats 0 or many times
regex17
 + Repeats 1 or many times
regex18
 ? Repeats 0 or 1 time
regex19
 {x} Repeats x times
regex20
 {x,y} Repeats from x to y times
regex21

 4 - Grouping in Regular Expression

You can group a part of regular expression and refer to it later using $ character.

For example: Regular expression to remove all odd white spaces between 2 words:

[java] String pattern = “(\S)(\s+)(\S)”; System.out.println(myString.replaceAll(pattern, “$1 $3”)); [/java]

 5 - Negative lookahead

Negative lookahead provides the possibility to exclude a pattern. With this you can say that a string should not be followed by another string.

Negative Lookaheads are defined via (?!pattern).

For example, the following will match “kai” if “kai” is not followed by “other”.

[java]kai(?!other)[/java]

 C - Some examples about Regular Expression in Java

Check if the input text is 24 or 32 bit hex color, with an optional leading # or ox:

[java] public void checkValid24or32bitColorFormat(){ Pattern pattern; Matcher matcher; String TEXT;

pattern = Pattern.compile("(?:#|0x)?(?:[0-9A-F]{2}){3,4}");

TEXT = "0xF0F73611";
matcher = pattern.matcher(TEXT);
System.out.println(TEXT + " : " + String.valueOf(matcher.matches()));

TEXT = "#FF006C";
matcher = pattern.matcher(TEXT);
System.out.println(TEXT + " : " + String.valueOf(matcher.matches()));

TEXT = "99AAB7FF";
matcher = pattern.matcher(TEXT);
System.out.println(TEXT + " : " + String.valueOf(matcher.matches()));

TEXT = "FFZZ08";
matcher = pattern.matcher(TEXT);
System.out.println(TEXT + " : " + String.valueOf(matcher.matches())); } [/java]

Check if the input text is a “slug” text or not:

[java] public void checkSlugText(){ Pattern pattern; Matcher matcher; String TEXT;

pattern = Pattern.compile("^[a-z0-9-]+$");

TEXT = "a_b_123";
matcher = pattern.matcher(TEXT);
System.out.println(TEXT + " : " + String.valueOf(matcher.matches()));

TEXT = "a-b-123";
matcher = pattern.matcher(TEXT);
System.out.println(TEXT + " : " + String.valueOf(matcher.matches())); } [/java]

So, that’s all for the basic regular expression in Java. I’m a newbie to regular expression so please don’t mind telling me if I have something wrong in this post!

Thanks :)