Java StringTokenizer – Using RegEx Pattern example shows what happens when we use regex pattern as delimiter to tokenize the string.
Can we use a regex pattern as delimiter to tokenize the string?
The StringTokenizer class in Java is used to tokenize the string content. We can provide the set of delimiters using which we want to generate the tokens from the given string content.
One way to do this is to create a StringTokenizer object using the below given constructor.
1 |
public StringTokenizer(String string, String delimiter) |
This constructor creates a new StringTokenizer object that will tokenize the string content based on the set of delimiters we passed. Once the object is created, we can get the tokens from the string using hasMoreTokens
and nextToken
methods as given in below example.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
package com.javacodeexamples.stringtokenizerexamples; import java.util.StringTokenizer; public class StringTokenizerRegEx { public static void main(String[] args) { String str = "Hello world Java"; StringTokenizer st = new StringTokenizer(str, " "); while(st.hasMoreTokens()) { System.out.println(st.nextToken()); } } } |
Output
1 2 3 |
Hello world Java |
In the above example, I used a simple space character to create tokens from the string. However, many a time we need more complex approach to generate tokens, for example, using a regex pattern.
However, the StringTokenizer does not support regex pattern. Providing a regex pattern as delimiters does not work as given below.
1 2 3 4 5 6 |
String str = "Hello,world.java"; StringTokenizer st = new StringTokenizer(str, "\\b"); while(st.hasMoreTokens()) { System.out.println(st.nextToken()); } |
Output
1 |
Hello,world.java |
As you can see from the output, the word boundary “\\b” regex pattern did not work and tokens were not generated.
How to generate tokens based on regex pattern?
The StringTokenizer does not support the regex pattern, but you can use the String split method to do the same. The only difference is, the split method returns an array containing all tokens.
Here is the same example using the split method.
1 2 3 4 5 6 |
String str = "Hello,world.Java"; String[] parts = str.split("\\b"); for(String part : parts) { System.out.println(part); } |
Output
1 2 3 4 5 |
Hello , world . Java |
If you want to tokenize the string using a regex pattern, just use the split method instead of the StringTokenizer class.
If you want to learn more about the string tokenizer, please visit the Java StringTokenizer Tutorial.
Please let me know your views in the comments section below.