Skip to content

Java split String by words example

Java split String by words example shows how to split string into words in Java. The example also shows how to break string sentences into words using the split method.

How to split String by words?

The simplest way to split the string by words is by the space character as shown in the below example.

Output

As you can see from the output, it worked for the test sentence string. The sentence is broken down into words by splitting it using space.

Let’s try some other not-so-simple sentences.

Output

As you can see from the output, our code did not work as expected. The reason being is simple split by space is not enough to separate words from a string. Sentences may be separated by punctuation marks like dot, comma, question marks, etc.

In order to make the code handle all these punctuation and symbols, we will change our regular expression pattern from only space to all the punctuation marks and symbols as given below.

Output

This time we got the output as we wanted. The regex pattern [ !\"\\#$%&'()*+,-./:;<=>?@\\[\\]^_`{|}~]+ includes almost all the punctuation and symbols that can be used in a sentence including space. We applied + at the end to match one or more instances of these to make sure that we do not get any empty words.

Instead of this pattern, you can also use \\P{L} pattern to extract words from the sentence, where \\P denotes POSIX expression and L denotes character class for word characters. You need to change the line with the split method as given below.

Please note that \\P{L} expression works for both ASCII and non-ASCII characters (i.e. accented characters like “café” or “kākā”).

This example is a part of the Java String tutorial with examples and the Java RegEx tutorial with examples.

Please let me know your views in the comments section below.

About the author

Leave a Reply

Your email address will not be published.