Java String Handling RegEx

Java split string by comma example

Java split string by comma example shows how to split string by comma in Java. Example also shows how to handle CSV record having comma between double quotes or parentheses.

How to split String by comma in Java?

You can split string by comma in Java using split method of String class. split method returns parts of the string which are generated from the split operation in an array.

Java split String by comma example

Output

As you may have noticed, some of the array elements have a space in them. You can use "\\s*,\\s*" regular expression to remove space around elements and split the string by comma, where

So basically we are looking for comma which has zero or more leading and zero or more trailing spaces. Once the match is found, split method will split the string by the match. Since the spaces are now part of our expression, they are not returned back in the result (split method does not return delimiters).

Output

How to return empty fields while splitting a String with comma?

Let’s change our CSV values a bit like given below

Output

We have total 6 values in the String (where 5th and 6th values are empty values), but split method returned only 4 parts. That is because by default split method does not return empty trailing values. In order to have these values returned, we need to specify the limit parameter in the split method.

Limit parameter specifies how many times the regex pattern can be applied on String. The default limit is zero which applies pattern as many times as possible but discards the empty trailing values. If the limit is non-zero positive, the pattern can be applied limit – 1 times at most. If the limit specified is a negative value, pattern can be applied as many times as possible (and retains empty trailing values). Let’s apply limit to the above code.

Output

How to split String by comma but ignore comma in parentheses?

Consider below given input values to the code we just wrote.

And our expected output is,

Output

Our pattern also matched the comma between parentheses. We want to match only commas which are not between parentheses. We need to rewrite our pattern to ",(?![^()]*\\))" where,

Basically, we are looking for comma which is not followed by closing parentheses thus ignoring the comma inside the parentheses.

Output

How to split String by comma but ignore comma in double quotes?

Sometimes CSV record values are enclosed in double quotes. The values may itself contain comma which we need to ignore while splitting the values by comma. We are going to use a "\"(,\")?" pattern where,

Here is the example program.

Output

Note: split method for version Java 1.7 and below returns first element as empty string.

The above regular expression works for the String having the all values enclosed in double quotes but it fails where some values are enclosed in double quotes and other are not.

Consider below given example string.

Output

Here is the more accurate version of regular expression which handles this scenario ",(?=([^\"]*\"[^\"]*\")*[^\"]*$)". Basically we are looking for a comma which has either zero or even number of double quotes.

Output

Please let us know your views in the comment section below.

 

Tags
Join 1000+ fellow learners! Enter your email address below: