Java String Handling RegEx

Java split String by new line Example

Java split String by new line example shows how to split string by new line in Java. Example also shows how to split string by new line using regular expression and ignore empty lines. Example also covers files created in Windows, Unix and Mac OS.

How to split String by new line in Java?

While processing a file or processing text area inputs you need to split string by new line characters to get each line. You can do this by using regular expression in Java.

Different operating systems use different characters to represent a new line as given below.

While splitting a string by new line, we need to take care of all the possible new line characters given above. So basically, we need to look for zero or one \r followed by \n or just \r. Converting it to regex pattern will become “\\r?\\n|\\r” where,

Above pattern will cover all three OS new line characters. Below given is the example of split string by new line.

Output

How to include empty trailing lines?

Above pattern ignores the empty trailing line at the end of the string or file.

Output

Last empty line is not included in the output. You can include the trailing empty lines by using limit parameter in split method as given below.

Output

How to ignore empty lines in between?

Sometimes string or file contains empty lines between the content as given below.

Output

There are two empty lines between Line1 and Line 2. If you want to ignore empty line between lines, you can use “[\r?\n|\r]+” regex pattern where,

Our pattern will match one or more new lines since we applied “+” to the character group.

Note: We do not need to escape \r or \n because they are inside character group ([]).

Output

How to split string by new line in Java 8?

If you are using Java 8, you can use “\R” pattern instead of “\\r?\\n|\\r” pattern. “\R” covers all the new line characters.

Output

How to split string by new line using line.separator? (Not recommended)

line.separator is a system property which gives default new line characters for the operating system. It can be used to split string as given below.

Output

Why it is not recommended? In above example, string contains \r\n as new line and output was generated in Windows machine. So, when we get “line.separator” property, it returned Windows new line which is \r\n and program worked as expected.

If the same code was ran on Unix, “line.separator” would have returned \n and our code would have failed. This approach is not recommended for the same reason, it is platform dependent. Means you will not be able to process file generated in different environment.

Please let us know your views in the comments section below.

 

Tags
Join 1000+ fellow learners! Enter your email address below: