Jsoup preserve new lines example

Jsoup preserve new lines example shows how to preserve new lines while using Jsoup to parse HTML. Example also shows how to preserve newlines characters having \n, <br> and <p> tags.

How to preserve new lines while using Jsoup?

Jsoup removes the newline character “\n” by default from the¬†HTML. It also does not retain new lines created by “<br>” or “<p>” tags either. Consider below given example.

Output

As you can see from the output, Jsoup replaced “\n” with a space character. To prevent Jsoup from removing the new line characters, you can change the OutputSetting of the Jsoup and turn pretty print off as given below.

Output

We cleaned the input HTML using clean method (clean HTML full example). We provided whitelist as none, so it removed all the HTML tags from the HTML string. In OutputSetting we specified pretty print as false, which prevented Jsoup removing the new line characters.

How to retain new lines created by <br> and <p> tags?

Many a times, new line is created by <br> and <p> tags in HTML output. While cleaning the HTML using Jsoup using clean method, it removes such new lines. The example given below shows how to retain such new lines.

Output

This example is a part of the Jsoup tutorial with examples.

Please let me know your views in the comments section below.

About the author

RahimV

RahimV

My name is RahimV and I have over 16 years of experience in designing and developing Java applications. Over the years I have worked with many fortune 500 companies as an eCommerce Architect. My goal is to provide high quality but simple to understand Java tutorials and examples for free. If you like my website, follow me on Facebook and Twitter.

6 Comments

Your email address will not be published. Required fields are marked *

Online Shopping