External Libraries

Jsoup preserve new lines example

Jsoup preserve new lines example shows how to preserve new lines while using Jsoup to parse HTML. Example also shows how to preserve new lines characters having \n, <br> and <p> tags.

How to preserve new lines while using Jsoup?

Jsoup removes new line character “\n” by default from the HTML. It also does not retain new lines created by “<br>” or “<p>” tags either. Consider below given example.

Output

As you can see from the output, Jsoup replaced “\n” with a space character. To prevent Jsoup from removing the new line characters, you can change the OutputSetting of the Jsoup and turn pretty print off as given below.

Output

We cleaned the input HTML using clean method (clean HTML full example). We provided whitelist as none, so it removed all the HTML tags from the HTML string. In OutputSetting we specified pretty print as false, which prevented Jsoup removing the new line characters.

How to retain new lines created by <br> and <p> tags?

Many a times, new line is created by <br> and <p> tags in HTML output. While cleaning the HTML using Jsoup using clean method, it removes such new lines. Example given below shows how to retain such new lines.

Output

Refer to Jsoup examples for more tips and tricks. Please let us know your views in the comments section below.

Want to learn quickly?
Try one of the many quizzes. I promise you will not be disappointed.

Tags

About the author

rahimv

rahimv

rahimv has over 15 years of experience in designing and developing Java applications. His areas of expertise are J2EE and eCommerce. If you like the website, follow him on Facebook, Twitter or Google Plus.

2 Comments

Your email address will not be published. Required fields are marked *