External Libraries

Jsoup clean HTML example

Jsoup clean HTML example shows how to clean HTML using Jsoup. Example also shows how to remove HTML tags from String and retain specific tags using whitelist while cleaning the HTML using Jsoup.

How to remove HTML tags by cleaning the HTML using Jsoup?

You can remove HTML tags from String using clean method of Jsoup.

This method removes all HTML tags from the HTML string while retaining the tags included in the specified whitelist. By default, Jsoup provides below given whitelists out of the box.

1) none
All HTML tags are removed except for the text nodes.

2) simpleText
This whitelist allows only text formatting HTML tags b, em, i, strong and u. All other tags are removed.

3) basic
Basic whitelist allows a, b, blockquote, br, cite, code, dd, dl, dt, em, i, li, ol, p, pre, q, small, span, strike, strong, sub, sup, u, ul tags. All other tags are removed. It does not allow images.

4) basicWithImages
As the name suggests, this whitelist allows all tags included in basic whitelist plus image (img tag).

5) relaxed
This is most accommodating whitelist which allows a, b, blockquote, br, caption, cite, code, col, colgroup, dd, div, dl, dt, em, h1, h2, h3, h4, h5, h6, i, img, li, ol, p, pre, q, small, span, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, u, ul tags.

How to clean HTML using whitelist?

Create appropriate whitelist object and use it along with the clean method to clean the HTML and retain tags specified in the whitelist as given below.

Output

How to retain specific tags while cleaning the HTML document?

Default whitelists come with preconfigured tags. What if you want to retain particular tags only and remove all other HTML tags? Whitelist provides addTags method using which you can add as many tags as you want to retain them as given below.

This method adds HTML tags to the whitelist.

Below example shows how to retain only <div> tags and remove all other HTML tags from the HTML String.

Output

Please let us know your views in the comments section below.

Tags
Join 1000+ fellow learners! Enter your email address below: