[New Plugin] TNCC - Readability

Hi to all bubblers !

I’ve just released a new plugin : TNCC - Readability.
It’s an adaptation to Bubble of the Readability library, a standalone version of the readability library used for Firefox Reader View.

“Reader View is a Firefox feature that strips away clutter like buttons, ads, background images, and videos […]” (Firefox Reader View for clutter-free web pages | Firefox Help)

You can consider this plugin as web scraping of any blog article to get some metadata, a light html version of the article and a text-only version.

=== SETUP ===
Drop the element on your page

2 actions are available from the plugin :

Parse :
This actions take a HTML string as parameter. It then parses the string to remove all the HTML not necessary to read its main content (buttons, ads, background images, and videos)
To get the result, use the Element Event “Parsed” to access the following results :

  • title : title of the web page
  • content : HTML content of the parsed web page
  • text : text content of the parsed webpage (stripped from HTML tags)
  • length : length in characters of the text
  • excerpt : Text description or short excerpt from the text (from web page metadata)
  • author metadata : author metadata (from web page metadata)
  • siteName : web site name
  • lang : Language of the text

Parse URL :
This actions take an URL string as parameter.
It returns the ‘html_body’ of the URL.
You can then use this result in the “Parse” action above to get its content.

You can get the plugin here : TNCC - Readability Plugin | Bubble

And the Demo page here : The NoCode Company Demos

Any feedback appreciated and any request to improve the plugin also :slight_smile:

Enjoy !