| boilerpipeR-package | Extract the main content from HTML files |
| ArticleExtractor | A full-text extractor which is tuned towards news articles. |
| ArticleSentencesExtractor | A full-text extractor which is tuned towards extracting sentences from news articles. |
| boilerpipe | Extract the main content from HTML files |
| CanolaExtractor | A full-text extractor trained on a 'krdwrd' Canola (see 'https://krdwrd.org/trac/attachment/wiki/Corpora/Canola/CANOLA.pdf'. |
| content | Wordpress generated Webpage (retrieved from Quantivity Blog <URL: https://quantivity.wordpress.com>). Content is saved as character and ready to be extracted. |
| DefaultExtractor | A quite generic full-text extractor. |
| Extractor | Generic extraction function which calls boilerpipe extractors |
| KeepEverythingExtractor | Marks everything as content. |
| LargestContentExtractor | A full-text extractor which extracts the largest text component of a page. |
| NumWordsRulesExtractor | A quite generic full-text extractor solely based upon the number of words per block (the current, the previous and the next block). |