fast-tagsoup Fast parsing and extracting information from (possibly malformed) HTML/XML documents
https://github.com/vshabanov/fast-tagsoup
This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly . Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.
Fast TagSoup parser. Speeds of 20-200MB/sec were observed.
Works only with strict bytestrings.
This library is intended to be used in conjunction with the original tagsoup
package:
import Text.HTML.TagSoup hiding (parseTags, renderTags)
import Text.HTML.TagSoup.Fast
Besides speed fast-tagsoup
correctly handles HTML <script>
and <style>
tags, converts tags to lower case and can decode non UTF-8 XML for you.
This parser is used in production in BazQux Reader feeds and comments crawler.
Stackage is a service provided by the
Haskell Foundation
│ Originally developed by
FP Complete