dom-selector

DOM traversal by CSS selectors for xml-conduit package

https://github.com/nebuta/

Latest on Hackage:0.2.0.1

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

BSD-3-Clause licensed by Nebuta Lab
Maintained by [email protected]

CSS selector support for xml-conduit/html-conduit. This package supports compile-time checking of CSS selectors using quasiquotes. All DOM traversals are purely functional.

  • Quick start

-- The following pragmas should be put first (Haddock does not accept a pragma notation.)
-- LANGUAGE OverloadedStrings, QuasiQuotes

module Main (main) where

import Text.XML.Cursor (fromDocument)
import Text.HTML.DOM (parseLBS)
import qualified Data.Text.Lazy.IO as TI (putStrLn)

import Control.Monad (mapM_)

import Text.XML.Scraping (innerHtml)
import Text.XML.Selector.TH

import Network.HTTP.Conduit
import Data.Conduit.Binary

main :: IO ()
main = do
   root <- fmap (fromDocument . parseLBS) $ simpleHttp "https://news.google.com/"
   let cs = queryT [jq| h2 span.titletext |] root
   mapM_ (TI.putStrLn . innerHtml) cs

You can use some elementary CSS selectors for traversing a DOM tree.

  • Other examples

https://github.com/nebuta/dom-selector/tree/master/examples

Changes:

Ver 0.2.1: Inappropriate Safe Haskell pragma was removed.

Ver 0.2: All scraping functions in Text.XML.Scraping return lazy text now. They are implemented with a type class.