shpider

Web automation library in Haskell.

http://github.com/ozataman/shpider

Latest on Hackage:0.2.1.1

This package is not currently in any snapshots. If you're interested in using it, we recommend adding it to Stackage Nightly. Doing so will make builds more reliable, and allow stackage.org to host generated Haddocks.

BSD-3-Clause licensed by Johnny Morrice, Ozgun Ataman
Maintained by Ozgun Ataman

Shpider is a web automation library for Haskell. It allows you to quickly write crawlers, and for simple cases ( like following links ) even without reading the page source.

It has useful features such as turning relative links from a page into absolute links, options to authorize transactions only on a given domain, and the option to only download html documents.

It also provides a nice syntax for filling out forms.

An example:

runShpider $ do
     download "http://apage.com"
     theForm : _ <- getFormsByAction "http://anotherpage.com"
     sendForm $ fillOutForm theForm $ pairs $ do
           "occupation" =: "unemployed Haskell programmer"
           "location" =: "mother's house"