dlist
Difference lists
Version on this page: | 1.0@rev:1 |
LTS Haskell 22.39: | 1.0@rev:2 |
Stackage Nightly 2024-10-31: | 1.0@rev:2 |
Latest on Hackage: | 1.0@rev:2 |
dlist-1.0@sha256:55ff69d20ce638fc7727342ee67f2f868da61d3dcf3763f790bf9aa0b145e568,3812
Module documentation for 1.0
- Data
- Data.DList
- Data.DList.Unsafe
- Data.DList
Difference Lists
List-like types supporting O(1) append
and snoc
operations.
Installation
dlist
is a Haskell package available from Hackage.
It can be installed with cabal
or stack
.
See the change log for the changes in each version.
Usage
Here is an example of “flattening” a Tree
into a list of the elements in its
Leaf
constructors:
import qualified Data.DList as DList
data Tree a = Leaf a | Branch (Tree a) (Tree a)
flattenSlow :: Tree a -> [a]
flattenSlow = go
where
go (Leaf x) = [x]
go (Branch left right) = go left ++ go right
flattenFast :: Tree a -> [a]
flattenFast = DList.toList . go
where
go (Leaf x) = DList.singleton x
go (Branch left right) = go left `DList.append` go right
(The above code can be found in the benchmark.)
flattenSlow
is likely to be slower than flattenFast
:
-
flattenSlow
uses++
to concatenate lists, each of which is recursively constructed from theleft
andright
Tree
values in theBranch
constructor. -
flattenFast
does not use++
but constructs a composition of functions, each of which is a “cons” introduced byDList.singleton
((x :)
). The functionDList.toList
applies the composed function to[]
, constructing a list in the end.
To see the difference between flattenSlow
and flattenFast
, consider some
rough evaluations of the functions applied to a Tree
:
flattenSlow (Branch (Branch (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
= go (Branch (Branch (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
= go (Branch (Leaf 'a') (Leaf 'b')) ++ go (Leaf 'c')
= (go (Leaf 'a') ++ go (Leaf 'b')) ++ "c"
= ("a" ++ "b") ++ "c"
= ('a' : [] ++ "b") ++ "c"
= ('a' : "b") ++ "c"
= 'a' : "b" ++ "c"
= 'a' : 'b' : [] ++ "c"
= 'a' : 'b' : "c"
flattenFast (Branch (Branch (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
= toList $ go (Branch (Branch (Leaf 'a') (Leaf 'b')) (Leaf 'c'))
= toList $ go (Branch (Leaf 'a') (Leaf 'b')) `append` go (Leaf 'c')
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) . unsafeApplyDList (go (Leaf 'c')) $ []
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) (unsafeApplyDList (go (Leaf 'c')) [])
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) (unsafeApplyDList (singleton 'c') [])
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) (unsafeApplyDList (UnsafeDList ((:) 'c')) [])
= unsafeApplyDList (go (Branch (Leaf 'a') (Leaf 'b'))) "c"
= unsafeApplyDList (UnsafeDList (unsafeApplyDList (go (Leaf 'a')) . unsafeApplyDList (go (Leaf 'b')))) "c"
= unsafeApplyDList (go (Leaf 'a')) (unsafeApplyDList (go (Leaf 'b')) "c")
= unsafeApplyDList (go (Leaf 'a')) (unsafeApplyDList (singleton 'b') "c")
= unsafeApplyDList (go (Leaf 'a')) (unsafeApplyDList (UnsafeDList ((:) 'b')) "c")
= unsafeApplyDList (go (Leaf 'a')) ('b' : "c")
= unsafeApplyDList (singleton 'a') ('b' : "c")
= unsafeApplyDList (UnsafeDList ((:) 'a')) ('b' : "c")
= 'a' : 'b' : "c"
The left-nested ++
in flattenSlow
results in intermediate list constructions
that are immediately discarded in the evaluation of the outermost ++
. On the
other hand, the evaluation of flattenFast
involves no intermediate list
construction but rather function applications and newtype
constructor wrapping
and unwrapping. This is where the efficiency comes from.
Warning! Note that there is truth in the above, but there is also a lot of
hand-waving and intrinsic complexity. For example, there may be GHC rewrite
rules that apply to ++
, which will change the actual evaluation. And, of
course, strictness, laziness, and sharing all play a significant role. Also, not
every function in the dlist
package is the most efficient for every situation.
Moral of the story: If you are using dlist
to speed up your code, check
to be sure that it actually does. Benchmark!
Design Notes
These are some notes on design and development choices made for the dlist
package.
Avoid ++
The original intent of Hughes’ representation of lists as first-class functions
was to provide an abstraction such that the list append
operation found in
functional programming languages (and now called ++
in Haskell) would not
appear in left-nested positions to avoid duplicated structure as lists are
constructed. The lesson learned by many people using list over the years is that
the append
operation can appear, sometimes surprisingly, in places they don’t
expect it.
One of our goals is for the dlist
package to avoid surprising its users with
unexpected insertions of ++
. Towards this end, there should be a minimal set
of functions in dlist
in which ++
can be directly or indirectly found. The
list of known uses of ++
includes:
DList
:fromList
,fromString
,read
DNonEmpty
:fromList
,fromNonEmpty
,fromString
,read
If any future requested functions involve ++
(e.g. via fromList
), the burden
of inclusion is higher than it would be otherwise.
Abstraction
The DList
representation and its supporting functions (e.g. append
, snoc
,
etc.) rely on an invariant to preserve its safe use. That is, without this
invariant, a user may encounter unexpected outcomes.
(We use safety in the sense that the semantics are well-defined and expected,
not in the sense of side of referential transparency. The invariant does not
directly lead to side effects in the dlist
package, but a program that uses an
unsafely generated DList
may do something surprising.)
The invariant is that, for any xs :: DList a
:
fromList (toList xs) = xs
To see how this invariant can be broken, consider this example:
xs :: DList a
xs = UnsafeDList (const [])
fromList (toList (xs `snoc` 1))
= fromList (toList (UnsafeDList (const []) `snoc` 1))
= fromList (toList (UnsafeDList (unsafeApplyDList (UnsafeDList (const [])) . (x :))))
= fromList (toList (UnsafeDList (const [] . (x :))))
= fromList (($ []) . unsafeApplyDList $ UnsafeDList (const [] . (x :)))
= fromList (const [] . (x :) $ [])
= fromList (const [] [x])
= fromList []
= UnsafeDList (++ [])
The invariant can also be stated as:
toList (fromList (toList xs)) = toList xs
And we can restate the example as:
toList (fromList (toList (xs `snoc` 1)))
= toList (UnsafeDList (++ []))
= []
It would be rather unhelpful and surprising to find (xs `snoc` 1)
turned out
to be the empty list.
To preserve the invariant on DList
, we provide it as an abstract type in the
Data.DList
module. The constructor, UnsafeDList
, and record label,
unsafeApplyDList
, are not exported because these can be used, as shown above,
to break the invariant.
All of that said, there have been numerous requests to export the DList
constructor. We are not convinced that it is necessary, but we are convinced
that users should decide for themselves.
To use the constructor and record label of DList
, you import them as follows:
import Data.DList.Unsafe (DList(UnsafeDList, unsafeApplyDList))
If you are using Safe Haskell, you may need to add this at the top of your module:
{-# LANGUAGE Trustworthy #-}
Just be aware that the burden of proof for safety is on you.
References
These are various references where you can learn more about difference lists.
Research
-
A novel representation of lists and its application to the function “reverse.” John Hughes. Information Processing Letters. Volume 22, Issue 3. 1986-03. Pages 141-144. PDF
This is the original published source for a representation of lists as first-class functions.
Background
Blogs and Mailing Lists
- Using Difference Lists. Douglas M. Auclair. 2008-08-13.
- A Sort of Difference. Edward Kmett. 2008-09-18.
- Reference for technique wanted. Richard O’Keefe, et al. 2010-10-31.
- 24 Days of Hackage: dlist. Oliver Charles. 2012-12-14.
- Constructing a list in a Monad. Joachim Breitner. 2013-11-13.
- Demystifying DList (Reddit). Tom Ellis. 2014-01-24.
- keepEquals with Difference Lists. Douglas M. Auclair. 2014-06-21.
Books
- Chapter 13. Data Structures. Real World Haskell. 2008-12-05.
License
BSD 3-Clause “New” or “Revised” License © Don Stewart, Sean Leather, contributors
Changes
Change Log
Unreleased
No unreleased changes at this time.
v1.0
Released on 2020-07-18, Nelson Mandela International Day.
Added
intercalate
forDList
(#43, Jacob Leach)Traversable
instance forDList
(#45, Veronika Romashkina)Data.DList.Internal
for theDList
implementation,Data.DList.Unsafe
for exporting theDList
constructorUnsafeDList
and record labelunsafeApplyDList
(#55, #59)Data.DList.DNonEmpty
(#60)- GitHub Action for uploading a release (#74)
dlist-bench
, a benchmark package (#71)
Changed
stimes
forDList
defined withstimesMonoid
(#46, Janek Spaderna)- Type of
tail
:DList a -> DList a
toDList a -> [a]
(#69) - GitHub Action for continuous integration testing to replace Travis-CI (#47, #50)
- GHC warning and error improvements (#72, #73)
- Improved documentation (#55, #70, #76, #77)
Removed
list :: b -> (a -> DList a -> b) -> DList a -> b
(#69)
v0.8.0.8
Released on 2020-04-02, World Autism Awareness Day.
Added
toList
in theFoldable
instance forDList
(#36, Ryan Scott)
Changed
QuickCheck
upper bound: 2.14 to 2.15 (a7ea60d
)
Fixed
- Documented time complexity of
head
forDList
(#35, Simon Jakobi)
v0.8.0.7
Released on 2019-08-05, Independence Day in Burkina Faso.
Added
MonadFail
instance forDList
(#32, Vanessa McHale)
Changed
deepseq
upper bound: 2 to 1.5 (#33, Herbert Valerio Riedel)
v0.8.0.6
Released on 2019-03-29, Martyrs’ Day in Madagascar.
Changed
QuickCheck
upper bound: 2.13 to 2.14 (242511c
)
v0.8.0.5
Released on 2018-09-13, Day of the Programmer.
Changed
QuickCheck
upper bound: 2.12 to 2.13 (0e2b3a5
)
v0.8.0.4
Released on 2018-01-19, Kokborok Day.
Added
{-# LANGUAGE Trustworthy #-}
inData.DList
(#31, Bertram Felgenhauer)
Changed
QuickCheck
upper bound: 2.11 to 2.12 (3d9c8ad
)QuickCheck
lower bound: 2.7/2.9 to 2.10 (4f92012
)Arbitrary
,Arbitrary1
instances forNonEmpty
in the test suite copied fromquickcheck-instances
(4f92012
)
v0.8.0.3
Released on 2017-07-04, Independence Day in the United States.
Added
quickcheck-instances
dependency in the test suite for theArbitrary
,Arbitrary1
instances forNonEmpty
(5b41d0d
)
Changed
QuickCheck
upper bound: 2.10 to 2.11 (b2f791a
)
Fixed
stimes
property in the test suite (#30, Oleg Grenrus)
v0.8.0.2
Released on 2016-09-04, World Sexual Health Day.
Fixed
- Missing module
OverloadedStrings
in the test suite (#29, Sergei Trofimovich)
v0.8.0.1
Released on 2016-07-29, the 58th Anniversary of the Creation of NASA.
Changed
QuickCheck
lower bound: 2.7 to 2.9 for GHC >= 8 (#28, Adam Bergmark)
v0.8
Released on 2016-07-17, Constitution Day in South Korea.
Added
- Pattern synonyms
Nil
andCons
(#15) Semigroup
instance forDList
(#25)- Canonical
Applicative
andMonad
instances forDList
(#23, Herbert Valerio Riedel)
Changed
IsString
instance forDList
is no longer flexible (#26, Baldur Blöndal)QuickCheck
upper bound: 2.9 to 2.10 (ef7eac5
)
v0.7.1.2
Released on 2015-08-23, International Day for the Remembrance of the Slave Trade and its Abolition.
Fixed
- Imports causing warnings in GHC 7.10 (#22, Mikhail Glushenkov)
v0.7.1.1
Released on 2015-03-19, St. Joseph’s Day.
Changed
v0.7.1
Released on 2014-06-28, the 100th Anniversary of the Assassination of Franz Ferdinand.
Added
IsList
instance forDList
(#13, Baldur Blöndal)
v0.7.0.1
Released on 2014-03-24, World Tuberculosis Day.
Changed
QuickCheck
upper bound: 2.7 to 2.8 (7494dbc
)
v0.7
Released on 2014-03-17, St. Patrick’s Day.
Added
NFData
instance forDList
(#10, Evan Laforge)IsString
instance forDList
(771a38d
)
Changed
base
lower bound: 2 to 4 (77f6898
)
Removed
DList
constructor and record label,maybeReturn
(62c0c09
)
v0.6.0.1
Released on 2013-12-01, World AIDS Day.
Changed
QuickCheck
lower bound: 2.6 to 2.5 (#9, Michael Snoyman)
v0.6
Released on 2013-11-29, Black Friday.
Added
apply
to replaceDList
record labelunDL
(#4)Eq
,Ord
,Show
, andAlternative
instances forDList
(#1, Bas van Dijk)Read
instance forDList
(58ef305
)Foldable
instance forDList
(5b1d09f
)- Travis-CI for continuous integration testing (#6, Herbert Valerio Riedel)
Changed
- Maintenance: Don Stewart to Sean Leather (#2, Bas van Dijk)
- Repository:
http://code.haskell.org/~dons/code/dlist/
tohttps://github.com/spl/dlist
base
lower bound: 0 to 2 (6e1d9e7
)
Fixed
- Test suite simplified and changed to use
cabal test
(9f58759
)
Deprecated
- Exported
DList
constructor and record label,maybeReturn
(#4)