Symparsec

Type level string (Symbol) parser combinators. A Parsec-like for Symbols; thus, Symparsec! With many of the features you’d expect:

  • define parsers compositionally, largely as you would on the term level
  • pretty, detailed parse errors
  • decent performance (for simple parsers)

Parsers may also be reified and used at runtime with guaranteed identical behaviour via a healthy dose of singletons.

Requires GHC >= 9.6.

Examples

Define a type-level parser:

import Symparsec
type PExample = Skip 1 :*>: Isolate 2 NatHex :<*>: (Literal "_" :*>: TakeRest)

Use it to parse a type-level string (in a GHCi session):

ghci> :k! Run PExample "xFF_"
Run ...
= Right '( '(255, "etc"), "")

Use it to parse a different, term-level string:

ghci> import Singleraeh.Demote ( demote )
ghci> run' @PExample demote "abc_123"
Right ((188,"123"),"")

Why?

Via GHC.Generics, we may inspect Haskell data types on the type level. Constructor names are Symbols. Ever reify these, then perform some sort of checking or parsing on the term level? Symparsec does the parsing on the type level instead. Catch bugs earlier, get faster runtime.

Also type-level Haskell authors deserve fun libraries too!!

Design

The parser

A parser is a 3-tuple of:

  • a character parser; given a character and a state, returns
    • Cont s: keep going, here’s the next state s
    • Done r: parse successful with value r, do not consume character
    • Err E: parse error, details in the E (a structured error)
  • an end handler, which takes only a state, and can only return Done or Err
  • an initial state

Running such a parser is very simple:

  • initialize state
  • parse character by character until end of input, or Done/Err

Parsers may not communicate with the runner any other way. This means no backtracking, chunking etc. This is a conscious decision, made for simplicity. We’re still able to implement a good handful of parser combinators regardless, including a limited form of backtracking.

This is a rough overview. See the code & Haddocks for precise details.

Contributing

I would gladly accept further combinators or other suggestions. Please add an issue or pull request, or contact me via email or whatever (I’m raehik everywhere).

License

Provided under the MIT license. See LICENSE for license text.

Changes

1.1.1 (2024-06-15)

  • add Apply combinator (effectively fmap)
  • add some more runners and utils (handy for generic-data-functions)

1.1.0 (2024-06-01)

  • add While combinator
  • add Count combinator
  • re-add :<|>: re-export in Symparsec.Parsers

1.0.1 (2024-05-27)

  • add TakeRest combinator
  • re-add :<|>: combinator with more accurate behaviour clarification

1.0.0 (2024-05-25)

  • small rewrite, changing how Done works (now non-consuming)
  • single all parsers
    • …except :<|>:, which is disabled for now due to complexity

0.4.0 (2024-05-12)

  • rebrand from symbol-parser to Symparsec
  • rename Drop -> Skip (more commonly used for monadic parsers)
  • document parsers
  • provide fixity declarations for infix binary combinators

0.3.0 (2024-04-20)

  • add new parsers: Take, :<|>:
  • tons of cleanup, renaming (RunParser -> Run)
  • add handful of tests (via type-spec)

0.2.0 (2024-04-19)

  • add two more combinators: End, Literal
  • remove some old code (Data.Type.Symbol, Data.Type.Symbol.Natural)
  • fix base lower bound (at least base-4.16, == GHC 9.2)
  • style: don’t tick promoted constructors unless necessary for disambiguation

0.1.0 (2024-04-17)

Initial release.

  • basic combinators: Drop, Isolate, NatHex (etc.), sequencing
  • acceptable error messages