TOML Parser

This package implements a validating parser for TOML 1.0.0.

This package uses an alex-generated lexer and happy-generated parser.

It also provides a pair of classes for serializing into and out of TOML.

Package Structure

---
title: Package Structure
---
stateDiagram-v2
    classDef important font-weight:bold;

    TOML:::important --> ApplicationTypes:::important : decode
    ApplicationTypes --> TOML : encode
    TOML --> [Token]: Lexer
    [Token] --> [Expr]: Parser
    [Expr] --> Table : Semantics
    Table --> ApplicationTypes : FromValue
    ApplicationTypes --> Table : ToValue
    Table --> TOML : Pretty

Most users will only need to import Toml or Toml.Schema. Other top-level modules are for low-level hacking on the TOML format itself. All modules below these top-level modules are exposed to provide direct access to library implementation details.

  • Toml - Basic encoding and decoding TOML
  • Toml.Schema - TOML schemas for application types
  • Toml.Semantics - Low-level semantic operations on TOML syntax
  • Toml.Syntax - Low-level parsing of text into TOML raw syntax

Examples

This file uses markdown-unlit to ensure that its code typechecks and stays in sync with the rest of the package.

{-# Language OverloadedStrings #-}
import Data.Text (Text)
import GHC.Generics (Generic)
import QuoteStr (quoteStr)
import Test.Hspec (Spec, hspec, it, shouldBe)
import Toml
import Toml.Schema

main :: IO ()
main = hspec (parses >> decodes >> encodes >> warns >> errors)

Using the raw parser

Consider this sample TOML text from the TOML specification.

fruitStr :: Text
fruitStr = [quoteStr|
[[fruits]]
name = "apple"

[fruits.physical]  # subtable
color = "red"
shape = "round"

[[fruits.varieties]]  # nested array of tables
name = "red delicious"

[[fruits.varieties]]
name = "granny smith"


[[fruits]]
name = "banana"

[[fruits.varieties]]
name = "plantain"
|]

Parsing using this package generates the following unstructured value

parses :: Spec
parses = it "parses" $
    forgetTableAnns <$> parse fruitStr
    `shouldBe`
    Right (table [
        ("fruits", List [
            Table (table [
                ("name", Text "apple"),
                ("physical", Table (table [
                    ("color", Text "red"),
                    ("shape", Text "round")])),
                ("varieties", List [
                    Table (table [("name", Text "red delicious")]),
                    Table (table [("name", Text "granny smith")])])]),
            Table (table [
                ("name", Text "banana"),
                ("varieties", List [
                    Table (table [("name", Text "plantain")])])])])])

Defining a schema

We can define a schema for our TOML format in the form of instances of FromValue, ToValue, and ToTable in order to read TOML directly into structured data form. This example manually derives some of the instances as a demonstration.

newtype Fruits = Fruits { fruits :: [Fruit] }
    deriving (Eq, Show, Generic)
    deriving (ToTable, ToValue, FromValue) via GenericTomlTable Fruits

data Fruit = Fruit { name :: String, physical :: Maybe Physical, varieties :: [Variety] }
    deriving (Eq, Show, Generic)
    deriving (ToTable, ToValue, FromValue) via GenericTomlTable Fruit

data Physical = Physical { color :: String, shape :: String }
    deriving (Eq, Show, Generic)
    deriving (ToTable, ToValue, FromValue) via GenericTomlTable Physical

newtype Variety = Variety String
    deriving (Eq, Show)

instance FromValue Variety where
    fromValue = parseTableFromValue (Variety <$> reqKey "name")
instance ToValue Variety where
    toValue = defaultTableToValue
instance ToTable Variety where
    toTable (Variety x) = table ["name" .= x]

We can run this example on the original value to deserialize it into domain-specific datatypes.

decodes :: Spec
decodes = it "decodes" $
    decode fruitStr
    `shouldBe`
    Success [] (Fruits [
        Fruit
            "apple"
            (Just (Physical "red" "round"))
            [Variety "red delicious", Variety "granny smith"],
        Fruit "banana" Nothing [Variety "plantain"]])

encodes :: Spec
encodes = it "encodes" $
    show (encode (Fruits [Fruit
            "apple"
            (Just (Physical "red" "round"))
            [Variety "red delicious", Variety "granny smith"]]))
    `shouldBe` [quoteStr|
        [[fruits]]
        name = "apple"

        [fruits.physical]
        color = "red"
        shape = "round"

        [[fruits.varieties]]
        name = "red delicious"

        [[fruits.varieties]]
        name = "granny smith"|]

Useful errors and warnings

This package takes care to preserve source information as much as possible in order to provide useful feedback to users. These examples show a couple of the message that can be generated when things don’t go perfectly.

warns :: Spec
warns = it "warns" $
    decode [quoteStr|
        name = "simulated"
        typo = 10|]
    `shouldBe`
    Success
        ["2:1: unexpected key: typo in <top-level>"] -- warnings
        (Variety "simulated")

errors :: Spec
errors = it "errors" $
    decode [quoteStr|
        # Physical characteristics table
        color = "blue"
        shape = []|]
    `shouldBe`
    (Failure
        ["3:9: expected string but got array in shape"]
        :: Result String Physical)

More Examples

A demonstration of using this package at a more realistic scale can be found in HieDemoSpec. The various unit test files demonstrate what you can do with this library and what outputs you can expect.

See the low-level operations used to build a TOML syntax highlighter in TomlHighlighter.

Changes

Revision history for toml-parser

2.0.1.0

  • Added ToValue UTCTime and FromValue UTCTime. These correspond to offset data-times with the timezone translated to UTC.

2.0.0.0

  • Pervasive annotations on the values added to allow for detailed positional error reporting throughout parsing and validation.
  • Replace uses of String with Text in the Value type and throughout the API
  • Reorganized almost all of the modules to minimize imports that upstream packages will actually need.

1.3.3.0

  • Added IsString Value instance.
  • Addded helpers for runMatcher for ignoring and failing on warning runMatcherIgnoreWarn and runMatcherFatalWarn

1.3.2.0

  • Added Toml.Generic to make instances easily derivable via DerivingVia.
  • Added GHC.Generics support for switching between product types and TOML arrays.

1.3.1.3

  • Bugfix: Previous fix admitted some invalid inline tables - these are now rejected

1.3.1.2

  • Bugfix: In some cases overlapping keys in inline tables could throw an exception instead instead of returning the proper semantic error value.

1.3.1.1

  • Ensure years are rendered zero-padded

1.3.1.0

  • Added Toml.Semantics.Ordered for preserving input TOML orderings
  • Added support for pretty-printing multi-line strings

1.3.0.0 – 2023-07-16

  • Make more structured error messages available in the low-level modules. Consumers of the Toml module can keep getting simple error strings and users interested in structured errors can run the different layers independently to get more detailed error reporting.
  • FromValue and ToValue instances for: Ratio, NonEmpty, Seq
  • Add FromKey and ToKey for allowing codecs for Map to use various key types.

1.2.1.0 – 2023-07-12

  • Added Toml.Pretty.prettyTomlOrdered to allow user-specified section ordering.
  • Added FromValue and ToValue instances for Text
  • Added reqKeyOf and optKeyOf for easier custom matching without FromValue instances.

1.2.0.0 – 2023-07-09

  • Remove FromTable class. This class existed for things that could be matched specifically from tables, which is what the top-level values always are. However FromValue already handles this, and both classes can fail, so having the extra level of checking doesn’t avoid failure. It does, however, create a lot of noise generating instances. Note that ToTable continues to exist because toTable isn’t allowed to fail, and when serializing to TOML syntax you can only serialize top-level tables.

  • Extracted Toml.FromValue.Matcher and Toml.FromValue.ParseTable into their own modules.

  • Add pickKey, liftMatcher, inKey, inIndex, parseTableFromValue to Toml.FromValue

  • Replace genericFromTable with genericParseTable. The intended way to derive a FromValue instance is now to write:

    instance FromValue T where fromValue = parseTableFromValue genericParseTable
    

1.1.1.0 – 2023-07-03

  • Add support for GHC 8.10.7 and 9.0.2

1.1.0.0 – 2023-07-03

  • Add Toml.FromValue.Generic and Toml.ToValue.Generic
  • Add Alternative instance to Matcher and support multiple error messages in Result
  • Add Data and Generic instances for Value

1.0.1.0 – 2023-07-01

  • Add ToTable and ToValue instances for Map
  • Refine error messages
  • More test coverage

1.0.0.0 – 2023-06-29

  • Complete rewrite including 1.0.0 compliance and pretty-printing.

0.1.0.0 – 2017-05-04

  • First version.