Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to parse a primitive type to a more constrained type? e.g. how to parse an 'IP' from a text field? #411

Open
Qqwy opened this issue Sep 27, 2022 · 5 comments

Comments

@Qqwy
Copy link

Qqwy commented Sep 27, 2022

I am looking to convert e.g. a string or text field that follows certain other rules into a more restricted datatype.
For instance, consider the 'server' field in this example https://github.com/kowainik/tomland/blob/main/test/examples/example.toml#L12.

We might want to parse this to a valid IP address datastructure (or fail parsing with a descriptive error). Vice-versa we of course want to be able to pretty-print this IP address back as a string.

So we have a Text -> Maybe IPAddress (or Text -> Either Text IPAddress) for the parsing, and IPAddress -> Text for the prettyprinting.

My hunch is that I'm looking for something that is very close to the signature of dimatch:

Toml.dimatch  :: (b -> Maybe a) -> (a -> b) -> Toml.TomlCodec a -> Toml.TomlCodec b

but with the opposite variance.

Toml.diparse  :: (b -> a) -> (a -> Maybe b) -> Toml.TomlCodec b -> Toml.TomlCodec a

(Of course, using an Either to keep track of the error messages is nicer than using Maybe here).

But maybe there is also a much simpler way that I'm missing?
I'm still very new to bidirectional parsing.

@Qqwy
Copy link
Author

Qqwy commented Sep 28, 2022

So there were a few functions I have missed when looking through the documentation before.
Conclusion: It is already very possible to do this right now, especially when parsing data from a Text field.

  • If the format you want to parse/prettyprint happens to be the same as the Read and Show instances, there is Toml.Codec.Combinator.Custom.read (Re-exported as Toml.read)
  • If you have a custom parser/prettyprint function pair, you can use Toml.Codec.Combinator.Custom.textBy. This function essentially has the signature that I was looking for in the question above (specialized to Text),
    i.e. (a -> Text) -> (Text -> Either Text a) -> Key -> TomlCodec a.
  • There is also validate and validateIf but these are less useful in my opinion as they do not follow parse, don't validate. In other words, you can only use them to further refine an a using a validation function whose result is another a.
  • In some cases, such as when trying to parse TOML keys (which by themselves are String/Text) to a custom datatype, you might need to drop down to the BiMap layer, most likely using the prism function if only one of the directions might fail. (if both might fail, use the BiMap constructor directly. If neither can fail, use iso).
    To ensure that your resulting BiMap is a TomlBiMap that you can use for other functions like e.g. Toml.tableMap, wrap the error message (which is probably a Text, depending on what other kind of parsing functionality you use) with the Toml.ArbitraryError constructor.

@Qqwy
Copy link
Author

Qqwy commented Sep 28, 2022

I'm leaving this issue open, because I think that one (or multiple) of these bullet points might be useful to add to the README in the section about translating between TOML and your desired Haskell.

What do you think?
If you give the green light, I'd love to contribute a PR for this.

@CGenie
Copy link

CGenie commented Feb 13, 2023

It would be nice to add this to the docs. I had to dig this issue to see how to write a custom parser for a custom data type.

@meghfossa
Copy link

Adding example here, for future readers as I ran into this today

import Text.Megaparsec
import Data.Text (Text)
import Toml (TomlCodec, (.=))
import Toml qualified

-- name: somename 
-- special: [<MyEntity>, <MyEntity>, ...]

data Example = Example
    {
        name :: Text,
        special :: Maybe [MyEntity]
    }
    deriving (Eq, Ord, Show)

exampleCodec :: TomlCodec Example
exampleCodec = Example 
      <$> Toml.diwrap (Toml.text "name") .= name
      <*> Toml.dioptional (Toml.arrayOf myCodec "special") .= special

myCodec :: Toml.TomlBiMap MyEntity Toml.AnyValue
myCodec = Toml._TextBy (toText . show) parseMyEntity

parseMyEntity :: Text -> Either Text MyEntity
parseMyEntity candidate = case runParser myEntityParser "" candidate of
  Left peb -> Left $ toText $ errorBundlePretty peb
  Right rr -> Right rr

myEntityParser :: Parser MyEntity
myEntityParser = .....

@domenkozar
Copy link

Here's a codec for uri-bytestring:

uri :: Toml.Key -> TomlCodec (URI.URIRef URI.Absolute)
uri = Toml.match (_URI2ByteString >>> Toml._ByteString)

_URI2ByteString :: Toml.TomlBiMap (URI.URIRef URI.Absolute) ByteString
_URI2ByteString = Toml.BiMap
    { forward  = Right . URI.serializeURIRef'
    , backward = left (Toml.ArbitraryError . show) . URI.parseURI URI.strictURIParserOptions
    }

Maybe we should start tomland-extras?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants