with-utf8
Get your IO right on the first try
https://github.com/serokell/haskell-with-utf8#readme
LTS Haskell 23.1: | 1.1.0.0@rev:1 |
Stackage Nightly 2024-12-27: | 1.1.0.0@rev:1 |
Latest on Hackage: | 1.1.0.0@rev:1 |
with-utf8-1.1.0.0@sha256:fa2572f401717243e4f0daa08c0d46270d8ff41f7528e3e91847e14324034ec8,3092
Module documentation for 1.1.0.0
- Data
- Data.Text
- Data.Text.IO
- Data.Text.Lazy
- Data.Text.Lazy.IO
- Data.Text
- Main
- System
- System.IO
with-utf8
Get your IO right on the first try.
Reading files in Haskell is trickier than it could be due to the non-obvious interactions between file encodings and system locale. This library is meant to make it easy once and for all by providing “defaults” that make more sense in the modern world.
See this blog post for more details on why this library needs to exists and an explanation of some of the opinionated decisions it is based on.
Use
See the documentation on Hackage for details, this is a quick summary.
Step 1: Get it
The library is on Hackage, go ahead and add it to the dependencies of your project.
Step 2: Wrap your main
Import withUtf8
from Main.Utf8
and wrap it around your main
:
import Main.Utf8 (withUtf8)
main :: IO ()
main = withUtf8 $
{- ... your main function ... -}
This will make sure that if your program reads something from stdin
or
outputs something to stdout
/stderr
, it will not fail with a runtime
error due to encoding issues.
Step 3: Read files using UTF-8
If you are going to read a text file (to be precise, if you are going to open
a file in text mode), you’ll probably use withFile
, openFile
, or readFile
.
Grab the first two from System.IO.Utf8
or the latter from Data.Text.IO.Utf8
.
Starting from text-2.1
, Data.Text.IO.Utf8
is available in the text
package
itself, hence this module in with-utf8
is now deprecated.
Note: it is best to import these modules qualified.
Note: there is no System.IO.Utf8.readFile
because it’s 2024 and
you should not read String
s from files.
All these functions will make sure that the content will be treated as if it was encoded in UTF-8.
If, for some reason, you really need to use withFile
/openFile
from base
,
or you got your file handle from somewhere else, wrap the code that works
with it in a call to withHandle
from System.IO.Utf8
:
import qualified System.IO as IO
import qualified System.IO.Utf8 as Utf8
doSomethingWithAFile :: IO.Handle -> IO ()
doSomethingWithAFile h = Utf8.withhandle h $ do
{- ... work with the file ... -}
Step 4: Write files using UTF-8
When writing a file either open it using withFile
/openFile
from
System.IO.Utf8
or write to it directly with writeFile
from
Data.Text.IO.Utf8
.
Starting from text-2.1
, Data.Text.IO.Utf8
is available in the text
package
itself, hence this module in with-utf8
is now deprecated.
Note: it is best to import these modules qualified.
Note: there is no System.IO.Utf8.writeFile
.
If, for some reason, you really need to use withFile
/openFile
from base
,
do the same as in the previous step.
Troubleshooting
Locales are pretty straightforward, but some people might have their terminals
misconfigured for various reasons. To help troubleshoot any potential issues,
this package comes with a tool called utf8-troubleshoot
.
This tool outputs some basic information about locale settings in the OS and what they end up being mapped to in Haskell. If you are looking for help, please, provide the output of this tool, or if you are helping someone, ask them to run this tool and provide the output.
Contributing
If you encounter any issues when using this library or have improvement ideas, please open report in issue on GitHub. You are also very welcome to submit pull request, if you feel like doing so.
License
Changes
Changelog
1.1.0.0
Changed
- Allow newer versions of base and text to support GHC up to 9.8.
- Deprecate Data.Text.IO.Utf8.
1.0.2.4
Changed
- Allow base 4.17, 4.18 (GHC 9.4, 9.6).
- Allow text<2.1
1.0.2.3
Support GHC 9.2.1.
Changed
- Allow base 4.16 (GHC 9.2.1).
1.0.2.2
Windows support.
Changed
- Fix
utf8-troubleshoot
on Windows.
1.0.2.1
A technical clean up release.
Changed
- Specify missing version bounds for dependencies.
1.0.2.0
Improve utf8-troubleshoot
to make it useful for identifying tricky cases.
Changed
utf8-troubleshoot
: improve available locale detectionutf8-troubleshoot
: display raw results from C libraries
1.0.1.0
GHC 8.10 compatibility and a new troubleshooting tool.
Added
utf8-troubleshoot
– the troubleshooting tool
Changed
- Bump
base
for GHC 8.10
1.0.0.0
Initial release.
Added
withUtf8
withStdTerminalHandles
setHandleEncoding
withHandle
setTerminalHandleEncoding
withTerminalHandle
openFile
withFile
readFile
writeFile