haskell - Using pipes-csv to parse Latin-1 encoded content? -

i'd use pipes-csv parse large csv files, turns out these csv files latin-1 encoded , turns out pipes-csv, , cassava library depends on, assume utf-8. ends producing parsing errors need handle.

the way i've approached duplicate records hold csv data text fields bytestring fields in dup. decode dup, manually translate latin-1 strings utf-8 , create final record. inelegant least.

is there better way?

per daniel's suggestion, here have far:

import qualified pipes.text.encoding pte import qualified pipes.bytestring pb  withfile "file.csv" readmode $ \h ->   let transcode = pte.decodeiso8859_1 . pb.fromhandle ~> pte.encodeutf8   runeffect $ decodebyname (void . transcode $ h) >-> process

it trades off unnecessary records unnecessary re-encoding of text, it's improvement. don't suppose there way without doing either of these unnecessary things?

Search This Blog

Ben

haskell - Using pipes-csv to parse Latin-1 encoded content? -

Comments

Post a Comment

Popular posts from this blog

routing - AngularJS State management ->load multiple states in one page -

python - GRASS parser() error -

Swift game error message -