performance - `friday` package is very slow -
i’m writing haskell program draws big maps knytt stories world files. use friday
package make image files, , need compose many graphics layers put spritesheets. right now, use own ugly function this:
import qualified vision.primitive im import qualified vision.image.type im import qualified vision.image.class im import vision.image.rgba.type (rgba, rgbapixel(..)) -- map word8 in [0, 255] double in [0, 1]. w2f :: word8 -> double w2f = (/255) . fromintegral . fromenum -- map double in [0, 1] word8 in [0, 255]. f2w :: double -> word8 f2w = toenum . round . (*255) -- compose 2 images one. `bottom` wrapped `top`'s size. compose :: rgba -> rgba -> rgba compose bottom top = let newsize = im.manifestsize top bottom' = wrap newsize bottom in im.fromfunction newsize $ \p -> let rgbapixel rb gb bb ab = bottom' im.! p rgbapixel rt gt bt @ = top im.! p ab' = w2f ab; at' = w2f @ ovl :: double -> double -> double ovl cb ct = (ct * at' + cb * ab' * (1.0 - at')) / (at' + ab' * (1.0 - at')) (~*~) :: word8 -> word8 -> word8 cb ~*~ ct = f2w $ w2f cb `ovl` w2f ct ao = f2w (at' + ab' * (1.0 - at')) in rgbapixel (rb ~*~ rt) (gb ~*~ gt) (bb ~*~ bt) ao
it alpha-composites bottom layer , top layer, so:
if “bottom” layer texture, looped horizontally , vertically (by wrap
) fit top layer’s size.
rendering map takes far, far longer should. rendering map default world comes game takes 27 minutes @ -o3
, though game can render each separate screen in less couple of milliseconds. (the smaller example output linked above see above takes 67 seconds; far long.)
the profiler (output here) says program spends 77% of time in compose
.
cutting down seems first step. seems simple operation, can’t find native function in friday
lets me this. supposedly ghc should @ collapsing of fromfunction
stuff, don’t know what’s going on. or package super slow?
as stated in comment, mce made performs fine , not yield interesting output:
module main import qualified vision.primitive im import vision.primitive.shape import qualified vision.image.type im import qualified vision.image.class im import vision.image.rgba.type (rgba, rgbapixel(..)) import vision.image.storage.devil (load, save, autodetect(..), storageerror, storageimage(..)) import vision.image (convert) import data.word import system.environment (getargs) main :: io () main = [input1,input2,output] <- getargs io1 <- load autodetect input1 :: io (either storageerror storageimage) io2 <- load autodetect input2 :: io (either storageerror storageimage) case (io1,io2) of (left err,_) -> error $ show err (_,left err) -> error $ show err (right i1, right i2) -> go (convert i1) (convert i2) output go i1 i2 output = res <- save autodetect output (compose i1 i2) case res of nothing -> putstrln "done compose" e -> error (show (e :: storageerror)) -- wrap image given size. wrap :: im.size -> rgba -> rgba wrap s im = let z :. h :. w = im.manifestsize im in im.fromfunction s $ \(z :. y :. x) -> im im.! im.ix2 (y `mod` h) (x `mod` w) -- map word8 in [0, 255] double in [0, 1]. w2f :: word8 -> double w2f = (/255) . fromintegral . fromenum -- map double in [0, 1] word8 in [0, 255]. f2w :: double -> word8 f2w = toenum . round . (*255) -- compose 2 images one. `bottom` wrapped `top`'s size. compose :: rgba -> rgba -> rgba compose bottom top = let newsize = im.manifestsize top bottom' = wrap newsize bottom in im.fromfunction newsize $ \p -> let rgbapixel rb gb bb ab = bottom' im.! p rgbapixel rt gt bt @ = top im.! p ab' = w2f ab; at' = w2f @ ovl :: double -> double -> double ovl cb ct = (ct * at' + cb * ab' * (1.0 - at')) / (at' + ab' * (1.0 - at')) (~*~) :: word8 -> word8 -> word8 cb ~*~ ct = f2w $ w2f cb `ovl` w2f ct ao = f2w (at' + ab' * (1.0 - at')) in rgbapixel (rb ~*~ rt) (gb ~*~ gt) (bb ~*~ bt) ao
this code loads 2 images, applies compose operation, , saves resulting image. happens instantly:
% ghc -o2 so.hs && time ./so /tmp/lambda.jpg /tmp/lambda2.jpg /tmp/output.jpg && o /tmp/output.jpg done compose ./so /tmp/lambda.jpg /tmp/lambda2.jpg /tmp/output.jpg 0.05s user 0.00s system 98% cpu 0.050 total
if have alternate mce please post it. complete code non-minimal eyes.
Comments
Post a Comment