-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bad dropWhile performance #108
Comments
i suspect that its either 1 or 2, but if not, 3 might be useful. could you also tell us what OS / GHC version / Vector version you're running this with? |
also it'd be worth seeing if theres a performance difference if you used Data.Vector.Unboxed instead |
I tried it with unboxed vectors, and it speeds up all functions by a factor of 2, though the ratio remains more or less the same: module Main where
import Criterion.Main
import qualified Data.Vector.Unboxed as V
bNoDrop :: V.Vector Double -> Double
bNoDrop xs
| V.null xs = 0
| otherwise = V.last xs / V.head xs
bDropWhile' :: V.Vector Double -> Double
bDropWhile' xs
| V.null xs = 0
| otherwise = V.last xss / V.head xss
where xss = dropWhile' (< 10) xs
dropWhile' :: V.Unbox a => (a -> Bool) -> V.Vector a -> V.Vector a
dropWhile' p v = V.drop (go 0) v where
go n | n == V.length v = n
| p (V.unsafeIndex v n) = go (n + 1)
| otherwise = n
bDropWhile :: V.Vector Double -> Double
bDropWhile xs
| V.null xs = 0
| otherwise = V.last xss / V.head xss
where xss = V.dropWhile (< 10) xs
setupEnv :: IO (V.Vector Double)
setupEnv = return $ V.enumFromN 0 10000000
{-# NOINLINE setupEnv #-}
main :: IO ()
main = defaultMain [
env setupEnv $ \v ->
bgroup "funcs" [
bench "noDrop" $ nf bNoDrop v
, bench "V.dropWhile" $ nf bDropWhile v
, bench "dropWhile'" $ nf bDropWhile' v
]
]
|
ok, i think it has to do with sharing, i'll test it after i get back from lunch, stay tuned! |
my hypothesis i shall test this afternoon, based on reading the core , is if you put the |
I'm still trying to do some sleuthing, but so we can unpack whats going on, lets unwrap some definitions! the generic vector dropwhile is define like so:
then the bundle rep is defined as follows
then the stream version is defined as
|
Without digging deeply into this, I think the liftM in step' looks a bit
|
Returning the result of an if-then-else is also surprisingly (to me) lazy.
|
did this ticket have any actionable conclusions or was it ultimately just a good "tradeoffs are complicated" dialogue? :) |
i'm closing for now, please chime in if you think theres a concrete action we can take that improves perf for all |
I think removing the mysterious laziness is likely good for compiler On Jul 24, 2016 2:55 PM, "Carter Tazio Schonwald" [email protected]
|
Perhaps I am misunderstanding the implementation, but it seems that The fix would then seem to be to use the |
The `unstream` version results in a copy of the remainder of the vector which is rarely what you want to do. This commit uses `drop` directly after `findIndex` which results in a big speed gain if the remaining part of the vector is large. Cf. haskell#108 and haskell#141
Now dropWhile's implementation uses stream only in case the stream fusion is already running. This is expected to reduce unnecessary re-allocation. cf. haskell#141 haskell#108
* make dropWhile fusion adaptive Now dropWhile's implementation uses stream only in case the stream fusion is already running. This is expected to reduce unnecessary re-allocation. cf. #141 #108 * Adaptive dropWhile Implementation: Implementation comment modified Modified the comment for the RULE "dropWhile/unstream [Vector]" in order to make the intention of the implementation there clear.
* make dropWhile fusion adaptive Now dropWhile's implementation uses stream only in case the stream fusion is already running. This is expected to reduce unnecessary re-allocation. cf. #141 #108 * Adaptive dropWhile Implementation: Implementation comment modified Modified the comment for the RULE "dropWhile/unstream [Vector]" in order to make the intention of the implementation there clear.
I am filing this as a follow-up to http://stackoverflow.com/questions/35419908/efficient-alternative-to-data-vector-dropwhile
Compiling this with GHC 7.10.3 and
--make -O2
results in the following:The text was updated successfully, but these errors were encountered: