-
Notifications
You must be signed in to change notification settings - Fork 793
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] Manofstick seq composer squashed #2526
[WIP] Manofstick seq composer squashed #2526
Conversation
- Removed all dynamic casting - Split enumerable/enumerator - Implementations of map, filter, skip, pairwise Fixed output due to change in Seq.map delayed component creation to respect mutability - Added take, mapi - Added same exceptions for skip & take - Added composable skip & take - Added array source - Added source specific folding for functions like sum (required growing of surface area, so currently not exposed) Temporarily remove the ISeqEnumerable interface It was causing build issues as it was currently unused. Fixing halting on take Return current as match of match statement for perf *slight* performance improvement
Thanks to @liboz for reminding me of the performance hit here. This is noticeable when you have small collections to be iterated.
skipwhile/takewhile
A more direct calling process. Slows things down *slightly* when only a single item is being processed, but is a better model to build from going forward. Remove unused member Simplified ProcessNext call by creating Result object Due to the bottom-up build process I now have a consistent output signature, which allowed it to be wrapped in an single object rather than being passed up and down the chain of ProcessNext calls. default -> override OptimizedClosure for mapi Consolidated code in base class ensuring performance Retained MoveNext in derived class to ensure we didn't add an extra virtual call into the call stack. Added ComposableEnumerableFactoryHelper Sweeping up common functionality init(Infinite)? implementations Not as simple as it should be due to the original implementation deciding to evaluate Current in a lazy fashion. Comments have been splattered around hopefully describing the situation in enough detail. Split Result object in multi-leveled Signal The plan is to then implement fold like functionality in a Tail like object that we can expose out in a public interface, so I'm trying to minimize what would be needed to be visible externally. Rearranged Enumerator/Enumerable pairs together Fix bug in skipping an init seq Restoring to last successful build server build I probably don't have any more time today to bug issues init/initInfinite Try again, without any other clutter Bug fix; off by 1... Moved Enumerable/Enumerator pairs into modules map2 Fix OnComplete/OnDispose Hmmm... not 100% happy with this because it requires all links in the chain to ensure that that follow the protocol, but it isn't too bad I guess... Seq.append This implemention performs vastly better than the previous implementation, which appeared to be more interested in being theoretically important than actually being a reasonable implementation. Anyway, the previous version blew up with stack overflow if you appended too many things, which the new version doesn't. minor perf; assume "InProcess" Bug fix; ensure exception protocol is followed Seq.fold
bug fix: "truncate 0" was causing MoveNext on underlying seq
except indexed
This implemention performs vastly better than the previous implementation, which appeared to be more interested in being theoretically important than actually being a reasonable implementation. Anyway, the previous version blew up with stack overflow if you appended too many things, which the new version doesn't.
Removed old choose function localizing upto This is retained for compatibility cleaning up SeqComposer.Helpers - better comments - consistent casing Seq.map3 Seq.mapi2 Simplified map2 - removing the check of both types Seq.unfold Added an IdentityFactory Identity can be used to wrap basic containers into SeqComposer compatible types, but can safely be removed when composing the components.
Also removed some extra null checks fixed to removing the right null check seq.tail and a fix to takewhile to use avoidtailcall
Seq.rev Added brackets to disambiguate Lazy Seq.permute Seq.sort(By|With|ByDescending|Descending)? Factory helper create methods for less clutter Replaced Lazy<'T> with (unit->'T) The use of lazy changed the seq's funcitonality, as it would have only been calculated once, even if the sequence was iterated again. Renamed Tail to SetResult to disambiguate Added Iter Seq.iter & Seq.average
Updated NoNeedToTailcall for Seq.iter changes Added OnComplete calls to Iter and Folds Experimental ForEach Currently this is only implemented on Array. This adds some public surface to the SeqComposer which may be removed. Fixed signature file, so can now use object expression Provided all ForEach implementations Removed Fold Remove Iter
PE verify says no. Appease the NoNeedToTailcall file
singleton identityfactory using tocomposer implementing previously implementing seq functions using toComposer. Also fixes a bug with the boolean checking first fix bug with bool
Just a straight cast when EnumerableBase Simplified Identity Avoid creating extra ref objects Using average as an example of using a tuple-like, but mutable, value type to tie the data closer together and avoid allocation. Ensuring that OnDispose is called from ForEach
…duce, last, trylast cleanup for Seq.iter, fold Also moves the functions in the seq.fsi file to be alphabetical passing false on process next when ending. fix bug where unfold did not respect the Halted state on Result
Fixed tryPick Processed 1 past the end of the data, as demo'd here: seq { for i = 1 to 5 do yield i failwith "boom" } |> Seq.pick (fun x -> if x = 5 then Some true else None) |> fun result -> assert result Fixed tryFind, similar to tryPick Error seen with seq { for i = 1 to 5 do yield i failwith "boom" } |> Seq.find (fun x -> x = 5) |> fun result -> assert (result=5) cleaned up math functions Just fixing some more one past the end more consistency more consistency foreach now takes seq and calls toComposer Made ignoring return value a bit more explicit processNextInForeach was a bit verbose Consolidated ForEach functionality Seq.concat
fix bug with Concat when there are side effects
The following caused an exception, when it shouldn't: [1;2;3] |> Seq.take 100 |> Seq.takeWhile (fun _ -> false) |> Seq.iter (fun _ -> ()) I'm not 100% happy with how I'm allocating ids, nor really with the added ceremony of the solution, but I think the idea of how to resolve is basically the right one.
- Shrank public interface by removing ISeqPipeline from ForEach. - Renamed haltingIdx to pipelineDepth - Removed haltingIdx from where I could Remove mutable state in SeqComponentFactory - Changed "depth" to "idx" to better communicate the function Simplified use of OnComplete & OnDispose Pushed the chaining functionality into the interface and added extra methods on SeqConsumer. This means the consumer can ignore the interface and just implement their version, which means less likely to be an issue of the message not being chained correctly. It also has the advantage that in the object expressions we don't have to cast back to the base type, which was a potentital area for errors. Starting to finalise namespacing and comments Still playing around, happy for some input here...
Bit faster on 64 bit, as don't need to access the ref of e2ok oops, type-o ! Minimalist exposure of factory infrastructire - made SeqEnumerable<> into an interface (ISeq<>) - made SeqComponentFactory<> into an interface (ISeqFactory<>) Renaming recommentations base on @rmunn feedback Commented strange Unchecked.default usage Partial move to Composer module In the Composer module we use ISeq rather than seq. An end goal could be be publicly expose this module for enhanced performancy.
Names are unique and descriptive enough Update NoNeedToTailcall.output.test.bsl for Seq mod Adding types to try to appease test Adding the types doesn't work. Only appearing in portable build, so pondering if it is a compiler bug? Will need to get onto someone about it I think.
- removed ref vars, as can just us let mutable - renamed variables to more meaningful names - removed modulus because I can
- Shrank public interface by removing ISeqPipeline from ForEach. - Renamed haltingIdx to pipelineDepth - Removed haltingIdx from where I could Remove mutable state in SeqComponentFactory - Changed "depth" to "idx" to better communicate the function Simplified use of OnComplete & OnDispose Pushed the chaining functionality into the interface and added extra methods on SeqConsumer. This means the consumer can ignore the interface and just implement their version, which means less likely to be an issue of the message not being chained correctly. It also has the advantage that in the object expressions we don't have to cast back to the base type, which was a potentital area for errors. Starting to finalise namespacing and comments Still playing around, happy for some input here...
Bit faster on 64 bit, as don't need to access the ref of e2ok oops, type-o ! Minimalist exposure of factory infrastructire - made SeqEnumerable<> into an interface (ISeq<>) - made SeqComponentFactory<> into an interface (ISeqFactory<>) Renaming recommentations base on @rmunn feedback Commented strange Unchecked.default usage Partial move to Composer module In the Composer module we use ISeq rather than seq. An end goal could be be publicly expose this module for enhanced performancy.
Names are unique and descriptive enough Adding types to try to appease test Adding the types doesn't work. Only appearing in portable build, so pondering if it is a compiler bug? Will need to get onto someone about it I think.
- removed ref vars, as can just us let mutable - renamed variables to more meaningful names - removed modulus because I can
Probably shouldn't be exposed in that manor in the first place, but secondly they caused a error in ci_part1. Used this as a chance to rename the module as well. Modified item/tryItem to use skip Unit tests check that the item call object the lazy nature of the Seq.init Starting the exposure of the inlinable Composer - Still hidden via internal module - simplified PipeIdx, no need for optional now - Made ISeqFactory an abstract class instead of interface so as not to require a stub implementation of PipeIdx in every object expression (or alternatively if the abstract class was used with the interface, then explicit declaration of the interface as well) - filter and map changed to inline versions Fix incorrect pipeIdx Hack to stop tail calls on ICompletionChaining passing a reference as an argument in a funciton stops the F# compiler from outputting a tail instruction for that function. None of these functions will be significantly deep as to warrant the need for a tail call. mapi to inline version - added a mapi_adapt version for non-inlined
convert distinct revise distinct hide unnecessarily exposed surface area inline distinctBy inline skipWhile inline takeWhile inline skip inline scan inline take & inline truncate inline windowed inline tail pipeline formatting signature file formatting inline except inline pairwise
- Removed Default implementations - Renamed methods on ICompletionChain - Simplied defined hierarch make choosing correct level to implement easier
inline fold inline sum inline sumBy & inline average inline averageBy inline min & inline minBy inline max & inline maxBy inline reduce inline tryFindIndex inline tryLast inline exactlyOne remove adapt variants inline map2, inline mapi2, & inline map3 inline iter2 & inline iteri2 inline forall2 & inline exists2 inline fold2 add cleanup to folders and cleanup to folders inline compareWith elevate composers
Just made it into a abstract class at the top of the hierarchy Removed unnecessary SetResult type Wrapped it's functionality into Result type Fix iteri2 from seq to ISeq Renaming Value to State Modified Folder to contain Result - This simplified foreach to allow for some further optimizations Moved IOutOfBound into Folder - Avoid creating extra object - foreach implementations call StopFutureProcessing directly Added Upcast for IOutOfBand Added *Thin for when no transforms applied Moved OnComplete Dipose() to OnDispose - fixed some consistency around member & override "Inline" ForEach methods - via struct interface (i.e. inline at runtime) - basically speed parity for sum Shrinking signature file Removed foreach/compose helpers - the didn't really serve any purpose Renamed ForEach to Fold Removed PipeIdx from SeqFactory - Made management of it part of the ISeq classes - Removed internal Build function as well Made Fold iterators more consistent Renamed Consumer to Activity - didn't really consume anything - removed helper classes to slightly decrease surface area Removed errorString argument - Moved SR.GetString code into Composer better selection of inline function Simplified inheritence hierarchy - and finally decided to just go with Transform as the general name for processing Restored TransformWithPostProcessing hierarch - For symmetry with Folder - Fixed a spelling error More renaming - SeqFactory to TransformFactory - Compose to PushTransform - Create to Compose Fix Skipping logic on Fold - And removed unbox cast for performance Made "Upcast"s little safer Tring to make variable naming consistent Consistency work; naming, spacing Composer.groupBy(Ref|Val) Fixed constraint (applied on wrong argument) Added module for GroupBy - also renamed delayed as delay to match seq Moved Array based function to Composer sorts/rev/permute ToArray via Fold - and moved creating Value comparer to a helper function null checks handled in toComposer countBy Remove comented minValBy/maxValBy head/last into Composer - more consistency formatting
OK; in the process of doing a build off this branch, and then I'll:
But Play School is about to finish, so mightn't get this done right away, but I'll endevour to get it done ASAP... |
Thanks for getting this ready; as said I'll get this green ASAP (hopefully by next week - we have a long weekend) Maybe could you reference #1570 as the top, for easy linkage? |
1d46ac4
to
b82b1c3
Compare
source.PushTransform { new TransformFactory<'T,'T>() with | ||
override __.Compose _ _ next = | ||
upcast { new Transform<'T,'V,Lazy<HashSet<'T>>>(next,lazy(HashSet<'T>(itemsToExclude,HashIdentity.Structural<'T>))) with | ||
upcast { new Transform<'T,'V,HashSet<'T>>(next,HashSet<'T>(itemsToExclude,HashIdentity.Structural<'T>)) with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm don't think removing Lazy
here is the correct thing to do. The original code is:
[<CompiledName("Except")>]
let except (itemsToExclude: seq<'T>) (source: seq<'T>) =
checkNonNull "itemsToExclude" itemsToExclude
checkNonNull "source" source
seq {
use e = source.GetEnumerator()
if e.MoveNext() then
let cached = HashSet(itemsToExclude, HashIdentity.Structural)
let next = e.Current
if (cached.Add next) then yield next
while e.MoveNext() do
let next = e.Current
if (cached.Add next) then yield next }
Where it doesn't create the HashSet
until after the first call to MoveNext
. So if the code never enumerates then the it is never allocated. But this was done to fix a unit test? Which test was it?
[<TestFixture>] | ||
type SeqModule() = | ||
[<Parallelizable>][<Category "Collections Tests">][<Category "Seq Tests">] | ||
module SeqModuleTests = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally like the simpler module
and let
syntax, but I think we should do this as a seperate PR, as @dsyme has said on a number of occasions that he would prefer minimum changes so as to limit as much scope of review as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was a basic usability change so that it;d be easy to run the necessary tests in the VS Test Runner, as they are currently setup it's hard to select the right ones. We can remove that commit later if Don doesn't like it, but for now it's useful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some reason my code review isn't appearing; so I repeat it here:
I'm don't think removing Lazy
here is the correct thing to do. The original code is:
[<CompiledName("Except")>]
let except (itemsToExclude: seq<'T>) (source: seq<'T>) =
checkNonNull "itemsToExclude" itemsToExclude
checkNonNull "source" source
seq {
use e = source.GetEnumerator()
if e.MoveNext() then
let cached = HashSet(itemsToExclude, HashIdentity.Structural)
let next = e.Current
if (cached.Add next) then yield next
while e.MoveNext() do
let next = e.Current
if (cached.Add next) then yield next }
Where it doesn't create the HashSet
until after the first call to MoveNext
. So if the code never enumerates then the it is never allocated. But this was done to fix a unit test? Which test was it?
Something weird was going on with my test setup and it seemed like this fixed a test but it actually didn't, It's off now |
34f4f71
to
cecb26e
Compare
Thanks for making this branch, but I think I am going to start clean - history be damned. I have a new machine, VS2017 installed, a master build that actually works (my old machine was a little sad and sorry wreck...), finally a glimpse of how git organises things (barging in speaking Mercurial never helped!) and it's 7am and I've been up for the last 2.5 hrs getting to this step 0 position! Oh, and I've decided to rename composer to Anyway, between The BirdMan Rally and other Moomba festivities I'll still strive for that green build ASAP! (But on a new PR; so this and #1570 should probably be put to sleep...) |
@manofstick thanks for the update! looking forward for |
same as #2525 but I squashed all consecutive commits from each author.
The original PR #1570