Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Seq.traverse/sequence*)!: Yield arrays #310

Merged
merged 11 commits into from
Mar 5, 2025

Conversation

bartelink
Copy link
Contributor

@bartelink bartelink commented Feb 17, 2025

Instead of yielding a [lazy] sequence as the bodies of the Error or Ok state arising from the traverse and sequence results, yields Arrays directly. This offers two advantages:

  • better pattern matching options - e.g. can match on Ok [||], Error [||], or use _.Length on the Error or Ok bodies (vs having to to when Seq.length etc)
  • more efficient implementation (the current uses all sorts of temps via use of Seq.rev etc

Resolves #254

Types of changes

  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

In general, the signatures can be used interchangeably, but its undeniable that this is a binary breaking change.

Checklist

  • Build and tests pass locally

@bartelink
Copy link
Contributor Author

bartelink commented Feb 17, 2025

cc @1eyewonder @TheAngryByrd My original impl was #255

I believe this correctly walks the sequences and combines successes and failures for the A variants (I reworded the docs as they suggested other semantics)

IMO this impl should be relatively efficient, but my goal in using and returing arrays is more about the usage experience, and intuitive ease of reading the impl than actual perf goals.

If you guys feel it's warranted, I'm not objecting to people tuning it a bit, though I can't say I'm personally that concerned as the Seq.rev being gone is already a huge win IMO


My main questiion/review comment would be whether you'd like me to apply this impl approach to the other variants - given they are relatively new, I guess now is the time, if it's ever to happen? I personally don't touch code that uses them so am not directly concerned (though consistency of approach is important for the lib in general of course)

@1eyewonder
Copy link
Contributor

1eyewonder commented Feb 17, 2025

@bartelink If you're looking to yield arrays directly, is there a reason why we are trying to implement in the Seq module as opposed to the existing Array module? The reason why I ended up converting this originally was due to needing lazy loading in very large sequences I was working with at the time. You can see the history shown here #277 if you didn't see this already. I'd also suggest adding benchmarks to the existing benchmark tests if we want to prove this is a more efficient implementation.

@bartelink
Copy link
Contributor Author

bartelink commented Feb 17, 2025

@1eyewonder sorry for the slow response - was porting the rest...

This is not about perf - I won't be surprised if it beats the current impl, but that's not the point - the point is a cleaner output contract. My references to perf in my comment above is more about the fact that I can well imagine there being some common cases worthy of special casing over the base impl (at the price of more code for people to review and/or traverse in perpetuity, which is why this impl keeps it simple for now)

The semantics should be 100% compatible with your cited need (lazy consumption of the input)


ASIDE: I don't see any point in having equivalents for Array, ResizeArray or anything else; this already covers what I see as the needs:

  1. lazy consumption, stopping when the outcome is decided
  2. minimal allocations and good general perf
  3. be able to pattern match and/or efficiently count the outputs as necessary

I tend to use immutable arrays over Lists, so that's less of a concern for me. Given the nature of how they work, it may well be that a specific implementation might be warranted for those if the requirement is for an optimal implementation (though in general having List.rev/Seq.rev and/or list comprehensions that are doing the moral equivalent of appending to the list is going to make that hard to achieve)

The TL;DR of all this is that the base impl in #255 is very boring and easy to read, and more complex implementations with lots of nesting, recursion and.or allocations are the ones that should have a burden of justifying that complexity

@TheAngryByrd
Copy link
Collaborator

Sorry for the delay, I'll get to look at this in the weekend.

@bartelink
Copy link
Contributor Author

Not a problem - take as long as you need

@bartelink
Copy link
Contributor Author

bartelink commented Mar 1, 2025

Separate question - is there a plan/place for a Seq.traverseTaskResult ?

I have a codebase that could use it and atm I have Seq.toList contortions


I implemented it in here but obviously if that's missing a point and/or making a mess, it can be removed.

I also added a sequenceTaskResult, but that might be a step too far and I'd definitely be happy to remove that - IMO even lazy taskresult seqs are liable to get people confused and/or broken

let actual = Seq.traverseOptionM tryTweetOption tweets

let actual =
Expect.wantSome actual "Expected result to be Some"
|> Seq.toList
Copy link
Contributor Author

@bartelink bartelink Mar 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note IMO needing to manipulate the result for it to be useful is a design smell (especially when there is no actual laziness, but yet you can't be 100% that you don't need to do a Seq.cache without seeing the impl as the type system does not guarantee it)

the one debatable thing is that if there was a first class immutable array, it would make sense to use it, but this is not that world as yet

Copy link
Collaborator

@TheAngryByrd TheAngryByrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you seem to need to materialize the seq to know if it ends up being an Ok or an Error, it makes sense to just return the materialized array.

Only have a tiny nitpick, might as well use ArrayCollector and take care of any small perf benefits it could give.

@1eyewonder I think lazy consumption is still happening here, which is a desired trait. Does that seem like the case to you?

| Error e -> Error e
| Ok initialSuccesses ->

let oks = ResizeArray(initialSuccesses)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: I think we could go with an ArrayCollector to benefit any small enough sequences. Might as well squeeze out all the tiniest perf benefits.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point (my first time using it - it'd be nice to have a way to check for empty without the mutable flag abuse I added, though I suspect in terms of actual JITted IL this probably wins so not going to sweat it!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That ArrayCollector shadowing behavior is probably something there should be a warnAsErrors setting or somesuch trapping?

@bartelink
Copy link
Contributor Author

Will also add the A variants, incoming...

@bartelink bartelink force-pushed the sequenceresult-sig branch 3 times, most recently from 430c3d5 to 4752e9d Compare March 2, 2025 23:43
@bartelink bartelink force-pushed the sequenceresult-sig branch from 4752e9d to b017a0a Compare March 2, 2025 23:52
@bartelink
Copy link
Contributor Author

bartelink commented Mar 3, 2025

Had to revert ArrayCollector stuff for stuff that goes through Fable (and the struct semantics forcing mutable are a bit confusing, so I could be persuaded to revert the non-Fable usages too...)

It would seem that the Array functions are now superfluous (or at least they should delegate to the Seq ones). My instinct is to take the former approach. While it may be counter intuitive to not have them, for me there is a non-zero carrying cost to them as they are.

Similarly IMO the List ones are probably best off delegating to the Seq impls and mapping to List at the end, given the amount of allocations and temps involved in List.rev abuse in the current impls - ultimately this stuff all costs in compile/transpile/trim or execution time

Let me know if you want me to do any of that; if not, I'm happy for this to get merged if you are...

@bartelink
Copy link
Contributor Author

Finally, a question re TFMs (I have form on this, see serilog/serilog#1970)

In general for libs, I would try to obviously provide netstandard2.0 support where possible
net6.0 is also a useful TFM in that it has less messy dependency chains

However the incremental cost vs benefit of e.g. having a 9.0 one to my mind needs to be justified - unless you are actually calling net9.0-specific APs and/or otherwise benefitting from something new or better, the library itself should NOT be constantly updated to net8.0, net9.0, etc. It should instead either stick with net6.0, or not have a TFM-specific build

Obviously randomly dropping a TFM-specific package subset is not something to do lightly, as it can cause a mess, but it's a good idea if going to a new major to pick something minimal to support and stick with that

The test suites of course should follow a different policy - including at least testing with latest LTS, and optionally with the STS

@bartelink
Copy link
Contributor Author

And another spam - I dont know how to fix this with paket (and IMO paket is not a help for libs - the libs I maintain and/or co-maintain tend not to use paket, even if I use it for apps), but FSharp.Core versions prior to v7 (which I believe this lib should ideally reference) have xmldoc included that makes a mess of searching for things in Rider as it (correctly) shows the content. My workaround is: https://github.com/jet/equinox/blob/master/src/Equinox/Equinox.fsproj#L25-L28

@1eyewonder
Copy link
Contributor

@TheAngryByrd Yes, the lazy consumption is happening which is the ideal trait, however, I have a question.

While there is still lazy consumption occurring, we are now loading the sequence into memory by converting the output into an array (correct me if I'm thinking of this wrong). Does this decision make sense for consumers who may want to keep the sequence lazy for memory purposes?

If we do see the memory concern as valid, we could still attempt to help @bartelink 's concerns out by looking into creating some helper functions to change the output to the desired shape. It would end up being some form of Result.map Seq.toArray |> Result.mapError Seq.toArray I'd imagine.

If we don't see the memory concern as valid, we may want to look into our sibling functionality found in the List and Array modules to keep things consistent and discoverable. It may not feel fluid for users to have List.traverse/sequence returning lists while Seq.traverse/sequence returns arrays, for example.

@bartelink
Copy link
Contributor Author

@1eyewonder Note also my comments in #310 (comment)

I believe that the Array specific impls are now redundant (and I am pretty sure that the List ones should be implemented in terms of this. Ultimately while this is a binary breaking change, there are no cases where this signature is less useful for real usage.

@TheAngryByrd
Copy link
Collaborator

However the incremental cost vs benefit of e.g. having a 9.0 one to my mind needs to be justified - unless you are actually calling net9.0-specific APs and/or otherwise benefitting from something new or better, the library itself should NOT be constantly updated to net8.0, net9.0, etc. It should instead either stick with net6.0, or not have a TFM-specific build

I'm targeting TFM 9 because of F# Language Version 9 which brings in a lot of specific new nullable APIs. There was a lot of breaking Source overloads when I went up to LangVersion 9 so I'm constraining it to net9 only and having 8 be for lower. Also bumping language versions in F# can cause problems with consuming a library built with Lang9 and consuming from Lang8. See dotnet/fsharp#14313 for an example of this.

@TheAngryByrd
Copy link
Collaborator

Let me know if you want me to do any of that; if not, I'm happy for this to get merged if you are...

I think this is fine for now. Yeah lets get this merged.

@TheAngryByrd TheAngryByrd merged commit 73f728a into demystifyfp:master Mar 5, 2025
27 checks passed
@TheAngryByrd
Copy link
Collaborator

Thanks again for fixing this!

TheAngryByrd added a commit that referenced this pull request Mar 5, 2025
- BREAKING: [Remove Ply and update to FSharp 6](#248) Credits @TheAngryByrd
- BREAKING: [Remove MergeSources (and!) from some implementations like Result](#261)  Credits @TheAngryByrd
- BREAKING: [Merge TaskResult into Core library](#285) Credits @TheAngryByrd
- This means FsToolkit.ErrorHandling.TaskResult is no longer a separate package and will not be updated. It is now part of the core library.
- BREAKING: [Rename retn to singleton](#287) Credits @1eyewonder
- BREAKING: [Rename returnError to error + documentation](#311) Credits @tw0po1nt
- [Use Microsoft.Bcl.AsyncInterfaces in netstandard2.0 (Allows IAsyncDisposable and IAsyncEnumerable)](#250) Credits @TheAngryByrd
- [Build against Net8](#251) Credits @TheAngryByrd
- [Fix Overload Resolution to Align to Computation Expression used](#252) Credits @TheAngryByrd
- [refactor!: Seq.sequenceResultM returns Array instead of seq](#255) Credits @bartelink
- [feat(Seq): sequenceResultA](#255) Credits @bartelink
- [Updated uses of Seq.append](#290) Credits @1eyewonder
- [Add Option.traverseAsync and Option.sequenceAsync](#298 (comment)) Credits @tw0po1nt
- [Add Require and Check helper methods](#295) Credits @PI-Gorbo
- [Add new AsyncOption APIs and document all its other functions; minor fixes to documentation for Option module](#307) Credits @tw0po1nt
- [F# 9 support and nullness](#308) Credits @TheAngryByrd
- [Update IcedTasks 0.11.7](0a4cc7b) Credits @TheAngryByrd
- [Add TaskValidation module](#313) Credits @tw0po1nt
- [feat(Seq.traverse/sequence*)!: Yield arrays](#310) Credits @bartelink
@bartelink bartelink deleted the sequenceresult-sig branch March 5, 2025 06:12
@bartelink
Copy link
Contributor Author

Thanks for getting it in - the release notes should probably more prominently stress that the return types have changed (i.e. I have the standard https://keepachangelog.com/en/1.0.0 ! annotation implying the BREAKING you put in for some of the others. Any thoughts on alignment between Seq, Array and List impls? IMO

for Array, the impls are incomplete, and the desired signatures match the Seq ones, which means

  • add an Array facade that delegates to the Seq ones (but then would one sprout a ResizeArray or iarray equivalent if someone asked?
  • just remove it

While the List impl has a history and a track record, ultimately, for any interested number of results, reving a List and/or chaining a series of results in closures is just not a good idea (and the legibility difference with it being idiomatic recursive list manipulation does not outweigh that for me). However, there is a reasonable strong expectation of results being list.

This suggests to me picking from:

  • use the same impl, but replace ResizeArray.ToArray with List.ofSeq and ArrayCollector.Close with ArrayCollector.Close >> List.ofSeq (or implemented a CloseAsList, or migrate the temp to ResizeArray etc)
  • add some internal helpers in the Seq module that enable them to be written as oneliners (i.e. parameterise the Close part to allow people to have arbitrary result types)

If you pop your outline lan/desires into an issue I guess it can be debated and/or closed from there...

@TheAngryByrd
Copy link
Collaborator

annotation implying the BREAKING you put in for some of the others.

Yeah good point. I'll mark this.

for Array, the impls are incomplete, and the desired signatures match the Seq ones, which means

Yeah I think it makes sense to delegate to the Seq ones, just to make discoverability better.

se the same impl, but replace ResizeArray.ToArray with List.ofSeq and ArrayCollector.Close with ArrayCollector.Close >> List.ofSeq (or implemented a CloseAsList, or migrate the temp to ResizeArray etc)

For list specifically, there is a ListCollector that's very much in the same vein as ArrayCollector. If Fable doesn't support it, might have to #if.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

feat(Seq): Add sequenceResultA
3 participants