-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[barray] adopt the barray module #38
base: master
Are you sure you want to change the base?
Conversation
This is WIP, but I think we can do a quick review for it. A few things in this change: 1. I added the iterator module to replace the sequence module. The OCaml community seems have not reached an aggrement on how functional iterator would look like, we've got `gen`, `seq`, `Core.Sequence` etc. I picked this because it is purely functional and have a monadic interface, while `Sequence` is side-effect based. Dave suggested we use this as well. 2. Adopted the barray module, added a Collection interface Things to do: 1. add unit tests for barray
src/containers/barray.mli
Outdated
@@ -46,7 +46,7 @@ val append : 'a array -> 'a array -> 'a array | |||
val concat : 'a array list -> 'a array | |||
(** Same as [Array.append], but concatenates a list of arrays. *) | |||
|
|||
val sub : 'a array -> int -> int -> 'a array | |||
val slice : 'a array -> int -> int -> 'a array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're going to call this method slice, I think it should honor the same semantics as JS. See the immutable.re implementaiton: https://github.com/facebookincubator/immutable-re/blob/master/src/Immutable.rei#L2098
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I think the array should be the last argument to enable chaining.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implemented slice
src/containers/barray.mli
Outdated
(** [Array.to_list a] returns the list of all the elements of [a]. *) | ||
|
||
val of_list : 'a list -> 'a array | ||
val fromList : 'a list -> 'a array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Personally I'd recommend not including functions that can't be implemented efficiently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me do some benchmark and see how efficient they can be, I am more inclined to keep them because I don't want to force the user to write these functions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that you can't make this efficient at scale. You either have to do an O(N) scan of the list to get its count, or you have to do multiple array copies as you discover the need for a bigger array. Either option is inefficient. The only reason OCaml developers do this is that they don't have better datastructures available. Immutable.re solves this problem by providing Vector, Deque and Stack, all that have O(1) count operations.
src/containers/barray.mli
Outdated
@@ -76,24 +76,23 @@ val mapi : (int -> 'a -> 'b) -> 'a array -> 'b array | |||
function is applied to the index of the element as first argument, | |||
and the element itself as second argument. *) | |||
|
|||
val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b array -> 'a | |||
val foldLeft : ('a -> 'b -> 'a) -> 'a -> 'b array -> 'a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use the same terminology as Immutable.re (reduce, reduceReversed). I had a conversation with @jordwalke and he made a compelling argument against using left/right.
src/containers/barray.mli
Outdated
(** [Array.fold_right f a x] computes | ||
[f a.(0) (f a.(1) ( ... (f a.(n-1) x) ...))], | ||
where [n] is the length of the array [a]. *) | ||
|
||
external make_float: int -> float array = "caml_make_float_vect" | ||
external makeFloat: int -> float array = "caml_make_float_vect" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do these specialized arrays make sense in JS? Are they compiled to JS TypedArrays?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it currently does, but I seem to recall it having been mentioned as a possibility at least.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/containers/barray.mli
Outdated
@@ -113,59 +112,42 @@ val fast_sort : ('a -> 'a -> int) -> 'a array -> unit | |||
|
|||
(* The following is for system use only. Do not call directly. *) | |||
|
|||
external unsafe_get : 'a array -> int -> 'a = "%array_unsafe_get" | |||
external unsafe_set : 'a array -> int -> 'a -> unit = "%array_unsafe_set" | |||
external unsafeGet : 'a array -> int -> 'a = "%array_unsafe_get" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these actually necessary in JS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think the automatic undefined -> option conversion makes this obsolete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are only targeting JS then Barray.get
can make use of Js.undefined_to_opt
, but since we will support native later and Immu.re is more native sided, I would not introduce too much BS specific stuff.
src/core/collection.ml
Outdated
@@ -0,0 +1,16 @@ | |||
|
|||
module type VectorLike = sig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you pull this module (collection.ml) out of core and move it into containers. It's unlikely Imm.re will ever depend on this, it would be better to define core as the common shared subset, and containers vs. imm.re can be a user choice.
src/core/comparable.ml
Outdated
@@ -5,3 +5,9 @@ module type S = sig | |||
|
|||
val compare:t Comparator.t | |||
end | |||
|
|||
module type C = sig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would avoid adding this definition. I intentionally removed it from Imm.re, because the definition needs to be considered within the scope of modular implicits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi thanks for the review, sorry I was somehow busy today and just had the chance to look at it. Could you elaborate on the design if we have modular explicit? I am just curious how it would look like and why it would affect adding this interface. Thanks
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is an example of an implicit module signature for a parameterized type from the modular implicit paper:
implicit module Show_list { A : Show } = struct
type t = A . t list
let show l = "["^ String . concat ", " ( List . map A . show l ) ^"]"
end
The paper is here: http://www.lpw25.net/ml2014.pdf
For the most part, I think we should avoid defining any interfaces that result in unwieldy apis that will be resolved in a consistent fashion by using modular implicits.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi yeah, I have read the paper, I agree this interface is a bit unusual. I took the idea from Core, but it seems Haskell does not have the notation, nor any other language I can recall. I will remove it then.
src/core/equatable.ml
Outdated
@@ -2,4 +2,9 @@ | |||
module type S = sig | |||
type t | |||
val equals : t Equality.t | |||
end | |||
|
|||
module type C = sig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto here. I would not implement this until we have modular implicits.
src/core/iterator.ml
Outdated
@@ -0,0 +1,41 @@ | |||
|
|||
type +'a node = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we rename this module Sequence, include the Sequence type definition in core, but move your implementation of the combinators into containers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This iterator design seems pretty good to me. I prefer its explicitness over the simpler and structural function that raises on exception.
src/containers/barray.mli
Outdated
val get_safe : 'a t -> int -> 'a option | ||
(** [get_safe a i] returns [Some a.(i)] if [i] is a valid index | ||
@since 0.18 *) | ||
val getSafe : 'a t -> int -> 'a option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getOption
might be a better name here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would encourage to follow the approach of always making the defaults safe, eg. get always returns an option, and then you can add getOrRaise for efficiency.
src/containers/barray.mli
Outdated
val fold : ('a -> 'b -> 'a) -> 'a -> 'b t -> 'a | ||
|
||
val foldi : ('a -> int -> 'b -> 'a) -> 'a -> 'b t -> 'a | ||
val foldi : f:('a -> int -> 'b -> 'a) -> acc:'a -> 'b t -> 'a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be reducei
. Or reduceWithIndex
to be really verbose about it?
src/containers/barray.mli
Outdated
(** Fold left on array, with index *) | ||
|
||
val fold_while : ('a -> 'b -> 'a * [`Stop | `Continue]) -> 'a -> 'b t -> 'a | ||
val foldWhile : ('a -> 'b -> 'a * [`Stop | `Continue]) -> 'a -> 'b t -> 'a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reduceWhile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can avoid the need for two functions, by using Imm.re's definition of reduce:
let reduce: while_::('acc => a => bool)? => ('acc => a => 'acc) => 'acc => t => 'acc;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also much prefer the while check returning a boolean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed it to val reduceWhile : ('a -> 'b -> 'a * bool) -> 'a -> 'b t -> 'a
where the boolean value indicate whether we should stop or not (true to stop, false to continue).
src/containers/barray.mli
Outdated
@since 0.8 *) | ||
|
||
val iter : ('a -> unit) -> 'a t -> unit | ||
indicated by the accumulator *) | ||
|
||
val iteri : (int -> 'a -> unit) -> 'a t -> unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forEachi
or forEachWithIndex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forEachWithIndex
src/containers/barray.mli
Outdated
@since 0.8 *) | ||
|
||
val iter : ('a -> unit) -> 'a t -> unit | ||
indicated by the accumulator *) | ||
|
||
val iteri : (int -> 'a -> unit) -> 'a t -> unit | ||
|
||
val blit : 'a t -> int -> 'a t -> int -> int -> unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mutates, so we might want to remove it, or at least mark it very clearly as being mutating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would love to have the minimal mutable subset of array in core (blit and pals), and then in containers you can add whatever you like for containers to containers itself.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name is a bit weird, took me a while to understand what it does.
src/containers/barray.mli
Outdated
|
||
val pp_i: ?sep:string -> (int -> 'a printer) -> 'a t printer | ||
(** Print an array, giving the printing function both index and item *) | ||
val toIterator: 'a t -> 'a Iterator.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An iterator is really just a "view" of or interface to the data structure, not itself a separate data structure, so I don't think toIterator
is a good name for this. getIterator
or makeIterator
, or perhaps just iterator
would be more fitting I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used toSequence in immutable.re.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to toSequence
src/containers/barray.mli
Outdated
(** Map each element into another value, or discard it *) | ||
|
||
val flat_map : ('a -> 'b t) -> 'a t -> 'b array | ||
val flatMap : ('a -> 'b t) -> 'a t -> 'b array | ||
(** Transform each element into an array, then flatten *) | ||
|
||
val (>>=) : 'a t -> ('a -> 'b t) -> 'b t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These should be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the intention to remove non-alpha methods from the library to promote using the actual words (bind, pure etc.)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep, random symbols are pretty hostile to newcomers, and they can easily be defined in your own projects or in an auxillary library.
src/containers/barray.mli
Outdated
|
||
val except_idx : 'a t -> int -> 'a list | ||
val exceptIndex : 'a t -> int -> 'a list | ||
(** Remove given index, obtaining the list of the other elements *) | ||
|
||
val (--) : int -> int -> int t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This actually seems pretty damn useful, and even kind of intuitive, but an ordinary range
function is probably still better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest js slice which is not too hard to implement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
range can be just an alias of --^
, or the other way around.
src/core/collection.ml
Outdated
module type VectorLike = sig | ||
type 'a t | ||
|
||
val mem: 'a Equality.t -> 'a t -> 'a -> bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've used this recently, and struggled to find this, so I should be able to remember what this does. But even seeing the name and type signature I still could not. Lol, that's how bad this name is. contains
is a nice name, though javascript has named it includes
for some silly reason.
src/core/collection.ml
Outdated
@@ -0,0 +1,16 @@ | |||
|
|||
module type VectorLike = sig |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big fan of this name, or this interface. It doesn't seem very well defined. It seems kind of like an Iterable
, but an Iterable
would not have a property-like length
function, but instead use a verb like count
to imply that an action needs to be performed. It also should not have an isEmpty
function, but perhaps instead an any
function (which is probably a better name for exists
if the predicate is optional)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed this module for now.
I also pretty much concur with everything @bordoley said |
A couple of general comment (around the same theme)
|
src/containers/barray.mli
Outdated
val flatMap : ('a -> 'b t) -> 'a t -> 'b t | ||
(** Transform each element into an t, then flatten *) | ||
|
||
val (--) : int -> int -> int t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we put operators in 'Infix' or 'Ops' sub modules. The motivation for this is that operators are only useful when the module they are defined in is opened. However opening a module is usually a bad practice so in order to limit the scope of what is open operator can be placed in a sub module.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I will put them in a Infix
submodule.
val compare:t Comparator.t | ||
end | ||
|
||
val compare: t Comparator.t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this file does not start with a "b" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am trying to not add b when the module name does not conflict with the standard library. Do you think it's better to have a b for all the modules?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only name directly conflicts with stdlib will be prepended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might very well conflicts with other people file names. It's much safer (and consistent) to start all files with the same prefix. If a user once to change the name in its program he can simply use module aliases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a top level namespace file called Collection
, so say if the user wants to use Sequence
he will probably will have to alias Collection.Sequence
. But I think for consistency I will change all the modules to start with a B
.
src/core/sequence.ml
Outdated
@@ -0,0 +1,41 @@ | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this file as well starts with "b"
|
||
(* The following is for system use only. Do not call directly. *) | ||
|
||
external unsafe_get : 'a array -> int -> 'a = "%array_unsafe_get" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Those are useful.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You still have them, as getOrRaise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsafe_get don't raise exception in the standard library, no boundary check is performed, and behavior is left undefined if out of bound.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
val unsafeGet : int -> 'a t -> 'a
val unsafeSet : int -> 'a -> 'a t -> unit
Added them back with this signature for chained application style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we rename this unsafeGetUnchecked
, so it's a bit less vague?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would discourage exposing this in the API personally. There are are uses for unsafe access, but at that point, I'd assume any user would be advance enough to provide their own implementation around the ffi interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's also a lot of bad formatting in the tests, and very inconsistent use of piped vs. non-piped forms
src/containers/barray.mli
Outdated
If the value of [x] is a floating-point number, then the maximum | ||
size is only [Sys.max_array_length / 2].*) | ||
|
||
val init : int -> (int -> 'a) -> 'a t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be named makeWithInit
perhaps?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stuck with 'init' in imm.re. I don't think it matters much though either way in this case.
src/containers/barray.mli
Outdated
val concat : 'a t list -> 'a t | ||
(** Same as [Barray.append], but concatenates a list of arrays. *) | ||
|
||
val slice : int -> int -> 'a t -> 'a t option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why were the labels removed here? I think labels are even more useful here because it helps distinguish start
/end
from start
/length
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Completely agree! As a general rule if you have more than one function parameter of a type then you should use label.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's hard to figure out what labels should be preserved and what should be removed, there is no guideline on this as far as I can see. Perhaps we should have two modules, Array
and ArrayLabeled
as the stdlib/ocaml-containers? What's your opinion?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @mransan's rule is good. If the order is painfully obvious it might be ok to make an exception, but otherwise i'd just follow that rule.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
immutable.re uses "start" and "end_"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really dislike having two apis, one labelled one not. Smells of lack of decisiveness on the behalf of the API writer and is harder to learn.
I generally used name labels when they are needed to clarify the intent of the API signature. In this case, there is no way to know what the the various integer arguments do without labels.
src/containers/barray.mli
Outdated
|
||
val map : ('a -> 'b) -> 'a t -> 'b t | ||
|
||
val mapi : (int -> 'a -> 'b) -> 'a t -> 'b t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mapWithIndex
src/containers/barray.mli
Outdated
|
||
val count: ('a -> bool) -> 'a t -> int | ||
|
||
val iter : ('a -> unit) -> 'a t -> unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forEach
src/core/sequence.ml
Outdated
| Cons (x, r) -> (match p x with | ||
| Some c -> Cons (c, filterMap p r) | ||
| None -> filterMap p r () | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
begin
/end
would be better than parentheses here IMO
src/core/sequence.ml
Outdated
|
||
type +'a node = | ||
| Nil | ||
| Cons of 'a * 'a t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The indentation here is inconsistent with the rest of the pipes in this file
src/core/sequence.ml
Outdated
|
||
let return x () = Cons(x, empty) | ||
|
||
let rec map f i () = match i () with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fond of having match
(or anything other than just function
really) at the end of the line here. When I see a function declaration followed by a line starting with a pipe, I expect the last argument of the function to be implicitly pattern matched, ie. = function
instead of = match ... with
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First, why match ... with
is bad here? I understand you mentioned it a couple of times before but I failed to grasp why this is a bad pattern. Second, I tried a bit, and found this one really hard to be rewritten using only function
since we have to match against i ()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I didn't mean it's bad to use match .. with
here, just that the formatting is unfortunate because it looks very similar to this:
let rec map f i () = function
| Nil -> Nil
| Cons (x, r) -> Cons (f x, map f r)
I think it's bad practice more generally to place code of any significance at what is effectively a different indent level, but match ... with
is especially unfortunate due to this similarity. So all I'm really asking is that you format it like this instead:
let rec map f i () =
match i () with
| Nil -> Nil
| Cons (x, r) -> Cons (f x, map f r)
src/containers/barray.ml
Outdated
|
||
let isEmpty a = length a = 0 | ||
|
||
external unsafe_sub : 'a array -> int -> int -> 'a array = "caml_array_sub" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be camelCased too, even if it's not exposed.
src/containers/barray.ml
Outdated
let length = Array.length | ||
|
||
let get i a = | ||
if i>=0 && i<Array.length a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent spacing around the infix operators.
I think it's good design to make performance costs explicit, since it makes the implications apparent when reading the code. It's usually considered bad practice to optimize prematurely, and most people don't read the documentation for basic stuff like this anyway, so it's important to have this information at read-time, not just write-time. There might be better solutions than inconvenience of course, but I don't think documentation is a good substitute for poor design, and I don't think it's a good idea to optimize the design for scripting either. |
@glennsl "fromList" is not inefficient, it's just a convertor between 2 containers. If we're discussing 2 implementations differences of that function and one is more performant than another I would agree, but here we're talking whether converting a list to an array should be exposed at all. A simple case could be that you use an API (3rd party) that returns a list but then you need to perform a one-time algorithm which require random access. Why not simply converting one to the other using the fromList function ? Are there better alternatives ? If so, why the fromList is not implemented in terms of them ? |
@mransan Conversion between any two containers is supported via |
I lean towards not having |
@glennsl @dorafmon, so fromSequence is:
So now I need to convert my list to a sequence which gets converted back to a list in order to I agree the unification of all containers to a general concept as a sequence is elegant and theoretically pleasing, but sometimes I just think it defeats the purpose of simplicity. |
@mransan there is no reason that implementation can't be changed not to convert to a list. It's basically impossible to implement Array.fromList/Seq, efficiently because it's always going to involve numerous o(n) copies. Another option is to require the caller pass the size as an argument to the Array.from* function, and raise an exception if you can't pull enough items from the list/seq to fill the array. |
@mransan @bordoley If we are only considering the BS case, then we can implement let fromSequence i =
let open Bsequence in
let arr = ref [||] in
let rec aux i =
match i () with
| Nil -> !arr
| Cons(x, r) -> ignore @@ Js.Array.push x !arr; aux r
in aux i JS offers variable length array and thus it made it this possible. We only need 1 copy of the array in total. |
A summary of outstanding issues and what I am thinking:
Apart from these two can we get this PR merged if no other objection? I think the interface is nice now and I the tests should provide at least basic coverage. Thanks a lot! |
@dorafmon: Under the covers JSArray is effectively a C++ vector. There are optimizations for small arrays (and you can implement these yourself as well) but pushing into large arrays can also cause significant amounts of copying. You also sacrifice some array optimizations. This article has more details (old but still mostly accurate): https://gamealchemist.wordpress.com/2013/05/01/lets-get-those-javascript-arrays-to-work-fast/ See item 4. For large datasets, it's likely that pushing into an immutable vector such as those implemented by Imm.re/Immutable.js will be faster than pushing into a native js array. |
@bordoley out of curiosity I assume the immutable vector has a O(1) conversion to arrays? |
@dorafmon I'll eat my words a little. I ran some perf tests last night, and indeed Array.push is fast (in absolute time) for pushing large number of items (faster than I thought it would be), beating even push into a Transient Vector by 30 to 50%. I tested with Arrays where n = 7M on node. This surprised me. I looked into the BS stdlib, and I suspect that I could optimize some array CopyOnWriteArray functions to use native implementations that could achieve equal or better performance. That said push is still not "efficient" in absolute terms as pre-allocating a single array and then inserting elements into it should be faster since it avoids generating garbage. Imm.re vectors do not have an O(1) conversion to array. However, it's possible to implement an conversion via one allocation, and O(N) copies. This is due to the fact that Imm.re vectors have an O(1) count operation. |
@bordoley efficiency is relative and depends on the context. Take this example: you call a third party API (no control over it) which returns a list, you then want to transform it into an array to make an expensive computation which required random access of the elements. So you must convert a
So now in this example (which by the way is very common), the most efficient solution is a plain old |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dorafmon The formatting of the tests are still completely borked.
I really don't understand why you insist on using ocp-indent, since to me it seems to cause more problems than it solves. Whatever you use is fine by me as long as the end result is good, but right now the tests are pretty bad.
.ocp-indent
Outdated
# Number of spaces used in all base cases, for example: | ||
# let foo = | ||
# ^^bar | ||
base = 4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why 4? Isn't 2 spaces eessentially the standard indentation for OCaml projects? I don't think I've seen a project using 4 spaces yet.
src/containers/barray.mli
Outdated
|
||
val unsafeGet : int -> 'a t -> 'a | ||
|
||
val unsafeSet : int -> 'a -> 'a t -> unit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong opinion on whether or not to include these, but if they're included I'd like them to be named unsafeGetUnchecked
and unsafeSetUnchecked
|
||
let empty () = Nil | ||
|
||
let return x () = Cons(x, empty) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've renamed this make
in bopt
and bresult
. On the fence about that, but I don't think return
is a very good name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, return
reflects that a sequence is monadic, but I guess it is okay to rename it something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a good point. While I understand wanting to avoid a lot of the category theory nomenclature, it might be good to keep the basics to it's clear what the structures being used actually are? return
, pure
or something like that are pretty standard in FP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good for who? Should we optimize for the few that have already demonstrated a great amount of perseverance by learning category theory, or the many who have not?
As an example, to call an external function that accepts a nullable value, e.g external f : int Js.null -> unit = ...
you currently have to write (and read) f (Js.Null.return 0)
. That's not a great experience and severely undermines what's otherwise a damn good FFI system. Most also seem to get along just fine with the option monad without needing a return
function.
That's not to say it wouldn't be good to have a standard monad "interface"/module signature, but I don't see great benefit in that interface adopting standard terminology, compared to the benefit of avoiding incomprehensible terminology.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I think we might be agreed then. I don't really care about the terminology, but I think the interface is very valuable, both the beginners and more experienced devs alike.
As long as there is a standard to adhere to that is consistent inside this repo, I can mentally replace bind and return with make and flatMap or andThen or whatever the best name is decided to be.
src/core/bsequence.ml
Outdated
match i () with | ||
| Nil -> Nil | ||
| Cons (x, r) -> begin | ||
match p x with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Too much indentation. I also think I'd prefer begin
on the same line as match
:
begin match p x with
| Some c -> ..
| None -> ...
end
Maybe that'll fix it for ocp-indent.
src/core/bsequence.ml
Outdated
| Nil -> acc | ||
| Cons (x, next) -> foldLeft trans (trans acc x) next | ||
|
||
let rec iter f i = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forEach
src/core/bsequence.ml
Outdated
| Nil -> Nil | ||
| Cons (this, next) -> append (trans this) (flatMap trans next) | ||
|
||
let rec foldLeft trans acc iter = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reduce
@bordoley @mransan Did some benchmarking, the code is here. So tired |
@glennsl I reformated the code a bit, especially the test. Is it better now? |
@dorafmon The last half or so of barray_test is better. Not as good as before you started messing with it, but good enough. bsequence_test, int_test and the first half of barray_test is still pretty bad. There's also lots of remaining issues with 4-space indentation, even if you've fixed the config the code itself hasn't been reformatted. |
@glennsl I overhauled the whole formatting a bit. Should be a bit better. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks much better, but there's still some issues and a few files have gotten much worse for some reason.
); | ||
let _ = (try Bchar.getDigitOrRaise '1' | ||
with | ||
| Failure _ -> 1) in (); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look very good.
Either
let _ = (
try
Bchar.getDigitOrRaise '1'
with
| Failure _ -> 1) in ();
or
let _ = (
try Bchar.getDigitOrRaise '1'
with Failure _ -> 1) in ();
or perhaps using begin
/end
instead of parens would be better
); | ||
runPerfTest "Sequence.foldLeft" (fun _ -> | ||
Bsequence.reduce (fun acc x -> x::acc) [] it | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These have actually gotten worse... Not sure why that would be
src/core/comparator.ml
Outdated
(if this < that | ||
then Ordering.Less | ||
else if this > that then Ordering.Greater else Ordering.Equal : | ||
Ordering.t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is very inconsistent.
src/core/comparator.ml
Outdated
(fun x -> | ||
fun y -> if (comparator x y) == Ordering.equal then true else false : | ||
'a Equality.t) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why this is explicitly curried, it would probably format better if it was not. Also weird spacing before the arrows.
|> Expect.toEqual '0'); | ||
test "fromIntOrRaise - 256" (fun () -> | ||
(fun () -> | ||
256 |> fromIntOrRaise) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This (and the next test) should have the body of the lambda indented one more level
[|1 ; 2; 3|] |> Barray.toSequence | ||
|> Barray.fromSequence | ||
|> Expect.toEqual [|1; 2; 3|] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the tests from this to the end of the file are still off
tests/int_test.ml
Outdated
test "neg - 42" (fun () -> | ||
42 |> neg | ||
|> Expect.toEqual (~-42) | ||
); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here again there's suddenly an explosion of indentation in this entire file...
(** Transform each element into an t, then flatten *) | ||
|
||
module Infix : sig | ||
val (--) : int -> int -> int t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file still has too much indentation. There's not very many places where it actually shows though.
This is WIP, but I think we can do a quick review for it.
A few things in this change:
community seems have not reached an aggrement on how functional
iterator would look like, we've got
gen
,seq
,Core.Sequence
etc. Ipicked this because it is purely functional and have a monadic
interface, while
Sequence
is side-effect based. Dave suggested we usethis as well.
Things to do: