[barray] adopt the barray module #38

ghost · 2017-04-22T13:43:16Z

This is WIP, but I think we can do a quick review for it.

A few things in this change:

I added the iterator module to replace the sequence module. The OCaml
community seems have not reached an aggrement on how functional
iterator would look like, we've got gen, seq, Core.Sequence etc. I
picked this because it is purely functional and have a monadic
interface, while Sequence is side-effect based. Dave suggested we use
this as well.
Adopted the barray module, added a Collection interface

Things to do:

add unit tests for barray

This is WIP, but I think we can do a quick review for it. A few things in this change: 1. I added the iterator module to replace the sequence module. The OCaml community seems have not reached an aggrement on how functional iterator would look like, we've got `gen`, `seq`, `Core.Sequence` etc. I picked this because it is purely functional and have a monadic interface, while `Sequence` is side-effect based. Dave suggested we use this as well. 2. Adopted the barray module, added a Collection interface Things to do: 1. add unit tests for barray

bordoley · 2017-04-22T17:21:52Z

src/containers/barray.mli

@@ -46,7 +46,7 @@ val append : 'a array -> 'a array -> 'a array
 val concat : 'a array list -> 'a array
 (** Same as [Array.append], but concatenates a list of arrays. *)

-val sub : 'a array -> int -> int -> 'a array
+val slice : 'a array -> int -> int -> 'a array


If you're going to call this method slice, I think it should honor the same semantics as JS. See the immutable.re implementaiton: https://github.com/facebookincubator/immutable-re/blob/master/src/Immutable.rei#L2098

Also I think the array should be the last argument to enable chaining.

Implemented slice

bordoley · 2017-04-22T17:23:26Z

src/containers/barray.mli

 (** [Array.to_list a] returns the list of all the elements of [a]. *)

-val of_list : 'a list -> 'a array
+val fromList : 'a list -> 'a array


Personally I'd recommend not including functions that can't be implemented efficiently.

Let me do some benchmark and see how efficient they can be, I am more inclined to keep them because I don't want to force the user to write these functions

The issue is that you can't make this efficient at scale. You either have to do an O(N) scan of the list to get its count, or you have to do multiple array copies as you discover the need for a bigger array. Either option is inefficient. The only reason OCaml developers do this is that they don't have better datastructures available. Immutable.re solves this problem by providing Vector, Deque and Stack, all that have O(1) count operations.

bordoley · 2017-04-22T17:24:20Z

src/containers/barray.mli

@@ -76,24 +76,23 @@ val mapi : (int -> 'a -> 'b) -> 'a array -> 'b array
    function is applied to the index of the element as first argument,
    and the element itself as second argument. *)

-val fold_left : ('a -> 'b -> 'a) -> 'a -> 'b array -> 'a
+val foldLeft : ('a -> 'b -> 'a) -> 'a -> 'b array -> 'a


Can we use the same terminology as Immutable.re (reduce, reduceReversed). I had a conversation with @jordwalke and he made a compelling argument against using left/right.

bordoley · 2017-04-22T17:25:05Z

src/containers/barray.mli

 (** [Array.fold_right f a x] computes
    [f a.(0) (f a.(1) ( ... (f a.(n-1) x) ...))],
    where [n] is the length of the array [a]. *)

-external make_float: int -> float array = "caml_make_float_vect"
+external makeFloat: int -> float array = "caml_make_float_vect"


do these specialized arrays make sense in JS? Are they compiled to JS TypedArrays?

I don't think it currently does, but I seem to recall it having been mentioned as a possibility at least.

@glennsl means here
We don't have conditional build (Native vs JS) yet, so let's leave it like this for now.

bordoley · 2017-04-22T17:25:29Z

src/containers/barray.mli

@@ -113,59 +112,42 @@ val fast_sort : ('a -> 'a -> int) -> 'a array -> unit

 (* The following is for system use only. Do not call directly. *)

-external unsafe_get : 'a array -> int -> 'a = "%array_unsafe_get"
-external unsafe_set : 'a array -> int -> 'a -> unit = "%array_unsafe_set"
+external unsafeGet : 'a array -> int -> 'a = "%array_unsafe_get"


Are these actually necessary in JS?

Do you think the automatic undefined -> option conversion makes this obsolete?

If we are only targeting JS then Barray.get can make use of Js.undefined_to_opt, but since we will support native later and Immu.re is more native sided, I would not introduce too much BS specific stuff.

bordoley · 2017-04-22T17:26:59Z

src/core/collection.ml

@@ -0,0 +1,16 @@
+
+module type VectorLike = sig 


Can you pull this module (collection.ml) out of core and move it into containers. It's unlikely Imm.re will ever depend on this, it would be better to define core as the common shared subset, and containers vs. imm.re can be a user choice.

bordoley · 2017-04-22T17:28:19Z

src/core/comparable.ml

@@ -5,3 +5,9 @@ module type S = sig

  val compare:t Comparator.t
 end
+
+module type C = sig


I would avoid adding this definition. I intentionally removed it from Imm.re, because the definition needs to be considered within the scope of modular implicits.

Hi thanks for the review, sorry I was somehow busy today and just had the chance to look at it. Could you elaborate on the design if we have modular explicit? I am just curious how it would look like and why it would affect adding this interface. Thanks

Here is an example of an implicit module signature for a parameterized type from the modular implicit paper:

implicit module Show_list { A : Show } = struct type t = A . t list let show l = "["^ String . concat ", " ( List . map A . show l ) ^"]" end

The paper is here: http://www.lpw25.net/ml2014.pdf

For the most part, I think we should avoid defining any interfaces that result in unwieldy apis that will be resolved in a consistent fashion by using modular implicits.

Hi yeah, I have read the paper, I agree this interface is a bit unusual. I took the idea from Core, but it seems Haskell does not have the notation, nor any other language I can recall. I will remove it then.

bordoley · 2017-04-22T17:28:49Z

src/core/equatable.ml

@@ -2,4 +2,9 @@
 module type S  = sig 
  type t 
  val equals : t Equality.t 
+end
+
+module type C = sig 


Ditto here. I would not implement this until we have modular implicits.

bordoley · 2017-04-22T17:29:24Z

src/core/iterator.ml

@@ -0,0 +1,41 @@
+
+type +'a node =


Can we rename this module Sequence, include the Sequence type definition in core, but move your implementation of the combinators into containers.

glennsl

This iterator design seems pretty good to me. I prefer its explicitness over the simpler and structural function that raises on exception.

glennsl · 2017-04-22T17:19:52Z

src/containers/barray.mli

-val get_safe : 'a t -> int -> 'a option
-(** [get_safe a i] returns [Some a.(i)] if [i] is a valid index
-    @since 0.18 *)
+val getSafe : 'a t -> int -> 'a option


getOption might be a better name here

I would encourage to follow the approach of always making the defaults safe, eg. get always returns an option, and then you can add getOrRaise for efficiency.

glennsl · 2017-04-22T17:23:26Z

src/containers/barray.mli

-val fold : ('a -> 'b -> 'a) -> 'a -> 'b t -> 'a
-
-val foldi : ('a -> int -> 'b -> 'a) -> 'a -> 'b t -> 'a
+val foldi : f:('a -> int -> 'b -> 'a) -> acc:'a -> 'b t -> 'a


Should be reducei. Or reduceWithIndex to be really verbose about it?

glennsl · 2017-04-22T17:23:42Z

src/containers/barray.mli

 (** Fold left on array, with index *)

-val fold_while : ('a -> 'b -> 'a * [`Stop | `Continue]) -> 'a -> 'b t -> 'a
+val foldWhile : ('a -> 'b -> 'a * [`Stop | `Continue]) -> 'a -> 'b t -> 'a


reduceWhile

you can avoid the need for two functions, by using Imm.re's definition of reduce:

let reduce: while_::('acc => a => bool)? => ('acc => a => 'acc) => 'acc => t => 'acc;

I also much prefer the while check returning a boolean

changed it to val reduceWhile : ('a -> 'b -> 'a * bool) -> 'a -> 'b t -> 'a where the boolean value indicate whether we should stop or not (true to stop, false to continue).

glennsl · 2017-04-22T17:24:07Z

src/containers/barray.mli

-    @since 0.8 *)
-
-val iter : ('a -> unit) -> 'a t -> unit
+    indicated by the accumulator *)

 val iteri : (int -> 'a -> unit) -> 'a t -> unit


forEachi or forEachWithIndex

forEachWithIndex

glennsl · 2017-04-22T17:25:27Z

src/containers/barray.mli

-    @since 0.8 *)
-
-val iter : ('a -> unit) -> 'a t -> unit
+    indicated by the accumulator *)

 val iteri : (int -> 'a -> unit) -> 'a t -> unit

 val blit : 'a t -> int -> 'a t -> int -> int -> unit


This mutates, so we might want to remove it, or at least mark it very clearly as being mutating.

I would love to have the minimal mutable subset of array in core (blit and pals), and then in containers you can add whatever you like for containers to containers itself.

The name is a bit weird, took me a while to understand what it does.

glennsl · 2017-04-22T17:30:04Z

src/containers/barray.mli

-
-val pp_i: ?sep:string -> (int -> 'a printer) -> 'a t printer
-(** Print an array, giving the printing function both index and item *)
+val toIterator: 'a t -> 'a Iterator.t


An iterator is really just a "view" of or interface to the data structure, not itself a separate data structure, so I don't think toIterator is a good name for this. getIterator or makeIterator, or perhaps just iterator would be more fitting I think.

I used toSequence in immutable.re.

changed to toSequence

glennsl · 2017-04-22T17:32:05Z

src/containers/barray.mli

 (** Map each element into another value, or discard it *)

-val flat_map : ('a -> 'b t) -> 'a t -> 'b array
+val flatMap : ('a -> 'b t) -> 'a t -> 'b array
 (** Transform each element into an array, then flatten *)

 val (>>=) : 'a t -> ('a -> 'b t) -> 'b t


These should be removed.

is the intention to remove non-alpha methods from the library to promote using the actual words (bind, pure etc.)?

yep, random symbols are pretty hostile to newcomers, and they can easily be defined in your own projects or in an auxillary library.

glennsl · 2017-04-22T17:34:11Z

src/containers/barray.mli


-val except_idx : 'a t -> int -> 'a list
+val exceptIndex : 'a t -> int -> 'a list
 (** Remove given index, obtaining the list of the other elements *)

 val (--) : int -> int -> int t


This actually seems pretty damn useful, and even kind of intuitive, but an ordinary range function is probably still better.

I suggest js slice which is not too hard to implement.

range can be just an alias of --^, or the other way around.

glennsl · 2017-04-22T17:42:57Z

src/core/collection.ml

+module type VectorLike = sig 
+    type 'a t 
+
+    val mem: 'a Equality.t -> 'a t -> 'a -> bool 


I've used this recently, and struggled to find this, so I should be able to remember what this does. But even seeing the name and type signature I still could not. Lol, that's how bad this name is. contains is a nice name, though javascript has named it includes for some silly reason.

glennsl · 2017-04-22T17:50:59Z

src/core/collection.ml

@@ -0,0 +1,16 @@
+
+module type VectorLike = sig 


Not a big fan of this name, or this interface. It doesn't seem very well defined. It seems kind of like an Iterable, but an Iterable would not have a property-like length function, but instead use a verb like count to imply that an action needs to be performed. It also should not have an isEmpty function, but perhaps instead an any function (which is probably a better name for exists if the predicate is optional)

Removed this module for now.

glennsl · 2017-04-22T18:08:42Z

I also pretty much concur with everything @bordoley said

mransan · 2017-05-11T13:59:07Z

A couple of general comment (around the same theme)

"fromList is inefficient": this is all relative and really up to the user to decide whether it meets its performance requirement of not. The documentation should specify the complexity of the operation but I believe we should still have such function even though there exist more efficient (and more complex) solution to the problem. What if I am writing a quick and simple program where performance does not matter (unit test, little script ...)? (It's perfectly fine btw to advertise better solutions in the documentation but the decision is eventually left to the user).
usage of unsafeGet/Set, I very much like those options to be available. There are many use cases that are perfectly fine using those unsafe API (generated code, algorithm where the boundary are check ahead of accessing the elements, etc). [@@undefined_to_opt] is still not as efficient as an unsafe access.
Can we rename collection to bcollection.

ghost · 2017-05-12T09:20:02Z

Hey sorry for the delay, was on holiday for a while, I think this is ready for another review, I cleaned up the interface and added more unit tests. @glennsl @bordoley

mransan · 2017-05-12T13:14:54Z

src/containers/barray.mli

+val flatMap : ('a -> 'b t) -> 'a t -> 'b t
+(** Transform each element into an t, then flatten *)
+
+val (--) : int -> int -> int t


Should we put operators in 'Infix' or 'Ops' sub modules. The motivation for this is that operators are only useful when the module they are defined in is opened. However opening a module is usually a bad practice so in order to limit the scope of what is open operator can be placed in a sub module.

Yes, I will put them in a Infix submodule.

mransan · 2017-05-12T13:16:02Z

src/core/comparable.mli

-  val compare:t Comparator.t
-end
-
+  val compare: t Comparator.t


Why does this file does not start with a "b" ?

I am trying to not add b when the module name does not conflict with the standard library. Do you think it's better to have a b for all the modules?

Only name directly conflicts with stdlib will be prepended.

You might very well conflicts with other people file names. It's much safer (and consistent) to start all files with the same prefix. If a user once to change the name in its program he can simply use module aliases.

We have a top level namespace file called Collection, so say if the user wants to use Sequence he will probably will have to alias Collection.Sequence. But I think for consistency I will change all the modules to start with a B.

mransan · 2017-05-12T13:16:31Z

src/core/sequence.ml

@@ -0,0 +1,41 @@
+


Shouldn't this file as well starts with "b"

mransan · 2017-05-12T13:21:45Z

src/barraylabels.mli

-
-(* The following is for system use only. Do not call directly. *)
-
-external unsafe_get : 'a array -> int -> 'a = "%array_unsafe_get"


Those are useful.

You still have them, as getOrRaise

unsafe_get don't raise exception in the standard library, no boundary check is performed, and behavior is left undefined if out of bound.

val unsafeGet : int -> 'a t -> 'a val unsafeSet : int -> 'a -> 'a t -> unit

Added them back with this signature for chained application style.

Could we rename this unsafeGetUnchecked, so it's a bit less vague?

I would discourage exposing this in the API personally. There are are uses for unsafe access, but at that point, I'd assume any user would be advance enough to provide their own implementation around the ffi interface.

glennsl

There's also a lot of bad formatting in the tests, and very inconsistent use of piped vs. non-piped forms

glennsl · 2017-05-13T15:53:15Z

src/containers/barray.mli

+    If the value of [x] is a floating-point number, then the maximum
+    size is only [Sys.max_array_length / 2].*)
+
+val init : int -> (int -> 'a) -> 'a t


This could be named makeWithInit perhaps?

I stuck with 'init' in imm.re. I don't think it matters much though either way in this case.

glennsl · 2017-05-13T15:54:08Z

src/containers/barray.mli

+val concat : 'a t list -> 'a t
+(** Same as [Barray.append], but concatenates a list of arrays. *)
+
+val slice : int -> int -> 'a t -> 'a t option


Why were the labels removed here? I think labels are even more useful here because it helps distinguish start/end from start/length.

Completely agree! As a general rule if you have more than one function parameter of a type then you should use label.

It's hard to figure out what labels should be preserved and what should be removed, there is no guideline on this as far as I can see. Perhaps we should have two modules, Array and ArrayLabeled as the stdlib/ocaml-containers? What's your opinion?

I think @mransan's rule is good. If the order is painfully obvious it might be ok to make an exception, but otherwise i'd just follow that rule.

immutable.re uses "start" and "end_"

I really dislike having two apis, one labelled one not. Smells of lack of decisiveness on the behalf of the API writer and is harder to learn.

I generally used name labels when they are needed to clarify the intent of the API signature. In this case, there is no way to know what the the various integer arguments do without labels.

glennsl · 2017-05-13T16:05:11Z

src/containers/barray.mli

+
+val map : ('a -> 'b) -> 'a t -> 'b t
+
+val mapi : (int -> 'a -> 'b) -> 'a t -> 'b t


mapWithIndex

glennsl · 2017-05-13T16:06:06Z

src/containers/barray.mli

+
+val count: ('a -> bool) -> 'a t -> int
+
+val iter : ('a -> unit) -> 'a t -> unit


glennsl · 2017-05-13T16:07:16Z

src/core/sequence.ml

+| Cons (x, r) -> (match p x with 
+    | Some c -> Cons (c, filterMap p r)
+    | None -> filterMap p r ()
+)


begin/end would be better than parentheses here IMO

glennsl · 2017-05-13T16:08:42Z

src/core/sequence.ml

+
+type +'a node =
+   | Nil
+   | Cons of 'a * 'a t


The indentation here is inconsistent with the rest of the pipes in this file

glennsl · 2017-05-13T16:12:22Z

src/core/sequence.ml

+
+let return x () = Cons(x, empty)
+
+let rec map f i () = match i () with 


I'm not fond of having match (or anything other than just function really) at the end of the line here. When I see a function declaration followed by a line starting with a pipe, I expect the last argument of the function to be implicitly pattern matched, ie. = function instead of = match ... with

First, why match ... with is bad here? I understand you mentioned it a couple of times before but I failed to grasp why this is a bad pattern. Second, I tried a bit, and found this one really hard to be rewritten using only function since we have to match against i ().

Sorry, I didn't mean it's bad to use match .. with here, just that the formatting is unfortunate because it looks very similar to this:

let rec map f i () = function | Nil -> Nil | Cons (x, r) -> Cons (f x, map f r)

I think it's bad practice more generally to place code of any significance at what is effectively a different indent level, but match ... with is especially unfortunate due to this similarity. So all I'm really asking is that you format it like this instead:

let rec map f i () = match i () with | Nil -> Nil | Cons (x, r) -> Cons (f x, map f r)

glennsl · 2017-05-13T16:14:06Z

src/containers/barray.ml

+
+let isEmpty a = length a = 0
+
+external unsafe_sub : 'a array -> int -> int -> 'a array = "caml_array_sub"


This should be camelCased too, even if it's not exposed.

glennsl · 2017-05-13T16:14:36Z

src/containers/barray.ml

+let length = Array.length
+
+let get i a =
+  if i>=0 && i<Array.length a


Inconsistent spacing around the infix operators.

glennsl · 2017-05-13T16:37:43Z

@mransan:

"fromList is inefficient": this is all relative and really up to the user to decide whether it meets its performance requirement of not. The documentation should specify the complexity of the operation but I believe we should still have such function even though there exist more efficient (and more complex) solution to the problem. What if I am writing a quick and simple program where performance does not matter (unit test, little script ...)? (It's perfectly fine btw to advertise better solutions in the documentation but the decision is eventually left to the user).

I think it's good design to make performance costs explicit, since it makes the implications apparent when reading the code. It's usually considered bad practice to optimize prematurely, and most people don't read the documentation for basic stuff like this anyway, so it's important to have this information at read-time, not just write-time.

There might be better solutions than inconvenience of course, but I don't think documentation is a good substitute for poor design, and I don't think it's a good idea to optimize the design for scripting either.

mransan · 2017-05-14T02:38:39Z

@glennsl "fromList" is not inefficient, it's just a convertor between 2 containers. If we're discussing 2 implementations differences of that function and one is more performant than another I would agree, but here we're talking whether converting a list to an array should be exposed at all.

A simple case could be that you use an API (3rd party) that returns a list but then you need to perform a one-time algorithm which require random access. Why not simply converting one to the other using the fromList function ? Are there better alternatives ? If so, why the fromList is not implemented in terms of them ?

glennsl · 2017-05-14T13:37:08Z

@mransan Conversion between any two containers is supported via sequence (xs |> Array.toSequence |> List.fromSequence). What you're requesting is that array and list should be special cased, which strongly implies that there is a more efficient conversion between them (in the absence of some other obvious reason for special treatment).

ghost · 2017-05-14T22:58:49Z

I lean towards not having fromList now since we have an iterator class Sequence which should play this role.

mransan · 2017-05-15T00:29:05Z

@glennsl @dorafmon, so fromSequence is:

let fromSequence i =
  of_list @@ Sequence.foldLeft (fun acc item -> acc @ [item]) [] i

So now I need to convert my list to a sequence which gets converted back to a list in order to
call the ofList function that has been removed from the interface.
a) it's slower
b) it's more complex to understand (ie new comers need to also understand sequence concept even for the 2 most popular containers in OCaml!)

I agree the unification of all containers to a general concept as a sequence is elegant and theoretically pleasing, but sometimes I just think it defeats the purpose of simplicity.

bordoley · 2017-05-15T20:59:17Z

@mransan there is no reason that implementation can't be changed not to convert to a list. It's basically impossible to implement Array.fromList/Seq, efficiently because it's always going to involve numerous o(n) copies. Another option is to require the caller pass the size as an argument to the Array.from* function, and raise an exception if you can't pull enough items from the list/seq to fill the array.

ghost · 2017-05-17T23:03:12Z

@mransan @bordoley If we are only considering the BS case, then we can implement fromSequence like this:

let fromSequence i =
    let open Bsequence in
    let arr = ref [||] in
    let rec aux i =
        match i () with
        | Nil -> !arr
        | Cons(x, r) -> ignore @@ Js.Array.push x !arr; aux r  
    in aux i

JS offers variable length array and thus it made it this possible. We only need 1 copy of the array in total.

ghost · 2017-05-17T23:25:11Z

A summary of outstanding issues and what I am thinking:

@glennsl mentioned the formatting issue, I am using ocp-indent and I have checked in the configuration file. I think we should leave the formatting up to the autoformatter. @glennsl could you check and see if you see any particular issue we should address?
@mransan mentioned we should prefix modules with b. There are a few modules that do not follow this guideline. I will change it in a seperate PR.

Apart from these two can we get this PR merged if no other objection? I think the interface is nice now and I the tests should provide at least basic coverage.

Thanks a lot!

bordoley · 2017-05-18T00:54:10Z

@dorafmon: Under the covers JSArray is effectively a C++ vector. There are optimizations for small arrays (and you can implement these yourself as well) but pushing into large arrays can also cause significant amounts of copying. You also sacrifice some array optimizations. This article has more details (old but still mostly accurate): https://gamealchemist.wordpress.com/2013/05/01/lets-get-those-javascript-arrays-to-work-fast/ See item 4.

For large datasets, it's likely that pushing into an immutable vector such as those implemented by Imm.re/Immutable.js will be faster than pushing into a native js array.

ghost · 2017-05-18T07:16:09Z

@bordoley out of curiosity I assume the immutable vector has a O(1) conversion to arrays?

bordoley · 2017-05-18T16:28:54Z

@dorafmon I'll eat my words a little. I ran some perf tests last night, and indeed Array.push is fast (in absolute time) for pushing large number of items (faster than I thought it would be), beating even push into a Transient Vector by 30 to 50%. I tested with Arrays where n = 7M on node. This surprised me. I looked into the BS stdlib, and I suspect that I could optimize some array CopyOnWriteArray functions to use native implementations that could achieve equal or better performance.

That said push is still not "efficient" in absolute terms as pre-allocating a single array and then inserting elements into it should be faster since it avoids generating garbage.

Imm.re vectors do not have an O(1) conversion to array. However, it's possible to implement an conversion via one allocation, and O(N) copies. This is due to the fact that Imm.re vectors have an O(1) count operation.

mransan · 2017-05-18T17:41:55Z

@bordoley efficiency is relative and depends on the context.

Take this example: you call a third party API (no control over it) which returns a list, you then want to transform it into an array to make an expensive computation which required random access of the elements. So you must convert a 'a list to 'a array. What are the choices:

use fromList: does the minimum amount of allocations and has linear complexity. (yes 2 iterations one to get length and one to copy).
use to/fromSequence: duplicate the list into another list (linear number of allocation)
use Immuntable List/Array/Vector as intermediate data structure to get to an array: no matter what you will need to allocate more stuff.

So now in this example (which by the way is very common), the most efficient solution is a plain old fromList. Maybe I miss something :) let me know!

glennsl

@dorafmon The formatting of the tests are still completely borked.

I really don't understand why you insist on using ocp-indent, since to me it seems to cause more problems than it solves. Whatever you use is fine by me as long as the end result is good, but right now the tests are pretty bad.

glennsl · 2017-05-18T18:07:11Z

.ocp-indent

+# Number of spaces used in all base cases, for example:
+#    let foo =
+#    ^^bar
+base = 4


Why 4? Isn't 2 spaces eessentially the standard indentation for OCaml projects? I don't think I've seen a project using 4 spaces yet.

glennsl · 2017-05-18T18:14:16Z

src/containers/barray.mli

+
+val unsafeGet : int -> 'a t -> 'a
+
+val unsafeSet : int -> 'a -> 'a t -> unit


I don't have a strong opinion on whether or not to include these, but if they're included I'd like them to be named unsafeGetUnchecked and unsafeSetUnchecked

glennsl · 2017-05-18T18:21:13Z

src/core/bsequence.ml

+
+let empty () = Nil
+
+let return x () = Cons(x, empty)


I've renamed this make in bopt and bresult. On the fence about that, but I don't think return is a very good name.

Well, return reflects that a sequence is monadic, but I guess it is okay to rename it something else.

this is a good point. While I understand wanting to avoid a lot of the category theory nomenclature, it might be good to keep the basics to it's clear what the structures being used actually are? return, pure or something like that are pretty standard in FP.

Good for who? Should we optimize for the few that have already demonstrated a great amount of perseverance by learning category theory, or the many who have not?

As an example, to call an external function that accepts a nullable value, e.g external f : int Js.null -> unit = ... you currently have to write (and read) f (Js.Null.return 0). That's not a great experience and severely undermines what's otherwise a damn good FFI system. Most also seem to get along just fine with the option monad without needing a return function.

That's not to say it wouldn't be good to have a standard monad "interface"/module signature, but I don't see great benefit in that interface adopting standard terminology, compared to the benefit of avoiding incomprehensible terminology.

ah, I think we might be agreed then. I don't really care about the terminology, but I think the interface is very valuable, both the beginners and more experienced devs alike.

As long as there is a standard to adhere to that is consistent inside this repo, I can mentally replace bind and return with make and flatMap or andThen or whatever the best name is decided to be.

glennsl · 2017-05-18T18:24:19Z

src/core/bsequence.ml

+    match i () with
+    | Nil -> Nil
+    | Cons (x, r) -> begin
+            match p x with 


Too much indentation. I also think I'd prefer begin on the same line as match:

begin match p x with | Some c -> .. | None -> ... end

Maybe that'll fix it for ocp-indent.

glennsl · 2017-05-18T18:24:45Z

src/core/bsequence.ml

+    | Nil -> acc
+    | Cons (x, next) -> foldLeft trans (trans acc x) next
+
+let rec iter f i = 


glennsl · 2017-05-18T18:28:52Z

src/core/bsequence.ml

+    | Nil -> Nil
+    | Cons (this, next) -> append (trans this) (flatMap trans next)
+
+let rec foldLeft trans acc iter = 


ghost · 2017-06-04T11:28:23Z

@bordoley @mransan Did some benchmarking, the code is here. So tired fromList and fromSequence on a large array and not surprisingly fromSequence is slower (around 5 times usually on my Mac). I used node's profiler and think the reason is that we grow the array too often.
So I modified the code a bit so it allocate a large array beforehand and then assign values to the array. But this version turns out to be even slower. I think we should include fromList, toList as the conversion happens a lot and in this case fromSeq/toSeq seems still suffers from performance.

ghost · 2017-06-04T13:08:32Z

@glennsl I reformated the code a bit, especially the test. Is it better now?

glennsl · 2017-06-07T01:11:54Z

@dorafmon The last half or so of barray_test is better. Not as good as before you started messing with it, but good enough. bsequence_test, int_test and the first half of barray_test is still pretty bad.

There's also lots of remaining issues with 4-space indentation, even if you've fixed the config the code itself hasn't been reformatted.

ghost · 2017-06-11T18:14:16Z

@glennsl I overhauled the whole formatting a bit. Should be a bit better.

glennsl

Looks much better, but there's still some issues and a few files have gotten much worse for some reason.

glennsl · 2017-06-12T06:34:14Z

perf/bchar_perf.ml

-    );
+    let _ = (try Bchar.getDigitOrRaise '1'
+      with
+      | Failure _ -> 1) in ();


This doesn't look very good.

Either

let _ = ( try Bchar.getDigitOrRaise '1' with | Failure _ -> 1) in ();

or

let _ = ( try Bchar.getDigitOrRaise '1' with Failure _ -> 1) in ();

or perhaps using begin/end instead of parens would be better

glennsl · 2017-06-12T06:35:32Z

perf/bsequence_perf.ml

+	);
+	runPerfTest "Sequence.foldLeft" (fun _ ->
+		Bsequence.reduce (fun acc x -> x::acc) [] it
+	)


These have actually gotten worse... Not sure why that would be

glennsl · 2017-06-12T06:37:30Z

src/core/comparator.ml

+  (if this < that
+   then Ordering.Less
+   else if this > that then Ordering.Greater else Ordering.Equal : 
+                            Ordering.t)


This is very inconsistent.

glennsl · 2017-06-12T06:40:14Z

src/core/comparator.ml

+  (fun x  ->
+     fun y  -> if (comparator x y) == Ordering.equal then true else false : 
+                                      'a Equality.t)


I don't understand why this is explicitly curried, it would probably format better if it was not. Also weird spacing before the arrows.

glennsl · 2017-06-12T06:41:58Z

tests/bchar_test.ml

+       |> Expect.toEqual '0');
+  test "fromIntOrRaise - 256" (fun () ->
+    (fun () ->
+    256 |> fromIntOrRaise)


This (and the next test) should have the body of the lambda indented one more level

glennsl · 2017-06-12T06:42:58Z

tests/bsequence_test.ml

+      [|1 ; 2; 3|] |> Barray.toSequence
+      |> Barray.fromSequence
+      |> Expect.toEqual [|1; 2; 3|]


All the tests from this to the end of the file are still off

glennsl · 2017-06-12T06:43:36Z

tests/int_test.ml

+		test "neg - 42" (fun () ->
+			42 |> neg 
+			   |> Expect.toEqual (~-42)
+		  );


Here again there's suddenly an explosion of indentation in this entire file...

glennsl · 2017-06-12T06:46:48Z

src/containers/barray.mli

+(** Transform each element into an t, then flatten *)
+
+module Infix : sig
+    val (--) : int -> int -> int t


This file still has too much indentation. There's not very many places where it actually shows though.

ghost requested review from bordoley and glennsl April 22, 2017 13:43

bordoley reviewed Apr 22, 2017

View reviewed changes

glennsl reviewed Apr 22, 2017

View reviewed changes

dorafmon added 4 commits April 26, 2017 23:50

[barray] continue refactoring

0b65167

[barray] add unit tests

a1304be

[barray] add unit tests

d9360c5

[barray] toSequence test

a0f6caa

[barry] fix CI

7cd7757

mransan reviewed May 12, 2017

View reviewed changes

glennsl requested changes May 13, 2017

View reviewed changes

[barray] per review

f858aa5

[barray] per review

2fb7eb7

[barray] change format a little bit

53a38c8

glennsl requested changes May 18, 2017

View reviewed changes

[barray] format change and sequence refactor

9f26f25

[barray] reformat

ba284d3

glennsl reviewed Jun 12, 2017

View reviewed changes


		(* The following is for system use only. Do not call directly. *)

		external unsafe_get : 'a array -> int -> 'a = "%array_unsafe_get"


		val map : ('a -> 'b) -> 'a t -> 'b t

		val mapi : (int -> 'a -> 'b) -> 'a t -> 'b t


		val count: ('a -> bool) -> 'a t -> int

		val iter : ('a -> unit) -> 'a t -> unit


		let return x () = Cons(x, empty)

		let rec map f i () = match i () with


		let isEmpty a = length a = 0

		external unsafe_sub : 'a array -> int -> int -> 'a array = "caml_array_sub"


		val unsafeGet : int -> 'a t -> 'a

		val unsafeSet : int -> 'a -> 'a t -> unit

[barray] adopt the barray module #38

Are you sure you want to change the base?

[barray] adopt the barray module #38

Conversation

ghost commented Apr 22, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glennsl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glennsl commented Apr 22, 2017

mransan commented May 11, 2017

ghost commented May 12, 2017

mransan May 12, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mransan May 14, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost May 14, 2017 • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

glennsl left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mransan May 12, 2017 •

edited

Loading

mransan May 14, 2017 •

edited

Loading

ghost May 14, 2017 •

edited by ghost

Loading

glennsl left a comment •

edited

Loading