Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing #[] in BitArray #3968

Closed
jzakiya opened this issue Feb 1, 2017 · 10 comments
Closed

Missing #[] in BitArray #3968

jzakiya opened this issue Feb 1, 2017 · 10 comments

Comments

@jzakiya
Copy link

jzakiya commented Feb 1, 2017

I originally posed this on the crystal google groups forum and it was suggested I raise this as an issue.
https://groups.google.com/forum/#!topic/crystal-lang/Iiqgrgvca8M


In trying to use bit arrays (0.20.5) I've noticed a number of bugs and inconsistencies
in expected behavior compared to regular arrays which use the same methods.

a = BitArray.new(10) --> [false, false, false, false, false, false, false, false, false, false]

a[0]= true  --> [true, false, false, false, false, false, false, false, false, false]

b = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

b.first(5) -->  [0, 1, 2, 3, 4]

a.first(5) --> [true, false, false, false, false]

b.last(5) --> [5, 6, 7, 8, 9]

a.last(5) -->  Error in sozcore2test.cr:107: wrong number of arguments for 'BitArray#last' (given 1, expected 0)
Overloads are:
 - Indexable(T)#last()
 - Indexable(T)#last(&block)

Then there are indexing issues using bit arrays.

b[5..-1] --> [5, 6, 7, 8, 9]

a[5..-1] --> Error in sozcore2test.cr:109: no overload matches 'BitArray#[]' with type Range(Int32, Int32)
Overloads are:
 - Indexable(T)#[](index : Int)

According to BitArray API, https://crystal-lang.org/api/0.20.5/BitArray.html
these are inherited from the appropriate modules that work with Arrays.

Are these problems with BitArrays just a consequence of the current age of the language, and planned
to be fixed, or are these the intended different behavior compared to other Arrays?

@mverzilli
Copy link

They definitely look like bugs :).

@mverzilli mverzilli added kind:bug A bug in the code. Does not apply to documentation, specs, etc. topic:stdlib labels Feb 2, 2017
@asterite
Copy link
Member

asterite commented Feb 8, 2017

From wikipedia:

A bit array (also known as bitmap, bitset, bit string, or bit vector) is an array data structure that compactly stores bits. It can be used to implement a simple set data structure.

That is, the use case of a bit array is to count things, to know if a given object is inside a set. For that, you consider each position in the bit array to be a different object of interest. For example the compiler uses this to determine if all types are covered by method overloads.

With this in mind, you normally don't want to get a piece of a bit array. You don't use it like an array. In Java there's no such functionality. Neither in C#. It's not that this was overlooked, it's just that the use case of a BitArray is not that of an Array.

That said, we could implement #[]. The question is whether this allocates new memory for the data or it still points to the old one.

@jzakiya It would be interesting to know what's your use case here and why you need this slicing functionality.

@asterite asterite changed the title BitsArray bugs and inconsistencies Missing #[] in BitsArra Feb 8, 2017
@asterite asterite changed the title Missing #[] in BitsArra Missing #[] in BitArray Feb 8, 2017
@asterite
Copy link
Member

asterite commented Feb 8, 2017

I also changed the title: there's no bug or inconsistency here, we are just missing an #[] method.

(with what I said above, maybe even BitArray shouldn't be Enumerable at all)

@jzakiya
Copy link
Author

jzakiya commented Feb 8, 2017

I think you need to see and appreciate this issue from the perspective of a user.

First, you can not restrict the use of any resource that is provided to just what you think it will/should be used for. If you create it people will try to use if for things you might have never considered.

A BitArray is presented as an array, which is a collection, which according to it's own documentation inherits a host of methods from other modules, including Enumerables. If so, it should work with them.
It's not logical that you can do: bitary.first(5) but not bitary.last(5).

I do a lot of numerical heavy applications. I use arrays to represents data that is boolean in nature, i.e, 1|0, true|false. I can significantly reduce memory usage by using BitArrays, over arrays of Ints, where I'm more concerned with memory reduction versus speed.

From a user's perspective, or at least MY USER PERSPECTIVE, it is a bug, and certainly inconsistent, to not be able to manipulate a BitArray like an Array, especially when the documentation states they inherit methods from the same modules. Either create accurate documentation to explain how the resource actually behaves or make the resource behavior match its documentation.

@mverzilli
Copy link

mverzilli commented Feb 8, 2017

That's interesting, maybe we could rename BitArray to Bitmap to avoid these kinds of false expectations. The name could be misleading if the user doesn't know the context discussed here and in https://en.wikipedia.org/wiki/Bit_array.

So why don't we just:

  1. Rename the class to Bitmap.
  2. Add a reference in the docs to the Wikipedia article. Maybe even include some of the insights provided by @asterite in the class documentation for the benefit of future users.
  3. Stop including Indexable, unless we provide sound implementations for all methods in that module.

@RX14
Copy link
Contributor

RX14 commented Feb 8, 2017

If we can implement the full array interface, is there any reason why not?

@ysbaddaden
Copy link
Contributor

Bitmap can be confusing too (images). I think I'd choose BitSet instead, and/or document how specific it is (i.e. not to be considered an Array).

@asterite asterite added kind:feature and removed kind:bug A bug in the code. Does not apply to documentation, specs, etc. labels Feb 14, 2017
@miketheman
Copy link
Contributor

#codetriage
This appears to have turned into a feature, as we don't have the #[] in BitArray.

At the top of src/bit_array.cr, we can see:

BitArray includes all the methods in Enumerable

So this either is a documentation issue, where we change Enumerable to Indexable, or ensure that BitArray should indeed include all Enumerable methods.

@RX14
Copy link
Contributor

RX14 commented May 10, 2017

@miketheman #[] methods aren't defined on Enumerable. Neither are ranged #[] methods defined on Indexable.

@refi64
Copy link
Contributor

refi64 commented May 10, 2017

FWIW I personally think that BitSet would be a great name...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants