Iterator for DAC vectors #1366

phillipov · 2022-02-23T20:46:22Z

We want to use range-based for-loops on array fields (in procedural code), so I implemented a const_iterator for DAC vectors (the DAC API behind "array fields").

DAC code before:

for (uint32_t i = 0; i < customer.sales_by_quarter().size(); ++i)
{
    gaia_log::app().info("{}", customer.sales_by_quarter()[i]);
}

and after:

for (int sale : customer.sales_by_quarter())
{
    gaia_log::app().info("{}", sale);
}

Out of scope

In this PR, I will not:

Add more convenience functions to dac_vector_t
Write Random Access Iterator conformance tests for dac_vector_const_iterator_t
- I wrote iterator conformance tests during my internship. However, they A) are not generic enough for me to easily include this iterator in them and B) only reach Forward Iterators for conformance, not Random Access Iterators.

If we want those out-of-scope items, put new JIRA tasks on the backlog.

senderista · 2022-02-24T00:05:16Z

production/inc/gaia/direct_access/dac_array.inc

+
+/**
+ * Normally `operator[]` should return a reference or const reference
+ * to the array element. Given we only support arrays of basic types and the


Really? We don't support vectors of strings? Flatbuffers does...

I think we don't support them yet through DDL, but I don't remember exactly. @chuan can confirm.

We should be able to support strings. I don't remember any blocking technical reasons not to.

senderista · 2022-02-24T00:08:05Z

production/inc/gaia/direct_access/dac_array.inc

+dac_vector_t<T_type>::dac_vector_t(const flatbuffers::Vector<T_type>* vector_ptr)
+    : m_vector(vector_ptr)
+{
+    static_assert(std::is_arithmetic<T_type>::value, "dac_vector_t only supports basic types!");


I don't think "basic type" is sufficiently descriptive. At least I would have assumed that it included strings. If you're going to use std::is_arithmetic as the type bound then you should make the message directly reflect its semantics: dac_vector_t only supports integer and floating-point types!.

Perhaps say "scalar types", like flatbuffers does?

senderista · 2022-02-24T00:11:07Z

production/inc/gaia/direct_access/dac_array.inc

+template <typename T_type>
+typename dac_vector_const_iterator_t<T_type>::reference dac_vector_const_iterator_t<T_type>::operator*() const
+{
+    return *(m_iterator_data + m_index);


No bounds checking for m_index? Surely that would be trivial to add?

Are we OK with runtime penalty for bounds checking ?

I think all indexed structures should be bounds-checked by default, with a possible opt-out if it can be demonstrated that performance requirements require bounds checking to be omitted. Our system introduces unavoidable overheads (e.g., from transaction logging and copy-on-write updates) that should dwarf any possible overhead from bounds checking. If you can demonstrate a microbenchmark that indicates otherwise, I would certainly reconsider.

senderista · 2022-02-24T00:11:24Z

production/inc/gaia/direct_access/dac_array.inc

+template <typename T_type>
+typename dac_vector_const_iterator_t<T_type>::pointer dac_vector_const_iterator_t<T_type>::operator->() const
+{
+    return m_iterator_data + m_index;


See above, why no bounds checking?

senderista · 2022-02-24T00:11:59Z

production/inc/gaia/direct_access/dac_array.inc

+template <typename T_type>
+dac_vector_const_iterator_t<T_type>& dac_vector_const_iterator_t<T_type>::operator++()
+{
+    ++m_index;


Wouldn't it be trivial to add bounds checking on m_index?

Or does it not really matter until a dereference? In that case we should add the bounds checking there.

We need to be able to iterate until we hit the iterator returned by end(). From what I can tell, this will be an index equal to the size of the array.

senderista · 2022-02-24T00:12:58Z

production/inc/gaia/direct_access/dac_array.inc

+typename dac_vector_const_iterator_t<T_type>::difference_type dac_vector_const_iterator_t<T_type>::operator-(
+    const dac_vector_const_iterator_t<T_type>& rhs) const
+{
+    return m_index - rhs.m_index;


Overflow checking?

senderista · 2022-02-24T00:13:35Z

production/inc/gaia/direct_access/dac_array.inc

+template <typename T_type>
+dac_vector_const_iterator_t<T_type>& dac_vector_const_iterator_t<T_type>::operator--()
+{
+    --m_index;


No underflow checking?

See comment above. We need to be able to iterate until we hit the 'rendof the array (reverse iterator end). To that point we also need to add anrend` method.

senderista · 2022-02-24T00:14:06Z

production/inc/gaia/direct_access/dac_array.inc

+dac_vector_const_iterator_t<T_type>& dac_vector_const_iterator_t<T_type>::operator+=(
+    dac_vector_const_iterator_t<T_type>::difference_type rhs)
+{
+    m_index += rhs;


Overflow checking?

senderista · 2022-02-24T00:14:49Z

production/inc/gaia/direct_access/dac_array.inc

+dac_vector_const_iterator_t<T_type>& dac_vector_const_iterator_t<T_type>::operator-=(
+    dac_vector_const_iterator_t<T_type>::difference_type rhs)
+{
+    m_index -= rhs;


Underflow checking?

LaurentiuCristofor · 2022-02-24T01:24:39Z

production/inc/gaia/direct_access/dac_array.hpp

+
+// Forward declaring so `dac_vector_const_iterator_t` can use it.
+template <typename T_type>
+class dac_vector_t;


Did we decide if we want to rename the file to match the class name?

I think we can do this change independently of any other considerations.

chuan · 2022-02-24T01:29:04Z

production/inc/gaia/direct_access/dac_array.hpp

+    const T_type* m_iterator_data;
+    uint32_t m_index;


Isn't it enough to just use a pointer for the iterator implementation?

I think so, but I also think that it's worth adding boundary checks on the index, so I'd keep the index and add the vector size as well here. Just because the standard iterator interface allows some undefined behavior doesn't mean that we should not provide a safer implementation.

Forgive my ignorance, We use m_index as an offset to the base pointer. How do we know uint32_t is always the right type?

LaurentiuCristofor

This is a good starting implementation, but i agree with Tobin's comments: the iterator operations should throw out of bound exceptions for bad index parameters.

LaurentiuCristofor · 2022-02-24T17:39:23Z

production/inc/gaia/direct_access/dac_array.hpp

+    bool operator>=(const dac_vector_const_iterator_t<T_type>& rhs) const;
+
+private:
+    const T_type* m_iterator_data;


Given that this is part of an "iterator" class, we could drop the "iterator" part and just call this m_data.

Actually, from looking closer at the implementation, something like m_base_data might be better. I initially thought like @chuan that this references the current element, rather than the base element.

production/inc/gaia/direct_access/dac_array.inc

LaurentiuCristofor

Approving based on discussions we had in today's db syncup. Bounds checking can be added later.

simone-gaia

LGTM, nice and clean code. None of my comments are blocking.

Please address outstanding comments (eg. boundary check).

simone-gaia · 2022-02-25T12:37:10Z

production/direct_access/tests/test_array_fields.cpp

+{
+protected:
+    array_field_test()
+        : db_catalog_test_base_t(std::string("addr_book.ddl")){};


not necessary to make std::string explicit (I believe...)

I probably, would, just so folks know its from std:: I don't know where someone is using namespace std in the included headers (hopefully none of ours) but I'm a fan of calling out std:: explicitly in code.

I'm not talking about std:: (which I like better too) I'm talking about the entire std::string, we don't do that almost anywhere...

simone-gaia · 2022-02-25T12:38:21Z

production/direct_access/tests/test_array_fields.cpp

+    const int32_t q1_sales = 200;
+    const int32_t q2_sales = 300;
+    const int32_t q3_sales = 500;


Is it necessary to keep these into separated consts? Seeing them I would expect they being used somewhere (besides filling the array) but they aren't. I would remove and if clang-tidy complains suppress the warning.

simone-gaia · 2022-02-25T12:39:11Z

production/direct_access/tests/test_array_fields.cpp

+    gaia_id_t id = customer_t::insert_row(customer_name, sales_by_quarter);
+    txn.commit();
+
+    auto c = customer_t::get(id);


writing customer won't hurt.

simone-gaia · 2022-02-25T12:40:29Z

production/direct_access/tests/test_array_fields.cpp

+    const int32_t q3_sales = 300;
+
+    auto_transaction_t txn;
+    auto w = customer_writer();


You can also write:

customer_writer w;

So that it's clear you are calling a class constructor.

simone-gaia · 2022-02-25T12:43:45Z

production/inc/gaia/direct_access/dac_array.hpp

+// Pick up our template implementation.  These still
+// need to be in the header so that template specializations
+// that are declared later will pick up the definitions.


No need to comment this, otherwise we should comment it everywhere.

simone-gaia · 2022-02-25T12:45:36Z

production/inc/gaia/direct_access/dac_array.hpp

+    const T_type* m_iterator_data;
+    uint32_t m_index;


Forgive my ignorance, We use m_index as an offset to the base pointer. How do we know uint32_t is always the right type?

daxhaw · 2022-03-03T20:46:27Z

production/direct_access/tests/test_array_fields.cpp

+    txn.commit();
+
+    std::vector<int32_t> sales;
+    for (auto sale : customer.sales_by_quarter())


I would also add an auto& sale : customer.sales_by_quarter() iteration test.

daxhaw · 2022-03-03T20:48:52Z

production/inc/gaia/direct_access/dac_array.hpp

+    /**
+     * Constructs an iterator that does not point to any data.
+     */
+    dac_vector_const_iterator_t();


can this just be = default;

daxhaw · 2022-03-03T20:52:09Z

production/inc/gaia/direct_access/dac_array.inc

+//
+
+template <typename T_type>
+dac_vector_const_iterator_t<T_type>::dac_vector_const_iterator_t()


per my comment in the header, you can remove this implementation and just use the default one provided by C++. I would then default initialize members for m_index and m_iterator_data

daxhaw · 2022-03-03T20:56:36Z

production/inc/gaia/direct_access/dac_array.inc

@@ -0,0 +1,206 @@
+/////////////////////////////////////////////


Just a thought: we know a-priori every type we could support for this class since we only support primitive types. We can actually have a C++ file here and then instantiate the specializations we support as well as make it easier to change this code later.

daxhaw · 2022-03-03T21:00:41Z

production/inc/gaia/direct_access/dac_array.hpp

+    const dac_vector_const_iterator_t<T_type>& rhs);
+
+/**
+ * The base class of array fields.


Who inherits from dac_vector_t? Nobody as far I know so this is not a base class.

daxhaw · 2022-03-03T21:07:08Z

production/direct_access/tests/test_array_fields.cpp

+    EXPECT_TRUE(std::equal(sales.begin(), sales.end(), customer.sales_by_quarter().data()));
+
+    sales.clear();
+    for (auto sales_iter = customer.sales_by_quarter().begin(); sales_iter != customer.sales_by_quarter().end(); ++sales_iter)


Since this is a "random access" iterator, we should add a test for iterating in reverse as well. This would involve adding rbegin and rend methods.

production/inc/gaia/direct_access/dac_array.inc

daxhaw

Overall looks good. I'm requesting the following changes:

Since this is a random access iterator we should be able to support reverse iteration. To that end, add rbegin and rend methods
[perhaps optional]. If we only support scalar types then we have a finite subset of template instantiations for the array. We could then provide a .cpp file instead on an .inc file with explicit template specializations.

…iteration # Conflicts: # production/inc/gaia/direct_access/dac_vector.hpp

dev_tools/gdev/gdev/host.py

production/direct_access/tests/test_array_fields.cpp

daxhaw · 2022-03-10T01:15:51Z

production/inc/gaia/direct_access/dac_array.hpp

+ * @tparam T_type The type of `dac_vector_t` elements that the iterator traverses.
+ */
+template <typename T_type>
+class dac_vector_const_reverse_iterator_t


Wait, do you need an entirely new class here? I thought you only had to add rend and rbegin methods and then honor the operator- overloads.

It is kind of modeled after std with forward and backward iterators as separate classes. It is possible to make a single iterator class with a flag to switch the behavior between forward an backward movement but I believe it would just overcomplicate the code. Another option is to create a base class and push common stuff there so these classes will be a little bit smaller

…er issues

phillipov added 8 commits February 11, 2022 01:36

Added DAC vector iterator and a test

dc7eed8

Separated array field tests into a new suite

f4fbaaf

Comments in dac_vector_t, variable renaming in DAC array test

4e6720c

Upgrading DAC vector iterator tag

799ba63

Finished the forward const_iterator for DAC vectors

adcbb8b

Upgraded DAC vector iterator to a random access iterator

dcdc675

Fixed operator[] behavior in the DAC vector iterator

992b35f

Catching DAC iterator branch up to master

b9458e2

phillipov added the enhancement New feature or request label Feb 23, 2022

phillipov requested review from chuan, LaurentiuCristofor, daxhaw, fineg74 and simone-gaia February 23, 2022 20:46

phillipov self-assigned this Feb 23, 2022

phillipov added 2 commits February 23, 2022 14:04

Added license header to array fields test file

55a4189

Fixed failing cmakelist

7538789

senderista reviewed Feb 24, 2022

View reviewed changes

phillipov removed the request for review from daxhaw February 24, 2022 00:17

LaurentiuCristofor reviewed Feb 24, 2022

View reviewed changes

chuan reviewed Feb 24, 2022

View reviewed changes

LaurentiuCristofor reviewed Feb 24, 2022

View reviewed changes

production/inc/gaia/direct_access/dac_array.inc Show resolved Hide resolved

LaurentiuCristofor approved these changes Feb 24, 2022

View reviewed changes

simone-gaia approved these changes Feb 25, 2022

View reviewed changes

daxhaw reviewed Mar 3, 2022

View reviewed changes

production/inc/gaia/direct_access/dac_array.inc Show resolved Hide resolved

daxhaw suggested changes Mar 3, 2022

View reviewed changes

fineg74 added 2 commits March 9, 2022 14:57

Add reverse iterator functionality

da3aa93

Merge remote-tracking branch 'origin/master' into phillip/dac_vector_…

49ed983

…iteration # Conflicts: # production/inc/gaia/direct_access/dac_vector.hpp

daxhaw reviewed Mar 10, 2022

View reviewed changes

dev_tools/gdev/gdev/host.py Outdated Show resolved Hide resolved

daxhaw reviewed Mar 10, 2022

View reviewed changes

production/direct_access/tests/test_array_fields.cpp Show resolved Hide resolved

daxhaw reviewed Mar 10, 2022

View reviewed changes

daxhaw approved these changes Mar 10, 2022

View reviewed changes

fineg74 added 2 commits March 11, 2022 15:14

Make gaiat accept iterator loops for array fields and fix several oth…

d4e0c35

…er issues

Fix clang-tidy issues

6d00bc2

fineg74 merged commit 800a854 into master Mar 12, 2022

fineg74 deleted the phillip/dac_vector_iteration branch March 12, 2022 00:32

senderista mentioned this pull request Mar 14, 2022

Fix LLVM assert in gaiat #1400

Merged

		@@ -0,0 +1,206 @@
		/////////////////////////////////////////////

Iterator for DAC vectors #1366

Iterator for DAC vectors #1366

Conversation

phillipov commented Feb 23, 2022

Out of scope

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LaurentiuCristofor left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

LaurentiuCristofor left a comment

Choose a reason for hiding this comment

simone-gaia left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daxhaw left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

daxhaw left a comment •

edited

Loading