Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP]Hopscotch concurrent map #106

Open
wants to merge 23 commits into
base: master
Choose a base branch
from
Open
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 45 additions & 17 deletions cds/container/hopscotch_hashmap.h
Original file line number Diff line number Diff line change
Expand Up @@ -76,8 +76,9 @@ namespace cds {

Bucket* segments_arys;

int calc_hash(KEY key) {
std::hash<KEY> hash_fn;
template <typename K>
unsigned int calc_hash(K const& key) {
std::hash<K> hash_fn;
return hash_fn(key) % MAX_SEGMENTS;
}

Expand Down Expand Up @@ -124,9 +125,8 @@ namespace cds {
*/
template <typename K>
bool contains(K const& key)
{
DATA data = get(key);
return data != NULL;
{
return find_with(key, [=](K const& one, K const& two) { return one != two; }, [](mapped_type&) {});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@khizmax, посмотрите ещё пожалуйста вот эту конструкцию. Нет вопроса почему это не компилируется, но есть вопрос как это лучше разрулить в этой структуре: на место K из тестов подставляется структура cds_test::striped_map_fixture::key_type, для которой не отпределён оператор !=, а использовать cds_test::striped_map_fixture::cmp в файле реализации, очевидно, неправильно.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Как разрулить в этой структуре - я не знаю.
Во всех контейнерах типа map в libcds есть дополнительный третий template-параметр - traits. Он задает характеристики мапы. Одна из характеристик - compare functor: это функтор, задающий сравнение элементов:

  • cmp( a, b ) < 0, если a < b
  • cmp( a, b ) == 0, если a == b
  • cmp( a, b ) > 0, если a > b
    Например, если тип KEY естьstd::string, этот функтор мог бы выглядеть как-то так:
struct compare {
    int operator()( std::string const& lhs, std::string const& rhs ) const
   {
       return lhs.compare ( rhs );
   }

   int operator()( std::string const& lhs, char const* rhs ) const
   {
       return lhs.compare( rhs );
   }

   int operator()( chr const* lhs, std::string const& rhs ) const
   {
      return -rhs.compare( lhs );
   }
};

Заметьте, что этот compare построен так, что в качестве ключа принимает один из типов std::string или char const* - компилятор сам разрулит, какой оператор следует вызвать.
Вообще, в traits выносятся все стратегии, чтобы можно было гибко настраивать контейнер под свои нужды, - hash-функтор, item_counter и т.п.
С таким подходом вам вообще не надо будет никаких лямбд.

И перестаньте везде использовать [=] в вызовах лямбды, это до добра не доведет ;-)
[=] значит, что вы в лямбду передаете копии всех локальных данных текущей функции, - это довольно редко нужно. В данном конкретном случае

[=](K const& one, K const& two) { return one != two; }

достаточно [], так как тело лямбды использует только аргументы функции и ничего более. См. http://en.cppreference.com/w/cpp/language/lambda

}

/// Checks whether the map contains \p key using \p pred predicate for searching
Expand All @@ -138,7 +138,7 @@ namespace cds {
template <typename K, typename Predicate>
bool contains(K const& key, Predicate pred)
{
return get(key, pred) != NULL;
return find_with(key, pred, [](mapped_type&) {});
}

/// Find the key \p key
Expand All @@ -160,31 +160,59 @@ namespace cds {
template <typename K, typename Func>
bool find_with(K const& key, Func f)
{
DATA data = get(key, [=](K const& one, K const& two) { return one != two; });
f(data);
return data != NULL;
return find_with(key, [=](K const& one, K const& two) { return one != two; }, f);
}

template <typename K, typename Predicate>
bool find(K const& key, Predicate pred)
{
DATA data = get(key, pred);
return data != NULL;
return find_with(key, pred, [](mapped_type&) {});
}

template <typename K>
bool find(K const& key)
{
DATA data = get(key, [=](K const& one, K const& two) { return one != two; });
return data != NULL;
return find_with(key, [=](K const& one, K const& two) { return one != two; }, [](mapped_type&) {});
}

template <typename K, typename Predicate, typename Func>
bool find_with(K const& key, Predicate pred, Func f)
{
DATA data = get(key, [=](K const& one, K const& two) { return one != two; });
f(data);
return data != NULL;
unsigned int hash = calc_hash(key);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash function should probably return a std::size_t so that all indexes can be reached using the hash. If they are capped at the max int then a table whose size is max int + 1 will never have those slots touched.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaKellyFella, I actually already changed this in a local copy, haven't committed it yet. 😄

Bucket* start_bucket = segments_arys + hash;
unsigned int try_counter = 0;
unsigned int timestamp;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same goes for timestamp as is for the hash. In order to prevent any misbehaviour or ABA on timestamp checking they should be as large as possible.

do {
timestamp = start_bucket->_timestamp;
unsigned int hop_info = start_bucket->_hop_info;
Bucket* check_bucket = start_bucket;
unsigned int temp;
for (int i = 0; i < HOP_RANGE; i++) {
temp = hop_info;
temp = temp >> i;

if (temp & 1) {
if (pred(key, *(check_bucket->_key)) == 0) {
f(*(check_bucket->_data));
return true;
}
}
++check_bucket;
}
++try_counter;
} while (timestamp != start_bucket->_timestamp && try_counter < MAX_TRIES);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MAX_TRIES really shouldn't exist. It's a holdover from the initial version and it isn't a good addition. It breaks linearisability under heavy insertion and deletion contention. Keep trying until you get a timestamp match, there's no reason to have a limit on the times you check.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DaKellyFella, since hop_info is limited to the sizeof(unsigned int), we can't keep tring because we can get undefined behaviour. In case of offsets' implementation that could be true, but it seems that not in this implementation.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@LeoSko

since hop_info is limited to the sizeof(unsigned int), we can't keep tring because we can get undefined behaviour.

No I mean MAX_TRIES shouldn't exist. You should keep rereading the bucket until you have completed a scan of the buckets hop_info indicates as useful until the timestamps haven't changed. That way you can guarantee nothing has changed since you scanned the table.

Does that make sense? Am I missing something? Apologies if I am.


if (timestamp != start_bucket->_timestamp) {
Bucket* check_bucket = start_bucket;
for (int i = 0; i < HOP_RANGE; i++) {
if (pred(key, *(check_bucket->_key)) == 0) {
f(*(check_bucket->_data));
return true;
}
++check_bucket;
}
}
return false;
}

/// For key \p key inserts data of type \ref value_type constructed with <tt>std::forward<Args>(args)...</tt>
Expand Down Expand Up @@ -506,7 +534,7 @@ namespace cds {
template <typename K, typename Predicate>
bool erase_with(K const& key, Predicate pred)
{
return erase(key, pred, [](mapped_type&) {});
return erase_with(key, pred, [](mapped_type&) {});
}

/// Delete \p key from the map
Expand Down