Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SITL perf improvements #11177

Merged
merged 11 commits into from
Jan 14, 2019
10 changes: 5 additions & 5 deletions Tools/sitl_run.sh
Original file line number Diff line number Diff line change
Expand Up @@ -128,19 +128,19 @@ then
echo "Not running simulation (\$DONT_RUN is set)."
elif [ "$debugger" == "lldb" ]
then
lldb -- $sitl_command
eval lldb -- $sitl_command
elif [ "$debugger" == "gdb" ]
then
gdb --args $sitl_command
eval gdb --args $sitl_command
elif [ "$debugger" == "ddd" ]
then
ddd --debugger gdb --args $sitl_command
eval ddd --debugger gdb --args $sitl_command
elif [ "$debugger" == "valgrind" ]
then
valgrind --track-origins=yes --leak-check=full -v $sitl_command
eval valgrind --track-origins=yes --leak-check=full -v $sitl_command
elif [ "$debugger" == "callgrind" ]
then
valgrind --tool=callgrind -v $sitl_command
eval valgrind --tool=callgrind -v $sitl_command
elif [ "$debugger" == "ide" ]
then
echo "######################################################################"
Expand Down
48 changes: 25 additions & 23 deletions platforms/posix/src/lockstep_scheduler/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,37 +2,39 @@ cmake_minimum_required(VERSION 2.8.12)

if(NOT PROJECT_NAME STREQUAL "px4")

project(lockstep_scheduler)
project(lockstep_scheduler)

set (CMAKE_CXX_STANDARD 11)
set (CMAKE_CXX_STANDARD 11)

add_library(lockstep_scheduler
src/lockstep_scheduler.cpp
)
add_definitions(-DUNIT_TESTS)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably move the whole file to tabs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I didn't realize. Commit pushed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Quick comment because I just noticed this (not PR related). The CMakeLists.txt doesn't need to be different for inclusion in px4 or building standalone.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does because it can be used for standalone unit tests. And I would like to keep these tests until we have figured out a way to port them over into PX4.


target_include_directories(lockstep_scheduler
PUBLIC
$<INSTALL_INTERFACE:include>
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include>
PRIVATE
${CMAKE_CURRENT_SOURCE_DIR}/src
)
add_library(lockstep_scheduler
src/lockstep_scheduler.cpp
)

target_link_libraries(lockstep_scheduler
pthread
)
target_include_directories(lockstep_scheduler
PUBLIC
$<INSTALL_INTERFACE:include>
$<BUILD_INTERFACE:${CMAKE_CURRENT_SOURCE_DIR}/include>
PRIVATE
${CMAKE_CURRENT_SOURCE_DIR}/src
)

target_compile_options(lockstep_scheduler PRIVATE -Wall -Wextra -Werror -O2)
target_link_libraries(lockstep_scheduler
pthread
)

add_subdirectory(test)
target_compile_options(lockstep_scheduler PRIVATE -Wall -Wextra -Werror -O2)

add_subdirectory(test)

else()

add_library(lockstep_scheduler
src/lockstep_scheduler.cpp
)
include_directories(
include
)
add_library(lockstep_scheduler
src/lockstep_scheduler.cpp
)
include_directories(
include
)

endif()
Original file line number Diff line number Diff line change
Expand Up @@ -10,22 +10,40 @@
class LockstepScheduler
{
public:
~LockstepScheduler();

void set_absolute_time(uint64_t time_us);
uint64_t get_absolute_time() const;
inline uint64_t get_absolute_time() const { return _time_us; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that you made it inline explicitly but it is already inline automatically if defined in the class declaration, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that you made it inline explicitly but it is already inline automatically if defined in the class declaration, right?

Yes, this is just slightly stronger, and sometimes I like to add it to make it clear to a reader that it's important here. Not strictly necessary though.

int cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *lock, uint64_t time_us);
int usleep_until(uint64_t timed_us);

private:
std::atomic<uint64_t> time_us_{0};

struct TimedWait {
~TimedWait()
{
// If a thread quickly exits after a cond_timedwait(), the
// thread_local object can still be in the linked list. In that case
// we need to wait until it's removed.
while (!removed) {
#ifndef UNIT_TESTS // unit tests don't define system_usleep and execute faster w/o sleeping here
system_sleep(5000);
#endif
}
}

pthread_cond_t *passed_cond{nullptr};
pthread_mutex_t *passed_lock{nullptr};
uint64_t time_us{0};
bool timeout{false};
bool done{false};
std::atomic<bool> done{false};
std::atomic<bool> removed{true};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At first I found the name removed confusing but I think it actually is what it is, and I can't find a better word.


TimedWait *next{nullptr}; ///< linked list
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't love the raw linked list. I would prefer an abstraction but I suppose it's easy enough to understand.

};
std::vector<std::shared_ptr<TimedWait>> timed_waits_{};
std::mutex timed_waits_mutex_{};
bool timed_waits_iterator_invalidated_{false};

std::atomic<uint64_t> _time_us{0};

TimedWait *_timed_waits{nullptr}; ///< head of linked list
std::mutex _timed_waits_mutex;
std::atomic<bool> _setting_time{false}; ///< true if set_absolute_time() is currently being executed
};
136 changes: 77 additions & 59 deletions platforms/posix/src/lockstep_scheduler/src/lockstep_scheduler.cpp
Original file line number Diff line number Diff line change
@@ -1,102 +1,123 @@
#include "lockstep_scheduler/lockstep_scheduler.h"


uint64_t LockstepScheduler::get_absolute_time() const
LockstepScheduler::~LockstepScheduler()
{
return time_us_;
// cleanup the linked list
std::unique_lock<std::mutex> lock_timed_waits(_timed_waits_mutex);

while (_timed_waits) {
TimedWait *tmp = _timed_waits;
_timed_waits = _timed_waits->next;
tmp->removed = true;
}
}

void LockstepScheduler::set_absolute_time(uint64_t time_us)
{
time_us_ = time_us;
_time_us = time_us;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense to switch it to the PX4 convention.


{
std::unique_lock<std::mutex> lock_timed_waits(timed_waits_mutex_);
std::unique_lock<std::mutex> lock_timed_waits(_timed_waits_mutex);
_setting_time = true;

auto it = std::begin(timed_waits_);
TimedWait *timed_wait = _timed_waits;
TimedWait *timed_wait_prev = nullptr;

while (it != std::end(timed_waits_)) {
while (timed_wait) {
// Clean up the ones that are already done from last iteration.
if (timed_wait->done) {
// Erase from the linked list
if (timed_wait_prev) {
timed_wait_prev->next = timed_wait->next;

std::shared_ptr<TimedWait> temp_timed_wait = *it;
} else {
_timed_waits = timed_wait->next;
}

// Clean up the ones that are already done from last iteration.
if (temp_timed_wait->done) {
it = timed_waits_.erase(it);
TimedWait *tmp = timed_wait;
timed_wait = timed_wait->next;
tmp->removed = true;
continue;
}

if (temp_timed_wait->time_us <= time_us &&
!temp_timed_wait->timeout &&
!temp_timed_wait->done) {
temp_timed_wait->timeout = true;
if (timed_wait->time_us <= time_us &&
!timed_wait->timeout) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that we don't need && !temp_timed_wait->done anymore?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, since we check for that above already. If it were needed here, it would mean we have a race condition.

// We are abusing the condition here to signal that the time
// has passed.
timed_waits_iterator_invalidated_ = false;
pthread_mutex_lock(temp_timed_wait->passed_lock);
pthread_cond_broadcast(temp_timed_wait->passed_cond);
pthread_mutex_unlock(temp_timed_wait->passed_lock);

if (timed_waits_iterator_invalidated_) {
// The vector might have changed, we need to start from the
// beginning.
it = std::begin(timed_waits_);
continue;
}
pthread_mutex_lock(timed_wait->passed_lock);
timed_wait->timeout = true;
pthread_cond_broadcast(timed_wait->passed_cond);
pthread_mutex_unlock(timed_wait->passed_lock);
}

++it;
timed_wait_prev = timed_wait;
timed_wait = timed_wait->next;
}

_setting_time = false;
}
}

int LockstepScheduler::cond_timedwait(pthread_cond_t *cond, pthread_mutex_t *lock, uint64_t time_us)
{
std::shared_ptr<TimedWait> new_timed_wait;
// A TimedWait object might still be in timed_waits_ after we return, so its lifetime needs to be
// longer. And using thread_local is more efficient than malloc.
static thread_local TimedWait timed_wait;
{
std::lock_guard<std::mutex> lock_timed_waits(timed_waits_mutex_);
std::lock_guard<std::mutex> lock_timed_waits(_timed_waits_mutex);

// The time has already passed.
if (time_us <= time_us_) {
if (time_us <= _time_us) {
return ETIMEDOUT;
}

new_timed_wait = std::make_shared<TimedWait>();
new_timed_wait->time_us = time_us;
new_timed_wait->passed_cond = cond;
new_timed_wait->passed_lock = lock;
timed_waits_.push_back(new_timed_wait);
timed_waits_iterator_invalidated_ = true;
timed_wait.time_us = time_us;
timed_wait.passed_cond = cond;
timed_wait.passed_lock = lock;
timed_wait.timeout = false;
timed_wait.done = false;

// Add to linked list if not removed yet (otherwise just re-use the object)
if (timed_wait.removed) {
timed_wait.removed = false;
timed_wait.next = _timed_waits;
_timed_waits = &timed_wait;
}
}

while (true) {
int result = pthread_cond_wait(cond, lock);
int result = pthread_cond_wait(cond, lock);

// We need to unlock before aqcuiring the timed_waits_mutex, otherwise
// we are at rist of priority inversion.
pthread_mutex_unlock(lock);
const bool timeout = timed_wait.timeout;

{
std::lock_guard<std::mutex> lock_timed_waits(timed_waits_mutex_);

if (result == 0 && new_timed_wait->timeout) {
result = ETIMEDOUT;
}

new_timed_wait->done = true;
}
if (result == 0 && timeout) {
result = ETIMEDOUT;
}

// The lock needs to be locked on exit of this function
timed_wait.done = true;

if (!timeout && _setting_time) {
// This is where it gets tricky: the timeout has not been triggered yet,
// and another thread is in set_absolute_time().
// If it already passed the 'done' check, it will access the mutex and
// the condition variable next. However they might be invalid as soon as we
// return here, so we wait until set_absolute_time() is done.
// In addition we have to unlock 'lock', otherwise we risk a
// deadlock due to a different locking order in set_absolute_time().
// Note that this case does not happen too frequently, and thus can be
// a bit more expensive.
pthread_mutex_unlock(lock);
_timed_waits_mutex.lock();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand, thanks for the comment. So timed_waits_mutex_.lock(); basically makes us wait for set_absolute_time() to finish, right?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.

_timed_waits_mutex.unlock();
pthread_mutex_lock(lock);
return result;
}

return result;
}

int LockstepScheduler::usleep_until(uint64_t time_us)
{
pthread_mutex_t lock;
pthread_mutex_init(&lock, nullptr);
pthread_cond_t cond;
pthread_cond_init(&cond, nullptr);
pthread_mutex_t lock = PTHREAD_MUTEX_INITIALIZER;
pthread_cond_t cond = PTHREAD_COND_INITIALIZER;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought these were only allowed for static locks/conds but looks like that's only true for older versions of the POSIX standard. I guess it's a nice optimization to save calls.


pthread_mutex_lock(&lock);

Expand All @@ -109,8 +130,5 @@ int LockstepScheduler::usleep_until(uint64_t time_us)

pthread_mutex_unlock(&lock);

pthread_cond_destroy(&cond);
pthread_mutex_destroy(&lock);

return result;
}
Loading