Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xtimer: broken on 'native` when with thread_flags #6642

Closed
haukepetersen opened this issue Feb 23, 2017 · 8 comments
Closed

xtimer: broken on 'native` when with thread_flags #6642

haukepetersen opened this issue Feb 23, 2017 · 8 comments
Assignees
Labels
Area: timers Area: timer subsystems Platform: native Platform: This PR/issue effects the native platform Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)

Comments

@haukepetersen
Copy link
Contributor

refining #6624

The following code hangs after the first timeout on native:

static void time_evt(void *arg)
{
    thread_flags_set((thread_t *)arg, 0x1);
}

int main(void)
{
    xtimer_t timer;
    timer.callback = time_evt;
    timer.arg = (void *)sched_active_thread;

    while (1) {
        xtimer_set(&timer, 1000000);
        thread_flags_wait_any(0x1);
        printf("+++ timeout +++\n");
    }

    return 0;
}

After a quick debugging session I might have found a lead: it seems like native is doing an early context switch from its interrupt context back to the main thread when calling thread_flags_set(). The second call to xtimer_set() exits in _lltimer_set() without setting the new timer, because _in_handler is still set. If I see it correctly, the _in_handler being set because the _timer_callback() function never actually exists -> hence it seems like the switch from interrupt to thread context is happening early.

I am not sure however, how this should be fixed?! Are there maybe some if (in_isr()) checks missing in the thread_flags implementation?

@haukepetersen haukepetersen added Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors) Platform: native Platform: This PR/issue effects the native platform Area: timers Area: timer subsystems labels Feb 23, 2017
@haukepetersen haukepetersen added this to the Release 2017.04 milestone Feb 23, 2017
@haukepetersen
Copy link
Contributor Author

The problem seems to be broken scheduling behavior in native, isr_thread_yield() (in native_cpu.c) calls the scheduler directly, and does not allow the interrupt to run to completion. If I am wrong or is that quite broken behavior?!

@jnohlgard
Copy link
Member

@haukepetersen I agree that the ISRs must be allowed to run to finish before switching contexts.

@lebrush
Copy link
Member

lebrush commented Feb 24, 2017

I faced a similar situation when implementing the xtimer_mutex_lock_timeout and solved by adding a thread_yield_higher() (which then I figured out it should rather be a sched_context_switch_request = 1 since it's called from an interrupt context) in the callback.

Maybe that was then a native pitfall after all but IIRC @OlegHahm saw the same behaviour in a SAM platform.

EDIT: found the #6441 (comment). The board was a samr21-xpro I guess adding the sched_context_switch_request = 1 in the callback fixed that too.

@haukepetersen
Copy link
Contributor Author

haukepetersen commented Feb 27, 2017

ISRs must be allowed to run to finish before switching contexts.

Agree, this is not only a should, but a MUST!

So it seems this behavior is (to some extend) related to #6442. The question is how we can fix this once and for all... IMHO this should be a high priority thing to do, but I don't have the know-how to fix this. Anyone else volunteering?

@OlegHahm
Copy link
Member

OlegHahm commented Feb 27, 2017

Maybe I don't it right, but why do you assume that the ISR does not run to completion? I can't neither see it from the native code (just calling sched_run()) does not immediately switch context) nor from putting a debug puts() after thread_yield_higher() in _thread_flags_wait().

@haukepetersen
Copy link
Contributor Author

I'll put together a test application to demonstrate this behavior.

@haukepetersen
Copy link
Contributor Author

done: #6659

@kaspar030
Copy link
Contributor

#6660 fixes the issue for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: timers Area: timer subsystems Platform: native Platform: This PR/issue effects the native platform Type: bug The issue reports a bug / The PR fixes a bug (including spelling errors)
Projects
None yet
Development

No branches or pull requests

6 participants