Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PX4 receives input_rc and arms by itself with no RC attached #11238

Closed
jzazbert opened this issue Jan 18, 2019 · 19 comments
Closed

PX4 receives input_rc and arms by itself with no RC attached #11238

jzazbert opened this issue Jan 18, 2019 · 19 comments
Assignees
Labels

Comments

@jzazbert
Copy link

jzazbert commented Jan 18, 2019

Describe the bug
Intermittently, the aircraft, when powered on and sitting on the ground with no RC transmitter turned on, will print out that it lost RC control, arm itself, spin up the motors, then switch into LAND failsafe because GPS hasn't locked yet. The problem seems somewhat intermittent, but we can go for stretches of hours where it will do it on every boot.

Logs of the event show that it receives input_rc messages every 3-20 seconds, with SOURCE=RC_INPUT_SOURCE_MAVLINK and 8 channels of 65535 values. When looking at the scaled manual control inputs in the log viewer, that gives an almost full yaw command and zero throttle, which may be the source of the arming.

We also have ensured that the RC_IN lines while not connected to an RC transmitter are not floating electrically.

To Reproduce
Steps to reproduce the behavior:

  1. Unplug RC receiver, and remove props!
  2. Power on aircraft
  3. Wait
  4. Sometimes it will arm and takeoff.

(Note that in our case we have the CBRK_IO_SAFETY set so that is NOT necessary to press the safety switch before launch), otherwise the arm would be prevented.

Expected behavior
We expect the aircraft to sit there and never arm

Log Files and Screenshots
Because the system arms, it starts logging. here are some logs from the events. It is worth noting that the input_rc message in the logs show a few received values, even though there are no RC receivers connected, and the source is listed as MAVLINK.
https://logs.px4.io/plot_app?log=4abb3756-1202-4d7f-93df-7278dcec0897
https://logs.px4.io/plot_app?log=d12666c6-faed-4645-bc04-0d5a94d05116
https://logs.px4.io/plot_app?log=508c3766-012d-4a24-a538-80235388be71
https://logs.px4.io/plot_app?log=2f046532-ae41-4327-b683-57cdfe3e1d70
https://logs.px4.io/plot_app?log=7cd79b51-c6b8-4ee0-9b33-311ef4f6c0ae

Drone (please complete the following information):
Our system is based on a custom adapter board with a Pixhawk 2.1 cube on it, a wifi module, and a long range radio. The adapter board has an MCU that can communicate with the cube. In order to rule out all external sources of mavlink traffic, we wiped the adapter MCU and disabled the wifi serial port and unplugged long range RF, and can still get it to happen. Our PX4 code fork is based on 1.7.3 release with a custom CAN driver for our ESCs.

@dagar dagar self-assigned this Jan 18, 2019
@dagar dagar added this to the Release v1.9.0 milestone Jan 18, 2019
@dagar
Copy link
Member

dagar commented Jan 18, 2019

Are you using QGroundControl? If so can you verify if thumbsticks are enabled (check settings and the fly screen).

@jzazbert
Copy link
Author

jzazbert commented Jan 18, 2019

We do use QGroundControl, but we can get this to happen with all external data links disconnected. We'll check and make sure of that however.

We can see the lost manual control message on the serial terminal

@jzazbert
Copy link
Author

jzazbert commented Jan 18, 2019

Unsure if this is related, but I have a pixhawk 4 (px4 1.8.1) connected via USB to QGroundControl 3.4.4 on a completely separate computer, and nothing else plugged in at all to the pixhawk. I'll periodically get the manual control lost error pop up, and if I use MAVLINK inspector, I see a single MANUAL_CONTROL packet get sent, then it starts sending RC_CHANNELS_OVERRIDE at 10hz.

The only cable plugged in to the pixhawk 4 is USB. This could be unrelated, but the lost control messages when there wasn't any to begin with is the same.

mavlink_control

@dagar
Copy link
Member

dagar commented Jan 18, 2019

@dagar
Copy link
Member

dagar commented Jan 18, 2019

Unsure if this is related, but I have a pixhawk 4 (px4 1.8.1) connected via USB to QGroundControl 3.4.4 on a completely separate computer, and nothing else plugged in at all to the pixhawk. I'll periodically get the manual control lost error pop up, and if I use MAVLINK inspector, I see a single MANUAL_CONTROL packet get sent, then it starts sending RC_CHANNELS_OVERRIDE at 10hz.

The only cable plugged in to the pixhawk 4 is USB. This could be unrelated, but the lost control messages when there wasn't any to begin with is the same.

mavlink_control

Can you share a corresponding log (.ulg)? You can set it to log from boot (SDLOG_MODE 1).

@jzazbert
Copy link
Author

Yeah, here you go. I'd turned on log from boot to try and capture. In this case, the input_rc message source is UNKNOWN, but there is no RC anything plugged into this box. The last part of log 1 is when the pixhawk fell off the table because of the USB cable...

https://logs.px4.io/plot_app?log=2c5bcb95-99d3-4c3d-825e-4d45028dc3f6
https://logs.px4.io/plot_app?log=7898563e-4539-4e84-b4b2-418c75f5c009

@jzazbert
Copy link
Author

jzazbert commented Jan 19, 2019

This sending of RC_CHANNELS_OVERRIDE bit; I don't understand why this is necessary, but when I looked at the code it seemed like it sends them out to the same MAV_SYS_ID as the current system.

If I have two aircraft that are powered up and their radio modems are on the same channel, and they both have the save MAV_SYS_ID, the send of RC_CHANNELS_OVERRIDE from one system could be interpreted as rc_inputs on the other?

Understanding that two aircraft with same sys_id on the same channel is bad, I'm wondering if that is the loop in our system that is causing our issue.

@Antiheavy
Copy link
Contributor

We also sometimes get spurious "Manual Control Lost" warnings through QGC. Our vehicles never have RC receivers plugged in.

here is a recent example log of this on our vehicle (running a lightly modified version of v1.8.0 firmware):
https://review.px4.io/plot_app?log=1fb297c5-c8f9-46dd-a24b-232451e7d5e1

the vehicle in the log above was very repeatably throwing the Manual Control Lost warnings shortly after establishing telemetry connection (40-50 seconds after boot in this case). Since connection had just been made, the is a bunch of telemetry and SDcard activity as mission and parameters are automatically downloaded to QGC from the vehicle. We removed the SD card and re-inserted it and then magically no more "manual control lost" warnings.

@LorenzMeier
Copy link
Member

@Antiheavy The problem is the CPU load - the manual control lost check fails because commander is not running for an extended amount of time. This might be related to hardware you have onboard or specific modifications. But it has absolutely nothing to do with the RC itself.

Please have a look at the CPU load plot.
https://review.px4.io/plot_app?log=1fb297c5-c8f9-46dd-a24b-232451e7d5e1#Nav-CPU-_-RAM

@LorenzMeier
Copy link
Member

@dagar FYI

@jzazbert
Copy link
Author

I believe I've solved our problem. Our aircraft had the same MAV_SYS_ID set, and were inadvertently able to communicate because the channel settings on our serial radios matched. Commands sent to one aircraft by either RC input or a brief noise glitch was getting copied into the RC_CHANNELS_OVERRIDE message and transmitted, and being received by the other aircraft and being interpreted as valid RC data.

The overall cause was that our aircraft had same system id and radio, however the unsafe aspect came in because of the retransmission of the rc signals on the RC_CHANNELS_OVERRIDE message.

In our case I've solved the problem by disabling the send of RC_CHANNELS_OVERRIDE, and also preventing reception of it. This removes the vulnerability of this happening in the future.

I'd like to understand why the RC_CHANNELS_OVERRIDE message is being sent in the first place, and propose that a more rigourous check be done before accepting such commands.

@Antiheavy
Copy link
Contributor

@Antiheavy The problem is the CPU load

Yeah, that was part of the point I was trying to make, but I didn't explain it very well. We always see CPU load spikes correlated with the Manual Control Lost warning messages. I wanted to post my example in case it was helpful in debugging @jzazbert 's issue. I also am hopeful if there is a systematic bug uncovered it might also alleviate our example too. Although the CPU spikes in @jzazbert 's logs are not as closely correlated to the manual control warnings as they are in our logs so the might be a different root cause.

@jzazbert
Copy link
Author

In our case, I think the bulk of our "manual control lost" messages are because the RC_CHANNELS_OVERRIDE messages being sent out from one aircraft to the other are relatively low rate due to available bandwidth on the channel, so it keeps exceeding the RC timeout threshold, getting a new packet, then exceeding the threshold again.

That said, I can still get this to happen on a idling pixhawk 4 so... who knows what causes some of those. The logs in that case showed periodic SBUS PX4IO packets every so often, which I attribute to noise since I had to RC RX plugged in in that case.

@stale
Copy link

stale bot commented Jun 24, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@julianoes
Copy link
Contributor

@jzazbert did you find a solution for the issue?

@stale stale bot removed the Admin: Wont fix label Jul 1, 2019
@jzazbert
Copy link
Author

jzazbert commented Jul 1, 2019

In our case, I disabled reception (and transmission) of the RC_CHANNELS_OVERRIDE message for safety.

This wasn't my preferred solution as it likely breaks other things in the future, but until there is a more robust mechanism for control authentication, it solved my safety problem.

@PX4 PX4 deleted a comment from stale bot Oct 1, 2019
@stale stale bot removed the Admin: Wont fix label Oct 1, 2019
@julianoes
Copy link
Contributor

@LorenzMeier can you explain why we are sending RC_CHANNELS_OVERRIDE?

Also, see my question here: #3433 (comment)

@LorenzMeier
Copy link
Member

Most likely this was a legacy use case (e.g. using a Pixhawk as RC adapter). I would disable it.

@julianoes
Copy link
Contributor

Should be fixed for good by #13109.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants