Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scripts: disable pulseaudio before sof bootloop #855

Conversation

zhuyingjiang
Copy link
Contributor

@zhuyingjiang zhuyingjiang commented Jan 17, 2019

the pulseaudio trigger a stream at the start, and it interfered
with the timing of sof_remove, this fix thesofproject/linux#534

Signed-off-by: Zhu Yingjiang [email protected]

Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the commit message, can you describe more detail about the use case here, the error (please paste it), and why pulseaudio is a problem.
I also dont see any connection with thesofproject/linux#534 ? Please explain in more detail.

Copy link
Member

@plbossart plbossart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NAK NAK NAK.
the fix wrecks the user setup and PulseAudio should only interact with the card based on udev triggers, so if there is a race condition it needs to be fixed.

@@ -1,5 +1,8 @@
#!/bin/bash

echo "autospawn = no" > /etc/pulse/client.conf
killall pulseaudio
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has got to be one of the silliest fixes I've ever seen. You just erased the entire file with this, so all other settings are gone. Seriously?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@plbossart updated

@wenqingfu
Copy link

'>' -> '>>'?

anyway, maybe it triggers the simultaneous problem that @keyonjie was looking into recently? it worth looking into deeper.

the pulseaudio trigger a stream at the start, and it interfered
with the timing of sof_remove, so disable it at first

Signed-off-by: Zhu Yingjiang <[email protected]>
@zhuyingjiang zhuyingjiang force-pushed the topic/disable-pulseaudio-before-sof_bootloop branch from 7aa51d8 to 35aa045 Compare January 18, 2019 07:54
Copy link
Member

@lgirdwood lgirdwood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still don't fully understand the reason for this change as your commit message is missing details. Are you saying there is an issue in sof_remove in the driver ? Can you paste logs. If there is an sof_remove() issue then this seems to be the wrong way to fix it.

@zhuyingjiang
Copy link
Contributor Author

Still don't fully understand the reason for this change as your commit message is missing details. Are you saying there is an issue in sof_remove in the driver ? Can you paste logs. If there is an sof_remove() issue then this seems to be the wrong way to fix it.

@lgirdwood
The case is: if we have not disable the pulseaudio , and start the remove/insert loop, then pulseaudio can occasionally access the device, and obtained the hardware resource, then in the next insert run, the ipc sequence will be disturbed.
This can be proved by disable the pulseaudio at first, then run the remove/insert loop. I have test max 10000 times and success, (not same as #552, that one the SHA1 for driver FW are the same, but maybe the different kconfig or tplg, used daily build there).
If during the insert/remove loop, use aplay to play some music, there are still ops same as this issue. and the system can not reboot use "reboot" command, the power has to be reset, the fail log captured are not the same as the aplay time is not the same.
bootloop_aplay_ops.1.log
bootloop_aplay_ops.2.log
bootloop_aplay_ops.3.log
bootloop_aplay_ops.4.log

So this comes to the question: "can the module/device be used when do the load/unload"?
this is same as when running a executable binary, can the binary deleted?
Maybe we need a solution either forbid use the devices during load/unload, or just give a " in use, fail to unload" error.

Analyse with add printk sometimes got NULL pointer which cause the oops, I will go deep and provide more analysis later on.

@lgirdwood
Copy link
Member

@zhuyingjiang ok this is a driver bug and needs fixed in the driver.

[  324.407898] sof-audio sof-audio: ipc tx succeeded: 0x50010000: GLB_COMP_MSG: SET_VALUE
[  324.407935] sof-audio sof-audio: ipc tx: 0x50010000: GLB_COMP_MSG: SET_VALUE
[  324.408078] sof-audio sof-audio: ipc tx succeeded: 0x50010000: GLB_COMP_MSG: SET_VALUE
[  324.408117] sof-audio sof-audio: ipc tx: 0x50010000: GLB_COMP_MSG: SET_VALUE
[  324.408260] sof-audio sof-audio: ipc tx succeeded: 0x50010000: GLB_COMP_MSG: SET_VALUE
[  324.884469] sof-audio sof-audio: ipc rx: 0x90020000: GLB_TRACE_MSG
[  324.884491] sof-audio sof-audio: ipc rx done: 0x90020000: GLB_TRACE_MSG
[  325.384422] sof-audio sof-audio: ipc rx: 0x90020000: GLB_TRACE_MSG
[  325.384444] sof-audio sof-audio: ipc rx done: 0x90020000: GLB_TRACE_MSG
[  329.198286] BUG: unable to handle kernel paging request at 000000046474e532
[  329.198302] PGD 0 P4D 0 
[  329.198314] Oops: 0000 [#1] SMP NOPTI
[  329.198325] CPU: 3 PID: 26 Comm: kworker/3:0 Not tainted 4.20.0+ #17
[  329.198331] Hardware name: AAEON UP-APL01/UP-APL01, BIOS UPA1AM36 04/10/2018
[  329.198352] Workqueue: events_power_efficient close_delayed_work [snd_soc_core]
[  329.198370] RIP: 0010:dapm_widget_invalidate_output_paths+0x6c/0xf0 [snd_soc_core]
[  329.198378] Code: c7 87 2c 01 00 00 ff ff ff ff 48 89 44 24 08 48 89 04 24 48 8b 87 e8 00 00 00 48 8d 8f e8 00 00 00 48 39 c1 48 8d 50 c8 74 51 <0f> b6 42 18 83 e0 0d 3c 01 75 39 48 8b 42 08 83 b8 2c 01 00 00 ff
[  329.198384] RSP: 0018:ffffaca18072fe18 EFLAGS: 00010292
[  329.198391] RAX: 000000046474e552 RBX: ffff9c74483882d8 RCX: ffff9c74483883c0
[  329.198396] RDX: 000000046474e51a RSI: ffff9c747a399738 RDI: ffff9c74483882d8
[  329.198402] RBP: 0000000000000002 R08: 0000746e65696369 R09: ffffaca18072fe18
[  329.198408] R10: 0000000000000018 R11: fefefefefefefeff R12: 0000000000000000
[  329.198413] R13: ffffffffc04b5180 R14: ffff9c7445118000 R15: 0ffff9c747bba450
[  329.198420] FS:  0000000000000000(0000) GS:ffff9c747bb80000(0000) knlGS:0000000000000000
[  329.198426] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  329.198432] CR2: 000000046474e532 CR3: 0000000030a0a000 CR4: 00000000003406e0
[  329.198436] Call Trace:
[  329.198460]  soc_dapm_dai_stream_event.isra.26+0x33/0xb0 [snd_soc_core]
[  329.198476]  snd_soc_dapm_stream_event+0x3c/0xb0 [snd_soc_core]
[  329.198492]  close_delayed_work+0x3e/0x50 [snd_soc_core]
[  329.198506]  process_one_work+0x1e3/0x3d0
[  329.198516]  worker_thread+0x28/0x3c0
[  329.198525]  ? set_worker_desc+0xb0/0xb0
[  329.198533]  kthread+0x10e/0x130
[  329.198541]  ? kthread_park+0x80/0x80
[  329.198552]  ret_from_fork+0x35/0x40
[  329.198559] Modules linked in: sof_pci_dev snd_sof_intel_hda_common sof_acpi_dev snd_sof_intel_byt snd_sof_nocodec snd_sof_intel_bdw snd_sof_xtensa_dsp snd_sof_intel_hsw snd_sof snd_soc_acpi_intel_match snd_soc_acpi snd_soc_pcm512x_i2c snd_soc_pcm512x snd_soc_da7213 snd_soc_rt5640 snd_soc_rt5651 snd_soc_rt5645 snd_soc_rt5670 snd_soc_rl6231 snd_soc_hdac_hdmi snd_soc_dmic x86_pkg_temp_thermal snd_soc_hdac_hda snd_sof_intel_hda snd_hda_ext_core snd_hda_codec snd_hwdep snd_hda_core snd_soc_wm8804_i2c snd_soc_wm8804 snd_soc_core intel_lpss_pci intel_lpss mfd_core snd_pcm efivarfs mmc_block sdhci_pci xhci_pci cqhci sdhci xhci_hcd [last unloaded: snd_soc_acpi]
[  329.198625] CR2: 000000046474e532
[  329.198633] ---[ end trace 5372dd66f32a180a ]---
[  329.198649] RIP: 0010:dapm_widget_invalidate_output_paths+0x6c/0xf0 [snd_soc_core]
[  329.198656] Code: c7 87 2c 01 00 00 ff ff ff ff 48 89 44 24 08 48 89 04 24 48 8b 87 e8 00 00 00 48 8d 8f e8 00 00 00 48 39 c1 48 8d 50 c8 74 51 <0f> b6 42 18 83 e0 0d 3c 01 75 39 48 8b 42 08 83 b8 2c 01 00 00 ff
[  329.198662] RSP: 0018:ffffaca18072fe18 EFLAGS: 00010292
[  329.198668] RAX: 000000046474e552 RBX: ffff9c74483882d8 RCX: ffff9c74483883c0
[  329.198674] RDX: 000000046474e51a RSI: ffff9c747a399738 RDI: ffff9c74483882d8
[  329.198679] RBP: 0000000000000002 R08: 0000746e65696369 R09: ffffaca18072fe18
[  329.198684] R10: 0000000000000018 R11: fefefefefefefeff R12: 0000000000000000
[  329.198690] R13: ffffffffc04b5180 R14: ffff9c7445118000 R15: 0ffff9c747bba450
[  329.198696] FS:  0000000000000000(0000) GS:ffff9c747bb80000(0000) knlGS:0000000000000000
[  329.198702] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  329.198707] CR2: 000000046474e532 CR3: 0000000030a0a000 CR4: 00000000003406e0

@lgirdwood
Copy link
Member

Closing since it's a kernel issue. @zhuyingjiang please create a FW issue.

@lgirdwood lgirdwood closed this Jan 22, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Up2: load/unload module always fail because of IPC issues
4 participants