Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sdmmc_read_sectors_dma: sdmmc_send_cmd returned 0x107 (IDFGH-11882) #12968

Open
3 tasks done
yizi15 opened this issue Jan 12, 2024 · 7 comments
Open
3 tasks done

sdmmc_read_sectors_dma: sdmmc_send_cmd returned 0x107 (IDFGH-11882) #12968

yizi15 opened this issue Jan 12, 2024 · 7 comments
Labels
Status: Opened Issue is new Type: Bug bugs in IDF

Comments

@yizi15
Copy link

yizi15 commented Jan 12, 2024

Answers checklist.

  • I have read the documentation ESP-IDF Programming Guide and the issue is not addressed there.
  • I have updated my IDF branch (master or release) to the latest version and checked that the issue is present there.
  • I have searched the issue tracker for a similar issue and not found a similar issue.

IDF version.

release/v5.2

Espressif SoC revision.

esp32s3n8

Operating System used.

Windows

How did you build your project?

Command line with idf.py

If you are using Windows, please specify command line type.

PowerShell

Development Kit.

ESP32-S3-USB-OTG

Power Supply used.

USB

What is the expected behavior?

通过fopen 打开文件,通过fread 读文件可以正常的读到文件的所有内容

What is the actual behavior?

在某些sd card(32g) 中 通过fopen 打开文件,通过fread 读文件经常读失败, 几秒内重新读取也会失败,写文件正常。
在另一种sdcard (16g)可以正常读写

Steps to reproduce.

将一个大于100M 的文件放入rf card,挂载sdmmc后 通过以下代码读文件
`sdmmc_host_t host = SDMMC_HOST_DEFAULT();
host.command_timeout_ms = 3000;
// host.max_freq_khz /= 2;
// host.flags &= ~SDMMC_HOST_FLAG_DDR;
// This initializes the slot without card detect (CD) and write protect (WP) signals.
// Modify slot_config.gpio_cd and slot_config.gpio_wp if your board has these signals.
sdmmc_slot_config_t slot_config = SDMMC_SLOT_CONFIG_DEFAULT();

// Set bus width to use:
slot_config.width = 4;
// On chips where the GPIOs used for SD card can be configured, set them in
// the slot_config structure:
slot_config.clk = CONFIG_EXAMPLE_PIN_CLK;
slot_config.cmd = CONFIG_EXAMPLE_PIN_CMD;
slot_config.d0 = CONFIG_EXAMPLE_PIN_D0;
slot_config.d1 = CONFIG_EXAMPLE_PIN_D1;
slot_config.d2 = CONFIG_EXAMPLE_PIN_D2;
slot_config.d3 = CONFIG_EXAMPLE_PIN_D3;

// Enable internal pullups on enabled pins. The internal pullups
// are insufficient however, please make sure 10k external pullups are
// connected on the bus. This is for debug / example purpose only.
slot_config.flags |= SDMMC_SLOT_FLAG_INTERNAL_PULLUP;

ESP_LOGI(TAG, "Mounting filesystem");
ret = esp_vfs_fat_sdmmc_mount(mount_point, &host, &slot_config, &mount_config, &card);

uint8_t buf[512];
const char* dst_path = "/sdcard/tst.txt";
size_t all_len =0;
struct stat statbuf = {};
stat(dst_path, &statbuf);
all_len = statbuf.st_size;
FILE* f = fopen(dst_path, "rb");
if(f == NULL)
{
ESP_LOGI("", "tst err");
}
else
{
void* p = buf;//malloc(512*1024);
size_t need_len = all_len;
while (need_len > 0)
{
int32_t read_len = 0;
for(int i = 0 ; i < 15; i ++)
{
read_len = fread(p, 1, 512, f);
if(read_len > 0)
{
break;
}
ESP_LOGI("", "read fail");
sleep_ms(30);
}
if(read_len <= 0)
{
break;
}
need_len -= read_len;
}
if(need_len != 0)
{
ESP_LOGE("","all_len != self->file_size %d, %d", all_len, need_len);
} else
{
ESP_LOGI("","read all data %d", all_len);
}
fclose(f);
}`

Debug Logs.

ESP-ROM:esp32s3-20210327
Build:Mar 27 2021
rst:0x1 (POWERON),boot:0x8 (SPI_FAST_FLASH_BOOT)
SPIWP:0xee
mode:DIO, clock div:1
load:0x3fce3810,len:0x178c
load:0x403c9700,len:0x4
load:0x403c9704,len:0xcbc
load:0x403cc700,len:0x2d90
entry 0x403c9914
I (27) boot: ESP-IDF v5.2-beta1-263-ge49823f10c-dirt 2nd stage bootloader
I (27) boot: compile time Jan 12 2024 16:06:39
I (28) boot: Multicore bootloader
I (32) boot: chip revision: v0.1
I (36) boot.esp32s3: Boot SPI Speed : 80MHz
I (41) boot.esp32s3: SPI Mode       : DIO
I (45) boot.esp32s3: SPI Flash Size : 8MB
I (50) boot: Enabling RNG early entropy source...
I (56) boot: Partition Table:
I (59) boot: ## Label            Usage          Type ST Offset   Length
I (66) boot:  0 nvs              WiFi data        01 02 00009000 00004000
I (74) boot:  1 coredump         Unknown data     01 03 0000f000 00010000
I (81) boot:  2 otadata          OTA data         01 00 0001f000 00002000
I (89) boot:  3 phy_init         RF data          01 01 00021000 00001000
I (96) boot:  4 factory          factory app      00 00 00030000 00200000
I (104) boot:  5 ota_0            OTA app          00 10 00230000 00200000
I (111) boot:  6 ota_1            OTA app          00 11 00430000 00200000
I (119) boot: End of partition table
I (123) boot: Defaulting to factory image
I (128) esp_image: segment 0: paddr=00030020 vaddr=3c030020 size=10218h ( 66072) map
I (148) esp_image: segment 1: paddr=00040240 vaddr=3fc96300 size=06528h ( 25896) load
I (154) esp_image: segment 2: paddr=00046770 vaddr=40378000 size=098a8h ( 39080) load
I (163) esp_image: segment 3: paddr=00050020 vaddr=42000020 size=2c5ach (181676) map
I (196) esp_image: segment 4: paddr=0007c5d4 vaddr=403818a8 size=049c0h ( 18880) load
I (207) boot: Loaded app from partition at offset 0x30000
I (207) boot: Disabling RNG early entropy source...
I (218) cpu_start: Multicore app
I (228) cpu_start: Pro cpu start user code
I (228) cpu_start: cpu freq: 240000000 Hz
I (228) cpu_start: Application information:
I (231) cpu_start: Project name:     ESP32_EXAMPLE
I (236) cpu_start: App version:      2600230616000a000000FFFFFFFF
I (243) cpu_start: ELF file SHA256:  b67272f6c4123c36...
I (249) cpu_start: ESP-IDF:          v5.2-beta1-263-ge49823f10c-dirt
I (256) cpu_start: Min chip rev:     v0.0
I (261) cpu_start: Max chip rev:     v0.99 
I (266) cpu_start: Chip rev:         v0.1
I (270) heap_init: Initializing. RAM available for dynamic allocation:
I (277) heap_init: At 3FC9D5C8 len 0004C148 (304 KiB): RAM
I (284) heap_init: At 3FCE9710 len 00005724 (21 KiB): RAM
I (290) heap_init: At 3FCF0000 len 00008000 (32 KiB): DRAM
I (296) heap_init: At 600FE010 len 00001FD8 (7 KiB): RTCRAM
I (303) spi_flash: detected chip: gd
I (306) spi_flash: flash io: dio
I (310) sleep: Configure to isolate all GPIO pins in sleep state
I (317) sleep: Enable automatic switching of GPIO sleep configuration
I (324) esp_core_dump_flash: Init core dump to flash
I (330) esp_core_dump_flash: Found partition 'coredump' @ f000 65536 bytes
I (338) main_task: Started on CPU0
I (342) main_task: Calling app_main()
I (362) example: Initializing SD card
I (362) example: Using SDMMC peripheral
I (363) example: Mounting filesystem
I (365) gpio: GPIO[36]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 
I (374) gpio: GPIO[35]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 
I (383) gpio: GPIO[37]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 
I (392) gpio: GPIO[38]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 
I (402) gpio: GPIO[33]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 
I (411) gpio: GPIO[34]| InputEn: 0| OutputEn: 1| OpenDrain: 0| Pullup: 0| Pulldown: 0| Intr:0 
I (467) gpio: GPIO[34]| InputEn: 0| OutputEn: 0| OpenDrain: 0| Pullup: 1| Pulldown: 0| Intr:0 
I (470) example: Filesystem mounted
E (15778) sdmmc_cmd: sdmmc_read_sectors_dma: sdmmc_send_cmd returned 0x107
E (15778) diskio_sdmmc: sdmmc_read_blocks failed (263)
I (15780) : read fail
I (15813) : read fail
I (15843) : read fail
I (15873) : read fail
I (15903) : read fail
I (15933) : read fail
I (15963) : read fail
I (15993) : read fail
I (16023) : read fail
I (16053) : read fail
I (16083) : read fail
I (16113) : read fail
I (16143) : read fail
I (16173) : read fail
I (16203) : read fail
E (16233) : all_len != self->file_size 123443470, 100476686
I (16233) main_task: Returned from app_main()

More Information.

sdcard_readerr.zip

@yizi15 yizi15 added the Type: Bug bugs in IDF label Jan 12, 2024
@espressif-bot espressif-bot added the Status: Opened Issue is new label Jan 12, 2024
@github-actions github-actions bot changed the title sdmmc_read_sectors_dma: sdmmc_send_cmd returned 0x107 sdmmc_read_sectors_dma: sdmmc_send_cmd returned 0x107 (IDFGH-11882) Jan 12, 2024
@igrr
Copy link
Member

igrr commented Jan 12, 2024

Please try changing this line:

const uint32_t data_timeout_ms = 100;

to a larger value, e.g. 500 ms.

We've seen some SD cards which are not compliant with the SD Specification, and do not reply to data transfer commands in 100ms. Perhaps that's what is happening in your case.

@yizi15
Copy link
Author

yizi15 commented Jan 12, 2024

Please try changing this line:

const uint32_t data_timeout_ms = 100;

to a larger value, e.g. 500 ms.
We've seen some SD cards which are not compliant with the SD Specification, and do not reply to data transfer commands in 100ms. Perhaps that's what is happening in your case.

Thanks, modify the timeout to 500 can solve the problem. I will close the issues after more test.

@igrr
Copy link
Member

igrr commented Jan 12, 2024

Let's keep this open... Since more people are running into this, I guess it makes sense to increase the timeout in IDF by default, or to make it somehow configurable. The issue will be closed automatically when we merge the fix.

@jliu83
Copy link

jliu83 commented Feb 14, 2024

Hello, we have just tested this.

We have an application where we are heavily dependent on data + logging on the sd card. The device would run fine most of the time, and occasionally we would get a sdmmc_read_sectors_dma: sdmmc_send_cmd returned 0x107. These errors are annoying because the tasks inside the framework would lock up, and there was no way to detect that this error has occurred. We used a watchdog timer to detect and reboot the system. This method sort of works, unless a sector of the files that are read on boot up triggers a timeout error code.

We kept these SD cards since we started seeing the problem, as they allow us to reproduce this error. While we scoured the forums for a solution and tried several of the proposed solutions, nothing worked, until I read this post.

Changing the timeout from 100 to 1000ms allows all of the old cards to boot up with no issue. I an convinced that the default 100ms is on the threshold for a lot of these cards, and the strict time keeping of the host driver is causing a lot of these cards to fail. The problem with the 0x107 error is that is crashes the entire SDMMC driver. It's okay to timeout, but allow the driver to continue functioning, don't just let it crash. This would make for a more robust driver.

I suggest two modifications to the driver. Allow the data command timeout to be set to a user value (something other than the default of 100) on initialization/mount. Also, when the driver encounters a timeout, do not lock up the driver, allow the driver to recover and retry, and continue functioning.

Thank you for the hint to resolving this. This post has given me a lot of relief and saved me a bunch of headaches.

@igrr
Copy link
Member

igrr commented Feb 14, 2024

but allow the driver to continue functioning, don't just let it crash.

@jliu83 Could you please describe the behavior you are observing in more detail? It is not expected that the driver should crash after one of the commands has timed out. If you have the details such as: IDF version, log output, panic output including the backtrace, please open a new issue and put the details there. We'll try to fix the issue and make the driver more robust.

@jliu83
Copy link

jliu83 commented Feb 14, 2024

This might be difficult to do, the SD cards that had the issues all work now, and the blocking file that was causing the error has been overwritten once a successful boot has occurred.

What I suspect is happening is the following.

sdmmc_read_sectors_dma() is called by sdmmc_read_sectors() is called by ff_sdmmc_read(). Then error 0x107 (timeout) is returned from sdmmc_read_sectors_dma() it is propagated all the way up to ff_sdmmc_read(). At this point many of the tasks in the system is halted, or are blocked. We didn't investigate too much at the time. Unfortunately I cannot replicate the error anymore and get the backtrace. If we are able to reproduce I will file a bug right away.

@josef2600
Copy link

i am having this problem too. but i am working with Arduino, so i reported to them:
(espressif/arduino-esp32#10835)
but they said, i should tell you to to it so they can do it. personally, it is so disappointing to me to hear that kind of answers. its like they just don't care. i really don't know what to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Status: Opened Issue is new Type: Bug bugs in IDF
Projects
None yet
Development

No branches or pull requests

5 participants