Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wifi dpp fixes (IDFGH-9445) #10812

Closed
wants to merge 2 commits into from

Conversation

gayafhannah
Copy link

Fix ESP_ERR_DPP_TX_FAILURE error caused by channel mismatch related issues(to the best of my knowledge) by setting ESP_WIFI_MBO_SUPPORT be a dependency of ESP_WIFI_DPP_SUPPORT
This has fixed all instances of the ESP_ERR_DPP_TX_FAILURE error and I have not been able to replicate the error with this fix yet.

  • components/esp_wifi/Kconfig
  • examples/wifi/wifi-easy-connect/dpp-enrollee/sdkconfig.defaults

Fix esp_dpp_handle_config_obj not copying null terminator of ssid and password.
Without this fix, I've quickly run into issues where unless I deinit and init DPP, subsequent attempts at configuring the network can lead to incorrect configs being sent to the callback.
An example of this is first using dpp to try connect to ExampleAP, but failing to connect due to signal reasons so you try connect it to SecondAP instead, without doing a deinit+init in-between, and as the null terminator of SecondAP isn't copied, on the second run you're still left with SecondAPP

  • components/wpa_supplicant/esp_supplicant/src/esp_dpp.c

Related Issue: #10615

@CLAassistant
Copy link

CLAassistant commented Feb 21, 2023

CLA assistant check
Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
You have signed the CLA already but the status is still pending? Let us recheck it.

@espressif-bot espressif-bot added the Status: Opened Issue is new label Feb 21, 2023
@github-actions github-actions bot changed the title Wifi dpp fixes Wifi dpp fixes (IDFGH-9445) Feb 21, 2023
@@ -180,12 +180,12 @@ static int esp_dpp_handle_config_obj(struct dpp_authentication *auth,
wifi_config_t *wifi_cfg = &s_dpp_ctx.wifi_cfg;

if (conf->ssid_len) {
os_memcpy(wifi_cfg->sta.ssid, conf->ssid, conf->ssid_len);
os_strlcpy((char*)wifi_cfg->sta.ssid, (char*)conf->ssid, sizeof(wifi_cfg->sta.ssid));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably separate this stuff into two separate PRs. You have on one hand trying to fix ESP_ERR_DPP_TX_FAILURE and on the other hand trying to make it work to let you bootstrap and/or start_listen multiple times. I suspect the os_strlcpy fix is going to sale through more easily where as the MBO support is going to need a lot more careful thought to really understand what's going on there and why that would be relevant.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I still don't personally know why exactly MBO support seems to fix it as well as it does. And it doesn't help that once you get deep enough into wifi related stuff, there isn't much in the way of documentation or anything.

}

if (dpp_akm_legacy(conf->akm)) {
if (conf->passphrase[0])
os_memcpy(wifi_cfg->sta.password, conf->passphrase,
os_strlcpy((char*)wifi_cfg->sta.password, (char*)conf->passphrase,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has a subtle bug in that if your first start_listen sequence yields a passphrase and your second one does not, you'll technically yield a wifi_config_t with a passphrase from the previous listen. I think the fix is overall more consistent if you wipe the wifi_cfg data structure on each call to start_listen. Same bug below with pmf_cfg.required = true.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually there are other existing bugs in this code that suggest wiping s_dpp_ctx.wifi_cfg on each invocation of esp_dpp_handle_config_obj is better. The caller even runs this function in a loop which seems clearly broken to me...

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I must've missed this thanks to the two networks I tested with having the same passphrase, I'll have to look at wiping wifi_cfg. ty

@Alvin1Zhang
Copy link
Collaborator

Thanks for your contribution.

@@ -495,6 +495,7 @@ menu "Wi-Fi"
bool "Enable DPP support"
default n
select ESP_WIFI_MBEDTLS_CRYPTO
select ESP_WIFI_MBO_SUPPORT
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are trying to reproduce this issue and fix it. these are two separate features and should have no dependency on each other.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know they are very unrelated, however this change is the only one that's let dpp consistently work near perfectly, compared to barely working unless the listen channel was set to that of the ap and phone(dont know why its an issue since the qr has the channel + the esp sends on the correct channel anyways).
Without this fix, esp_wifi_action_tx_req would in most cases I tested, fail by sending the
WIFI_EVENT_ACTION_TX_STATUS event with a status=1, which in turn esp_dpp.c would send the ESP_ERR_DPP_TX_FAILURE event to the callback.
Without this fix, it would take making the environmental conditions near perfect for me to get a moderate success rate, but with this fix, even going out of my way I cant intentionally cause the WIFI_EVENT_ACTION_TX_STATUS related error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw on 4.4.4 I could not reproduce MBO having an effect. What helped for me most was fixing the fact that retry doesn't work as it is supposed to in the example code (s_dpp_stop_listening is not reset on start_listen), then just spamming retry on my phone. It is possible this MBO feature does something, but it's gonna take some tough work to figure out why. What isn't up for debate tho is that DPP is highly unreliable for more than just me, and we definitely need to dig in and understand why.

@jasta
Copy link
Contributor

jasta commented Feb 25, 2023

@gayafhannah checking in, are you planning to divide this PR so we can get the non-MBO fixes fast tracked? If not I might be able to tackle that as I'd really like to base the Rust DPP support on top of a fixed version of esp-idf. Honestly I'd think the severity of these bugs would warrant a new release in the 4.4.x branch given that this feature was mature in the 4.4.x series and 5.x is so new, though I'm not really sure what Espressif's policy is there.

Also @kapilkedawat would be good to hear from you what your preferences are moving forward here. Even though we have some basic workarounds that make things work better, we still don't really have a fundamental understanding as to why it is so intermittent or why the tx failure occurs in the first place. I'm recommending we move forward with the obvious fixes (e.g. s_dpp_stop_listening is not cleared so even the example code never retries as it claims to), but punt to Espressif for more sleuthing on the root cause. I'm happy to help out there, but I don't have source code for the wifi driver and can't really tell why it would fail.

@jasta
Copy link
Contributor

jasta commented Mar 1, 2023

Opened #10865 to tackle just the start_listen change

@kapilkedawat
Copy link
Collaborator

Its just a coincidence that issue was not observed when MBO was enabled, MBO and DPP are independent features. therefore we cannot merge this PR.

@espressif-bot espressif-bot added Status: Done Issue is done internally Resolution: Won't Do This will not be worked on and removed Status: Opened Issue is new labels Feb 29, 2024
@Alvin1Zhang Alvin1Zhang closed this Mar 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Resolution: Won't Do This will not be worked on Status: Done Issue is done internally
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants