-
Notifications
You must be signed in to change notification settings - Fork 13.3k
httpUpdate (ssl client) sometimes hangs #3908
New issue
Have a question about this project? No Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “No Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? No Sign in to your account
Comments
Master uses lwip2 by default. It is selectable in menus (tools>lwIP variant: v2 or the 3 others (=v1.4)). |
switching to "Prebuilt" solved this so I'll go w/ that for now. If I can help debug lwip2 I'm glad to. I'm under the gun right now but by the end of the week I should have some time to help solve this. |
@whyameye please provide a MCVE sketch to reproduce the issue. |
@whyameye Is it still relevant using latest master ? |
I'm traveling and don't have a device with me to test. I can test next
week. Sorry for all the delays.
…On Mon, Jan 8, 2018 at 4:47 AM, david gauchard ***@***.***> wrote:
@whyameye <https://github.com/whyameye> Is it still relevant using latest
master ?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#3908 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABoy5sya5R05vdVmyl8vcrLHDZDhMHWcks5tIfIogaJpZM4Q1AfM>
.
|
using commit 8b87491 this does not seem to be an issue anymore for v2 MSS 536 but is still an issue for v2 MSS 1460. Since 1460 is already marked "unstable" maybe this isn't a huge deal now. I could still make an MCVE if you need. |
Please provide an mcve sketch, it always helps. |
This test code is essentially identical to the supplied The test code works fine with lwiP v1.4, using either "compile from source" or "higher bandwidth" options. With lwiP v2.0 both "higher bandwidth" and "Lower Memory" options fail with message The test code was compiled using master branch commit bf5a0f2 dated February 8. |
@whyameye Can you re-upload your MCVE sketch, the above link is currently broken. |
sorry about that. Link is fixed. |
Without sketch modification apart from AP, url and fingerprint,
So I don't have the same error as yours, but there's something wrong at the final updater stage. Anyway, this shows that you must release a lot of ram before going to update through SSL. |
@igrr, I tried to download a ~350KB file (a firmware) using the example StreamHTTPClient.ino
and I ran into I reckon I'm far from understanding all ssl/tls internals. @earlephilhower Should I also try with BearSSL ?
|
"mac" in this case is a id/sequence portion at the end(IIRC) of the TLS fragment, not the more common Ethernet address. Once one goes bad you're pretty much hosed on that connection, and indicates corruption either in transport or in RAM buffers. "close notify" means the SSL connection is dead even if the underlying TCP/IP socket is still viable. @d-a-v I've not looked at the httpUpdate class at all, but BearSSL's at a state where is should be possible to give it a go if you really want. Because BearSSL is stricter the code needs to apply the fingerprint after object creation, before you connect. Or for testing just add a "client.setInsecure()" and it'll accept any certificate. The BearSSL fingerprint validator also takes the 20-byte fingerprint, not a string representation of hex digits, so it's slightly different there as well. |
Just the following hack to ESP8266HTTPClient.cpp and running the script @ filimin.xyz seems to work fine w/BearSSL with v2-high-BW. The code downloads from the https server and reboots as expected (it looks like the downloaded BIN simply connects to the HTTP at the same spot and re-downloads the same BIN but it prints out a "Firmware update loading from update URL http://..." line instead of the silence in the given .ino). So it's OOM w/AXTLS somewhere, as you were seeing, @d-a-v . Not sure what the remediation should be.
|
@earlephilhower thanks. And I take this opportunity to thank you for all the ongoing BearSSL work. It seems to take more RAM, but if I understand well, your couple ::supportMaxFragmentLen() / ::setBufferSizes will help with that, can they do the job automatically ? @whyameye Currently we are using the goodold axTLS. I get the same behavior as yours, sometimes it works, sometimes not. I tried to dig a bit with no luck, and luckily we have this huge work with BearSSL coming soon to us. |
I appreciate all the work on this. I'm excited for BearSSL! 👍 It seemed like we were having OOM issues with axTLS and now I read BearSSL might take more memory but does also solve the issues we are seeing? I'm probably misunderstanding something. |
Maybe I'm missing something, but if I understand your comment @d-a-v then this is an OOM condition w/axtls, no? If the RAM needs between 1456 and 576 MSS should only be 2-3KB different in this use case, and the larger MSS consistently fails, that means the smaller one is right on the edge anyway and heap fragmentation could make the difference between pass and fail, no? If that's the case, is there anything we can really do? Unfortunately, axTLS doesn't check memory allocation success, and the effort needed to put it in seems large. Plus, you'd just end up with this failing and printing "OOM in axTLS" anyway...not success. Even if BearSSL takes a little more memory, because the memory is allocated on connection start it either works, or you don't get an SSL connection. No fragmentation, predictable performance. The constant 4.5KB 2nd stack can even be hidden in "unused" SYS space to make it about equal as compared to axTLS. Setbuffersizes lets you shrink the recv buffer if you know what you're doing, and lets the xmit buffer be as small as 512bytes+fluff. The MaxFragmentLen TLS extension lets you negotiate w/remote servers to legally shrink the recv buffer down to 512b-4kb, but until OpenSSL 1.1.1 is adopted by folks it's not going to be supported. Even w/o this, if you know individual packets are limited in size, you can shrink the buffers (i.e. MQTT where you say you won't accept messages >4KB in length which is enormous for that purpose, or HTTP RANGE GET requests). |
Your code will work as-is with the BearSSL commit just pushed as long as you change the fingerprint into a uint8_t[20] array (i.e. {0x4c, 0x90, ...});. When I get the time I'll add real x509 certificate validation for this (just adding methods w/right signature to pass it down to the BearSSL object). SHA fingerprints are not secure and don't guarantee you're talking to who you think you are...
|
About the OOM, yes it is unclear. |
Are we measuring heap during operation, or just between axTLS calls? I've also not done extensive testing (gotta re-upload the original sketch after each successful reprogramming), so if it's really intermittent I may not have caught it. My guess is that even though it starts with more free space on axTLS, due to there being mallocs() in the main decryption loop and fragment operations, the real needed space is > than available in some instances. With BearSSL, the design goal of the original author was "allocate mem once, before connection" so only stack space is used during normal operation. Not saying one is better than the other, but (assuming no TCP memory needs, etc.) if you had 0 bytes of heap available w/BSSL it would still work, whereas axTLS would crash hard. How often is the failure? If I get, say, 10 successes in a row would that point to BSSL being "good," or is it only something like a 1 in 100 chance of failure? |
In my tests I did measure the heap in the StreamHTTPClient.ino's main loop, that I used instead of the referenced updater, after I found that the problem occurred during download and was not particularly linked to the flashing process itself. Sorry I was not clear enough: BearSSL was always working in my tests ! - I did not run it 100 times in a row yet, though I'm eager to get time and run my tests Failures I described above occurred with axTLS. |
@earlephilhower using your PR #4273 and changing one line in your hack above from I was able to get httpUpdate to work with BearSSL. I'm wondering though the "right" way to do this with the PR and using a certificate authority to validate a chain of certificates specifically for the update. (CA validation works well with BearSSL fetching URLs 👍 ) Thanks in advance for your help. |
@whyameye The hack above is no longer needed, there is a BearSSL enabled fingerprint validator already in the PR. Just pass in a uint8_t[20] array (i.e. parse the hex string into binary values) and it will make a BearSSL connection and use that fingerprint to validate things. For the httpUpdater, that's as good a validation as can be done with the current API. Once the BearSSL PR is merged, we can see about using certs/etc. in the httpUpdater, but for now given the fact that it's already at 18K+ lines and 70 files changed, just doing the fingerprint validation (i.e. keeping httpUpdater API) is the best we can do. |
BearSSL is merged in #4273 , with alternate BearSSL::WiFi* classes. Although axtls-based classes are still available and even the default, they are planned for deprecation and then retirement, hence won't be fixed. Any issues with BearSSL-based classes should be reported in new issues. |
Basic Infos
Hardware
Hardware: ESP-12F
Core Version: master commit 7b09ae5 Nov 2017
Description
Using
ESPHTTPUpdate
sometimes the update succeeds, sometimes it fails and reports back failure (OK) but worst is that sometimes it just hangs with a partial download, never returning from theupdate
call. No watchdog reset. The only way to get the uC back is to manually reset it.Digging into this with issue reports etc. I got lost quickly: is this master using lwip or lwip2? If it is using lwip should I try switching to lwip2 and if so how? Is there some sort of timeout or watchdog reset that could happen in this failure case and if so is there an easy way for me to trigger it? I saw a timeout set on the http connect to 8000ms (can't remember where I saw that) but it doesn't appear to be working. The server closes the connection with a partial download and the ESP8266 just sits there, no reset, no nothing, not executing my code not returning from the
ESPhttpUpdate.update
call.Problem description
Settings in IDE
Module: Node 1.0
Flash Size: 4MB/1MB SPIFFS
CPU Frequency: 160Mhz
Flash Mode: DIO
The text was updated successfully, but these errors were encountered: