Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OTA Update Problem #74

Closed
gschmottlach opened this issue May 17, 2015 · 21 comments
Closed

OTA Update Problem #74

gschmottlach opened this issue May 17, 2015 · 21 comments
Labels

Comments

@gschmottlach
Copy link

I'm trying to utilize the new OTA feature of Sming and I've wired up my application to listen for an MQTT topic (e.g. /home/garage/large/cmd) for a command to initiate the OTA. It receives the command which includes two files to update: eagle.flash.bin and eagle.irom0text.bin. The JSON formatted command looks as follows:

{
    "cmd": "sw_update",
    "args": [
        {
            "offset": 0,
            "url": "http://192.168.0.113:8000/eagle.flash.bin"
        },
        {
            "offset": 36864,
            "url": "http://192.168.0.113:8000/eagle.irom0text.bin"
        }
    ]
}

The offsets are in decimal but convert to 0x0 => eagle.flash.bin, 0x9000 => eagle.irom0text.bin. Anyway, the software contacts my webserver and successfully downloads the two files just fine. The thing I noticed, however, is that it appears to only flash one of the files/segments instead of both. I'm a little confused looking at HttpFirmwareUpdate::applyUpdate(). It seems like it only flashes a single item. Anyway, after writing to flash it re-boots and just sits there - dead to the world. Perhaps by looking at the included trace you can provide me with some idea of what might be going on or perhaps how I can trouble-shoot this. Any insight into the expected theory of operation would be very helpful as well. Are there limitations which I should be aware of? It seems that it reflashes the same flash region where the code is currently executing out of which seems a little suspect - especially (as I understand it), the ESP8266 does execute in-place from flash by loading chunks of code into an in-memory (SRAM) cache.

Here's the trace:

> MQTT status: MQTT_MSG_PUBLISH (len: 299)
22: 272

/home/garage/large/cmd:
        {
             "cmd": "sw_update",
                                    "args": [
                                                     {
                                                                  "offset": 0,
                                                                                          "url": "http://192.168.0.113:8000/eagle.flash.bin"
                                                                                                                                                    },
                                                                                                                                                              {
                                                                                                                                                                           "offset": 36864,
                            "url": "http://192.168.0.113:8000/eagle.irom0text.bin"
                                                                                          }
                                                                                               ]
                                                                                                }
+TCP connection
sect_first: 40, sect_last: 7c

Software update started successfully
TCP received: 299 bytes
onReadyToSendData: 1
TCP connection send: 41 (41)
TcpClient request completed
onReadyToSendData: 3
onReadyToSendData: 3
onReadyToSendData: 3
onReadyToSendData: 3
TCP sent: 41
onReadyToSendData: 2
onReadyToSendData: 3
Download file:
    (0) http://192.168.0.113:8000/eagle.flash.bin -> 0x0
Download: http://192.168.0.113:8000/eagle.flash.bin
connect to: 192.168.0.113
TcpConnection::connect result:, 0
onReadyToSendData: 3
onReadyToSendData: 3
OnConnected
TCP connected
onReadyToSendData: 0
TCP connection send: 54 (54)
TcpClient request completed
TCP sent: 54
onReadyToSendData: 2
TCP received: 17 bytes
onReadyToSendData: 1
Header pos: 181
Date === Sun, 17 May 2015 17:47:01 GMT
Content-type === application/octet-stream
Content-Length === 32080
Last-Modified === Sun, 17 May 2015 16:41:50 GMT
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
onReadyToSendData: 3
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 145 bytes
onReadyToSendData: 1
TCP received: (null)
TCP connection closing
-TCP connection
jump: 4767
Download file:
    (1) http://192.168.0.113:8000/eagle.irom0text.bin -> 0x9000
Download: http://192.168.0.113:8000/eagle.irom0text.bin
+TCP connection
connect to: 192.168.0.113
TcpConnection::connect result:, 0
onReadyToSendData: 3
OnConnected
TCP connected
onReadyToSendData: 0
TCP connection send: 58 (58)
TcpClient request completed
TCP sent: 58
onReadyToSendData: 2
TCP received: 17 bytes
onReadyToSendData: 1
Header pos: 182
Date === Sun, 17 May 2015 17:47:02 GMT
Content-type === application/octet-stream
Content-Length === 217968
Last-Modified === Sun, 17 May 2015 16:41:50 GMT
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
onReadyToSendData: 3
onReadyToSendData: 3
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
onReadyToSendData: 3
onReadyToSendData: 3
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
onReadyToSendData: 3
onReadyToSendData: 3
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 1460 bytes
onReadyToSendData: 1
TCP received: 614 bytes
onReadyToSendData: 1
TCP received: (null)
TCP connection closing
-TCP connection

Firmware download finished!
         item: 0x0 0x40240000 32097 bytes
         item: 0x9000 0x40249000 217985 bytes
Firmware upgrade started
start write: 0x40000 -> 0x0 254849
write: 0x40000 -> 0x0 (sect: 0), 254849
write: 0x41000 -> 0x1000 (sect: 1), 250753
write: 0x42000 -> 0x2000 (sect: 2), 246657
write: 0x43000 -> 0x3000 (sect: 3), 242561
write: 0x44000 -> 0x4000 (sect: 4), 238465
write: 0x45000 -> 0x5000 (sect: 5), 234369
write: 0x46000 -> 0x6000 (sect: 6), 230273
write: 0x47000 -> 0x7000 (sect: 7), 226177
write: 0x48000 -> 0x8000 (sect: 8), 222081
write: 0x49000 -> 0x9000 (sect: 9), 217985
write: 0x4A000 -> 0xA000 (sect: 10), 213889
write: 0x4B000 -> 0xB000 (sect: 11), 209793
write: 0x4C000 -> 0xC000 (sect: 12), 205697
write: 0x4D000 -> 0xD000 (sect: 13), 201601
write: 0x4E000 -> 0xE000 (sect: 14), 197505
write: 0x4F000 -> 0xF000 (sect: 15), 193409
write: 0x50000 -> 0x10000 (sect: 16), 189313
write: 0x51000 -> 0x11000 (sect: 17), 185217
write: 0x52000 -> 0x12000 (sect: 18), 181121
write: 0x53000 -> 0x13000 (sect: 19), 177025
write: 0x54000 -> 0x14000 (sect: 20), 172929
write: 0x55000 -> 0x15000 (sect: 21), 168833
write: 0x56000 -> 0x16000 (sect: 22), 164737
write: 0x57000 -> 0x17000 (sect: 23), 160641
write: 0x58000 -> 0x18000 (sect: 24), 156545
write: 0x59000 -> 0x19000 (sect: 25), 152449
write: 0x5A000 -> 0x1A000 (sect: 26), 148353
write: 0x5B000 -> 0x1B000 (sect: 27), 144257
write: 0x5C000 -> 0x1C000 (sect: 28), 140161
write: 0x5D000 -> 0x1D000 (sect: 29), 136065
write: 0x5E000 -> 0x1E000 (sect: 30), 131969
write: 0x5F000 -> 0x1F000 (sect: 31), 127873
write: 0x60000 -> 0x20000 (sect: 32), 123777
write: 0x61000 -> 0x21000 (sect: 33), 119681
write: 0x62000 -> 0x22000 (sect: 34), 115585
write: 0x63000 -> 0x23000 (sect: 35), 111489
write: 0x64000 -> 0x24000 (sect: 36), 107393
write: 0x65000 -> 0x25000 (sect: 37), 103297
write: 0x66000 -> 0x26000 (sect: 38), 99201
write: 0x67000 -> 0x27000 (sect: 39), 95105
write: 0x68000 -> 0x28000 (sect: 40), 91009
write: 0x69000 -> 0x29000 (sect: 41), 86913
write: 0x6A000 -> 0x2A000 (sect: 42), 82817
write: 0x6B000 -> 0x2B000 (sect: 43), 78721
write: 0x6C000 -> 0x2C000 (sect: 44), 74625
write: 0x6D000 -> 0x2D000 (sect: 45), 70529
write: 0x6E000 -> 0x2E000 (sect: 46), 66433
write: 0x6F000 -> 0x2F000 (sect: 47), 62337
write: 0x70000 -> 0x30000 (sect: 48), 58241
write: 0x71000 -> 0x31000 (sect: 49), 54145
write: 0x72000 -> 0x32000 (sect: 50), 50049
write: 0x73000 -> 0x33000 (sect: 51), 45953
write: 0x74000 -> 0x34000 (sect: 52), 41857
write: 0x75000 -> 0x35000 (sect: 53), 37761
write: 0x76000 -> 0x36000 (sect: 54), 33665
write: 0x77000 -> 0x37000 (sect: 55), 29569
write: 0x78000 -> 0x38000 (sect: 56), 25473
write: 0x79000 -> 0x39000 (sect: 57), 21377
write: 0x7A000 -> 0x3A000 (sect: 58), 17281
write: 0x7B000 -> 0x3B000 (sect: 59), 13185
write: 0x7C000 -> 0x3C000 (sect: 60), 9089
write: 0x7D000 -> 0x3D000 (sect: 61), 4993
write: 0x7E000 -> 0x3E000 (sect: 62), 897
Firmware upgrade finished

 ets Jan  8 2013,rst cause:2, boot mode:(3,6)

ets_main.c 
@gschmottlach
Copy link
Author

For additional information, I was able to test the basic functionality using the Basic _AirUpdate example which downloads the "Hello Sming" program. I also had my program (garagectrl) download the same "Hello Sming" program from your server and it flashed and re-started the program correctly. So it would seem there is something about my garagctrl program that prevents it from flashing itself via OTA. My garagectrl program flashes via the serial port without any problems. Is there any program size limitations I should be aware of? I just used the simple "python -m SimpleHTTPServer 8000" invocation at the command-line from the firmware directory to serve the binary files.

At a bit of a loss at the moment why this fails for my application only (it appears).

@robert-nash
Copy link

I had a similar problem to you when using python simpleHTTPServer. I think it might be something to do with the headers which it sends which are different than those sent by the server hosting the update in the example. He is using nginx but I managed to get it to work using CherryPy, so you might want to try using a different webserver.

@gschmottlach
Copy link
Author

I thought of the same thing and tried webfsd with the same results. Without additional feedback from the Sming developers, there seems to be a few worrisome issues with their approach. First, most implementations I've run across (including rboot) and Expressif's own solution seem to preserve the concept of maintaining separate boot-able images. You never flash into the same partition where your code is running and always maintain one "good" image. Of course you sacrifice precious flash to maintain 1 (or more) backup images. Likewise, the code needs to be built for the different partitions (e.g. start address varies).

Another issue that I can't quite follow is although it downloads two segments, I'm not sure that it's flashing them both (see HttpFirmwareUpdate::applyUpdate()). Hopefully this is just a mis-understanding on my part or I've done something terrible wrong.

I actually like Sming and am using it as the basis for one of my projects. I hope I can end up leveraging the OTA method they provide because it sure would simplify programming my device since it will be installed in a hard-to-reach location.

Thanks for the suggestion . . . I'll investigate some more . . .

@gschmottlach
Copy link
Author

I tried it again with a simple CheeryPy file server. The results were identical. I think there is something more happening here than just an issue with downloading the segments correctly. Works fine with a serial download but OTA is another story altogether. Hope the Sming developers have some ideas to try ;-)

@shansted
Copy link
Contributor

I have had some success with this. Sometimes I'll get 2 -3 successful flashes before it goes south. Whenever it is successful, Item 1 is always at 0x0 0x4023C000. It always seems to be OK the first time after flashing across serial. A failed flash always has another address ie item: 0x0 0x402434C0 29888 bytes. It also seems that on a failed flash, I always get

Download: http://10.10.254.109:80/fw/eagle.flash.bin
connect to: 10.10.254.109
TcpConnection::connect result:, 0
TCP connection error: -8
Download file:
(0) http://10.10.254.109/fw/eagle.flash.bin -> 0x0
flash: 1076084736
Download: http://10.10.254.109:80/fw/eagle.flash.bin
+TCP connection
connect to: 10.10.254.109
TcpConnection::connect result:, 0
TCP connection error: -8
Download file:

I don't get any connection errors on a successful flash. doing an fsformat before flash "seems" to improve the odds of a successful write, but since the flash process does this anyway, I cant see why.

BTW, Sming is awesome. Keep up the good work!!!

@tprochazka
Copy link

I'm interested what happen if it fail during the update process. It directly write downloaded data to the flash memory? Without any CRC check?

It is possible programm some minimal code which will check if new firmware exist and download it on the different place in the flash to keep update mechanism safe?

@anakod
Copy link
Member

anakod commented May 25, 2015

Another issue that I can't quite follow is although it downloads two segments, I'm not sure that it's flashing them both

It will write full updated flash segment from end to end (from start of first to the end of the last)

I'm interested what happen if it fail during the update process. It directly write downloaded data to the flash memory? Without any CRC check?

  1. It never rewrite actual code during downloading process (always write in different place). Flashing starts after
  2. CRC check can be added I think. TCP already has some crc, but additional check may be good I think.

@gschmottlach
Copy link
Author

It will write full updated flash segment from end to end (from start of first to the end of the last)

What's not entirely clear is where the downloaded image it stored prior to flashing it to the final destination in flash. Also, in applyUpdate() I see this code:

int size = items[items.count() - 1].targetOffset + items[items.count() - 1].size - items[0].targetOffset;
System.applyFirmwareUpdate(items[0].flash - INTERNAL_FLASH_START_ADDRESS, items[0].targetOffset, size);

Is it safe to assume your HttpFirmwareUpdateItem structure just contains the book-keeping information about where the OTA has been temporarily stored prior to actually writing it to flash? Is the OTA image being stored in RAM or in another (I assume unused) portion of flash prior to being written to the final destination?

As I understand it, .irom0.text is run from memory-mapped SPI flash. What also concerns me is how that region is updated via your OTA mechanism without disturbing the code that is actively running there. You algorithm doesn't appear to maintain separate distinct update "slots" that are updated via OTA with a boot-loader configured to run the downloaded image on successful update. Likewise, I don't see any mention of separate linker files to build the application for one (or more) distinct .irom0.text regions.

Basically, a more detailed explanation of how the OTA mechanism works and is implemented would be extremely helpful. For my situation, I don't believe my image is being corrupted during the download. I believe there is something more fundamentally broken with the current OTA strategy barring additional details on exactly how it works and any inherent limitations.

I've been following the development of the rboot OTA framework and it appears to be a very reasonable approach to this problem. Additionally, the author has offered a good deal of information on how the mechanism works, limitations, and integration examples. Likewise, in this post he indicates that he has successfully integrated it into your Sming framework with minor changes. Perhaps it might be to your advantage to investigate his approach and perhaps offer it as a replacement (or alternative) mechanism for supporting OTA updates? It seems his solution already offers some of the features that yours currently lacks.

Again, I appreciate the tremendous effort you've put into Sming and am generally pleased with the framework and it's on-going development. It just seems the current Sming OTA strategy, as it stands today, may have some fundamental limitations that preclude it's use as a general-purpose (and reliable) OTA solution.

@anakod
Copy link
Member

anakod commented May 25, 2015

Basically, a more detailed explanation of how the OTA mechanism works and is implemented would be extremely helpful.

I'm very sorry, really I have a limitted time to write documentation for my code. Sorry. I understand what it isn't right, but it's to hard develop all and create help. If somebody can help with any parts of documentation it also will be great.

Now it's saving downloaded image in filesystem space (and rewrite files). But if you have big flash it can be stored in the free space. After that code what make firmwaring located in IRAM and can work during IROM rewriting.

I've been following the development of the rboot OTA framework and it appears to be a very reasonable approach to this problem.

Why not, If somebody help with it, we can move to more powerful rboot solution.

@gschmottlach
Copy link
Author

I understand how difficult it can be to write the code and then have spare cycles to document it. It can be the least enjoyable part of the process but unfortunately is one of the more critical parts for developing a community. At least you do provide sample projects which helps immensely.

Perhaps Richard might share the changes he made to your linker script to get your examples to work with his OTA framework. I'll probably begin an investigation myself because I could see it as an extremely useful feature . . . one that has to be robust in order to really be useful.

@anakod
Copy link
Member

anakod commented May 25, 2015

Yes, I'm always trying to provide examples for every developed class.

Thank you very much, please keep me informed. Also I'm ready to answer any questions.

@gschmottlach
Copy link
Author

You might be very interested in this:

http://www.esp8266.com/viewtopic.php?p=18403&sid=2689c2c483e75ce37b6e88cd9fa7975c#p18403

Richard describes how he integrated rboot into your linker/makefiles for the simple Blink project.

@anakod
Copy link
Member

anakod commented Jun 4, 2015

OTA core was updated, please check again.
Also don't forget reboot chip one time after flashing (before OTA). After that auto-reboot will start work.

@gschmottlach
Copy link
Author

I was looking through the commits and it wasn't clear what changed to address this problem? Can you give me any details on the potential "fix"? Are you saying you've updated to the latest SDK and since SMING partially relies on the SDK OTA update mechanism this might have "fixed" the issue? I'm a bit confused.

Also, I went ahead and implemented the rboot boot-loader (see above) based on Richard's changes to your Blink example project in my own project. It required a few more tweaks to his Makefile but I've successfully implemented it in my own Sming based project and it works like a charm. I have two "rom" slots that I can switch between and optionally update. It feels a little safer to only update the slot that is not currently being used. The only issue is I haven't figured out is how to deal with a spiffs image. My project doesn't use the filesystem so I safely omitted that option. I suppose the spiff images could be stored in an extra flash slot if the flash is large enough (often the case since the ESP8266 can only map 1MB of the flash for iromtext and an ESP-12 typically comes with 4 MB . . . so 512KB for each "rom/program" slot still leaves you with 3 MB untouched and potentially useful for your spiffs filesystem). Have you looked any further into the rboot approach?

@anakod
Copy link
Member

anakod commented Jun 4, 2015

@gschmottlach It almost small fixes, most important - checking correct download state before flashing.

I think what when you wrote:

Anyway, after writing to flash it re-boots and just sits there - dead to the world

You just miss reboot chip after flashing, because when code auto-started after firmwaring from serial it can't reboot automatically. But if you reboot it one time after flashing, it can run auto reboot when needed, and OTA can be finished automatically.

So if you will test Sming OTA and reply here it will be very good. If you will share your experience about rboot, it will be great also. Thank you very much.

@anakod
Copy link
Member

anakod commented Jun 4, 2015

About last changes: also WDT switching was added. It should solve unexpected restart problem.

@gschmottlach
Copy link
Author

Back when I was investigating Sming's native OTA feature, I rebooted the chip (manually) after the OTA update. If I recall, the problem was it wouldn't boot at all and ended up with an exception on power-up. I tried this several times. I was able, however, to successfully build/execute your OTA example project and that worked correctly - just with my project it would not. Unfortunately, I ripped out the code in my original project and moved on to implement the 'rboot' OTA implementation which seems to work well for my use case. As I mentioned, it just seems "safer" to have a backup image available and not to update the image that is currently executing. Seems like it might be tricky to insure the OTA code doesn't try to access any functions in the memory-mapped irom.text section of the flash during the update.

My experiences with 'rboot' have been positive so far. The alternative Makefile that Richard provided in the ESP8266.com posting (referenced above) was fairly complete and only needed a few tweaks to support building '*.c' files. Also, I had to add a few targets to build rboot.bin and utilize his updated linker files which produce the two rom0/rom1 binary images. Finally, his rboot-ota.c code had a small bug that prevented the binaries to be downloaded when using a simple Python server. I could certainly share the pertinent Makefile's and code changes as a zip/tgz archive if you're really interested in investigating it for Sming. Eventually I'll probably make my project public but it's still a a work in progress - not quite ready for prime time.

If you have any specific questions, let me know. With work chewing up most my free time, it's been tough to focus on this little side project. Unfortunately I can't quickly repeat my earlier experiments with Sming OTA solution.

@gschmottlach
Copy link
Author

You might be interested in this bugfix if you consider investigating rboot's OTA:

raburton/esp8266#5

@piperpilot
Copy link
Contributor

@gschmottlach I'd be interested in seeing your changes. I am getting ready to implement OTA, and rboot does seem like a better approach. For my application, I can't risk something getting corrupted as my users won't really have the knowledge to re-flash via serial. Any info you can provide is appreciated.

@gschmottlach
Copy link
Author

Hi Curtis -

Sorry for not getting back sooner but I was out of town. I haven't updated my Makefile lately to match the changes they (continue) to make to the general Sming Makefile. It's highly likely I'll have to tweak a few things. I intend to (at some point) to update my project to the latest Sming release and merge any changes. In the interim, you might want to check out Richard's original integration into the Sming Blink example (see http://www.esp8266.com/viewtopic.php?p=18403&sid=2689c2c483e75ce37b6e88cd9fa7975c#p18403). I started with his instructions (and modified Makefile) and then tweaked it as necesary to fit my project's needs (e.g. library dependences etc...). Also, I added code to my application to do a DNS lookup of a MQTT broker that published a topic (JSON formatted) containing information of where to fetch the OTA binaries. From there you feed the URL to Richard's library and it fetches the binary, flashes it, and reboots. Not too tricky depending on your background. If it was in a public repo I'd share it with you but I still want to do some cleanup before making it generally available.. Let me know if you need more details . . .

@hreintke
Copy link
Contributor

Closing: New OTA update using rBoot is failsafe and the recommended way for OTA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants