Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Figure out root cause of spikes in control lag time #46

Closed
alanwells opened this issue Feb 23, 2017 · 12 comments
Closed

Figure out root cause of spikes in control lag time #46

alanwells opened this issue Feb 23, 2017 · 12 comments

Comments

@alanwells
Copy link
Contributor

alanwells commented Feb 23, 2017

This was the main issue I experienced at the Feb 18th track day that prevented me from reliably driving the car around the track for training. It always manifests as an intermittent problem but I am also able to observe it at home, although less frequently than I saw it at the track day.

I'm running a local donkey server over wifi, so 4G latency is not a factor here. On wifi, I'm frequently seeing lag times spike above 1s, sometimes as long as 30 or more seconds. The clues that I've seen so far are:

  1. This seems to happen more frequently in areas of high network congestion (like the track day when everyone was running nearby).
  2. It happens on my home wifi network at least once per minute while driving the donkey car, to varying levels of severity.
  3. I tried an alternate router at home on a non-standard wifi channel, and was not able to reproduce the delays.

Here's a sample console log from the pi that shows the spikes. Lag time of ~0.06 is about normal on my home network.

{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.06897997856140137
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.07510542869567871
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.06453394889831543
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.759141206741333
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.05977487564086914
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.0692141056060791
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.06003284454345703
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.16736602783203125
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.06820440292358398
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.0678567886352539
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.12179088592529297
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.05697226524353027
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.0699162483215332
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.06665158271789551
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.17603182792663574
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 1.0047976970672607
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
{"angle": "0", "throttle": "0"}
vehicle <> server: request lag: 0.0619354248046875
throttle update: 0.0
pulse: 370
@wroscoe
Copy link
Collaborator

wroscoe commented Feb 23, 2017

Thanks @alanwells. I remember that the trace shows this hangs on the connection creation. Maybe reusing the connection would prevent these large lags. The requests module provides this functionality with sessions. http://docs.python-requests.org/en/master/user/advanced/

@wroscoe
Copy link
Collaborator

wroscoe commented Feb 23, 2017

Pull request #47 adds a potential fix and lag logging.

@wroscoe
Copy link
Collaborator

wroscoe commented Feb 23, 2017

@alanwells could you try #47 and let me know if it solves the issues you're having? Once confirmed I'll merge to master.

@wroscoe wroscoe mentioned this issue Feb 23, 2017
@alanwells
Copy link
Contributor Author

@wroscoe I just tested the latest code on the dev branch. There's a new issue now - throttle and steering controls are reseting back to 0 even with consistent input from the controlling page on the iOS device. The presence of this issue makes it difficult to gauge whether the previous lag was eliminated, as I can't get consistent steering/throttle control long enough to see if the car response is lagging relative to my control input.

This issue happens when I try to control the device from a remote (iPad or iPhone), but does not occur when I control the device from a browser on the laptop that is running the donkey server.

Here's a video (ignore lack of power to wheels here, I had the motor unplugged. the same resetting to zero happens with throttle when motor is plugged in): https://www.dropbox.com/s/unb8pr1pq6v6old/donkey-issue-46.m4v?dl=0

Console log from the pi:

vehicle <> server: request lag: 0.03238105773925781
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.033580780029296875
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.028684616088867188
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.03644919395446777
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.028028011322021484
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.030170202255249023
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.029787778854370117
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.03412437438964844
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.029006242752075195
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.0317385196685791
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.046013593673706055
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.03188753128051758
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.03586864471435547
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.02921605110168457
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.028040170669555664
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.05167269706726074
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.031673431396484375
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.03875923156738281
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.0317845344543457
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.03853964805603027
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.05096435546875
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.028660297393798828
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.030751466751098633
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.03459978103637695
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.033445119857788086
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.030283689498901367
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.0296323299407959
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.04015350341796875
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.03234410285949707
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.03404116630554199
throttle update: 1.0
pulse: 520
angle: -1.0   throttle: 1.0
vehicle <> server: request lag: 0.02869892120361328
throttle update: 0.0
pulse: 370
angle: 0.0   throttle: 0.0
vehicle <> server: request lag: 0.03536820411682129

server log:

Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:1
Control: user: A: -1   T:1
Drive: user: A: 0   T:0
Drive: user: A: -1   T:0.9571428571428571
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: 0   T:0
Drive: user: A: 0   T:0
Drive: user: A: 0   T:0
Control: user: A: 0   T:0
Drive: user: A: 0   T:0
Control: user: A: 0   T:0

@alanwells
Copy link
Contributor Author

Unfortunately I don't have time to debug further today but I can look into it over the weekend.

@alanwells
Copy link
Contributor Author

Also, it appears to be unrelated to wifi signal strength - the video above was taken when laptop running the server, iOS control device, and the pi were all within 5 feet of my router.

@wroscoe
Copy link
Collaborator

wroscoe commented Feb 24, 2017

Thanks for trying this @alanwells I'm unable to replicate this but it looks like the bug is on the server. The server should be sending continuous values shown as the "Drive:" line in the output. In your output the values coming from the web input are switching between 1/-1 and zero. I'll post here if I think of anything that could be causing this.

@alanwells
Copy link
Contributor Author

alanwells commented Feb 25, 2017

@wroscoe I just tried again and I'm not able to reproduce the resetting to zero issue from this morning. Thinking back on it, I had my PS4 gamepad sitting on my desk and connected to my laptop via bluetooth at the same time I was testing this morning. Is it possible the resetting I saw was a result of competing control inputs (on my laptop, the gamepad was not being held and could have been sending 0 angle, 0 throttle to the open vehicle browser window on the laptop, while I was sending non-zero throttle and steering inputs via the browser page on the iOS device)?

@wroscoe
Copy link
Collaborator

wroscoe commented Feb 25, 2017

@alanwells that is possible. I don't have a game pad controller so I can't test it. I did just do a fairly long test drive and can say that the connection fix seems pretty robust. Here are my lag times running with a Mifi and the timout set to .2 seconds. I'm not sure how the lag ever gets above .2 but --''/-- (or whatever that symbol is)
image

@alanwells
Copy link
Contributor Author

@wroscoe interesting. I'm planning to do some more in depth testing tomorrow morning. I'll let you know what I find.

@alanwells
Copy link
Contributor Author

@wroscoe Here's what I found in testing this morning (running dev branch):

  1. I was able to repro the resetting to zero issue. It happened again when I had a control page open on both my laptop browser and on my iOS device at the same time. As soon as I closed the browser page on the laptop, the problem stopped. I think it was a case of competing control signals, and this morning it happened without a gamepad connected to my laptop so I don't think that's the cause.

  2. After the above issue was addressed by closing the browser issue, I was able to drive the car successfully without any noticeable loss of control on my hotspot. Either the fixes you put in place on the dev branch are working, or the source of my control issues are coming from some other intermittent factor that was present during my testing. I've never been able to reliably drive my car as long as I did this morning without control issues, so I'm inclined to believe that the changes in the dev branch helped.

For comparison to your stats, I also logged my lag times on my home router and my hotspot. I had my timeout set to 0.2 as well. These were measured with the routers positioned within a few feet of the pi and my laptop:

  • Mean lag time on home router (TP-Link Archer C7): 0.030s
  • Mean lag time on hotspot: 0.033s

There's a minor increase in lag time on the hotspot, but probably not enough to be concerned about. When I moved the hotspot and pi about 30 ft and two walls away from my laptop, mean lag increased to 0.037s. Again, probably not enough to be concerned about at this point.

I also measured the failure rates of connections (connection errors and read timeouts). Interestingly, the hotspot had a significantly lower failure rate than my home router:

  • Connection failure rate on TP-Link home router: 7.83%
  • Connection failure rate on hotspot: 1.05%

This surprised me, but it might be because of other traffic on my home network (between computers and mobile devices we have about 10 active devices on our home network, vs just the pi and laptop on the hotspot).

Anyway, I think you can close this issue.

@wroscoe
Copy link
Collaborator

wroscoe commented Feb 25, 2017

@alanwells That makes sense because I added a javascript function to repeatedly send stop signals to the car when the joystick was not moving to keep the car from driving off if it missed the one stop command. This may not be needed now that the throttle is reduced if the connection is lost. This also highlights the fact that the web app is not safe or multiple users to use at the same time.

It's nice to see the low lag times when using a low network. My ~.1 lag times were likely due to going through the internet to AWS.

Closing this issue.

@wroscoe wroscoe closed this as completed Feb 25, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants