Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

publish error: etcdserver: request timed out #8526

Closed
featheryus opened this issue Sep 8, 2017 · 5 comments
Closed

publish error: etcdserver: request timed out #8526

featheryus opened this issue Sep 8, 2017 · 5 comments
Assignees

Comments

@featheryus
Copy link

featheryus commented Sep 8, 2017

Always report "publish error: etcdserver: request timed out, possibly due to previous leader failure"

We have three instance, mn-0, mn-1, and sn-2,
mn-0 is leader at first, then it reboot, mn-1 became leader.
When mn-0 startup, it become follower,
Aug 30 08:35:59.883639 mn-0 etcd[885]: 4bc7141c11bf71da became follower at term 5
Aug 30 08:35:59.883662 mn-0 etcd[885]: newRaft 4bc7141c11bf71da [peers: [], term: 5, commit: 9754, applied: 0, lastindex: 9754, lastterm: 5]
Aug 30 08:36:00.078833 mn-0 etcd[885]: starting server... [version: 3.1.4, cluster version: to_be_decided]
Aug 30 08:36:00.238346 mn-0 etcd[885]: 4bc7141c11bf71da is starting a new election at term 5

why the peers is empty ?
After that it always report error like below:
Aug 30 08:40:00.397358 mn-0 etcd[885]: got unexpected response error (etcdserver: request timed out) [merged 4 repeated lines in 1.57s]

In other node mn-1, it also report time out error

Aug 30 08:36:49.442004 mn-1 etcd[870]: the clock difference against peer 33a67dbe91f4c91e is too high [1.32294022s > 1s]

Aug 30 08:36:52.513422 mn-1 etcd[870]: publish error: etcdserver: request timed out, possibly due to previous leader failure

Until half hour later,
mn-1 start an election and became leader.
The system recover.
Aug 30 09:07:27.721180 mn-1 etcd[863]: 3465edf29beeba8f became follower at term 6
.....
Aug 30 09:07:30.121719 mn-1 etcd[863]: 3465edf29beeba8f is starting a new election at term 7
Aug 30 09:07:30.121767 mn-1 etcd[863]: 3465edf29beeba8f became candidate at term 8

can you help to check why it always timeout.

Thanks.

@gyuho
Copy link
Contributor

gyuho commented Sep 8, 2017

Can you provide other log lines? (e.g. Warnings, became leader, ...)

@jiaxuanzhou
Copy link
Contributor

jiaxuanzhou commented Sep 9, 2017

@featheryus this kind error may caused by the low performance of the disk, would you provide some performance data of the iops by iostat -x 1 or value of wa in top ?

@featheryus
Copy link
Author

@gyuho @jiaxuanzhou
Thank you very much for your support.
I'll attach full log later.
But we can't get io performance data before next reproduce.
I have another question, this error last more than half hour. And mn-1 restart 7-8 times. Why it suddenly recover 09:07:27.721180.
Seems there is no condition changed,
Is it possible, the io is always busy during the past half hour.( 08:35:59-09:07:27), And just recover from 09:07:27?
Thanks.

@gyuho
Copy link
Contributor

gyuho commented Sep 12, 2017

@featheryus Ping?

@gyuho
Copy link
Contributor

gyuho commented Sep 20, 2017

Closing. Please reopen when there's full logs with reproducible steps.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants