Use exponential backoff for failed peer heartbeats.#193
Use exponential backoff for failed peer heartbeats.#193macb wants to merge 1 commit intogoraft:masterfrom
Conversation
|
If indeed the backoff is desirable, have you considered placing a limit on the timeout? The concern I have is that a server could be down for arbitrary amounts of time, sending its timeout through the roof. Then, when it came back, it'd be ignored for an unnecessary period of time. |
|
I was thinking about that but wasn't sure what the arbitrary limit should be On Thu, Feb 27, 2014 at 7:45 PM, Diego Ongaro notifications@github.com
|
|
@macb In etcd-io/etcd#595 I was more meaning that the logging should backoff exponentially. The backoff on this side, if we add any, should be capped at a second or two. |
|
@philips understandable. I had originally looked into logging backoff for failed heartbeats but didn't see a neat way to approach that. @xiangli-cmu had mentioned heartbeat probing back-off as well and it seemed like it'd kill two birds with one stone. A limit definitely makes sense, but I didn't want to do much else without getting feedback from more involved devs. |
We are re-writing the heartbeat function. I think we can just leave this pull request here for now. |
Noticed @xiangli-cmu mention exponential back-off for peer heartbeats in etcd-io/etcd#595 and thought it might make a good first attempt to contribute.
Feedback would be greatly appreciated.