~ne02ptzero/libfloat

a17da7a3 — Louis Solofrizzo 26 days ago master
log: trigger an election when another leader send us an AE with the same term

Should not be possible, in that case something is seriously wrong,
trigger an election to be safe

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/55842

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Julien Egloff <jegloff@scaleway.com>
21ab27a6 — Louis Solofrizzo 3 months ago
node: do not delete my own node from a cluster

Because it will break the ctx->me pointer in the context, and will crash
horribly.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/54672

Acked-by: Patrik Cyvoct <patrik@ptrk.io>
Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
87cb99dc — Louis Solofrizzo 8 months ago
log: Apply deep-sleep state from heartbeats / leaders instead of computing locally

We're now applying the deep-sleep state from the AEs/Hearbeats from the
leader, instead of each node computing it locally. This way, only the
leader computes it, and gossip this state to other nodes via AEs, and
those node simply apply it. Solve some issues of desync deep-sleep
timers seen on production.

I've also reworked the deep-sleep routines a bit, in order to have a
single entry-point for it.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/50314

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Julien Egloff <jegloff@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>

 ________________________________________
/ Just because they are called           \
| 'forbidden' transitions does not mean  |
| that they are forbidden. They are less |
| allowed than allowed transitions, if   |
| you see what I mean. -- From a Part 2  |
\ Quantum Mechanics lecture.             /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
65f9f585 — Louis Solofrizzo 11 months ago
log: Only ERROR log on smaller commit_id when it is problematic

Otherwise, DEBUG it. Should generate less logs on production / tests.
Also removes the minimum 20 entries used to compute optimistic
replication.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/47901

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
77f3b3c9 — Louis Solofrizzo 11 months ago
periodic: Change log level from ERROR to DEBUG on gray-failure checks

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/47503

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
9e7bac3b — Louis Solofrizzo 1 year, 4 months ago
periodic: Trigger an election when a leader has lost its followers

Also fixes some formatting issues in the latest patch.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/42881

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>

 ________________________________________
/ Take what you can use and let the rest \
\ go by. -- Ken Kesey                    /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
raft: add soft snapshot feature

This patch add soft snapshot feature: if no logs are received for the
soft_compact_time seconds, a snapshot will be made based on the
soft_compact_after_n value.

Signed-off-by: Patrik Cyvoct <patrik@ptrk.io>
27f69f8a — Julien Egloff 1 year, 4 months ago
log, periodic: Ensure libfloat is resistant to clock drifting

By using relative timer instead of absolute ones.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/42580

Signed-off-by: Julien Egloff <jegloff@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
Acked-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
a97638c1 — Louis Solofrizzo 1 year, 6 months ago
log: Missing condition logic on append_entries receive

That can lead to stuck snapshots, on specific cases.
I've simplified some logic and added some debug here and there.
I've also fixed the snapshotting logic to account for off-by-one or more
log id error.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/41678

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
1e1bf074 — Louis Solofrizzo 1 year, 6 months ago
log: Accept snapshots logs if the term is higher than our snapshot term

It's stucking replication on some clusters in production

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/41159

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>

 _____________________________________
/ "Nature is very un-American. Nature \
| never hurries." -- William George   |
\ Jordan                              /
 -------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
c45a1f8a — Louis Solofrizzo 1 year, 6 months ago
compilation: Add a build.zig compilation file

Working as intended, some small fixes in the code to compile with clang
without warnings.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/41085

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>

 ________________________________________
/ If I had only known, I would have been \
\ a locksmith. -- Albert Einstein        /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
4c31834f — Louis Solofrizzo 1 year, 7 months ago
election: Fix follower condition and be less strict on term

Revert "dynamo: Do not discard leader when timeout is reached when using leader-check dynamo like elections"
Fix no_leader count for gray-failures
Add no wake-up if the leader has been recovered from a gray-failure check
Load leader before deep-sleep states in order not to force a wake-up on restarts

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/40645

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
937fe35f — Louis Solofrizzo 1 year, 7 months ago
dynamo: Do not discard leader when timeout is reached when using leader-check dynamo like elections

Rather, discard the leader when all the nodes respond with a loss and
trigger an election.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/40455

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
Acked-by: Julien Egloff <jegloff@scaleway.com>

 ________________________________________
/ "What do you give a man who has        \
| everything?" the pretty teenager asked |
| her mother. "Encouragement, dear," she |
\ replied.                               /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
4ee65d14 — Louis Solofrizzo 1 year, 7 months ago
persistent: Also save deep-sleep timer to be persistent across restarts

Useful for rolling releases of the deep-sleep feature

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/40428

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>

 ____________________________________
< A well-known friend is a treasure. >
 ------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
68af1737 — Louis Solofrizzo 1 year, 7 months ago
raft: Add optionnal deep-sleep clusters

This patchs adds a new feature: Deep-sleep clusters. The idea is simple:
When a certain time is reached (conf.deep_sleep_time) without any logs
written, the timeout thresholds are raised by two, for a maximum of 4
times. This should lower the heartbeats per seconds required on large
clusters, but will impact the recovery time of those clusters in case of
a sudden leader-loss.

Also added a persistent state for the leadership and the deep sleep
state; this way, it is reloaded on cluster restart, which can avoid
errors on restarts.

Also added a small fix which avoids the cluster-check routines when an
election is already on-going.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/40300

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by     : Patrik Cyvoct <pcyvoct@scaleway.com>

 ____________________________________
/ Xerox never comes up with anything \
\ original.                          /
 ------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
298eb764 — Louis Solofrizzo 1 year, 8 months ago
log: Replace some mallocs with callocs

Prevent garbage reading above.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/39608

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>
9b60f790 — Louis Solofrizzo 1 year, 9 months ago
election: Change the dynano election trigger to a relative majority

We have seen an (rare) issue in production, where an election is never
trigerred if 2 out of 5 nodes are unreachable. That's because a node was
waiting for at least 3 answers (5 - 2), not counting itself, to trigger
an election. This is now fixed, as we wait for a relative majority (5 /
2).

Patch : https://lists.sr.ht/~ne02ptzero/libfloat/patches/39359

Signed-off-by : Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by      : Patrik Cyvoct <pcyvoct@scaleway.com>
Acked-by      : Julien Egloff <jegloff@scaleway.com>

 ______________________________________
/ Girls marry for love. Boys marry     \
| because of a chronic irritation that |
| causes them to gravitate in the      |
| direction of objects with certain    |
| curvilinear properties. -- Ashley    |
\ Montagu                              /
 --------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
periodic: force election from followers in case of a total leader loss

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/37524

Acked-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>

Signed-off-by: Patrik Cyvoct <patrik@ptrk.io>
37ff9491 — Louis Solofrizzo 2 years ago
log: Don't trigger an election when an unknown leader have a lower term

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/37045

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by     : Patrik Cyvoct <pcyvoct@scaleway.com>

 _________________________________________
/ The only thing we learn from history is \
| that we learn nothing from history. --  |
| Hegel I know guys can't learn from      |
| yesterday ... Hegel must be taking the  |
| long view. -- John Brunner, "Stand on   |
\ Zanzibar"                               /
 -----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
c03e69a2 — Louis Solofrizzo 2 years ago
election: Add a possible callback when becoming a follower

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/36566

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by     : Patrik Cyvoct <pcyvoct@scaleway.com>
Acked-by     : Florian Florensa <fflorensa@scaleway.com>

 _________________________________________
/ Soldiers who wish to be a hero Are      \
| practically zero, But those who wish to |
| be civilians, They run into the         |
\ millions.                               /
 -----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
Next