~ne02ptzero/libfloat

build: update build.zig for zig 0.14

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/58775

Signed-off-by: Patrik Cyvoct <patrik@ptrk.io>
Signed-off-by: Louis Solofrizzo <louis@ne02ptzero.me>
83b314c4 — Louis Solofrizzo 11 days ago
log: allow the token "max_logs_size" to be disabled

If it is set to "0", the behavior will be unlimited bytes for each
packet, only limited by "max_logs_per_ae".

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/58594

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Saalik Hatia <shatia@scaleway.com>
2e5e03a6 — Louis Solofrizzo a month ago
periodic: ensure elections are retried after a timeout

Ignore the current node state, and execute the gray-failures routines
anyway. We need to ensure that the election timeout is reached, another
election is launched, regardless of the state of the last election (if
it has not suceedeed, of course).

Ensure the resign are sent to a single node instead of all the nodes,
since it will guarantee a non-working election as the result.

Clarified some comments here and there.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/58069

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>
Acked-by: Saalik Hatia <shatia@scaleway.com>
084d8147 — Louis Solofrizzo a month ago
periodic: fix dynamodb like elections

Some bugs were raised on network partitions. In short, we're changing
the election behavior to ensure it is trigerred only when needed.

- On a partition, do not trigger an election and simply step-down as the
  election does not do anything (we're partitionned) and might be
  harmful when we're re-introduced to the cluster: Our term will be
  bigger than the actual recovered term, thus triggering an election for
  nothing.
- Do not trigger an election if an election is already on-going on
  dynamodb gray failures
- Do not trigger an election if timeout has been reached on
  gray-failures, simply retry
- Changed the majority check to a single node check for leader count: If
  a single node reports a sucessfull leadership, do nothing.
- Wait for at least a majority of responses to ensure a majority of
  nodes is without leader (and none of the other ones have one) before
  triggering an election.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/58069

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>
Acked-by: Saalik Hatia <shatia@scaleway.com>
b08dc76d — Louis Solofrizzo a month ago
libfloat: add new callback is_node_online

If implemented, it should return wether a target node is online or not.
If not, libfloat will not send any log-data to it, but simply hertbeats
until it comes back online.

Updated the core-logic of libfloat_send_append_entries to reflect that
change, and added a small check to not send any data to a node we've
never heard from (yet).

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/58069

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>
Acked-by: Saalik Hatia <shatia@scaleway.com>
6bd6f672 — Louis Solofrizzo a month ago
libfloat: add max_logs_size configuration token

Which limit the total size of the AppendEntries packet to be sent either
on log-replay or log-sync. This token defaults on 65KB, but the final
packet might be bigger than this:
- First of all, there's headers and overhead that are not counted in the
  total limit
- The limit act as a send threshold, not an exclusive limit: We can't
  cut a log in half.

Updated DEBUG to reflect this new logic.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/58069

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>
Acked-by: Saalik Hatia <shatia@scaleway.com>
d0f5b2a4 — Louis Solofrizzo a month ago
elections: force vote reset for every new term

There was some cases were the vote were not reset, leading to potentials
double-leaders elections. We know reset our vote and every vote for
every new term, which should fix it.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/58069

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>
Acked-by: Saalik Hatia <shatia@scaleway.com>
5102b10b — Louis Solofrizzo a month ago
raft: add warning log callback

And change the log-level of elections from DEBUG to WARNING.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/58069

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>
Acked-by: Saalik Hatia <shatia@scaleway.com>
a17da7a3 — Louis Solofrizzo 5 months ago
log: trigger an election when another leader send us an AE with the same term

Should not be possible, in that case something is seriously wrong,
trigger an election to be safe

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/55842

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Julien Egloff <jegloff@scaleway.com>
21ab27a6 — Louis Solofrizzo 7 months ago
node: do not delete my own node from a cluster

Because it will break the ctx->me pointer in the context, and will crash
horribly.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/54672

Acked-by: Patrik Cyvoct <patrik@ptrk.io>
Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
87cb99dc — Louis Solofrizzo 1 year, 1 month ago
log: Apply deep-sleep state from heartbeats / leaders instead of computing locally

We're now applying the deep-sleep state from the AEs/Hearbeats from the
leader, instead of each node computing it locally. This way, only the
leader computes it, and gossip this state to other nodes via AEs, and
those node simply apply it. Solve some issues of desync deep-sleep
timers seen on production.

I've also reworked the deep-sleep routines a bit, in order to have a
single entry-point for it.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/50314

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Julien Egloff <jegloff@scaleway.com>
Acked-by: Florian Florensa <fflorensa@scaleway.com>

 ________________________________________
/ Just because they are called           \
| 'forbidden' transitions does not mean  |
| that they are forbidden. They are less |
| allowed than allowed transitions, if   |
| you see what I mean. -- From a Part 2  |
\ Quantum Mechanics lecture.             /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
65f9f585 — Louis Solofrizzo 1 year, 3 months ago
log: Only ERROR log on smaller commit_id when it is problematic

Otherwise, DEBUG it. Should generate less logs on production / tests.
Also removes the minimum 20 entries used to compute optimistic
replication.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/47901

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
77f3b3c9 — Louis Solofrizzo 1 year, 4 months ago
periodic: Change log level from ERROR to DEBUG on gray-failure checks

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/47503

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
9e7bac3b — Louis Solofrizzo 1 year, 8 months ago
periodic: Trigger an election when a leader has lost its followers

Also fixes some formatting issues in the latest patch.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/42881

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>

 ________________________________________
/ Take what you can use and let the rest \
\ go by. -- Ken Kesey                    /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
raft: add soft snapshot feature

This patch add soft snapshot feature: if no logs are received for the
soft_compact_time seconds, a snapshot will be made based on the
soft_compact_after_n value.

Signed-off-by: Patrik Cyvoct <patrik@ptrk.io>
27f69f8a — Julien Egloff 1 year, 9 months ago
log, periodic: Ensure libfloat is resistant to clock drifting

By using relative timer instead of absolute ones.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/42580

Signed-off-by: Julien Egloff <jegloff@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
Acked-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
a97638c1 — Louis Solofrizzo 1 year, 10 months ago
log: Missing condition logic on append_entries receive

That can lead to stuck snapshots, on specific cases.
I've simplified some logic and added some debug here and there.
I've also fixed the snapshotting logic to account for off-by-one or more
log id error.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/41678

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
1e1bf074 — Louis Solofrizzo 1 year, 11 months ago
log: Accept snapshots logs if the term is higher than our snapshot term

It's stucking replication on some clusters in production

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/41159

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>

 _____________________________________
/ "Nature is very un-American. Nature \
| never hurries." -- William George   |
\ Jordan                              /
 -------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
c45a1f8a — Louis Solofrizzo 1 year, 11 months ago
compilation: Add a build.zig compilation file

Working as intended, some small fixes in the code to compile with clang
without warnings.

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/41085

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>

 ________________________________________
/ If I had only known, I would have been \
\ a locksmith. -- Albert Einstein        /
 ----------------------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||
4c31834f — Louis Solofrizzo 1 year, 11 months ago
election: Fix follower condition and be less strict on term

Revert "dynamo: Do not discard leader when timeout is reached when using leader-check dynamo like elections"
Fix no_leader count for gray-failures
Add no wake-up if the leader has been recovered from a gray-failure check
Load leader before deep-sleep states in order not to force a wake-up on restarts

Patch: https://lists.sr.ht/~ne02ptzero/libfloat/patches/40645

Signed-off-by: Louis Solofrizzo <lsolofrizzo@scaleway.com>
Acked-by: Patrik Cyvoct <pcyvoct@scaleway.com>
Next
Do not follow this link