summaryrefslogtreecommitdiff
path: root/gemfeed/DRAFT-distributed-systems-simulator.gmi
blob: 7cb0e67be1857c655aa63a68acdddf2bc746272d (plain)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
# Distributed Systems Simulator v1.1.0 released

> DRAFT - Not yet published

This blog explores the Java-based Distributed Systems Simulator program I created as my diploma thesis at the Aachen University of Applied Sciences (August 2008). The simulator offers both built-in implementations of common distributed systems algorithms and an extensible framework that allows researchers and practitioners to implement and test their own custom protocols within the simulation environment.

=> https://codeberg.org/snonux/ds-sim ds-sim on Codeberg (modernized, English-translated version)

=> ./distributed-systems-simulator/ds-sim-screenshot.png Screenshot: The Distributed Systems Simulator running a Broadcast protocol simulation with 6 processes. The visualization shows message lines between process bars, with blue indicating delivered messages and green indicating messages still in transit.

## Table of Contents

* ⇢ Distributed Systems Simulator v1.1.0 released
* ⇢ ⇢ Motivation
* ⇢ ⇢ Installation
* ⇢ ⇢ Fundamentals
* ⇢ ⇢ ⇢ Client/Server Model
* ⇢ ⇢ ⇢ Processes and Their Roles
* ⇢ ⇢ ⇢ Messages
* ⇢ ⇢ ⇢ Local and Global Clocks
* ⇢ ⇢ ⇢ Events
* ⇢ ⇢ ⇢ Protocols
* ⇢ ⇢ Graphical User Interface (GUI)
* ⇢ ⇢ ⇢ Simple Mode
* ⇢ ⇢ ⇢ The Menu Bar
* ⇢ ⇢ ⇢ The Toolbar
* ⇢ ⇢ ⇢ The Visualization
* ⇢ ⇢ ⇢ Color Differentiation
* ⇢ ⇢ ⇢ The Sidebar
* ⇢ ⇢ ⇢ The Log Window
* ⇢ ⇢ ⇢ Expert Mode
* ⇢ ⇢ ⇢ Configuration Settings
* ⇢ ⇢ Protocols and Examples
* ⇢ ⇢ ⇢ Dummy Protocol
* ⇢ ⇢ ⇢ Ping-Pong Protocol
* ⇢ ⇢ ⇢ Broadcast Protocol
* ⇢ ⇢ ⇢ Internal Synchronization Protocol
* ⇢ ⇢ ⇢ Christian's Method (External Synchronization)
* ⇢ ⇢ ⇢ Berkeley Algorithm
* ⇢ ⇢ ⇢ One-Phase Commit Protocol
* ⇢ ⇢ ⇢ Two-Phase Commit Protocol
* ⇢ ⇢ ⇢ Basic Multicast Protocol
* ⇢ ⇢ ⇢ Reliable Multicast Protocol
* ⇢ ⇢ Additional Examples
* ⇢ ⇢ ⇢ Lamport and Vector Timestamps
* ⇢ ⇢ ⇢ Simulating Slow Connections
* ⇢ ⇢ ⇢ Raft Consensus Failover
* ⇢ ⇢ Protocol API
* ⇢ ⇢ ⇢ Class Hierarchy
* ⇢ ⇢ ⇢ Implementing a Custom Protocol
* ⇢ ⇢ ⇢ Available API Methods
* ⇢ ⇢ ⇢ Example: Reliable Multicast Implementation
* ⇢ ⇢ Project Statistics

## Motivation

Distributed systems are notoriously complex, with intricate interactions between multiple nodes, network partitions, and failure scenarios that can be difficult to understand and debug in production environments. A distributed systems simulator provides an invaluable learning tool that allows developers and students to experiment with different architectures, observe how systems behave under various failure conditions, and gain hands-on experience with concepts like consensus algorithms, replication strategies, and fault tolerance -- all within a controlled, repeatable environment. By abstracting away the operational overhead of managing real distributed infrastructure, simulators enable focused exploration of system design principles and help bridge the gap between theoretical knowledge and practical understanding of how distributed systems actually work in the real world.

In the literature, one can find many different definitions of a distributed system. Many of these definitions differ from each other, making it difficult to find a single definition that stands alone as the correct one. Andrew Tanenbaum and Maarten van Steen chose the following loose characterization for describing a distributed system:

> "A distributed system is a collection of independent computers that appears to its users as a single coherent system" - Andrew Tanenbaum

The user only needs to interact with the local computer in front of them, while the software of the local computer ensures smooth communication with the other participating computers in the distributed system.

This thesis aims to make it easier for users to view distributed systems from a different perspective. Here, the viewpoint of an end user is not adopted; instead, the functional methods of protocols and their processes in distributed systems should be made comprehensible, while simultaneously making all relevant events of a distributed system transparent.

To achieve this goal, a simulator was developed, particularly for teaching and learning purposes at the University of Applied Sciences Aachen. With the simulator, protocols from distributed systems with their most important influencing factors can be replicated through simulations. At the same time, there is ample room for personal experiments, with no restriction to a fixed number of protocols. It is therefore important that users are enabled to design their own protocols.

The original simulator (VS-Sim) was written in Java 6 in 2008 with a German-language UI. In 2025, I revamped and modernized it as ds-sim: The entire codebase and UI were translated from German to English. The build system was migrated from hand-rolled Ant scripts to Maven. The Java baseline was upgraded from Java 6 to Java 21, adopting modern language features such as sealed class hierarchies, record types, formatted strings, and pattern matching. A proper exception hierarchy and consistent error handling were introduced. Comprehensive Javadoc documentation was added to all public APIs. A headless testing framework was implemented, bringing the project to 208 unit tests covering core components, the event system, and all protocol implementations. The project structure was reorganized to follow standard Maven conventions, and architecture documentation was added. In total, the modernization touched 199 files with over 15,000 lines of new code. Back in 2008, I wrote every single line by hand using the Vim editor. For the 2025 modernization, I could rely on Claude Code for most of the heavy lifting -- the translation, the refactoring, the test generation, the documentation. It is insane how times have changed.

## Installation

The modernized ds-sim requires Java 21 or higher and Maven 3.8 or higher.

```
# Clone the repository
git clone https://codeberg.org/snonux/ds-sim.git
cd ds-sim

# Set JAVA_HOME if needed (e.g. on Fedora Linux)
export JAVA_HOME=/usr/lib/jvm/java-21-openjdk

# Build the project
mvn clean package

# Run the simulator
java -jar target/ds-sim-*.jar
```

For a faster development build without running tests:

```
mvn package -DskipTests
```

After building, the following artifacts are available in the `target/` directory:

* `ds-sim-1.1.0.jar` - Executable JAR with all dependencies bundled
* `original-ds-sim-1.1.0.jar` - JAR without dependencies

The project also includes 208 unit tests that can be run with `mvn test`. Example simulation files for all built-in protocols are included in the `saved-simulations/` directory.

=> https://codeberg.org/snonux/ds-sim ds-sim source code on Codeberg

## Fundamentals

For basic understanding, some fundamentals are explained below. A deeper exploration will follow in later chapters.

### Client/Server Model

```
+-----------------------------------------+
|                                         |
|   +--------+         +--------+         |
|   | Client |<------->| Server |         |
|   +--------+         +--------+         |
|                                         |
|       Sending of Messages               |
|                                         |
+-----------------------------------------+

Figure 1.1: Client/Server Model
```

The simulator is based on the client/server principle. Each simulation typically consists of a participating client and a server that communicate with each other via messages (see Fig. 1.1). In complex simulations, multiple clients and/or servers can also participate.

### Processes and Their Roles

A distributed system is simulated using processes. Each process takes on one or more roles. For example, one process can take on the role of a client and another process the role of a server. The possibility of assigning both client and server roles to a process simultaneously is also provided. A process could also take on the roles of multiple servers and clients simultaneously. To identify a process, each one has a unique Process Identification Number (PID).

### Messages

In a distributed system, it must be possible to send messages. A message can be sent by a client or server process and can have any number of recipients. The content of a message depends on the protocol used. What is meant by a protocol will be covered later. To identify a message, each message has a unique Message Identification Number (NID).

### Local and Global Clocks

In a simulation, there is exactly one global clock. It represents the current and always correct time. A global clock never goes wrong.

Additionally, each participating process has its own local clock. It represents the current time of the respective process. Unlike the global clock, local clocks can display an incorrect time. If the process time is not globally correct (not equal to the global time, or displays an incorrect time), then it was either reset during a simulation, or it is running incorrectly due to clock drift. The clock drift indicates by what factor the clock is running incorrectly. This will be discussed in more detail later.

```
+---------------------+     +---------------------+
|    Process 1        |     |    Process 2        |
|                     |     |                     |
| +-----------------+ |     | +-----------------+ |
| |Server Protocol A| |     | |Client Protocol A| |
| +-----------------+ |     | +-----------------+ |
|                     |     |                     |
| +-----------------+ |     +---------------------+
| |Client Protocol B| |
| +-----------------+ |     +---------------------+
|                     |     |    Process 3        |
+---------------------+     |                     |
                            | +-----------------+ |
                            | |Server Protocol B| |
                            | +-----------------+ |
                            |                     |
                            +---------------------+

Figure 1.2: Client/Server Protocols
```

In addition to normal clocks, vector timestamps and Lamport's logical clocks are also of interest. For vector and Lamport times, there are no global equivalents here, unlike normal time. Concrete examples of Lamport and vector times will be covered later in the "Additional Examples" section.

### Events

A simulation consists of the sequential execution of finitely many events. For example, there can be an event that causes a process to send a message. A process crash event would also be conceivable. Each event occurs at a specific point in time. Events with the same occurrence time are executed directly one after another by the simulator. However, this does not hinder the simulator's users, as events are executed in parallel from their perspective.

Two main types of events are distinguished: programmable events and non-programmable events. Programmable events can be programmed and edited in the event editor, and their occurrence times depend on the local process clocks or the global clock. Non-programmable events, on the other hand, cannot be programmed in the event editor and do not occur because of a specific time, but due to other circumstances such as:

* Message receive events: Triggered when a message arrives at a recipient process
* Protocol schedule events (alarms): Triggered by a timer set by a protocol, e.g. for retransmission timeouts
* Random events: Such as random process crashes based on configured crash probability

### Protocols

A simulation also consists of the application of protocols. It has already been mentioned that a process can take on the roles of servers and/or clients. For each server and client role, the associated protocol must also be specified. A protocol defines how a client and a server send messages, and how they react when a message arrives. A protocol also determines what data is contained in a message. A process only processes a received message if it understands the respective protocol.

In Figure 1.2, 3 processes are shown. Process 1 supports protocol "A" on the server side and protocol "B" on the client side. Process 2 supports protocol "A" on the client side and Process 3 supports protocol "B" on the server side. This means that Process 1 can communicate with Process 2 via protocol "A" and with Process 3 via protocol "B". Processes 2 and 3 are incompatible with each other and cannot process messages received from each other.

Clients cannot communicate with clients, and servers cannot communicate with servers. For communication, at least one client and one server are always required. However, this restriction can be circumvented by having processes support a given protocol on both the server and client sides (see Broadcast Protocol later).

## Graphical User Interface (GUI)

### Simple Mode

=> ./distributed-systems-simulator/ds-sim-screenshot2.png Screenshot: The simulator showing the settings dialog. The visualization area displays process bars with message lines between them. The settings window allows configuring simulation parameters like number of processes, simulation duration, clock drift, message loss probability, and more.

The simulator requires JDK 21 and can be started with the command `java -jar target/ds-sim-VERSION.jar`

The simulator then presents itself with a main window. To create a new simulation, select "New Simulation" from the "File" menu, after which the settings window for the new simulation appears. The individual options will be discussed in more detail later, and for now, only the default settings will be used.

By default, the simulator starts in "simple mode". There is also an "expert mode", which will be discussed later.

### The Menu Bar

In the File menu, you can create new simulations or close the currently open simulation. New simulations open by default in a new tab. However, you can also open or close new simulation windows that have their own tabs. Each tab contains a simulation that is completely independent from the others. This allows any number of simulations to be run in parallel. The menu items "Open", "Save" and "Save As" are used for loading and saving simulations.

Through the Edit menu, users can access the simulation settings, which will be discussed in more detail later. This menu also lists all participating processes for editing. If the user selects a process there, the corresponding process editor opens. The Simulator menu offers the same options as the toolbar, which is described in the next section.

Some menu items are only accessible when a simulation has already been created or loaded in the current window.

### The Toolbar

The toolbar is located at the top left of the simulator. The toolbar contains the functions most frequently needed by users. The toolbar offers four different functions:

* Reset simulation: can only be activated when the simulation has been paused or has finished
* Repeat simulation: cannot be activated if the simulation has not yet been started
* Pause simulation: can only be activated when the simulation is currently running
* Start simulation: can only be activated when the simulation is not currently running and has not yet finished

### The Visualization

The graphical simulation visualization is located in the center right. The X-axis shows the time in milliseconds, and all participating processes are listed on the Y-axis. The demo simulation ends after exactly 15 seconds. The visualization shows processes (with PIDs 1, 2, and 3), each with its own horizontal black bar. On these process bars, users can read the respective local process time. The vertical red line represents the global simulation time.

The process bars also serve as start and end points for messages. For example, if Process 1 sends a message to Process 2, a line is drawn from one process bar to the other. Messages that a process sends to itself are not visualized but are logged in the log window (more on this later).

Another way to open a process editor is to left-click on the process bar belonging to the process. A right-click, on the other hand, opens a popup window with additional options. A process can only be forced to crash or be revived via the popup menu during a running simulation.

In general, the number of processes can vary as desired. The simulation duration is at least 5 and at most 120 seconds. The simulation only ends when the global time reaches the specified simulation end time (here 15 seconds), not when a local process time reaches this end time.

### Color Differentiation

Colors help to better interpret the processes of a simulation. By default, processes (process bars) and messages are displayed with the following colors (these are only the default colors, which can be changed via the settings):

```
Process Colors:
  Black   - The simulation is not currently running
  Green   - The process is running normally
  Orange  - The mouse is over the process bar
  Red     - The process has crashed

Message Colors:
  Green   - The message is still in transit
  Blue    - The message has successfully reached its destination
  Red     - The message was lost
```

### The Sidebar

The sidebar is used to program process events. At the top, the process to be managed is selected (here with PID 1). In this process selection, there is also the option to select "All Processes", which displays all programmed events of all processes simultaneously. "Local events" are those events that occur when a certain local time of the associated process has been reached. The event table below lists all programmed events along with their occurrence times and PIDs.

To create a new event, the user can either right-click on a process bar and select "Insert local event", or select an event below the event table, enter the event occurrence time in the text field below, and click "Apply".

Right-clicking on the event editor allows you to either copy or delete all selected events. Using the Ctrl key, multiple events can be selected simultaneously. The entries in the Time and PID columns can be edited afterwards. This provides a convenient way to move already programmed events to a different time or assign them to a different process. However, users should ensure that they press the Enter key after changing the event occurrence time, otherwise the change will be ineffective.

In addition to the Events tab, the sidebar has another tab called "Variables". Behind this tab is the process editor of the currently selected process. There, all variables of the process can be edited, providing another way to access a process editor.

### The Log Window

The log window (at the bottom) logs all occurring events in chronological order. At the beginning of each log entry, the global time in milliseconds is always logged. For each process, its local times as well as the Lamport and vector timestamps are also listed. After the time information, additional details are provided, such as which message was sent with what content and which protocol it belongs to. This will be demonstrated later with examples.

```
000000ms: New Simulation
000000ms: New Process; PID: 1; Local Time: 000000ms; Lamport time: 0; Vector time: (0,0,0)
000000ms: New Process; PID: 2; Local Time: 000000ms; Lamport time: 0; Vector time: (0,0,0)
000000ms: New Process; PID: 3; Local Time: 000000ms; Lamport time: 0; Vector time: (0,0,0)
```

By deactivating the logging switch, message logging can be temporarily disabled. With logging deactivated, no new messages are written to the log window. After reactivating the switch, all omitted messages are subsequently written to the window. Deactivated logging can lead to improved simulator performance.

### Expert Mode

=> ./distributed-systems-simulator/ds-sim-screenshot.png Screenshot: The Distributed Systems Simulator in expert mode, showing a Broadcast protocol simulation with 6 processes. The visualization shows message lines between process bars, with blue indicating delivered messages and green indicating messages still in transit.

The simulator can be operated in two different modes: simple mode and expert mode. The simulator starts in simple mode by default, so users don't have to deal with the simulator's full functionality all at once. Simple mode is clearer but offers fewer functions. Expert mode is more suitable for experienced users and accordingly offers more flexibility. Expert mode can be activated or deactivated via the switch of the same name below the log window or via the simulation settings.

In expert mode, the following additional features become available:

* Global events: In addition to local events, global events can now also be edited. Global events are triggered when a specific global simulation time is reached, rather than a local process time. This only makes a difference when local process times differ from the global time (e.g. due to clock drift).
* Direct PID selection: The user can directly select the associated PID when programming a new event.
* Lamport and Vector time switches: If the user activates one of these two switches, the Lamport or vector timestamps are displayed in the visualization. Only one can be active at a time to maintain clarity.
* Anti-aliasing switch: Allows the user to activate or deactivate anti-aliasing for smoother graphics. Disabled by default for performance reasons.
* Log filter: A regular expression filter (Java syntax) that makes it possible to filter only the essential data from the logs. For example, `"PID: (1|2)"` shows only log lines containing "PID: 1" or "PID: 2". The filter can be activated retroactively and during a running simulation.

### Configuration Settings

The simulation settings window allows configuring many aspects of the simulation. Key settings include:

* Processes receive own messages (default: false): Whether processes can receive messages they sent to themselves.
* Average message loss probabilities (default: true): Whether to average the loss probabilities of sender and receiver processes.
* Average transmission times (default: true): Whether to average the transmission times of sender and receiver processes.
* Show only relevant messages (default: true): Hides messages sent to processes that don't support the protocol.
* Expert mode (default: false): Enables expert mode features.
* Simulation speed (default: 0.5): The playback speed factor. A value of 1 means real-time, 0.5 means half speed.
* Number of processes (default: 3): Can also be changed during simulation via right-click.
* Simulation duration (default: 15s): Between 5 and 120 seconds.

Each process also has individual settings:

* Clock drift (default: 0.0): By what factor the local clock deviates. A value of 0.0 means no deviation. A value of 1.0 means double speed. Values > -1.0 are allowed.
* Random crash probability (default: 0%): Probability that the process crashes randomly during the simulation.
* Message loss probability (default: 0%): Probability that a message sent by this process is lost in transit.
* Min/Max transmission time (default: 500ms/2000ms): The range for random message delivery times.

## Protocols and Examples

The simulator comes with 10 built-in protocols. As described earlier, protocols are distinguished between server-side and client-side. Servers can respond to client messages, and clients can respond to server messages. Each process can support any number of protocols on both the client and server side. Users can also implement their own protocols using the simulator's Protocol API (see the Protocol API section).

The program directory contains a `saved-simulations` folder with example simulations for each protocol as serialized `.dat` files.

### Dummy Protocol

The Dummy Protocol serves only as a template for creating custom protocols. When using the Dummy Protocol, only log messages are output when events occur. No further actions are performed.

### Ping-Pong Protocol

=> ./distributed-systems-simulator/ping-pong.png Visualization: The Ping-Pong Protocol showing two processes (P1 and P2) exchanging messages in a continuous back-and-forth pattern. Blue lines represent delivered messages bouncing between the process bars over a 15-second simulation.

In the Ping-Pong Protocol, two processes -- Client P1 and Server P2 -- constantly send messages back and forth. The Ping-Pong client starts the first request, to which the server responds to the client. The client then responds again, and so on. Each message includes a counter that is incremented at each station and logged in the log window.

```
Programmed Ping-Pong Events:

| Time (ms) | PID | Event                          |
|-----------|-----|--------------------------------|
| 0         | 1   | Ping-Pong Client activate      |
| 0         | 2   | Ping-Pong Server activate      |
| 0         | 1   | Ping-Pong Client request start |
```

It is important that Process 1 activates its Ping-Pong client before starting a Ping-Pong client request. Before a process can start a request, it must have the corresponding protocol activated. This also applies to all other protocols.

**Ping-Pong Storm Variant**

=> ./distributed-systems-simulator/ping-pong-storm.png Visualization: The Ping-Pong Storm variant with three processes. P1 is the client, P2 and P3 are both servers. The visualization shows an exponentially growing number of messages as each client message generates two server responses, creating a dense web of blue and green message lines.

By adding a third process P3 as an additional Ping-Pong server, a Ping-Pong "Storm" can be realized. Since every client message now receives two server responses, the number of messages doubles with each round, creating an exponential message flood.

```
Programmed Ping-Pong Storm Events:

| Time (ms) | PID | Event                          |
|-----------|-----|--------------------------------|
| 0         | 1   | Ping-Pong Client activate      |
| 0         | 2   | Ping-Pong Server activate      |
| 0         | 3   | Ping-Pong Server activate      |
| 0         | 1   | Ping-Pong Client request start |
```

### Broadcast Protocol

=> ./distributed-systems-simulator/broadcast.png Visualization: The Broadcast Protocol with 6 processes (P1-P6). Dense crossing message lines show how a broadcast from P1 propagates to all processes, with each process re-broadcasting to others. Blue lines indicate delivered messages, green lines indicate messages still in transit.

The Broadcast Protocol behaves similarly to the Ping-Pong Protocol. The difference is that the protocol tracks -- using a unique Broadcast ID -- which messages have already been sent. Each process re-broadcasts all received messages to others, provided it has not already sent them.

In this case, no distinction is made between client and server, so that the same action is performed when a message arrives at either side. This makes it possible, using multiple processes, to create a broadcast. P1 is the client and starts a request at 0ms and 2500ms. The simulation duration is exactly 5000ms. Since a client can only receive server messages and a server can only receive client messages, every process in this simulation is both server and client.

```
Programmed Broadcast Events:

| Time (ms) | PID | Event                            |
|-----------|-----|----------------------------------|
| 0         | 1-6 | Broadcast Client activate        |
| 0         | 1-6 | Broadcast Server activate        |
| 0         | 1   | Broadcast Client request start   |
| 2500      | 1   | Broadcast Client request start   |
```

### Internal Synchronization Protocol

=> ./distributed-systems-simulator/int-sync.png Visualization: Internal Synchronization with 2 processes. P1 (client, clock drift 0.1) shows a faster-running clock reaching 15976ms by simulation end. The blue message lines show P1 periodically synchronizing with P2 (server, no drift), with the time corrections visible as slight adjustments in P1's timeline.

The Internal Synchronization Protocol is used for synchronizing the local process time, which can be applied when a process time is running incorrectly due to clock drift. When the client wants to synchronize its (incorrect) local process time t_c with a server, it sends a client request. The server responds with its own local process time t_s, allowing the client to calculate a new, more accurate time for itself.

After receiving the server response, the client P1 calculates its new local process time as:

```
t_c := t_s + 1/2 * (t'_min + t'_max)
```

This synchronizes P1's local time with an error of less than 1/2 * (t'_max - t'_min), where t'_min and t'_max are the assumed minimum and maximum transmission times configured in the protocol settings.

In the example, the client process has a clock drift of 0.1 and the server has 0.0. The client starts a request at local process times 0ms, 5000ms, and 10000ms. By simulation end, P1's time is synchronized to 15976ms (an error of -976ms from the global 15000ms).

```
Programmed Internal Sync Events:

| Time (ms) | PID | Event                              |
|-----------|-----|------------------------------------|
| 0         | 1   | Internal Sync Client activate      |
| 0         | 2   | Internal Sync Server activate      |
| 0         | 1   | Internal Sync Client request start |
| 5000      | 1   | Internal Sync Client request start |
| 10000     | 1   | Internal Sync Client request start |
```

Protocol variables (client-side):

* Min. transmission time (Long: 500): The assumed t'_min in milliseconds
* Max. transmission time (Long: 2000): The assumed t'_max in milliseconds

These can differ from the actual message transmission times t_min and t_max, allowing simulation of scenarios where the protocol is misconfigured and large synchronization errors occur.

### Christian's Method (External Synchronization)

=> ./distributed-systems-simulator/christians.png Visualization: Comparison of Internal Synchronization (P1) and Christian's Method (P3) with P2 as shared server. Both P1 and P3 have clock drift 0.1. The visualization shows P1 synchronized to 14567ms (error: -433ms) while P3 synchronized to 15539ms (error: -539ms), demonstrating the different accuracy of the two methods.

Christian's Method uses the RTT (Round Trip Time) to approximate the transmission time of individual messages. When the client wants to synchronize its local time t_c with a server, it sends a request and measures the RTT t_rtt until the server response arrives. The server response contains the local process time t_s from the moment the server sent the response. The client then calculates its new local time as:

```
t_c := t_s + 1/2 * t_rtt
```

The accuracy is +/- (1/2 * t_rtt - u_min) where u_min is a lower bound for message transmission time.

The visualization compares both synchronization methods side by side: P1 uses Internal Synchronization and P3 uses Christian's Method, with P2 serving both. Both P1 and P3 have clock drift 0.1. In this particular run, Internal Synchronization achieved a better result (-433ms error vs. -539ms), though results vary between runs due to random transmission times.

```
Programmed Comparison Events:

| Time (ms) | PID | Event                                |
|-----------|-----|--------------------------------------|
| 0         | 1   | Internal Sync Client activate        |
| 0         | 1   | Internal Sync Client request start   |
| 0         | 2   | Christian's Server activate          |
| 0         | 2   | Internal Sync Server activate        |
| 0         | 3   | Christian's Client activate          |
| 0         | 3   | Christian's Client request start     |
| 5000      | 1   | Internal Sync Client request start   |
| 5000      | 3   | Christian's Client request start     |
| 10000     | 1   | Internal Sync Client request start   |
| 10000     | 3   | Christian's Client request start     |
```

### Berkeley Algorithm

=> ./distributed-systems-simulator/berkeley.png Visualization: The Berkeley Algorithm with 3 processes. P2 is the server (coordinator) sending time requests to clients P1 and P3. After collecting responses, P2 calculates correction values and sends them back. Final times show P1=16823ms, P2=14434ms, P3=13892ms -- all brought closer together through averaging.

The Berkeley Algorithm is another method for synchronizing local clocks. This is the first protocol where the server initiates the requests. The server acts as a coordinator. The client processes are passive and must wait until a server request arrives. The server must know which client processes participate in the protocol, which is configured in the server's protocol settings.

When the server wants to synchronize its local time t_s and the process times t_i of the clients (i = 1,...,n), it sends a server request. n is the number of participating clients. The clients then send their local process times back to the server. The server measures the RTTs r_i for all client responses.

After all responses are received, the server sets its own time to the average t_avg of all known process times (including its own). The transmission time of a client response is estimated as half the RTT:

```
t_avg := 1/(n+1) * (t_s + SUM(r_i/2 + t_i))
t_s := t_avg
```

The server then calculates a correction value k_i := t_avg - t_i for each client and sends it back. Each client sets its new time to t'_i := t'_i + k_i.

```
Programmed Berkeley Events:

| Time (ms) | PID | Event                             |
|-----------|-----|-----------------------------------|
| 0         | 1   | Berkeley Client activate          |
| 0         | 2   | Berkeley Server activate          |
| 0         | 3   | Berkeley Client activate          |
| 0         | 2   | Berkeley Server request start     |
| 7500      | 2   | Berkeley Server request start     |
```

Protocol variables (server-side):

* PIDs of participating processes (Integer[]: [1,3]): The PIDs of the Berkeley client processes. The protocol will not work if a non-existent PID is specified or if the process does not support the Berkeley protocol on the client side.

### One-Phase Commit Protocol

=> ./distributed-systems-simulator/one-phase-commit.png Visualization: The One-Phase Commit Protocol with 3 processes. P1 crashes at 1000ms (shown in red) and recovers at 5000ms. P2 (server) periodically sends commit requests. The red lines show lost messages during P1's crash period, while blue lines show successful message exchanges after recovery.

The One-Phase Commit Protocol is designed to move any number of clients to a commit. In practice, this could be creating or deleting a file that each client has a local copy of. The server is the coordinator and initiates the commit request. The server periodically resends the commit request until every client has acknowledged it. For this purpose, the PIDs of all participating client processes and a timer for resending must be configured.

In the example, P1 and P3 are clients and P2 is the server. P1 crashes at 1000ms and recovers at 5000ms. The first two commit requests fail to reach P1 due to its crash. Only the third attempt succeeds. Each client acknowledges a commit request only once.

```
Programmed One-Phase Commit Events:

| Time (ms) | PID | Event                                  |
|-----------|-----|----------------------------------------|
| 0         | 1   | 1-Phase Commit Client activate         |
| 0         | 2   | 1-Phase Commit Server activate         |
| 0         | 3   | 1-Phase Commit Client activate         |
| 0         | 2   | 1-Phase Commit Server request start    |
| 1000      | 1   | Process crash                          |
| 5000      | 1   | Process revival                        |
```

Protocol variables (server-side):

* Time until resend (Long: timeout = 2500): Milliseconds to wait before resending the commit request
* PIDs of participating processes (Integer[]: pids = [1,3]): The client process PIDs that should commit

### Two-Phase Commit Protocol

=> ./distributed-systems-simulator/two-phase-commit.png Visualization: The Two-Phase Commit Protocol with 3 processes. P2 (server) orchestrates a two-phase voting process with clients P1 and P3. The complex message pattern shows the voting phase followed by the commit/abort phase, with messages crossing between all three processes over a 10-second simulation.

The Two-Phase Commit Protocol is an extension of the One-Phase Commit Protocol. The server first sends a request to all participating clients asking whether they want to commit. Each client responds with true or false. The server periodically retries until all results are collected. After receiving all votes, the server checks whether all clients voted true. If at least one client voted false, the commit process is aborted and a global result of false is sent to all clients. If all voted true, the global result true is sent. The global result is periodically resent until each client acknowledges receipt.

In the example, P1 and P3 are clients and P2 is the server. The server sends its first request at 0ms. Here both P1 and P3 vote true, so the commit proceeds.

```
Programmed Two-Phase Commit Events:

| Time (ms) | PID | Event                                  |
|-----------|-----|----------------------------------------|
| 0         | 1   | 2-Phase Commit Client activate         |
| 0         | 2   | 2-Phase Commit Server activate         |
| 0         | 3   | 2-Phase Commit Client activate         |
| 0         | 2   | 2-Phase Commit Server request start    |
```

Example log extract showing the two-phase voting process:

```
000000ms: PID 2: Message sent; ID: 94; Protocol: 2-Phase Commit
                  Boolean: wantVote=true
000905ms: PID 3: Message received; ID: 94; Protocol: 2-Phase Commit
000905ms: PID 3: Message sent; ID: 95; Protocol: 2-Phase Commit
                  Integer: pid=3; Boolean: isVote=true; vote=true
000905ms: PID 3: Vote true sent
001880ms: PID 2: Message received; ID: 95; Protocol: 2-Phase Commit
001880ms: PID 2: Vote from Process 3 received! Result: true
001947ms: PID 1: Message received; ID: 94; Protocol: 2-Phase Commit
001947ms: PID 1: Vote true sent
003137ms: PID 2: Votes from all participating processes received!
                  Global result: true
003137ms: PID 2: Message sent; ID: 99; Protocol: 2-Phase Commit
                  Boolean: isVoteResult=true; voteResult=true
004124ms: PID 1: Global vote result received. Result: true
006051ms: PID 2: All participants have acknowledged the vote
010000ms: Simulation ended
```

Protocol variables (server-side):

* Time until resend (Long: timeout = 2500): Milliseconds to wait before resending
* PIDs of participating processes (Integer[]: pids = [1,3]): Client PIDs that should vote and commit

Protocol variables (client-side):

* Commit probability (Integer: ackProb = 50): The probability in percent that the client votes true (for commit)

### Basic Multicast Protocol

=> ./distributed-systems-simulator/basic-multicast.png Visualization: The Basic Multicast Protocol with 3 processes. P2 (client) sends periodic multicast messages to servers P1 and P3. P3 crashes at 3000ms (shown in red) and recovers at 6000ms. Red lines indicate lost messages, blue lines show delivered messages. Some messages to P1 are also lost due to the 30% message loss probability.

The Basic Multicast Protocol is very simple. The client always initiates the request, which represents a simple multicast message. The Basic Multicast servers serve only to receive the message. No acknowledgments are sent. The client P2 sends a multicast message every 2500ms to servers P1 and P3.

P1 can only receive multicast messages after 2500ms because it does not support the protocol before then. P3 is crashed from 3000ms to 6000ms and also cannot receive messages during that time. Each process has a 30% message loss probability, so some messages are lost in transit (shown in red).

In this example, the 3rd multicast message to P3 and the 5th and 6th messages to P1 were lost. Only the 4th multicast message reached both destinations.

```
Programmed Basic Multicast Events:

| Time (ms) | PID | Event                                  |
|-----------|-----|----------------------------------------|
| 0         | 2   | Basic Multicast Client activate        |
| 0         | 3   | Basic Multicast Server activate        |
| 0         | 2   | Basic Multicast Client request start   |
| 2500      | 1   | Basic Multicast Server activate        |
| 2500      | 2   | Basic Multicast Client request start   |
| 3000      | 3   | Process crash                          |
| 5000      | 2   | Basic Multicast Client request start   |
| 6000      | 3   | Process revival                        |
| 7500      | 2   | Basic Multicast Client request start   |
| 10000     | 2   | Basic Multicast Client request start   |
| 12500     | 2   | Basic Multicast Client request start   |
```

### Reliable Multicast Protocol

=> ./distributed-systems-simulator/reliable-multicast.png Visualization: The Reliable Multicast Protocol with 3 processes. P2 (client) sends multicast messages to servers P1 and P3, retrying until acknowledgments are received from all servers. P3 crashes at 3000ms and recovers at 10000ms. Red lines show lost messages, blue lines show delivered ones. Despite failures, all servers eventually receive and acknowledge the multicast.

In the Reliable Multicast Protocol, the client periodically resends its multicast message until it has received an acknowledgment from all participating servers. After each retry, the client "forgets" which servers have already acknowledged, so each new attempt must be acknowledged again by all participants.

In the example, P2 is the client and P1 and P3 are the servers. At 0ms, the client initiates its multicast message. The message loss probability is set to 30% on all processes. The client needs exactly 5 attempts until successful delivery:

* Attempt 1: P1 doesn't support the protocol yet. P3 receives the message but its ACK is lost.
* Attempt 2: The message to P1 is lost. P3 receives it but is crashed and can't process it.
* Attempt 3: P1 receives the message and ACKs successfully. The message to P3 is lost.
* Attempt 4: P1 receives and ACKs again. P3 receives it but is still crashed.
* Attempt 5: Both P1 and P3 receive the message and ACK successfully.

```
Programmed Reliable Multicast Events:

| Time (ms) | PID | Event                                    |
|-----------|-----|------------------------------------------|
| 0         | 3   | Reliable Multicast Server activate       |
| 0         | 2   | Reliable Multicast Client activate       |
| 0         | 2   | Reliable Multicast Client request start  |
| 2500      | 1   | Reliable Multicast Server activate       |
| 3000      | 3   | Process crash                            |
| 10000     | 3   | Process revival                          |
```

Example log extract:

```
000000ms: PID 2: Reliable Multicast Client activated
000000ms: PID 2: Message sent; ID: 280; Protocol: Reliable Multicast
                  Boolean: isMulticast=true
000000ms: PID 3: Reliable Multicast Server activated
001590ms: PID 3: Message received; ID: 280; Protocol: Reliable Multicast
001590ms: PID 3: ACK sent
002500ms: PID 1: Reliable Multicast Server activated
002500ms: PID 2: Message sent; ID: 282; Protocol: Reliable Multicast
                  Boolean: isMulticast=true
003000ms: PID 3: Crashed
005000ms: PID 2: Message sent; ID: 283; Protocol: Reliable Multicast
005952ms: PID 1: Message received; ID: 283
005952ms: PID 1: ACK sent
007937ms: PID 2: ACK from Process 1 received!
...
011813ms: PID 2: ACK from Process 3 received!
011813ms: PID 2: ACKs from all participating processes received!
015000ms: Simulation ended
```

Protocol variables (server-side):

* Time until resend (Long: timeout = 2500): Milliseconds to wait before resending the multicast
* PIDs of participating processes (Integer[]: pids = [1,3]): Server PIDs that should receive the multicast

## Additional Examples

### Lamport and Vector Timestamps

=> ./distributed-systems-simulator/lamport-timestamps.png Visualization: Lamport Timestamps displayed on the Berkeley Algorithm simulation. Each event on a process bar shows its Lamport timestamp as a number in parentheses. The timestamps increase monotonically and are updated according to the Lamport clock rules when messages are sent and received between P1, P2, and P3.

> "For many purposes, it is sufficient that all machines agree on the same time. It is not necessary that this time also agrees with real time, like every hour announced on the radio... For a certain class of algorithms, only the internal consistency of clocks is important." - Andrew Tanenbaum

Clocks that provide such a time are also known as logical clocks. Two implementations are realized in the simulator: Lamport timestamps and vector timestamps.

After activating the Lamport time switch in expert mode, the current Lamport timestamp appears at every event of a process. Each process has its own Lamport timestamp that is incremented when a message is sent or received. Each message carries the current Lamport time t_l(i) of the sending process i. When another process j receives this message, its Lamport timestamp t_l(j) is recalculated as:

```
t_l(j) := 1 + max(t_l(j), t_l(i))
```

The larger Lamport time of the sender and receiver process is used and then incremented by 1. After the Berkeley simulation shown here, P1 has Lamport timestamp 16, P2 has 14, and P3 has 15.

=> ./distributed-systems-simulator/vector-timestamps.png Visualization: Vector Timestamps displayed on the same Berkeley Algorithm simulation. Each event shows its vector timestamp as a tuple (v1,v2,v3) representing the known state of all three processes. The tuples grow as processes communicate and merge their knowledge of each other's progress.

With the active vector time switch, all vector timestamps are displayed. Like the Lamport timestamp, each message includes the current vector timestamp of the sending process. With n participating processes, the vector timestamp v has size n. Each participating process i has its own index, accessible via v(i). When v is the vector timestamp of the receiving process j and w is the vector timestamp of the sending process, the new local vector timestamp of process j is calculated as follows:

```
for (i := 0; i < n; i++) {
    if (i = j) {
        v(i)++;
    } else if (v(i) < w(i)) {
        v(i) := w(i);
    }
}
```

By default, the vector timestamp is only incremented when a message is sent or received. In both cases, the sender and receiver each increment their own index in the vector timestamp by 1. Upon receiving a message, the local vector timestamp is then compared with the sender's, and the larger value is taken for all indices.

After the simulation, P1 has vector timestamp (8,10,6), P2 has (6,10,6), and P3 has (6,10,8).

The simulation settings include boolean variables "Lamport times affect all events" and "Vector times affect all events" (both default to false). When set to true, all events (not just message send/receive) will update the timestamps.

### Simulating Slow Connections

=> ./distributed-systems-simulator/slow-connection.png Visualization: Slow connection simulation comparing Internal Synchronization (P1) and Christian's Method (P3) with P2 as server. P3 has high transmission times (2000-8000ms) simulating a slow network connection. P1 synchronizes to 21446ms (error: -1446ms) while P3 only reaches 16557ms (error: -3443ms), showing how slow connections degrade synchronization quality.

The simulator can also simulate slow connections to a specific process. This example revisits the comparison of Internal Synchronization (P1) and Christian's Method (P3), with P2 serving both. In this scenario, P3 has a poor network connection, so messages to and from P3 always require a longer transmission time.

P3's minimum transmission time is set to 2000ms and maximum to 8000ms, while P1 and P2 keep the defaults (500ms/2000ms). The simulation duration is 20000ms. With the "Average transmission times" setting enabled, the effective transmission time for messages involving P3 is:

```
1/2 * (rand(500,2000) + rand(2000,8000)) = 1/2 * rand(2500,10000) = rand(1250,5000)ms
```

Because P3 starts a new request before receiving the answer to its previous one, and because it always associates server responses with its most recently sent request, its RTT calculations become incorrect on each round, and its local time is poorly synchronized. P1 synchronizes to 21446ms (error: -1446ms) while P3 only reaches 16557ms (error: -3443ms).

### Raft Consensus Failover

=> ./distributed-systems-simulator/raft-consensus-failover.png Screenshot: A 60-second Raft simulation with three processes. P1 starts as the initial leader, crashes at 3500ms, later recovers, P2 wins the reelection and remains leader, and P3 crashes later. The blue and red message lines show the continuing heartbeat and acknowledgment traffic during and after failover.

While modernizing ds-sim, I also added a simplified Raft Consensus example. The simulation is intentionally small: three processes, one initial leader, one crash, a clean reelection, a recovery of the old leader, and then another crash later in the run. This makes it possible to see the most important Raft transitions without being overwhelmed by cluster size.

The event log tells a very readable story. At `0ms`, `P1` starts as the initial leader in `term 0`. It immediately sends a heartbeat and an `appendEntry` message carrying the log entry `cmd1`. `P2` joins at `100ms`, `P3` at `1700ms`, and both acknowledge the leader's traffic. At that point the cluster is healthy: one leader, two followers, successful heartbeats, and successful log replication.

At `3500ms`, `P1` crashes. The followers still process the last in-flight messages, but once the election timeout expires, `P2` becomes a candidate and sends a `voteRequest` for `term 1`. `P3` grants that vote, and at `9395ms` the log records the decisive line:

```
009395ms: PID: 2; ... Leader elected by majority vote: process 2 (term 1)
```

That transition is followed immediately by new heartbeats and a new `appendEntry`, which is exactly what you want to see in a Raft simulation: leadership is not just declared, it is exercised.

At `12002ms`, the old leader `P1` recovers. Importantly, it does not try to reclaim control. Instead, it receives heartbeats from `P2` and answers with `heartbeatAck` messages, rejoining the cluster as a follower. That is one of the most useful teaching moments in the log, because it makes the term-based leadership model concrete: the recovered node does not become leader again just because it used to be one.

At `20000ms`, `P3` crashes. The cluster continues running with `P2` as leader and `P1` as follower for the rest of the 60-second simulation. The log remains dominated by periodic heartbeats from `P2` and acknowledgments from `P1`, showing that the system stays stable even after a second failure.

This single scenario demonstrates several core Raft properties in one replay:

* Stable startup leadership
* Heartbeats and follower acknowledgments
* Log replication
* Leader failure detection
* Majority-based reelection
* Safe reintegration of a recovered former leader
* Continued service after a later follower crash

It is also a good example of why a simulator is useful for distributed systems. In a real production system, reconstructing this sort of sequence would require stitching together logs from multiple nodes. Here, the message flow, the crashes, the recoveries, and the Lamport/vector timestamps are all visible in one place.

## Protocol API

The simulator was designed from the ground up to be extensible. Users can implement their own protocols in Java by extending the `VSAbstractProtocol` base class. Each protocol has its own class in the `protocols.implementations` package.

### Class Hierarchy

```
VSAbstractEvent
  +-- VSAbstractProtocol (base class for all protocols)
        +-- VSDummyProtocol
        +-- VSPingPongProtocol
        +-- VSBroadcastProtocol
        +-- VSInternalTimeSyncProtocol
        +-- VSExternalTimeSyncProtocol
        +-- VSBerkeleyTimeProtocol
        +-- VSOnePhaseCommitProtocol
        +-- VSTwoPhaseCommitProtocol
        +-- VSBasicMulticastProtocol
        +-- VSReliableMulticastProtocol
```

### Implementing a Custom Protocol

Each protocol class must implement the following methods:

* A public constructor: Must specify whether the client or the server initiates requests, using `VSAbstractProtocol.HAS_ON_CLIENT_START` or `VSAbstractProtocol.HAS_ON_SERVER_START`.
* `onClientInit()` / `onServerInit()`: Called once before the protocol is first used. Used to initialize protocol variables and attributes via the VSPrefs methods (e.g. `initVector`, `initLong`). Variables initialized this way appear in the process editor and can be configured by the user.
* `onClientReset()` / `onServerReset()`: Called each time the simulation is reset.
* `onClientStart()` / `onServerStart()`: Called when the client/server initiates a request. Typically creates and sends a `VSMessage` object.
* `onClientRecv(VSMessage)` / `onServerRecv(VSMessage)`: Called when a message arrives.
* `onClientSchedule()` / `onServerSchedule()`: Called when a scheduled alarm fires.
* `toString()`: Optional. Customizes log output for this protocol.

### Available API Methods

Methods inherited from `VSAbstractProtocol`:

* `sendMessage(VSMessage message)`: Sends a protocol message (automatically updates Lamport and Vector timestamps)
* `hasOnServerStart()`: Whether the server or client initiates requests
* `isServer()` / `isClient()`: Whether the current process has the protocol activated as server/client
* `scheduleAt(long time)`: Creates an alarm that fires at the given local process time, triggering `onClientSchedule()` or `onServerSchedule()`
* `removeSchedules()`: Cancels all pending alarms in the current context
* `getNumProcesses()`: Returns the total number of processes in the simulation

Process methods available via the inherited `process` attribute:

* `getTime()` / `setTime(long)`: Get/set the local process time
* `getGlobalTime()`: Get the current global simulation time
* `getClockVariance()` / `setClockVariance(float)`: Get/set the clock drift
* `getLamportTime()` / `setLamportTime(long)`: Get/set the Lamport timestamp
* `getVectorTime()` / `updateVectorTime(VSVectorTime)`: Get/update the vector timestamp
* `getProcessID()`: Get the process PID
* `isCrashed()` / `isCrashed(boolean)`: Check or set crash state
* `getRandomPercentage()`: Get a random value between 0 and 100

Message methods (`VSMessage`):

* `new VSMessage()`: Create a new message
* `getMessageID()`: Get the message NID
* `setBoolean(key, value)` / `getBoolean(key)`: Set/get boolean data
* `setInteger(key, value)` / `getInteger(key)`: Set/get integer data
* `setLong(key, value)` / `getLong(key)`: Set/get long data
* `setString(key, value)` / `getString(key)`: Set/get string data
* `getSendingProcess()`: Get a reference to the sending process
* `isServerMessage()`: Whether it's a server or client message

### Example: Reliable Multicast Implementation

Here is a condensed example showing key parts of the Reliable Multicast Protocol implementation:

```java
public class VSReliableMulticastProtocol extends VSAbstractProtocol {
    public VSReliableMulticastProtocol() {
        // The client initiates requests
        super(VSAbstractProtocol.HAS_ON_CLIENT_START);
        super.setClassname(super.getClass().toString());
    }

    private ArrayList<Integer> pids;

    // Initialize protocol variables (editable in the process editor)
    public void onClientInit() {
        Vector<Integer> vec = new Vector<Integer>();
        vec.add(1); vec.add(3);
        super.initVector("pids", vec, "PIDs of participating processes");
        super.initLong("timeout", 2500, "Time until resend", "ms");
    }

    // Send multicast to all servers that haven't ACKed yet
    public void onClientStart() {
        if (pids.size() != 0) {
            long timeout = super.getLong("timeout") + process.getTime();
            super.scheduleAt(timeout);
            VSMessage message = new VSMessage();
            message.setBoolean("isMulticast", true);
            super.sendMessage(message);
        }
    }

    // Handle ACK from a server
    public void onClientRecv(VSMessage recvMessage) {
        if (pids.size() != 0 && recvMessage.getBoolean("isAck")) {
            Integer pid = recvMessage.getIntegerObj("pid");
            if (pids.contains(pid))
                pids.remove(pid);
            super.log("ACK from Process " + pid + " received!");
            if (pids.size() == 0) {
                super.log("ACKs from all processes received!");
                super.removeSchedules();
            }
        }
    }

    // Retry on timeout
    public void onClientSchedule() { onClientStart(); }
}
```

## Project Statistics

The original VS-Sim project (August 2008) was written in Java 6 and consisted of:

* 61 source files across 12 Java packages
* Approximately 15,710 lines of code
* 2.2 MB of generated Javadoc documentation
* 142 KB compiled JAR file
* 10 built-in protocols
* 163 configurable settings

The modernized successor ds-sim (version 1.1.0) has been updated to Java 21 and translated to English:

* 146 source files (117 main + 29 test) across 19 Java packages
* Approximately 27,900 lines of code (22,400 main + 5,500 test)
* 12 built-in protocols
* 208 unit tests
* 269 configurable settings

=> https://codeberg.org/snonux/ds-sim ds-sim source code on Codeberg
=> https://codeberg.org/snonux/vs-sim vs-sim source code on Codeberg (original German version, 2008)

Other related posts are:

=> ./2026-03-01-loadbars-0.13.0-released.gmi 2026-03-01 Loadbars 0.13.0 released
=> ./2022-12-24-ultrarelearning-java-my-takeaways.gmi 2022-12-24 (Re)learning Java - My takeaways
=> ./2022-03-06-the-release-of-dtail-4.0.0.gmi 2022-03-06 The release of DTail 4.0.0
=> ./2016-11-20-object-oriented-programming-with-ansi-c.gmi 2016-11-20 Object oriented programming with ANSI C

E-Mail your comments to `paul@nospam.buetow.org`

=> ../ Back to the main site