100-Prisoners-1-Switch-Problem

Implementation of the multithreading thought experiment in which 100 prisoners must communicate with nothing more than a single switch to escape their prison. This is a variant on the popular 100 prisoners problem, but instead of being about optimizing the probability of success, there is a solution that guarantees the prisoners' success. Instead of a combinatorics problem, it is more of a threading anaolgy, along with a logic puzzle.

Usage

This is a Unix-specific console-based program. Clone the repo and run the makefile to create an executable called simulation. You will need support for c++17. Once compiled successfully, run the executable with the --help option to see an overview of what you can do with it, or check out the bottom half of the readme for a detailed breakdown.

Problem Description

100 prisoners are told by their warden that they will receive a challenge. If they succeed, they all go free, but if they fail, they will never be allowed to leave. What is this, Saw? I guess we'll assume the prisoners are good people and we want to help them escape. Here is the challenge:

There will be a switch room which contains nothing but a single switch which can be flipped on or off, though it isn't actually connected to anything. Trying to break or otherwise tamper with the switch, or trying to do anything to modify the physical state of the room itself will result in instant failure, I guess.
The warden will decide at arbitrary times to let 1 of the 100 prisoners into the switch room. Though the order and frequency of this are not guaranteed, what is guaranteed is that given an unbounded number of overall visits, every individual prisoner will eventually visit the switch room an unbounded number of times. It could be that a given prisoner visits the room more than once before another prisoner gets to visit it even once, but eventually everyone will visit it.
Only 1 prisoner may be in the switch room at a time. They can choose to flip the switch or leave it alone, and then leave. The switch will remain in that state, not being touched by any prison staff in-between visits by prisoners.
The prisoners will be given the opportunity to talk to each other in the very beginning, before anyone sees the switch room, to form a strategy. Once the challenge is underway, prisoners will have no way to communicate with each other.
The initial state of the switch is unknown.
The challenge ends when any one prisoner declares, "We have all visited the switch room at least once." If the declaration is correct, then the prisoners win. If it is incorrect, they lose.

High-Level Solution

The idea is to assign 99 prisoners to the following task (for simplicity, it is purposely missing some crucial details which will be discussed later):

When you enter the switch room, first ask yourself if you have ever flipped the switch before. If so, leave immediately without doing anything. If not, move on to checking the state of the switch. If the switch is currently off, flip it on and then leave. Otherwise, leave immediately without doing anything. Never ever declare that all prisoners have visited the switch room.

The 1 prisoner who is not part of the above 99 will have this special task instead:

Keep a mental note of a counter that starts at 0. When you enter the switch room, check the state of the switch. If it is on, flip it off and increment your mental counter by 1. Otherwise, leave immediately without doing anything. You may declare that all prisoners have visited at least once when your counter reaches 99. Do NOT overcount, i.e., do not try to count beyond 99.

Again, this solution is currently lacking crucial details. Namely, it is not accounting for the unknown initial state. However, it is easiest to understand the two base cases (the switch started off, or the switch started on) separately before accounting for the uncertainty. The true solution has every prisoner in the regular group try to flip the switch on exactly twice, while the designated counter counts to 198. We'll work our way to understanding that thinking gradually.

More Detailed Breakdown

The fundamentals revolve around the problem of communicating the count with nothing but the switch, when the switch is only binary in state. Basically, one of those two states will need to communicate that a prisoner has just visited, and the other state will need to comminucate that that fact has been acknowledged and taken into account. Then, we need a sort of resident monitor (RM) to be in charge of this "acking"; they become the one keeping track of the total count of prisoners who have visited, and also must communicate back to the others when they are ready for someone to communicate a visit to them again. They will be the only one who knows for sure when every prisoner has visited at least once, given that everyone else will not flip the switch on more than absolutely necessary. However, this comes with a dangerous downside; if the RM expects too high of a number, it is possible that they deadlock, waiting forever for an increment that will never happen because all other prisoners have hit their maximum number of flips. This is why the RM has a rule to not overcount. They should declare the challenge complete as soon as the minimum count required is reached.

Next, let us consider the two base cases (switch starts on versus switch starts off) separately. This means each prisoner will only be flipping the switch once, with the RM trying to count to 99. But more generally, we want to consider what it looks like right before and after the final prisoner has flipped the switch on (this will make it easier to compare to the true solution later):

Switch starts on: In this case, if one of the 99 from the regular group visits before the first time the RM visits, they will have to leave without doing anything. This means that the RM's first visit, in which they turn the switch off for the first time, will NOT correspond to an actual prisoner having visited. Because they have no way of knowing that, they will increment their counter anyway. When the counter reaches 99, it actually represents that only 98 of the regular group has actually visited. 98 prisoners * 1 on state + 1 prisoner * 0 on states + 1 initial on state = 99 on states counted. So it turns out that counting to 99 isn't sufficient! The RM would have needed to count to 100. In fact, assuming a 50/50 chance of the switch starting in the on/off position, if the RM declares that everyone has visited upon counting 99, then there is a 50/50 chance they are correct/wrong! How terrible.
Switch starts off: Okay, so then this is the case in which counting to 99 would be correct. This is because the first time the RM sees the switch on, it definitely represents another prisoner trying to communicate their visit. Right before the last prisoner has flipped the switch, 98 prisoners * 1 on state + 1 prisoner * 0 on states = 98 on states counted, so counting to 99 will be sufficient.

However, the problem says that we don't know which it could be. So then, to be safe, shouldn't we just count to 100? But then, let's examine the deadlock scenario for each case, pretending that the RM wants to count to 100 instead, to be safe. We generalize the description of when the deadlock situation occurs to be when all 99 of the regular group have finished reporting their visits but the RM is still expecting to count higher. We've determined that counting to 99 isn't safe, and want to use 100 instead, but this is what happens with the deadlock scenario:

Switch starts on: When the 99th of the regular group flips the switch on, the RM's next visit will increment the counter to 100 as they turn the switch back off again. This seems to work just fine. 99 prisoners * 1 on state + 1 initial on state = 100 on states counted. Since 100 was reached, the RM declares that every prisoner has visited the room at least once, and the answer is correct.
Switch starts off: When the 99th of the regular group flips the switch on, the RM's next visit will increment the counter to 99 as they turn the switch back off again. 99 prisoners * 1 on state = 99 on states counted. The RM is expecting to count to 100. However, according to the rules for the regular group, no one will ever flip the switch back on again, because they have all satisfied their tasks. Any time the RM revisits the room, the switch will still be off. They can never declare for sure that everyone is finished, because it is possible that the switch started in the on position, and the 99th person just still hasn't flipped it on yet. As time passes and nothing changes, the RM can reason that the probability that a prisoner still hasn't visited is getting lower and lower, meaning that if they risked making the claim that all 100 had visited, the odds are growing in their favor. However, it is still possible to be wrong, and the exact odds are uncertain. So technically, the RM should wait till they count to 100 to be sure, but that will never happen.

So we see this paradox that counting to 99 may not work in the case that the switch started in the on position (there is a chance for a false positive - claiming that everyone visited when they did not), and counting to 100 may not work in the case that the switch started in the off position (there is a chance for a false negative - thinking that not everyone visited yet even though they did).

Thus we come to the true solution, in which everyone in the regular group attempts to communicate exactly two visits to the RM. To explore the implications on the RM's desired count, let us consider the largest count in which the condition of everyone visiting at least once is still unsatisfied. As we somewhat generalized earlier, it will happen when exactly 98 of the regular group have communicated their visits - this time two - but one person still hasn't communicated any.

Switch starts on: 98 prisoners * 2 on states + 1 prisoner * 0 on states + 1 initial on state = 197 on states counted. The 198th on state counted will guarantee that everyone visited at least once.
Switch starts off: 98 prisoners * 2 on states + 1 prisoner * 0 on states = 196 on states counted. The 197th on state counted will guarantee that everyone visited at least once.

Okay, so like before, we would have to pick the larger number to avoid a false positive. That means the RM is trying to count to 198. But will this repeat the issue of introducing the possibility of the deadlock scenario? Recall that it will happen when all 99 of the regular group have completed their tasks - in this case, reporting exactly two visits - but the RM is still expecting to count more.

Switch starts on: 99 prisoners * 2 on states + 1 initial on state = 199 on states counted.
Switch starts off: 99 prisoners * 2 on states = 198 on states counted.

So 198 works!

It appears that for two potiential initial states which can introduce an offset in the count between 0-1, the prisoners must communicate, redundantly, at least two visits to get the minimum and maximum bounds to align. Further redundancy would widen the range of tolerance between the minimum and maximum, but it's completely unnecessary, so they should just stop at two per prisoner, with the expected count being precisely 198.

Addendum: It turns out there is a special case in which the RM walks in for the first time and sees the switch turned off; this can only have been the case if the switch started off and no one else has entered yet. Otherwise, it would have been flipped on. The RM is at least aware of the first time they entered, so this is a valid way to identify the starting state of the switch, though the likelihood of it happening is probably pretty low. When they know for sure that the switch started off, they can actually be guaranteed that a count of 197 is sufficient.

Program Details

The main purpose of the program is to allow a user to simulate different variations of the problem and see how performance changes. By performance, I mean both the prisoners' ability to win the challenge, as well as how fast they can finish. To this end, there are several parameters you can alter when calling the program. You set them with additional arguments on the command line. By default, the program simulates the problem as described so far (100 prisoners, the OS deciding who goes in, their strategy being to have the resident monitor be in charge, the switch state being random, etc.). The following section will list in more detail the various arguments, flags, and other options you can provide to tweak this. After that will be a section that talks about the implementation itself, not the usage of the program. That section is for those curious about the actual code behind the program.

Parameters

After the token used to call the program itself (simulation), you may give additional arguments, separated by whitespace, in any order, to tell the program to behave differently. I distinguish the terms argument, flag, and option here, grouped by the number of hyphen (-) characters used to denote each one:

Command line arguments with no hyphens are what I refer to as plain old arguments.
Those you specify with a single hyphen are what I refer to as flags.
Those you specify with two hyphens in a row are what I call options, and they generally require an equal (=) sign to set, though not always.

Below are the different parameters you can set, grouped by the above categories:

Arguments

Number of Prisoners: any integer > 0. This is the only argument in this program. It represents how many prisoners there will be in the prison, therefore how many will take part in the challenge. It isn't recommended to set it too high, but you could set it lower to finish simulations faster.

Flags

-d: Debug Mode. When specified, enables debug mode (not the same as running an actual debugger lol). This just makes the program print out a lot of extra information about program state and control flow. It is especially useful when trying to understand what threads are getting CPU bursts when. You will find that in most operating systems, the "randomness" of the order in which prisoners get to go in is extremely biased, simply because the task of going in and possibly flipping a switch is so trivial that they are likely to complete it with lots of time left in the the timeslice assigned to them, and the OS, seeing the same thread trying to enter the critical section again with time remaining in their slice, will prefer letting them in again rather than wasting resources on a context switch. In essence, the OS doesn't understand the concept of fairness in our context, or what will complete the program faster in the long run, so it just takes the approach that appears to be efficient from a threading perspective. You could try to let the number of prisoners approach infinity with the hope that the time quanta for each thread will approach 0, until the timeslice for a thread is reliably less than the time needed to enter and exit the switch room. However, if your Unix OS uses completely fair scheduling (CFS), then there is some minimum granularity for timeslices which is probably still larger than the time needed to execute critical section code. Check it out for yourself!
-h: Halfway Mode. This is one the output mode specifiers, the other being silent mode. If you specify multiple -s and/or -h flags, the last one seen will be used. When specified, halfway mode reduces output by not reporting what prisoners do in the switch room, only who enters. This helps mitigate console clutter without being completely silent. It's probably counterintuitive to set this flag along with enabling debug mode and/or verbose mode, but technically you can.
-s: Silent Mode. This is the other of the two output mode specifiers, the other being halfway mode. If you specify multiple -h and/or -s flags, the last one seen will be used. When specified, halfway mode minimizes output by not reporting anything but the final conclusion. Trying to combine this with verbose mode will always result in the -s flag overriding the -v flag (unless you do something funny like -shv, which technically just means you are doing halway mode with verbose mode enabled), however you can combine with debug mode (-d) for an interesting output result focused on thread behavior (unless you set the option to change the warden, in which case the output is pretty much nonsense lol).
-v: Verbose Mode. When specified, enables verbose mode. Gets overriden by silent mode (-s). Verbose mode has the prisoners report a lot more about what they are doing, including how many times they have actually entered the room versus how many times they have flipped the switch.

Options

--initial_state=<state>: Sets the initial state of the switch.
- Some shorthand alternatives for initial_state are initial, init, and i.
- Valid values of <state> are:
  - unknown: The switch state will be randomly determined. This is the default behavior when the option is not specified.
  - on: The switch will start in the on position. You can alternatively write 1, up, or set here.
  - off: The switch will start in the off position. You can alternatively write 0, down, or reset here.
--warden=<type>: Sets what will play the role of the warden, i.e., what decides the order that prisoners go into the switch room.
- Some shorthand alternatives for warden are ward and w.
- Valid values of <type> are:
  - os: Your own OS will be in charge; this is when threads are spawned for each prisoner and all told to try to go in at once. A mutex protects the switch room from multiple prisoners getting in at once, and your OS will have control over scheduling the threads. The is the default behavior when the option is not specified.
  - pseudo: The randomness of the OS is simulated in a single thread; to decide who will be going in, a random number corresponding to some prisoner is chosen, over and over again until the challenge concludes. Because the random number generator doesn't have CFS bias, it's likely to be more fair than the os option, meaning that the prisoners will probably finish much faster despite the general assumption that threading makes things faster. The pseudorandom number generator is based on a Mersenne Twister, and, like with the os option, it is possible for the same prisoner to go in multiple times before another goes in once - it is even possible for the same prisoner to go in multiple times in a row! You can alternatively write pseudorand, pseudorandom, rand, or random here.
  - fixed: In this case, a permutation of the list of all prisoners is decided on in the beginning, and then, in a single thread, that permutation is traversed on loop until the challenge ends. This means that while the order that prisoners will go in is unknown, every prisoner will definitely go in once before the list starts to repeat and prisoners begin going in for their second time, and so on. Note that the prisoners have no way of knowing this (without editing the code), so they will still follow the strategy requiring them to count beyond the first cycle.
  - seq: The same as fixed, but the permutation is just the regular order of the list. The only real purpose of this setting is for when you want the numbers to go in sequence, otherwise it has the same runtime complexity as fixed. You can alternatively write sequential here.
  - fast: Not only do prisoners go into the room in sequential order, but in addition, the resetter is sent in every other time (including as the first one in). This results in the best-case scenario of the proper strategy - the minimum number of prisoners enter the room (395, given the default 100 number of prisoners). On the other hand, combining it with the improper strategy, while still guaranteed to have the prisoners succeed, actually ends up being slower than fixed and seq which also guarantee success.
--strategy=<mode>: Sets whether the prisoners will use the bulletproof strategy as described earlier, or a shaky one that can fail.
- Some shorthand alternatives for strategy are strat and st.
- Valid values of <mode> are:
  - proper: This is the strategy that always guarantees that the prisoners will be correct in their claim. They cannot ever be wrong. In this strategy, only the resetter is allowed to make the claim that all prisoners have been in the room at least once. This is the default behavior when the option is not specified. You can alternatively write p here.
  - improper: This strategy allows any prisoner to claim that the challenge is complete. Even setters can make the claim. A setter will do so as soon as they have satisfied both of these requirements: they must have flipped the switch exactly twice, and they must have entered the room more than twice. Note that when the warden is set to fixed, seq, or fast, this strategy is still guaranteed to succeed. Also note that even when the warden is set to os or pseudo, the prisoners still may succeed. Under the CFS assumption, os is most likely to result in a failure. Run it a few times to see! You can alternatively write i here.
--seed=<value>: Sets a seed to be used for anything determined with user-space randomness.
- A shorthand alternative for seed is se.
- <value> must be parsable as an unsigned 32-bit integer.
  - The default behavior when the option is not specified is to randomly generate a seed at runtime. The idea is that, if you specify the -d flag, you can see what seed is being used, and save it to replicate simulations. Note that you cannot replicate a simulation in which the warden was os with a seed alone; you would need to replicate environmental factors beyond the scope of this program; therefore, it is essentially not possible to replicate a simulation in this case at all.
--help: Prints out a summarized version of these usage details.

Again, you can rearrange the order that you specify arguments, flags, and options however you want.

Implementation

The codebase is separated into several header files and source files. The entry point is found in simulation.cpp, which first calls on the Parser class (found in parser.h) to determine user-given parameters for the program, and then initializes the prison before issuing the challenge to the prisoners. The Prison is its own static class (found in prison.h), which keeps track of a vector of Prisoner objects and a SwitchRoom object that contains a Switch object. The Prisoner class itself is just an abstract base class for two child classes, Setter and Resetter. The Prison makes use of polymorphism to work with both Setters and Resetters in terms of their parent class. Prisoner, Setter, and Resetter can all be found in prisoner.h. SwitchRoom and Switch are found in switch.h. The other modules are for global variables/constants and enum definitions.

Within the Prison::challenge() method, the warden is determined and the perform_task() method of each Prisoner is called polymorphically. Whether threads are started on the perform_task() methods depends on whether the warden is set to the OS or not. Other warden types don't require threads, and instead the challenge() method uses other means to decide the order in which Prisoners execute their perform_task() methods. Early on in challenge(), a boolean for tracking whether the challenge is over is intialized to false. Its address on that stack frame is passed to the perform_task() methods of each Prisoner, which may need to use it to break their own infinite loops when threaded; this works because the boolean can only be set to true within a Prisoner's perform_task() method. In a threaded context, all of the threads will then see the update and realize that some thread declared the challenge over. Back in Prison::challenge(), once the boolean is true, the program will check some statistics, then decide whether the prisoners were correct in their claim. It does this by looping over all the prisoners and ensuring that they did all in fact enter the room at least once.

The Prisoner perform_task() methods are very similar between Setters and Resetters. In essence, they all begin by attempting to unlock the door, whether threaded or not. So long as the room is not already occupied, they will succeed, and let themselves in. Once inside, they consider whether they need to inspect the state of the switch or not. The answer to that question is no when the prisoner has already finished their task. However, if they have not, then they consider whether they need to flip the switch. This depends on whether they are a setter or resetter and whether the switch is on or off. If they do decide they need to flip it, they will increment an internal count they are working towards. For setters, this is 2. For resetters, this is determined with the formula (Number of Prisoners - 1) * 2. Once finished, prisoners leave the room and lock the door behind them, signalling to the next in line that the room is available again.

The SwitchRoom's only real purpose is to maintain some state information, including the state of the Switch object within. It is worth noting, if a Prisoner wants to do anything inside the room, including asking about the state of the switch or toggling it, they must first enter the room. And to enter the room, they must first unlock the room. They cannot unlock it if it is not currently locked. This means that if a prisoner leaves the room and does not lock the door behind them, no other prisoner will be able to enter.

gatoflaco / 100-Prisoners-1-Switch-Problem