Among Us

Group 19: Logical Aspects of Multi-Agent Systems.

Introduction Model Knowledge Rules Findings Extensions GitHub

What is Among Us?

Among us is a hidden information game which is set in various sci-fi locations. The players are divided up into two different groups, crewmates and impostors, which are visually indistinguishable. The goal of the impostors is to kill every crewmate, and the goal of the crewmates is to either complete a series of tasks in their environment, or to figure out who the impostors are. The latter is done by a discussion and voting round in which all players that are alive can share information. If during the voting round sufficient players vote for one specific player, they will be removed from the game.

Information can be obtained in various ways. It is known to impostors who the other impostors are. Crewmates can obtain this information by witnessing certain acts that only impostors can perform, such as killing or making use of a special navigation system only accessible to them. Furthermore, crewmates can witness fellow cremates performing certain actions which confirm their innocence. These actions are special tasks that have a special animation which is visible to all players near the acting crewmate. Lastly, impostors have access to a special series of actions called sabotages. These actions do not have animations or any other indicators towards who performed them, so they can be performed without revealing who the impostors are, and help them in isolating crewmates.

The results of this project consist of two parts. First, a simulation was built of a simplified version of the game. In this version, impostors do not have access to a special navigation system or sabotage, and do not interfere with the discussion phase. For the corresponding Kripke model, the information that is encoded within is which agent suspects which other agents of potentially being an impostor. Second, a formal model was set up, including a proof-of-concept program, exploring the possibilities of modelling information regarding to location and the use of special navigation systems.

This report is structured as follows: First, we will discuss the formalization of the kripke model for the simulation. Second, we discuss how agents can interact with this kripke model. Third, we will discuss the findings from running the simulation with varying parameter settings. Finally, we will discuss the location model and its results.

Formalizing Among Us

Fig.1 showcases in broad terms how our implementation of Among Us is structured. Below, the phases are explained in more detail.

A flowchart of how our Among Us is implemented.
Fig.1 - A flowchart of how Among Us is implemented.

Action Phase

The action phase consists out of two sub-phases: acting and observing. In the former, all agents can either move to an adjacent room or perform an action, which is a task if they are a crewmate, or kill if they are an impostor. They can also choose to neither move nor perform an action. After all actions are completed, the agents observe what actions have taken place in the room that they are in, and witness those that are marked as observable events, such as visual tasks, murder or the presence of a corpse. The agents follow a hard-coded strategy to complete their own tasks by selecting one that they have not completed, moving to the correct room, and performing said task. Impostors, on the other hand, move about in a random fashion if they do not kill. This action phase is repeated until a crewmate observes a dead body, in which case the agent can report the body. This immediately triggers the discussion phase.

To model the real game, the impostor will have a cooldown of a set number of rounds on their kill-action. When it is capable of killing a crewmate, this action is stochastic. This means that simply being in the same room as another agent without getting killed does not mean that they are a crewmate. The probability of the impostor killing a crewmate depends on how many other crewmates are in the room. If, for example, there are many other crewmates in the same room as the impostor, the chances of the impostor killing anyone are lower than when the impostor is in a room with only one crewmate.

Discussion Phase

The discussion phase starts with an announcement of which agent has died. Afterwards, each crewmate announces their entire knowledge set about who which agents are an impostor, a crewmate, or could still be both. At the start of the game, when crewmates still have not gained any knowledge, the knowledge of crewmate 1 is modelled like K_1 ((Imp_2 \lor \neg Imp_2) \land (Imp_3 \lor \neg Imp_3) \land ... \land (Imp_n \lor \neg Imp_n)). Here, both possibilities are still open for each agent and the role of the crewmate making the announcement is not included.

Then, when an agent knows that for example a_2 is an impostor and a_3 is a crewmate, the knowledge gets updated to K_1 ((Imp_2) \land (\neg Imp_3) \land ... \land (Imp_n \lor \neg Imp_n)).

Impostors do not announce anything at this stage. Since crewmates only incorporate the knowledge of other agents that they are sure of are also crewmates, the announcements of the impostors would always be ignored and are therefore not of any added value.

Voting Phase

The crewmates and the impostors have different voting behaviour. Before voting, the crewmates check in the kripke model which worlds they still consider to be possible and therefore also which agents they still suspect. They also check if they are already sure whether an agent is the impostor. If this latter is the case, then they will vote for this agent which they know is the impostor. Otherwise, they randomly vote for an agent which is still on their suspect-list, or they pass. When the list of suspects becomes smaller, so does the chance of crewmates passing. This allows crewmates to be more confident in their guess if there are fewer impostor possibilities. The threshold under which an agent passes is determined by the number of suspects divided by half of the initial agents.

The impostors use higher-order knowledge in their voting strategy. Before voting, they determine which crewmate they should vote off by counting the number of suspects these crewmates still have. This number is indirectly announced in the announcement phase, and from this information impostors know what crewmates know. In order to eliminate the highest threats, the impostors vote for the agent that suspects the least number of agents and is therefore closest to finding the impostors. The impostors never vote for fellow impostors.

After all votes have been cast, the action with most votes is executed. This means that either an agent is voted off the ship, or the voting round is passed. In case a tie occurs between multiple actions, the voting round also passes. There does not need to be a majority (e.g. more than half of the votes) for an action to be executed.

Ending conditions

The game can end in three possible ways:

Formal Model

Our simulation allows for one or two impostors, both with differing Kripke models. The formal definition for both will be given here. First, shared in both definitions are the following:


Kripke Model: One impostor

The Kripke model for the case of one impostor is defined as follows:

M^1 ::= < S^1, \pi^1, R_1^1, ..., R_m^1 >, with:

Kripke Model: Two impostors

The Kripke model for the case of two impostors is defined as follows:

M^2 ::= < S^2, \pi^2, R_1^2, ..., R_m^2 >, with:

Kripke Model: Number of worlds and relations

As aforementioned, our implementation of Among Us supports simulations with either one or two impostor(s). The size of the Kripke model The size of the model depends on this, as each world represents which single agent is, or pair of agents are, the impostor(s). For one impostor, the model has the same number of worlds as there are agents, where each world represents a state where one agent is the impostor. Each world is connected to each world, meaning there are 10*10 =100 relations.

For two impostors, the size of the model is the number of agents times the number of agents minus one, divided by two. In the case of ten agents, this means that there are (10*9)/2=45 possible worlds. This way, all possible combinations of agents have a single world. This means that w_1_2 does exist but w_2_1 does not. As for the relations, for each agent, each world in which that agent is not involved (45-9=36) is connected to all other worlds minus itself (35). Additionally, each world has a reflexive relation to itself. There are therefore (36*35)+45=1305 relations per agents, so 10*1305=13050 relations in total. At the start of the game, the Kripke Model for ten agents and two impostors can be seen in Fig.2.

Kripke Model at the start of a game
Fig.2 - The full kripke model at the start of a ten agent game with two impostors.

In the visualization of the world, the reflexive arrows have been left out and the straight lines represent bidirectional arrows. Additionally, the impostor relations have also been left out of the in-game simulation. This is because the impostors do not update these relations, nor are they used by any other agents. This allows for a neater visualisation, as otherwise all relations would still be visible by at the end of the game. At the end of an example run, the Kripke Model then looks like Fig.3.

Kripke Model after a run
Fig.3 - The full kripke model after a random ten agent game with two impostors.

Knowledge Types

There are four basic types of knowledge that can be learned. Here, a1 and a2 $\in A$, x1 and x2$\in X$ and y $\in Y$ (see the 'formal model' section):

For the first two in these types of knowledge to be gained, it is important that a crewmate a1 is in the same room as agent a2 (where a1 a2) at the same time. The last knowledge type about whether an agent is an impostor or not can be derived from the first three knowledge types as described in the next sections.

Reasoning Rules

There are several rules a crewmate (here a1, a2 and a3) can use to deduct this last type of knowledge about who the impostors (here a4) are. They are listed as follows:

Findings

In order to test several functionalities and interactions of our implementation, we ran several simulations. The standard settings with which we ran the simulations are:

Each simulation was run with 10000 iterations and the standard settings were used, except for the variable that was experimented with.

The first changeable variable is the number of impostors (see Fig.4). Our implementation supports having either one or two impostors. Here, as expected, the crewmates are very likely to win in a game with only one impostor. In most cases, the crewmates figure out who the impostor is and vote them out. In some cases, the crewmates win by completing their tasks. This means that the impostor was not fully discovered, but could also not kill enough crewmates in time. In about 3% of all cases, the single impostor still manages to kill enough crewmates in order to win.

The game with two impostors, which is run according to all the standard settings, is more balanced. Here, the impostor/crewmate winrate is close to 50/50. This property, in combination with standard settings from the real game Among Us, is why these base settings were set this way. Where crewmates win, they usually do so because they discovered who the impostors were and could therefore vote them out. In about 13% of all runs, the impostors could not kill fast enough, but were also not discovered in time. This results in a win for the crewmates because they managed to finish all their tasks and save the ship.

Results with a varying number of impostors
Fig.4 - The winrate of impostors and crewmates depending on the number of impostors.

When varying the number of crewmates (see Fig.5), the winrate shifts a lot. With three crewmates and two impostors, it is near impossible for the crewmates to win. This is because if only one of them is killed, they already lose. When increasing the number of crewmates on the other hand, it becomes apparant that the impostors have a hard time winning. Another visible change is that the crewmates are more likely to win by voting out the impostors than by completing all their tasks. This is due to the fact that (1) impostors have to kill more people and therefore have a higher chance of being caught, (2) it takes longer for all the impostors to kill all crewmates, giving the crewmates more time to communicate, and (3) crewmates have more possible agents that they can trust and communicate with. Therefore, their combined knowledge is greater and the impostors can more easily be identified.

Results with a varying number of crewmates
Fig.5 - The winrate of impostors and crewmates depending on the number of crewmates.

Varying the number of tasks (see Fig.6) also behaves as can be expected. With a lower number of tasks (e.g. four), the crewmates mainly win by completing their tasks, as the impostors do not have enough time to kill and the crewmates do not have enough time to gain and share their knowledge. With an increasing number of tasks however, the chances of crewmates winning by finishing their tasks becomes smaller and smaller. With 16 tasks, this becomes near impossible. The increasing number of tasks also increases the chances of impostors winning, as they have more time to kill, but it also increases the chances of crewmates winning by voting off the impostors. With more tasks, the number of visual tasks also scales. This allows the crewmates to clear eachother more often, and they can therefore share their knowledge more often. Additionally, they have extra time to gather and share this information.

Results with a varying number of tasks
Fig.6 - The winrate of impostors and crewmates depending on the number of tasks.

The number of visuals greatly impacts the trust relations between agents, as can be seen on the left of Fig.7. With no visual tasks, the crewmates can never establish full trust in another agent. Therefore, they cannot know that what other agents communicate is true, making them vulnerable to losing from the impostors, as can be seen in the figure on the right of Fig.7. With an increasing number of visual tasks, where five visual tasks means that half of the tasks are visual and ten visual tasks means that all tasks are visual, the winratio for the crewmates quickly increase, as their chances of trusting and communicating with other crewmates also increase. This is visible by the decline in impostor wins and the decreasing part the wins by tasks play in their total number of wins.

Results with a varying number of visual tasks
Results with a varying number of visual tasks
Fig.7 - The winrate of impostors and crewmates as well as the total trust relations depending on the number of visual tasks.

The last variable that we changed in our simulations was the kill cooldown of the impostors (see Fig.8). This is denoted in the number of time steps that pass before an impostor can kill again. Interestingly, when the impostors do not have any kill cooldown, they are less likely to win than when they do have a (short) kill cooldown. This can mostly be explained by how the impostors decide they are going to kill. Basically, the more agents that are in the room, the less likely an impostor is to kill one of these agents, as these agents would otherwise know who the impostor is. However, this probability is not zero, so with such a short cooldown, the impostors also have a higher probability of killing off agents when too many other agents are around. This can for example occur right at the start of the game, when all agents gather in the Cafetaria. Additionally, more frequent kills lead to more frequent discussion phases, which allows crewmates to communicate and vote more often. In general, this leads to shorter games, which is also illustrated by the fact that next to no games are won by completing all the tasks. With a longer cooldown like ten seconds, the behaviour is as expected, where the impostors win less than with a five-step cooldown and the crewmates win more often by completing all their tasks. With such a set-up, the impostors can often not kill the crewmates in time.

Results with a varying kill cooldown
Fig.8 - The winrate of impostors and crewmates depending on the kill cooldown.

Extended Knowledge Model: Movement between Rooms

In this section, we will discuss a model for formalizing the movement that takes place in among us. Due to computational and time constraints, this part was not added to the main simulation but implemented separately for a much simpler game in a separate branch.

Background

One kind of reasoning about the game Among Us involves the way players move from room to room. For example, consider a scenario where player c learns that b was killed in a specific room and tries to figure out who could have been the killer. He might for example think as follows: “I was close to player a while b was killed at the other side of the map. This means a cannot be the killer.” A good player does not just consider the information that he observed directly, but also what can be extrapolated from it by considering the rules of the game. For example, a might have been out of sight of c for a short time, but c can use his knowledge of the map to deduce that there was insufficient time for a to move all the way to the place where b was killed. In this case c should still remove a from his list of suspects.

We can capture this kind of reasoning by constructing a Kripke model that describes what agents know about where everyone is at any time. To keep things simple we will study a highly simplified version of the game in which we only consider the way agents move from room to room. Movement and observations will be described by using Dynamic Epistemic Logic action models (Ditmarsch, Hoek, and Kooi 2007) with postconditions(Benevides and Lima 2017). This approach could in principle be extended to a more complex model, where we also consider things like other types of actions and agent roles, although in that case the number of states would probably become very large even when considering modest numbers of agents and rooms. In what follows we will first describe the Kripke model and the action models used to update it. We will then describe how the model was implemented in Python, describe two optimizations that are used in the implementation, and show its application to a simple example. Our source code is available on Github and can also be executed interactively using a web-based notebook on Binder without the need to install any software.

Kripke model

Let \mathcal{A} denote the set of agents and \mathcal{R} the set of rooms. At any time, each agent is in one of a fixed set of rooms. We describe the location of agents with the set of atomic propositions P_{\mathcal{M},\mathcal{R}} = \{a_r | a \in \mathcal{A}, r \in \mathcal{R}\}, where a_r denotes that a is in room r. The way agents can move from room to room is described by the relation \rightsquigarrow on the rooms, such that r_1 \rightsquigarrow r_2 if room r_2 can be reached from room r_1 in one movement step. Since we will always allow agents to move in both directions between rooms and to remain in the same room, the relation \rightsquigarrow will be reflexive and symmetric.

When |\mathcal{A}| = m, at any given time the knowledge in our model is described by a \mathcal{S}5_{\mathsf{(m)}}-world (\mathbb{M}, s), where \mathbb{M} = \left<S, \pi, R_1, \dots, R_m \right> is a Kripke structure and the distinguished state s \in S describes the real world. The valuation function \pi: S \rightarrow (P_{\mathcal{M},\mathcal{R}} \rightarrow \{\mathbf{t},\mathbf{f}\}) assigns a truth value to all atomic propositions in each state. Because agents are always in exactly one room, the valuation function must satisfy the following constraint for every a \in \mathcal{A} in order to describe a consistent state of the game:

[\pi(s)(a_i) = \mathbf{t}] \quad \mathrm{iff} \quad [\pi(s)(a_j) = \mathbf{f}] (j \neq i)

Action models

The Kripke model is updated by executing an action model with preconditions and postconditions. An action model describes a set of possible actions, of which one will be executed. The definitions we use are similar to those given in (Benevides and Lima 2017), but the preconditions are restricted to being conjunctions of possibly negated atomic propositions, which is sufficient for our prupose. Briefly, an action model with postconditions is very similar to a Kripke structure and has the following structure: \mathsf{M} = \left<\mathsf{S}, \mathsf{pre}, \mathsf{pos}, R_1, \dots, R_m\right>. Here \mathsf{S} is the set of actions that are possible in the action model. \mathsf{pre} : \mathsf{S} \rightarrow \mathcal{L} assigns a precondition sentence to each action (something like p_1 \land p_2 \land \neg p_3), and \mathsf{post} : \mathsf{S} \rightarrow (\{p_1, \dots\ p_j \}, \{\neg q_1, \dots, \neg q_k\}) assigns to each state a set of propositions that will become true after the corresponding action is executed and a set of propositions that will become false. The relations R_m relate states that are indistinguishable for agent m.

The result of executing an action (\mathsf{M}, \mathsf{j}) in action model (\mathbb{M}, s) is the Kripke world (\mathbb{M} \otimes \mathsf{M}, (s, \mathsf{j})). We will not repeat the definition of the operation, but the states (s, \mathsf{j}) of \mathbb{M} \otimes \mathsf{M} are those states in the Caresian product S \times \mathsf{S} where (mathbb{M}, s) \vDash pre(\mathsf{j}), with the substitutions in the postcondition carried out. Two states in the product model are related for agend a if the corresponding states and actions are related for that agent in both the Kripke model and the action model (thus agents can distinguish resulting states when they can distinguish either the previous state or the action).

In our simplified version of Among Us, every turn consists of all agents simulataneously taking a movement step, followed by them simultaneously observing the state of the new room they are in. We can neatly define this set of operations as the execution of a set of action models. We will define a movement action model \mathsf{MOV}_a describing agent a taking one step, and an action model \mathsf{OBS} for the effect of all observations. The relation between the Kripke models of two consecutive time steps is then as follows:

(\mathbb{M}_{t+1}, s_{t+1}) = (\mathbb{M}_{t}, s_{t}) \otimes (\mathsf{MOV}_{a}, \mathsf{mov}_{a,t}) \otimes \dots \otimes (\mathsf{MOV}_{m}, \mathsf{mov}_{m,t}) \otimes (\mathsf{OBS}, \mathsf{obs}_{t})

Single agent movement

The single agent movement action model \mathsf{MOV_a} contains one state for every connected pair of rooms (r, s) \in \rightsquigarrow. It has precondition \mathsf{pre}(r,s) = a_r, and postcondition \mathsf{post}(r,s) (\{a_s\}, \{\neg a_r\}) (so the only change is the position of agent a). All states are distinguishable for a and indistinguishable for all other agents, so R_x includes only the reflexive relations if x=a and is the full relation on the states of \mathsf{MOV_a} for all other agents.

Observation

The observation action model contains one action for every possible distribution of agents over rooms, so the total number of states is |\mathcal{R}| ^ {|\mathcal{A}|}. For the state for a distribution (a_i, b_j, \dots), the precondition will be the conjunction a_i \land b_j \land \dots, so each action picks out states in the Kripke model that match that distribution of agents. Two states in the observation action model are equivalent for agent a if the set of agents in the room where a is is the same. The postconditions of the observation action model is empty. Thus, applying this action model leaves the size of the Kripke model unchanged but removes relations where agents can distinguish the corresponding states based on what they can see in the room they are in.

Implementation

The Kripke models and action models were implemented from scratch in Python using the NetworkX graph manipulation library(Hagberg, Schult, and Swart 2008). As we only consider fairly simple sentences here, all logical sentences are evaluated using the built-in set operations of Python. To be able to handle more complex sentences (like higher-order knowledge) the Kripke models could be translated into mlsolver models like the other part of our project, but for the sake of simplicity we did not pursue this.

Optimization 1: Removing irrelevant states

Because we only evaluate sentences that contain a single K_a-operator in the distinguished state s and this operator only depends on the states that are reachable from s in one step, the ourcomes we report do not depend on the states in the Kripke-model that are not related to s for any agent. We can use this fact by removing all states that are not neighbours of s after every application of an action model. It is easy to see that this simplification is safe as long as the preconditions and postconditions in the action model don’t use the K-operator (which is currently not possible in our simulation). This optimization can be enabled or disabled in the simulation, and was used during the generation of the results below.

Optimization 2: Replacing nodes with their bisimulation class

After the execution of a sequence of action models, one state will be created for each sequence of actions that is allowed by the preconditions. This means the number of states tends to grow as the simulation proceeds. Many of these states can be removed by the first optimization, but an orthagonal problem is the creation of states that are equivalent for the evaluation of all sentences. This type of equivalence is called a bisimulation, and we can deal with it by replacing all states in the Kripke-model with their bisimulation class. A full explanation is given in (Eijck 2006), in addition to an algorithm that can be used to perform the optimization. We implemented this algorithm and saw a simplification of the Kripke model in many cases. For example, when the simulation is started with two agent in the same room that are then moved apart and back together, both agents know the full state of the game, but the Kripke model will consist of a number of indistinguishable states referring to each other. The application of the bisimulation algorithm correctly reduces these to a single state. In our example below, this step was performed after the removal of irrelevant sates every time an action model was executed. In the implementation of the bisimulation algorithm an existing recipe for partitioning sets into equivalence classes based on a binary predicate was used (Reid 2007) together with the graph algorithms built into NetworX.

Example

For the example, we consider a game where two agents a and b are moving around in a map of four room that are connected in a linear fashion (r_1 \leftrightsquigarrow r_2 \leftrightsquigarrow r_3 \leftrightsquigarrow r_4). In the first time step, both agents are in room 1. Every consecutive step, agent b moves one step to the right while agent a remains in room 1. Becuase the Kripke models are not very informative to look at directly, we have at each time step evaluated propositions of form a_i, \neg a_i, K_{a^\prime} a_i, K_{a^\prime} \neg a_i in the distinguished state s (the real world) for every combination of a \in \{a, b\}, a^\prime \in \{a, b\}, i \in \{1,2,3,4\}. The code includes a function for automatically evaluating such a set of propositions and constructing a LaTeXtable that can be used to easily check the state of the Kripke model.

The results are shown in the tables below, where all propositions in the truth row should be read as (\mathbb{M}_t,s_t) \vDash p, in the agent a row as (\mathbb{M}_t,s_t) \vDash K_ap and in the agent b row as (\mathbb{M}_t,s_t) \vDash K_bp (so each table cell shows propositions describing that room that are true or that are known to be true by that agent). \begin{array}{|c|c|c|c|c|} \hline t = 1 & \text{Room 1}&\text{Room 2}&\text{Room 3}&\text{Room 4}\\ \hline \text{Truth} & a_1,b_1 & \neg b_2,\neg a_2 & \neg b_3,\neg a_3 & \neg a_4,\neg b_4\\ \hline \text{Agent a} & a_1,b_1 & \neg b_2,\neg a_2 & \neg b_3,\neg a_3 & \neg a_4,\neg b_4\\ \hline \text{Agent b} & a_1,b_1 & \neg b_2,\neg a_2 & \neg b_3,\neg a_3 & \neg a_4,\neg b_4\\ \hline \end{array}

Initially, agent a and b can see each other, so both know where the other is, and consequently the entire state of the game.

\begin{array}{|c|c|c|c|c|} \hline t=2 & \text{Room 1}&\text{Room 2}&\text{Room 3}&\text{Room 4}\\ \hline \text{Truth} & a_1,\neg b_1 & b_2,\neg a_2 & \neg b_3,\neg a_3 & \neg a_4,\neg b_4\\ \hline \text{Agent a} & a_1,\neg b_1 & b_2,\neg a_2 & \neg b_3,\neg a_3 & \neg a_4,\neg b_4\\ \hline \text{Agent b} & a_1,\neg b_1 & b_2,\neg a_2 & \neg b_3,\neg a_3 & \neg a_4,\neg b_4\\ \hline \end{array}

At time 2, b has moved one step to the right (as can be read from the truth row). After one step of b, agent a knows that b must be in room 2 because he is no longer in room 1 and this is the only room that was reachable in one step. Similarly, agent b sees that a is not in room 2. Because the only possible actions for a were to move to room 2 or to stay in room 1, he knows a must still be in room 1. Consequently, both agents still know the valuation of all atomic propositions.

\begin{array}{|c|c|c|c|c|} \hline t = 3 & \text{Room 1}&\text{Room 2}&\text{Room 3}&\text{Room 4}\\ \hline \text{Truth} & a_1,\neg b_1 & \neg b_2,\neg a_2 & b_3,\neg a_3 & \neg a_4,\neg b_4\\ \hline \text{Agent a} & a_1,\neg b_1 & \neg a_2 & \neg a_3 & \neg a_4,\neg b_4\\ \hline \text{Agent b} & \neg b_1 & \neg b_2 & b_3,\neg a_3 & \neg a_4,\neg b_4\\ \hline \end{array}

After another time step, a has no way to know if b stayed in room 2 or moved to room 3. Consequently, all he knows about b is that he is not in room 1 (or he would be seen by a) or in room 4 (because that room cannot be reached from room 1 in two steps). Agent b also doesn’t know exactly where a is anymore, but he does know a cannot be in room 3 or 4.

\begin{array}{|c|c|c|c|c|} \hline t = 4 & \text{Room 1}&\text{Room 2}&\text{Room 3}&\text{Room 4}\\ \hline \text{Truth} & a_1,\neg b_1 & \neg b_2,\neg a_2 & \neg b_3,\neg a_3 & b_4,\neg a_4\\ \hline \text{Agent a} & a_1,\neg b_1 & \neg a_2 & \neg a_3 & \neg a_4\\ \hline \text{Agent b} & \neg b_1 & \neg b_2 & \neg b_3 & b_4,\neg a_4\\ \hline \end{array}

Finally, agent b moves to the last room. Both agents now only kow that the other is not in the same room as they are.

References

Benevides, Mario, and Isaque Lima. 2017. “Action Models with Postconditions.” Computacion Y Sistemas 21 (September): 401–6. https://doi.org/10.13053/CyS-21-3-2808.

Ditmarsch, H. V., W. Hoek, and Barteld P. Kooi. 2007. “Dynamic Epistemic Logic.” In.

Eijck, Jan van. 2006. “Lecture Notes Logica Voor Ai: Bisimulations.” https://staff.fnwi.uva.nl/d.j.n.vaneijck2/courses/lai0506/LAI11.pdf.

Hagberg, Aric A., Daniel A. Schult, and Pieter J. Swart. 2008. “Exploring Network Structure, Dynamics, and Function Using Networkx.” In Proceedings of the 7th Python in Science Conference, edited by Gaël Varoquaux, Travis Vaught, and Jarrod Millman, 11–15. Pasadena, CA USA.

Reid, John. 2007. “Equivalence Partition (Python Recipe).” https://code.activestate.com/recipes/499354-equivalence-partition/.