MAS Final Project Sup 1

CSCE 475/875

Final Project Assignment:

Local Decisions vs. Global Coherence

Assigned: September 23, 2013

Due: December 16, 2013

(Project 5 minutes late will not be accepted)

A Supplement: An Example

This supplement is to give you an example for your proposal to be turned in on October 16. There are five components that you will need to provide sufficient details for me to make a judgment in terms of the complexity of your design and project, and also for me to hold you accountable for your design when I evaluate your final project report. The five components are: (1) the problem statement, (2) the agent design strategy, (3) the desired emergent behavior, (4) the hypotheses, and (5) the experiments that you will likely conduct.

Here is an example:

Problem Statement

To design a multiagent system where individual agents work as a team of robotic mine sweepers effectively and efficiently in a specific area. The area’s size is square feet and there are K mines located randomly in the area. There are N mine sweepers randomly dropped into the area. Each mine is set to go off on its own within a specific time, and will render a mine sweeper non-functional (“kill a mine sweeper”) if the mine sweeper is within a R ft. radius of an exploding mine. Every mine sweeper only has enough battery power to traverse a total of B feet in distance. And three mine sweepers are required to de-activate a mine. To be optimally effective, the team must be able to detect all mines in the area before a mine goes off. To be optimally efficient, the team must be able to keep all its robotic mine sweepers alive and retain the highest total amount of battery power in all the robots.

Agent Design Strategy

Each agent can move in the grid in four directions: north, south, east, and west, one foot at a time. If it is atop of a mine, it can obtain the remaining time before the mine goes off. It can decide to move away from the mine immediately, one foot at a time, or it can decide to deactivate it. It can try to form a team of three members, communicating with its team members to all move into the grid. When all three are atop of the mine, one will signal an action to deactivate the mine. Of course, if the mine explodes before it is deactivated, then all three sweepers will be killed.

Our agent will be designed such that the local decisions are made according to the following principles in order of importance: (1) preserve its life at all cost, (2) deactivate as many mines as possible, (3) preserve its battery power. That means, for example, if the agent realizes that it is running out of battery power, it will try to go towards a region where mines have been de-activated before it becomes immobile. Our agent will decide whether to (1) move, (2) stay, (3) form a team of three, or (4) de-activate a mine. At the same time, our agent will be able to communicate to other agents. If our agent decides to stay, then it will need to decide whether to move north, south, east, or west. When deciding, it will consider the costs and rewards, which constitute its utility. For example,

Utility(action1) = Cost(action1) – Reward(action1)

Cost is based on the loss of battery power, and possible “kill”. Reward is based on the virtual monetary value of deactivating a mine, and possible moving closer to a yet-to-be-de-activated mine. The exact equations of these costs and rewards will be detailed in our final project report.

To help our agent make the correct decisions, it will remember the area that it has visited and the mine it has encountered and whether the mine has been de-activated. Thus, it is possible for our agent to share this information with other agents to gradually build a complete view of the square feet area.

Our agents can also communicate with other agents to form a team of three if Utility(form a team of three) is the best at the time.

Of course, the more the agents communicate and exchange information each time they communicate, the less is the locality of their decision making process.

Desired Emergent Behavior

The desired emergent behavior is that teams of three agents are able to exchange information with each other to cover the entire square feet area, de-activating all mines before they go off, preserving the lives of all agents, and maintaining the highest amount of collective battery power.

Hypotheses

Hypothesis 1: The effectiveness of the team of agents (robot mine sweepers) is proportional to . However, if N is below T such that , then the effectiveness becomes non-existent with the percentage of mines successfully de-activated before going off being below 5% of the total.

Hypothesis 2: The efficiency of the team of agents (robot mine sweepers) is inversely proportional to R, but proportional to B. That is, if the blast radius of an exploding mine is smaller, then it is easier for an agent to be efficient. If an agent has more battery power, then it is also easier for an agent to be efficient.

Hypothesis 3: If >> and >>, then the system will not achieve coherence (no emergent behavior). On the other hand, if >> and >>, then the system will achieve coherence quickly. If ( >> and >>) or ( >> and >>), then there is a phase-transition point such that the performance of the system switches from emergent coherence to non-convergence.

Experiments

Our experiments are designed to test the above three hypotheses. We will run numerous simulation runs varying the four key parameters in the following manner:

Parameter	Range of Values
L	100, 250, 500, 750, 1000
B	100, 250, 500, 750, 1000
N	500, 1000
R	5, 10, 50, 100

Thus, there are different configurations. For each configuration, we will run five simulations and then average the results. For each run, we will collect the number of mines de-activated, the number of mines that blew up, the number of mine sweepers killed, the number of mine sweepers that ran out of battery power, the total amount of battery power left (in terms of feet). The first two result constitutes the effectiveness of the system while the later three the efficiency of the system. We will then plot the results accordingly to individually respond to the hypotheses posed. For our discussion of results, we will follow the POJI[1] protocol by Professor Leen-Kiat Soh.

[1] What is POJI? This will be discussed in class.