next up previous
Next: The Synchronous Double Auction Up: Modeling Other Agents Previous: Levels of recursive modeling

Non-estimating agents

  Although as one might expect, 0-level is as low as we can go, there are varying degrees of sophistication within levels--including the zeroth--that are worth distinguishing in a taxonomy of learning approaches. In particular, an agent might learn a policy without explicit consideration of at-i at all, in which case its policy, fi(sti), is a function of its local state alone.

An example of such an agent, introduced in our trading scenario below, is the competitive agent. To be competitive in a market context means to assume that one's own effect on the environment is negligible, in which case there is no advantage to speculating about the actions of others. This type of agent tends to behave ``reactively'', although this characterization is not as precisely defined as is ``competitive'' in the market context.

Our 1-level agent models the other agents as non-estimating 0-level types. To estimate the policy of agent j, it applies a locally weighted linear regression to the available history data, \( \{(s^j_{\tau},a^j_{\tau}) \vert \tau < t\} \).Given current state stj, take its k nearest neighbors (defined by Euclidean distance), $\{s^j_{\tau_1}, \ldots, s^j_{\tau_k}\}$, and run a linear regression on the data points $\{
(s^{j}_{\tau_1},a^{j}_{\tau_1}), \ldots,(s^j_{\tau_k},a^{j}_{\tau_k})
\}$. This yields an estimation of the parameters $\alpha$ and $\beta$ in the model

\begin{displaymath}
a^j=\alpha + \beta s^j. \end{displaymath}

The estimate of agent j's action will then be $\hat{a}_t^j=\alpha +
\beta s^j_t$.

We can define a 2-level agent in a similar manner. The 2-level agent tries to model $f^j(s_t^j, \{ f^{k}(s_{t}^{k},\hat{a}_t^{-k})\vert
k\neq j\})$, where \( f^{k}(s_{t}^{k},\hat{a}_t^{-k}) \) is its model of agent j's model of agent k's policy. Since all agents have the same observations (except about their own policies), i's model of j's model of k (for \( k\neq i \)) is exactly what i's model of k would be if i were acting as a 1-level agent.

In our previous study [12], we investigated two types of agents: a competitive agent and a 1-level agent that modeled the others in the aggregate as non-estimating. Our learning model was also online, implemented in a market system called WALRAS [28]. Our experiments showed that when states are not observable, such incomplete information can lead the learning agent to a self-fulfilling suboptimal equilibrium. In this paper, we designed and tested four types of agent in a different, auction-based market system.


next up previous
Next: The Synchronous Double Auction Up: Modeling Other Agents Previous: Levels of recursive modeling
Junling Hu
4/27/1999