| |
(7) |
A 1-level agent tries to estimate a fixed policy functions of other agents. It assumes that other agents are 0-level competitive agents, who choose actions based on their individual optimization problem. That means other agent's actions are only function of their own states. Assuming the policy functions are linear, the estimation of Ptj is
![]()
A 2-level agent, like a 1-level agent, models the policy functions of
other agents. But a 2-level agent assumes others are 1-level learning
agents. We adopt a simplified 2-level model:
Ptj=f(etj, et-j). We found that a linear regression method
does not work well. One reason is the correlation between the
independent variables etj, e-jt. In addition, the
high dimensionality of input data requires large amount of data in
order to get unbiased estimation. Since we assume agents take only a
small sample of history data (fixed window length), we need to adopt
a nonparametric regression method. We use the K-Nearest Neighbor
method in this paper. For current joint state
, take
its K nearest neighbors
as the inputs and the corresponding actions
of agent j,
as the outputs. The estimation of Ptj is:

where dl is the distance between the data point