The sequence of buys and sells for a particular stock, the order flow, we model as an Input-Output Hidden Markov Model fit to historical data.  When combined with the dynamics of the order book, this creates a highly non-linear and difficult dynamic system.  Our reinforcement learning algorithm, based on likelihood ratios, is run on this partially-observable environment.  We demonstrate learning results for two separate real stocks.
| Adlar J. Kim, Christian R. Shelton, and Tomaso Poggio (2002). "Modeling Stock Order Flows and Learning Market-Making from Data." Technical report. MIT AI Lab, AI Memo 2002-009. | |||||||
@techreport{KimShePog02,
   author = "Adlar J. Kim and Christian R. Shelton and Tomaso Poggio",
   title = "Modeling Stock Order Flows and Learning Market-Making from Data",
   institution = "{MIT} {AI} Lab",
   type = "AI Memo",
   year = 2002,
   number = "2002-009",
   month = jun,
}