Yingfei Wang | University of Washington

Online pricing and dynamic bidding for freight brokerage services.

[my picture] width= A significant amount of truck transport is handled by freight brokers in the trucking industry. We consider a broker who would like to automate the bidding process, which can be dynamic based on past observations, between business partners to a stream of loads, observing either success or failure from both parties in each matchmaking attempt. To circumvent uncertainty and reason about the whole truckload network, we consider an unknown underlying parametric discrete choice models with the market response to any given price confounded by statistical noise. We design a contextual optimal learning policy with bootstrap aggregating to guide price experimentation by maximizing the expected value of information. The empirical investigation in the design of the dynamic bidding policy demonstrates the value of an optimal learning policy to increase the total revenue. Finally, we interact with a driver management simulator that mimics real-world operations and use online evaluations to illustrate the effectiveness of our proposed method. Experimental results are closely studied to provide guidance towards more stable and sustainable brokerage business models.

Bayesian optimization with feature hierarchies.

Our problem is motivated by healthcare applications where the highly sparsity and the relatively small number of patients makes learning more difficult. For example, a patient can have a number of attributes, spanning from the age, weight, to diagnoses and to their medical history. We are developing probabilistic models with feature hierarchies to represent the context and the decisions. With the adaptation of an online boosting framework, We are further developing optimal learning policies to sequentially make decisions in high-dimensional settings while balancing the contribution of each level of feature aggregation.

An optimal learning method for developing personalized treatment regimes.

Motivated by personalized health care, we consider the problem of sequentially making decisions that are rewarded by “successes” and “failures” which can be predicted through an unknown relationship that depends on a partially controllable vector of attributes for each in- stance. The learner takes an active role in selecting samples from the instance pool. The goal is to maximize the probability of success.

With the adaptation of an online Bayesian linear classifier, we develop a knowledge-gradient type policy to guide the experiment by maximizing the expected value of information of labeling each alternative, in order to reduce the number of expensive physical experiments. We provide a detailed study on how to make sequential medical decisions under uncertainty to reduce health care costs on a real world knee replacement dataset.

Interactive query auto-completion (QAC)

We propose to formulate query auto-completion as an online decision making problem and designed novel algorithms. Extensive Hadoop-based experiments on large scale datasets were conducted to show their superiority in ranking qualities.

Combinatorial semi-bandits for whole-page recommendation.

We work on a stochastic combinatorial online optimization problem that is motived by multiple ads recommendation with layout information. Thompson sampling and Network flow are applied to provide exact solutions as well as balance exploration and exploitation.

Relation extraction and natural language processing.

In U.S. Food and Drug Administration (FDA) approval news, we are interested in whether the drug mentioned in the news article is approved, the generic name of the approved drug and which company manufactures the approved drug. We use a named-entity recognizer over the data to identify all the named entities in the text documents and then use multi-class classification to extract relations between each ordered pair of named entities. For many language-processing tasks including relation extraction, there is an abundance of unlabeled data, but labeled data is lacking and too expensive to create in large quantities. In order to make the relation extraction task more scalable and automatic, we further design a semi-supervised learning method by bootstrapping more labeled relation patterns from a small set of labeled data.

Parallel knowledge gradient method for nested-batch Bayesian optimization.

[my picture] width= Most previous work in optimal learning assumes a fully sequential setting where at each time step only one decision is made. However, the sequential design fails to account for the ability to run several parallel experiments in batches. Driven by numerous needs among materials science society, we developed a Nested-Batch-KG policy for sequential experiments when experiments can be conducted in parallel and/or there are multiple tunable parameters which are decided at different stages in the process. We demonstrate the effectiveness of our approach on the material design problem of maximizing output current of a photoactive device.

Finite-time analysis for the knowledge-gradient policy.

Although many value of information policies (e.g., EI, KG) exist with nice asymptotic guarantees and empirical performance, there is no finite-time bound for such policies mainly due to the adaptive nature of the policies, that is, the current decision depends on the stochastic outcomes of past decisions. We fill in this gap by offering a new perspective of interpreting ranking and selection problems as adaptive stochastic multi-set maximization problems and deriving the first finite-time bound of the knowledge-gradient. In addition, we introduce the concept of prior-optimality and provide another insight into the performance of the knowledge gradient policy based on the submodular assumption on the value of information.

Conference Presentations and Chaired Sessions

Session Chair, INFORMS Annual Meeting, Houston, 2017

INFORMS Annual Meeting, Nashville, 2016

International Conference on Machine Learning (ICML), New York City, 2016

INFORMS Optimization Society Conference, Princeton University, 2016

INFORMS Annual Meeting, Philadelphia, 2015

Session Chair, Modeling and Optimization: Theory and Applications (MOPTA), Lehigh University, 2015

INFORMS Annual Meeting, San Francisco, 2014