Ashis Banerjee - University of Washington

Ongoing and recent research projects

Feature extraction and selection using topological data analysis

Topological data analysis (TDA) has emerged as an extremely powerful and versatile class of methods to recover the unknown topology, or more generally shape, of the high-dimensional space from which samples are drawn, i.e., observations are made. While originally stemming from computational topology, TDA now encompasses machine learning and statistics in addition to algebraic topology and computational geometry. In particular, it is useful for feature extraction in a variety of big data domains from large point clouds or meshes to textured images and gene sequences. The extracted features are then successfully used for several types of learning tasks including multi-way classification, clustering and prediction.

Inspired by this success, we applied TDA for the first time (to the best of our knowledge) on manufacturing process data to extract the key process variables (features) that have the most significant impact on overall product yield. The results are summarized in the form of a topological network, where each node represents a cluster of yield measurements and edges are drawn if two adjacent nodes have common measurements. Sub-groups along the periphery of the network are then obtained, which have high or low yield values and are distinguishable from each other as well as from the rest of the network using a small number of key features. For a benchmark chemical processing data set, we obtained just 11 key variables (out of 57) that provided similarly low yield prediction error and an order of magnitude reduction in computation time as compared to the complete set of features.

Sparse representation of topological persistence for multi-way classification

We also developed a new method, termed as the Sparse-TDA algorithm, to combine favorable aspects of persistence homology-based TDA with sparse sampling, which provides an efficient way of reconstructing spatio-temporal signals from a limited number of well-chosen samples. This combination was realized by selecting an optimal set of sparse pixel samples from the topologically persistent features generated by a vector-based TDA algorithm. The results showed that the Sparse-TDA method realized better or comparable classification accuracy and reduced training times than other state-of-the-art methods on three benchmark computer vision problems.

Predictive analytics for real-time supply chain visibility

In today's inter-connected economy, original equipment manufacturers (OEMs) procure parts, many of which have lead times of more than a year, from hundreds of globally distributed suppliers. The OEM suppliers, which are often small and medium-scale enterprises (SMEs), obtain parts themselves from many other dispersed suppliers, leading to the creation of complex supply chain networks with several layers of hierarchical dependencies. The networks frequently include multiple bottlenecks in the form of specialized suppliers who act as sole sources of certain critical parts. These characteristics make it imperative to have a high degree of real-time visibility into the flow of parts through the networks to facilitate operation planning for OEMs and SMEs alike.

This project brings together a collaborative academia-industry team to develop, validate, and scale-up powerful predictive analytics methods that enables such vsibility. Using a combination of supervised learning, dimension reduction, and text mining methods, large volumes of dynamic transactional and supplier capabilities data are processed on a per-demand basis to predict part delivery times (and, hence, part availabilities) with a high level of accuracy and confidence at the network nodes. Data warehousing and model computation are performed in a high performance computing cluster. The results are presented to the various stakeholders as customized interfaces of an open-source visibility tool, which is critically evaluated for usefulness, interactivity and trustworthiness.

Multi-agent consensus optimization for supply networks

Networked multiagent consensus optimization

Multi-agent systems are characterized by decentralized decision-making and localized communication among the agents. Supply networks form the backbones of both services and manufacturing industries, and need to operate as efficiently as possible to yield optimized returns. We bring the notion of multi-agent systems to clustered supply networks such that each supplier acts as an agent. Consequently, we adapt consensus-based auction bidding methods to optimize the assignment of demands to the suppliers with known communication pathways and resource constraints. Results on moderately large networks show promising performance in terms of both assignment quality, as given by the overall demand delivery cost and proportion of assigned demands, and computation time. The long-term goal is to develop efficient methods capable of real-time optimization on dynamic networks with scheduling constraints, various routing options, and uncertain demands.

Effective human-robot collaboration through vision-based ergonomic risk prediction

One of the key considerations for viable human-robot collaboration in industrial settings is safety. This consideration is particularly important when a robot operates in close proximity to humans and assists them with certain tasks in increasingly automated factories and warehouses. However, ergonomic safety of the collaborating humans is an extremely important related topic that has not received much attention until recently. Unlike other commonly considered safety measures, a lack of ergonomic safety does not lead to immediate injury concerns or fatality risks. It, however, increases the likelihood of causing long-term injuries in the form of musculoskeletal disorders. Therefore, we need to develop workspaces that minimize the ergonomic risks due to repetitive and/or physically strenuous tasks involving awkward human postures as much as possible. Collaborating robots, or co-bots, in short, can help a lot in this regard by taking over a majority of such tasks.

We are addressing this problem by accurately classifying human object manipulation actions into different ergonomic risk groups using RGB-D video data. The overall classification task can be decomposed into action description/representation and action detection/recognition. For a given semantically meaningful action such as "picking-up a box", a person can bend her/his back and reach the box and pick it up, or (s)he can kneel first and hold the box and get up while holding it. Therefore, one action category may include various human movement styles. Considering the dearth of such data in the literature, we generate a new labeled dataset of indoor object manipulation tasks, wherein a three-tier hierarchy of action labels are assigned to the individual video frames. The labels are further augmented with three types of ergonomic risk categories (safe, requires attention, and high risk, respectively) based on the widely used rapid entire body assessment (REBA) methodology. A VGG16 architecture is used to first extract spatial features, which are then fed in a sequential manner to a temporal convolutional neural network (TCN) to segment the videos into detected actions.

The future plan is to predict human intent, i.e., forecast future actions given a set of partially executed object manipulation tasks, in both fully and partially structured environments. This prediction will pave the way for the co-bot to partition the remaining tasks optimally and help the human in completing the operation as ergonomically safely as possible.

Optical tweezers-assisted investigation of interactions among heterogeneous cell types

We are using holographic optical tweezers to construct multi-cellular arrangements with accurate positioning of all the cells so as to generate exactly desired geometries. The goal is to precisely control the spatial arrangements of parencyhmal and non-parenchymal cells to address fundamental questions on inter-cellular signaling that plays a critical role in governing the functions of living tissues. We can, thereby, test hypotheses regarding the spatial extent of paracrine signaling and the relative contributions of multi-factorial signals among heterogeneous cell types.

Multiplexed optical tweezers automation

Optical tweezers offer certain advantages as a non-contact micro-robotic manipulation tool such as convenient multiplexing capability, flexibility in the choice of the manipulated object and manipulation medium, precise control, easy object release, and minimal object damage. However, automated manipulation of multiple objects in parallel, which is essential for any efficient and reliable operation, poses several challenges.

The first challenge is to extract real-time information about the positions and orientations (states) of the objects. We have developed a robust image processing method, which uses a novel combination of well-known processing techniques that have been successfully applied in other contexts. Results on time-lapse microscopy images show that the method works well on objects of both regular and irregular shapes, is able to distinguish between object types, and accurately detects object configurations even when they are located in close proximities to each other.

The next challenge is to use the estimated states of the objects for controlling their motions in a stable and robust manner. To address this challenge, we have modeled the interactions of multiple optical traps on microspheres to develop simplified state-space representations of the object (particle) dynamics. These representations are then used to design a model predictive controller (MPC) that coordinates the motions of several particles in real time.

Learning microrobot stochastic dynamics using hierarchical Bayesian regression

Considering that it is still challenging to use a standard MPC for coordinated control of a large number of particles, we have started developing a reinforcement learning (RL)-based MPC method. Such an RL-based controller typically requires a dynamics model that can be queried rapidly in real time. To this end, we have designed a hierarchical Bayesian linear regression model with local features to learn the true stochastic and non-linear system dynamics. The model is hierarchical since the model parameters have non-stationary priors for increased flexibility. The variational expectation maximization (EM) algorithm is used to solve the maximum likelihood estimation problem for the model parameters, and the procedure is enhanced by introducing hidden target variables in the model. The EM algorithm yields parsimonious model structures, and consistently provides fast and accurate predictions.

Completed research projects

Big data aggregration and analysis

Relational or hierarchical databases and tuple stores encounter challenges in managing heterogeneous and dynamic data sources originating at multiple spatio-temporal scales. However, such data sources occur ubiquitously in almost any real world cyber-physical domain of interest such as healthcare, manufacturing, defense, and building facilities. We applied a novel technique, called the associative memory (AM) model, to address some of these challenges by mimicking the encoding, processing, storage, and retrieval capabilities of human long-term memory.

The preliminary results have been promising. In particular, we have successfully discovered and validated significant correlations between patient characteristics and diagnostic and therapeutic interventions in invasive breast cancer. Through seamless integration of a tumor registry and an electronic medical record system, clinically expected as well as anomalous correlations have been found that can shape future treatment recommendations and high fidelity drug administration recording.

High-dimensional, multi-robot control

We used boosting trees, a type of regression model, to encode the solution vectors of high-dimensional control problems that are cast in integer linear programming (ILP) form, and are similar with respect to the robots and workspace constraints. The model was generated offline by solving the relaxed LP problems that occur in the branch and bound tree nodes of some similar ILP problems optimally. This enabled fast querying operations that results in more than an order of magnitude speed-up at run-time over both state-of-the-art solvers. We also provided performance guarantees on the estimated LP solution values.

Our approach was quite general purpose in the sense that it could be used for mission planning of aircraft, routing of ground vehicles, as well as path planning of autonomous vehicles. For example, we applied it for finite-horizon, chance-constrained, optimal control of robots and achieved 10-35 times reduction in computation time without degrading the solution quality. A simpler version was integrated within a decision support system to aid human supervisors in scheduling aircraft carrier deck operations.

Micro-scale object transport using optical tweezers

Optical tweezers have emerged as one of the most promising non-contact manipulation techniques at the small scales; they can successfully trap and transport objects of different sizes and shapes in fluid media. In other words, they can be viewed as miniature robots made of focused laser light. Autonomous object transport operations require path planning, which is challenging due to the stochastic Brownian motion of the objects, noise in the imaging measurements, and the need for fast control update rates.

Particle transport using optical tweezers

We built a Langevin dynamics simulator and formulated the single particle transport problem as a partially observable Markov decision process (POMDP), which was then solved efficiently using systematic search space pruning and adaptation of the QMDP algorithm. We also developed a coordinated approach for transporting multiple particles simultaneously. Effective performance was demonstrated using 2 micron diameter silica beads in a holographic optical tweezers set-up. This work has many exciting applications in the bio-medical field, particularly in automating multicellular studies.