EZLearn+Automatic Claim Validation
We are developing an end-to-end system for validating scientific claims against open data repositories using NLP, machine learning, and data integration techniques.
Privacy-Preserving Synthetic Data
We are developing usable, general tools to generate shareable synthetic datasets with strong privacy guarantees from any input dataset.
Responsible Data Science
We are studying the technical foundations for responsible data science, including fair machine learning, semi-synthetic private data, data governance, automatic metadata attachment and curation, and...
Data Science for Social Good
Building on our data science incubator program and the University of Chicago's Data Science for Social Good program, we ran an interdisciplinary summer program for...
Machine vision, machine learning, and bibliometric analysis to understand how visualization is used to convey ideas in the scientific literature.
Myria Middleware for Polystores
Part of the Myria project, RACO (the Relational Algebra COmpiler) is a polystore middleware system that provides query translation, optimization, and orchestration across complex multi-system...
Clustering Billion-Edge Graphs
Working at the intersection of network science, databases, and high-performance computing, we developed a series of novel parallel algorithms based on Infomap serial graph clustering...
Scalable Flow Cytometry
We have developed algorithms, methods, systems, and applications in support of the Seaflow project in the Armbrust Lab in the UW department of Oceanography.
SQLShare: DB-as-a-Service
SQLShare aims to increase uptake of databases in data science and shed light on how data scientists work with data
VizDeck + Visualization Recommenders
VizDeck recommends visualizations based on the statistical properties of the data tempered by perception heuristics. Dashboards are assembled through a card-game UI.
Data Pricing
We developed theory and systems related to buying and selling data online.
Horizon: Big Data Analytics for Science
The Horizon project was one of the early efforts to study the limits of Hadoop for complex analytics. We developed Hadoop algorithms for visualization, machine...

This webpage was built with Bootstrap and Jekyll. You can find the source code here. Last updated: Aug 02, 2021