Ka Yee Yeung: Research and Projects
Publications
- 2025
- Molecular phenotypes of null alleles in human cells (MorPhiC) consortium: Towards functional characterization of all human genes. Accepted by Nature 2024.
- 2023
- Container Profiler: Profiling Resource Utilization of Containerized Big Data Pipelines.
Varik Hoang, Ling-Hong Hung, David Perez, Huazeng Deng, Raymond Schooley, Niharika Arumilli, Ka Yee Yeung, Wes Lloyd.
Gigascience, Volume 12, 2023, giad069.
GitHub.
Pre-print : arXiv:2005.11491v2 2023.
- Rapid detection of myeloid neoplasm fusions using Single Molecule Long-Read Sequencing.
Olga Sala-Torra, Shishir Reddy, Ling-Hong Hung, Lan Beppu, David Wu, Jerald Radich, Ka Yee Yeung, Cecilia CS Yeung.
PLOS Global Public Health 3(9): e0002267.
Pre-print medRxiv 10.1101/2022.06.16.22276469.
- A randomized controlled trial of precision nutrition counseling for service members at risk for metabolic syndrome.
McCarthy, M.S., Colburn, Z.T., Yeung K.Y., Gillette, L.H., Hong, L.H., Elshaw, E.
Military Medicine 2023, Volume 188, Warfighter Special Issue, pages 606-613.
- 2022
- Cloud-enabled Biodepot workflow builder integrates image processing using Fiji with reproducible data analysis using Jupyter notebooks.
Ling-Hong Hung, Evan Straw, Shishir Reddy, Robert Schmitz, Zachary Colburn, and Ka Yee Yeung.
Scientific Reports 12: 14920 (2022).
GitHub.
Earlier version bioRxiv 10.1101/2021.10.22.465513.
- Accessible, interactive and cloud-enabled genomic workflows integrated with the NCI Genomic Data Commons.
Ling-Hong Hung, Bryce Fukuda, Robert Schmitz, Varik Hoang, Wes Lloyd, Ka Yee Yeung.
Pre-print bioRxiv 10.1101/2022.08.11.503660.
- Accelerated and Reproducible Fiji for image processing using GPUs on the cloud. Ling-Hong Hung, Evan Straw, Zachary Colburn, Ka Yee Yeung.
Pre-print bioRxiv 10.1101/2022.07.15.500283.
- Ultrarapid Targeted Nanopore Sequencing for Fusion Detection of Leukemias.
Cecilia CS Yeung, Olga Sala-Torra, Shishir Reddy, Ling-Hong Hung, Jerry Radich, Ka Yee Yeung.
Pre-print medRxiv 10.1101/2022.06.20.22276664.
- 2021
- Application of Natural Language Processing and Machine Learning to Radiology Reports.
Seoungdeok Jeon, Zachary Colburn, Joshua Sakai, Ling-Hong Hung, and Ka Yee Yeung.
Poster presentation at the 12th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (ACM BCB 2021), Article No 67, pp 1, August 1 to 4, 2021, Gainesville, FL, USA.ACM, New York, NY, USA.
https://doi.org/10.1145/3459930.3469496
- A graphical, interactive and GPU-enabled workflow to process long-read sequencing data.
Shishir Reddy, Ling-Hong Hung, Olga Sala-Torra, Jerald Radich, Cecilia CS Yeung, Ka Yee Yeung.
BMC Genomics 22, Article number: 626 (2021).
Pre-print bioRxiv 10.1101/2021.05.11.443665.
- 2020
- An Investigation on Public Cloud Performance Variation for an RNA Sequencing Workflow.
David Perez, Ling-Hong Hung, Sonia Xu, Ka Yee Yeung, Wes Lloyd.
Proceedings of the 11th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics, Article number 96, Pages 1-7. https://doi.org/10.1145/3388440.3414859.
Workshop paper presented at
the 9th International Workshop on Parallel and Cloud-based
Bioinformatics and Biomedicine (ParBio) 2020.
- Characterizing Performance Variation of Genomic Data Analysis Workflows on the Public Cloud.
David Perez, Ling-Hong Hung, Sonia Xu, Ka Yee Yeung, Wes Lloyd.
Poster abstract at the 2020 IEEE Intl Conf on Dependable, Autonomic and Secure Computing, Intl Conf on Pervasive Intelligence and Computing, Intl Conf on Cloud and Big Data Computing, Intl Conf on Cyber Science and Technology Congress (DASC/PiCom/CBDCom/CyberSciTech)
in August 2020. DOI: 10.1109/DASC-PICom-CBDCom-CyberSciTech49142.2020.00116.
- Viruses, Visualization, and Validation: Interactive Mining of COVID-19 Literature.
Varun Mittal, Naveen Garg, Yos Wagenmans, Mayuree Binjolkar, Rashad Hatchett, Varik Hoang, Emma Biggs Lanier, Ling-Hong Hung, Ka Yee Yeung.
Abstract accepted for an oral presentation in the
28th Conference on Intelligent Systems for Molecular Biology (ISMB) in July 2020.
- Profiling Resource Utilization of Bioinformatics Workflows.
Huazeng Deng, Ling-Hong Hung, Raymond Schooley, David Perez, Niharika Arumilli, Ka Yee Yeung, Wes Lloyd.
(2023 version available)
arXiv 2020.
- Accessible and interactive RNA sequencing analysis using serverless computing.
Ling-Hong Hung, Xingzhi Niu, Wes Lloyd, Ka Yee Yeung.
Pre-print: bioRxiv 576199v2.
Early version: bioRxiv 576199.
- 2019
- Using BioDepot-workflow-builder to access public databases in a containerized environment. Christin Scott, Ling-Hong Hung, Wes Lloyd, Ka Yee Yeung.
Poster abstract at the IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2019, page 1243, San Diego, November 2019.
- Multi-Omic Precision Medicine Clinical Trial in Acute Leukemia.
Pamela S. Becker, Vivian G. Oehler, Carl Anthony Blau, Timothy S Martins, Niall Curley, Sylvia Chien, Jin Dai, PhD, Nicole Kauer, Ka Yee Yeung, Ling-Hong Hung, Cody Hammer, Paul C. Hendrie, Mary-Elizabeth M. Percival, Ryan D. Cassaday, Bart L. Scott, Roland B. Walter, Kelda Gardner, Mary Gwin, Heather Smith, Andrew Carson, Bradley Patay, and Elihu H. Estey.
Poster Abstract accepted by the American Society of Hematology 2019.
Blood 2019, volume 134 (issue supplement_1): 1269.
- Building containerized workflows using the BioDepot-workflow-Builder (BwB).
Ling-Hong Hung, Jiaming Hu, Trevor Meiss, Alyssa Ingersoll, Wes Lloyd, Daniel Kristiyanto, Yuguang Xiong, Eric Sobie, Ka Yee Yeung.
Cell Systems 2019, volume 9, issue 5, pages 508-514.E3.
Preprint: bioRxiv 099010.
GitHub.
- Leveraging Serverless Computing to Improve Performance for Sequence
Comparison.
Xingzhi Niu, Dimitar Kumanov, Ling-Hong Hung, Wes Lloyd, Ka Yee Yeung.
Proceedings of the 10th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics , Pages 683-687.
Presented at the 8th International Workshop on Parallel and Cloud-based
Bioinformatics and Biomedicine (ParBio) 2019.
- Holistic optimization of RNA-seq workflow for multi-threaded environments.
Ling-Hong Hung, Wes Lloyd, Radhika Agumbe Sridhar,
Saranya Devi Athmalingam Ravishankar, Yuguang Xiong, Eric Sobie,
Ka Yee Yeung.
Bioinformatics 2019, volume 35, issue 20, pages 4173-4175.
Pre-print: bioRxiv 345819.
- Integration of multiple data sources for gene network inference using genetic perturbation data.
Xiao Liang, William Chad Young, Ling-Hong Hung, Adrian E Raftery, Ka Yee Yeung.
Extended abstract on page 601 of the Proceeding of the 9th ACM Conference on Bioinformatics, Computational Biology and Health Informatics, Aug 29-Sept 1, 2018, Washington DC.
Full paper: Journal of Computational Biology 2019, volume 26, number 10.
Preprint: bioRxiv 158394.
GitHub
- 2018
- Serverless computing provides on-demand high performance computing for biomedical research.
Dimitar Kumanov, Ling-Hong Hung, Wes Lloyd, Ka Yee Yeung.
Preliminary version: arXiv:1807.11659.
- Hot-starting software containers for bioinformatics analyses.
Pai Zhang, Ling-Hong Hung, Wes Lloyd, Ka Yee Yeung.
Gigascience 2018, 7(8), giy092.
Early version bioRxiv 204495.
- Embedding containerized workflows inside data science notebooks enhances reproducibility.
Jiaming Hu, Ling-Hong Hung, Ka Yee Yeung.
bioRxiv 309567. Resources:
nbdocker Youtube video
GitHub.
Featured research at the eScience Institute nbdocker = Jupyter + Docker: simplifying reproducible research.
- A crowdsourced analysis to identify ab initio molecular signatures predictive of susceptibility to viral infection.
Slim Fourati, Aarthi Talla, Mehrad Mahmoudian, Joshua G Burkhart, Riku Klen, Ricardo Henao, Zafer Aydin, Ka Yee Yeung, Mehmet Eren Ahsen, Reem Almugbel, Samad Jahandideh, Xiao Liang, Torbjorn E.M. Nordling, Motoki Shiga, Ana Stanescu, Robert Vogel, The Respiratory Viral DREAM Challenge Consortium, Gaurav Pandey, Christopher Chiu, Micah T McClain, Chris W Woods, Geoffrey S Ginsburg, Laura L Elo, Ephraim L Tsalik, Lara M Mangravite, Solveig K Sieberts.
Nature Communications 2018, 9:4418.
Early version:
bioRxiv 311696
- Temporal Genetic Association and Temporal Genetic Causality Methods for Dissecting Complex Networks.
Luan Lin, Quan Chen, Jeanne Hirsch, Seungyeul Yoo, Ka Yee Yeung, Roger Bumgarner, Zhidong Tu, Eric Schadt, and Jun Zhu.
Nature Communications 2018, 9:3980
- Identifying Dynamical Time Series Model Parameters from Equilibrium Samples, with Application to Gene Regulatory Networks.
William Chad Young, Ka Yee Yeung, Adrian E. Raftery.
Statistical Modelling 2018.
- Reproducible Bioconductor Workflows Using Browser-Based Interactive Notebooks And Containers.
Reem Almugbel, Ling-Hong Hung, Jiaming Hu, Abeer M. Almutairy, Nicole E. Ortogero, Yashaswi Tamta, Ka Yee Yeung.
Journal of the American Medical Informatics Association (JAMIA) 2018, 25(1): 4-12 (Editor's Choice).
Early version: bioRxiv 144816. Source code available at
Bioconductor notebooks
GitHub.
Featured in the RNA-seq Blog dated Nov 2, 2017.
- 2017
- Model-based clustering with data correction for removing artifacts in gene expression data.
William Chad Young, Ka Yee Yeung, Adrian E. Raftery.
Annals of Applied Statistics 2017, 11(4):1998-2026.
Early version: arXiv:1602.06316
Full text: PMC6364860.
- GUIdock-VNC: Using a graphical desktop sharing system to provide a browser-based interface for containerized software.
Varun Mittal, Ling-Hong Hung, Jayant Keswani, Daniel Kristiyanto,
Sung Bong Lee and Ka Yee Yeung.
Gigascience 2017, 6(4): 1-6.
GUIdock-VNC GitHub page
- fastBMA: Scalable Network Inference and Transitive Reduction.
Ling-Hong Hung, Kaiyuan Shi, Migao Wu, William Chad Young, Adrian Raftery, Ka Yee Yeung.
Gigascience 2017, 6(10): 1-10.
Early version bioRxiv 099036.
- Software solutions for reproducible RNA-seq workflows.
Trevor Meiss, Ling-Hong Hung, Yuguang Xiong, Evren U. Azeloglu, Marc R. Birtwistle, Eric A. Sobie, Ka Yee Yeung.
bioRxiv 099028.
NIH BD2K LINCS Webinar 11/22/16 by Dr. Ling-Hong Hung Docker pipelines for RNA-
seq alignment and analyses.
- 2016
- Predicting discontinuation of docetaxel treatment for metastatic castration-resistant prostate cancer (mCRPC) with random forest.
Daniel Kristiyanto, Kevin E. Anderson, Ling-Hong Hung, Ka Yee Yeung.
F1000Research 2016, 5:2673.
- GUIdock: Using Docker containers with a common graphics user interface to address the reproducibility of research.
Ling-Hong Hung, Daniel Kristiyanto, Sung Bong Lee, Ka Yee Yeung.
PLOS One 2016, 11(4):e0152686.
GUIdock-X11 GitHub page
- A Posterior Probability Approach for Gene Regulatory Network Inference
in Genetic Perturbation Data.
William Chad Young, Adrian E. Raftery, Ka Yee Yeung.
Mathematical Biosciences and Engineering (MBE) 2016, 13(6): 1241-1251.
Earlier version:
arXiv:1603.04835
- A Crowdsourcing Approach to Developing and Assessing Prediction Algorithms for AML Prognosis.
Noren et al.
PLoS Computational Biology 2016, 12(6): e1004890.
I served as a member of the DREAM 9 AML-OPC Consortium (as a collaborator).
- 2015
- Contribution to DREAM 9.5 Prostate Cancer Challenge.
Kristiyanto D, Anderson K, Khankhajeh SS, Shi K, West S, Hung LH, Lee A, Wei Q,
Wu M, Yin Y and Yeung KY. Predicting discontinuation of docetaxel treatment for metastatic castration-resistant prostate cancer (mCRPC) with hill-climbing and random forest.
F1000 Research 2015, 4:1383 (poster). Presented at the 8th annual RECOMB/ISCB Conference on Regulatory
and Systems Genomics.
- CyNetworkBMA: a Cytoscape app for inferring gene regulatory networks.
Maciej Fronczuk, Adrian E. Raftery, Ka Yee Yeung.
Source Code for Biology and Medicine 2015, 10:11
- Toward Individualized Therapy: Correlation of Mutation Analysis with in vitro High Throughput Drug Sensitivity Testing in New Diagnosis and Relapsed Acute Myeloid Leukemia. Becker P.S., Schmitt M.W., Loeb L.A., Xie Z., Carson A.R., Khankhajeh S.S., Wei Q., Hung L.H., Martins T., Estey E.H., Blau C.A., Oehler V. and Yeung K.Y. Abstract accepted for poster presentation at the
ASH (American Society of Hematology) Annual Meeting 2015. The abstract will appear in Blood.
- Development of a Wireless Sensor Network for Indoor Environment Using Wireless InSite. Braga M. V., Lampa P. H. D. M., Silva F. A. N., Silva S. M. G., Baiocchi O. R., Yeung K.Y., Barret C. M., Landowski R., de Carvalho F. B. S. Accepted for publication in ENCOM - IECOM Annual Meeting in Communications, Networks and Cryptography, Campina Grande, Brazil 2015.
- 2014
- Bayesian Model Averaging methods and R package for gene network
construction. Ka Yee Yeung, Chris Fraley, William Chad Young,
Roger Bumgarner and Adrian E.Raftery.
Big Data Analytic Technology For Bioinformatics and Health Informatics (KDDBHI), workshop at the 20th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), August 24-27, 2014, New York City.
- Fast Bayesian Inference for Gene Regulatory Networks Using ScanBMA.
Wm. C Young, Adrian E Raftery and Ka Yee Yeung.
BMC Systems Biology 2014, 8:47.
- 2013
- Personalized Approach to Acute Myeloid Leukemia Using a High-throughput Chemosensitivity Assay.
Yeung K.Y., Blau C.A., Oehler V.G., Lee S.I., Miller C., Chien S., Martins T.J., Estey E. and Becker P.S.
Blood November 15, 2013, vol. 122, no. 21: 483.
- Signature Discovery for Personalized Medicine.
Ka Yee Yeung.
Proceedings of the 2013 IEEE International Conference on Intelligence and
Security Informatics, Part III, workshop papers, pages 333-338.
ISI 2013
- Discovery of expression signatures in chronic myeloid leukemia by Bayesian Model Averaging.
Ka Yee Yeung.
Statistical Diagnostics for Cancer: Analyzing High-Dimensional Data, Chapter 3. Wiley-Blackwell Publisher. Edited by Frank Emmert-Streib and Matthias Dehmer.
- 2012
- Integrating external biological knowledge in the construction of regulatory networks from time-series expression data.
Kenneth Lo, Adrian Raftery, Kenneth Dombek, Jun Zhu, Eric Schadt, Roger Bumgarner, and Ka Yee Yeung.
BMC Systems Biology 2012, 6:101.
- Predicting relapse prior to transplantation in chronic myeloid leukemia by integrating expert knowledge and expression data.
Ka Yee Yeung, Ted Gooley, A. Zhang, Adrian Raftery, Jerry Radich,
and Vivian Oehler.
Bioinformatics 2012, 28(6): 823-830.
Supplementary web site .
-
Fast Inference for the Latent Space Network Model Using a Case-Control
Approximate Likelihood.
Adrian Raftery, Xiaoyue Niu, Peter Hoff and Ka Yee Yeung.
Journal of
Computational and Graphical Statistics 2012, 21(4): 901-919.
An older version (July 2010) appeared in
Technical
Report 572, Department of Statistics, University of Washington.
- 2011
-
Construction of regulatory networks using expression time-series data of a genotyped population.
Ka Yee Yeung, Kenneth Dombek, Kenneth Lo, John Mittler, Jun Zhu,
Eric Schadt, Roger Bumgarner, and Adrian Raftery.
PNAS 2011, 108(48): 19436 - 41.
Supplementary web site .
- 2010
- 2009
-
The derivation of diagnostic markers of chronic myeloid leukemia progression from microarray data.
Vivian G. Oehler*, Ka Yee Yeung*, Yongjae E. Choi, Roger E. Bumgarner,
Adrian E. Raftery, and Jerald P. Radich.
Blood 2009, Vol. 114, No. 15, pp. 3292-3298.
*Co-first authors.
-
Iterative Bayesian Model Averaging: a method for the application of survival analysis to high-dimensional microarray data.
Amalia Annest, Roger E Bumgarner, Adrian E Raftery, and
Ka Yee Yeung.
BMC Bioinformatics
2009, 10: 72.
Supplementary web site. Software: bioconductor package
iterativeBMAsurv.
- 2008
-
MeV+R: using MeV as a graphical user interface for Bioconductor
applications in microarray analysis.
Vu T Chu, Raphael Gottardo, Adrian E Raftery, Roger E Bumgarner,
Ka Yee Yeung.
Genome Biology
2008, 9: R118.
Supplementary web site.
- 2006
-
Bayesian Context-specific infinite mixture model for clustering of
gene expression profiles accross diverse microarray datasets.
Xiangdong Liu, Siva Sivaganesan, Ka Yee Yeung, Junhai Guo,
Roger Bumgarner, Mario Medvedovic.
Bioinformatics 2006, 22: 1737-1744.
-
Bayesian Robust Inference for Differential Gene Expression in cDNA
Microarrays with Multiple Samples.
Raphael Gottardo, Adrian Raftery, Ka Yee Yeung and Roger Bumgarner.
Biometrics 2006, 62: 10-18.
Earlier version:
Technical Report 455 (July 2004) , Department of Statistics,
University of Washington.
- Robust
estimation of cDNA microarray intensities with replicates.
Raphael Gottardo, Adrian Raftery, Ka Yee Yeung and Roger Bumgarner.
Journal of the American Statistical Association 2006, 101: 30-40.
Earlier version:
Technical Report 438 (Dec 2003), Department of Statistics,
University of Washington.
Supplementary web site.
- 2005
-
Donuts, scratches and blanks: Robust model-Based segmentation of
microarray images.
Qunhua Li , Chris Fraley , Roger Bumgarner,
Ka Yee Yeung and Adrian Raftery.
Technical Report 473 (Jan 2005), Department of Statistics,
University of Washington.
Bioinformatics 2005, 21: 2875 - 2882.
-
Bayesian Model Averaging: Development of an improved multi-class,
gene selection and classification tool for microarray data.
Ka Yee Yeung, Roger Bumgarner and Adrian Raftery.
Technical Report 468 (Oct 2004), Department of Statistics,
University of Washington.
Bioinformatics 2005, 21: 2394-2402.
- 2004
- Bcl-2 overexpression leads to increases in suppressor of cytokine
signaling-3 expression in B cells and de novo follicular lymphoma.
Gary J. Vanasse, Robert K. Winn, Sofya Rodov, Arthur W. Zieske, John T. Li,
Joan C. Tupper, Mette A. Peters, Ka Y. Yeung, and John M. Harlan.
Molecular Cancer Research 2004, 2: 620-631.
- Review article: Pattern recognition in expression data. Ka Yee Yeung ,
and Roger Bumgarner. Recent Developments in Nucleic Acids
Research 2004, 1: 333-354.
-
Bayesian Robust Inference for Differential Gene Expression in cDNA
Microarrays with Multiple Samples.
Raphael Gottardo, Adrian Raftery, Ka Yee Yeung and Roger Bumgarner.
Technical Report 455 (July 2004), Department of Statistics,
University of Washington.
To appear in Biometrics.
-
From co-expression to co-regulation:
how many microarray experiments do we need?
Ka Yee Yeung, Mario Medvedovic and Roger Bumgarner.
Genome Biology 2004, 5: R48.
- Bayesian mixture model based clustering of replicated microarray data.
Mario Medvedovic, Ka Yee Yeung and Roger Bumgarner.
Bioinformatics 2004 20:1222-1232.
- 2003
- 2002
- 2001
- 1999
Presentation and talks
Dissertations
Back to Ka Yee's home page.