Welcome to my humble abode!!

Me at Ahu Tongariki, Easter Island

At Ahu Tongariki, Easter Island


Who I am Education Professional Experience Publication Extras Curriculum Vitae Contact Experiments

Who I am


Education


Professional Experience

Research and development at Knowledge Mining Research Team, Electronics and Telecommunications Research Institute (ETRI), South Korea, February 2010 - April 2014

My first task was to build a Hadoop and HBase cluster to process Big Data (news, blogs, Tweets). Then I taught myself to write MapReduce codes for text analysis. This changed the way my team process text data, from dealing several tens of thousand documents on a few separate machines to systematically analyzing millions of documents per day on a cluster and saving the results to a distributed database. Then I worked on Named Entity Recognition module and Event Extraction module, while simultaneously studying machine learning techniques. I also developed solvers for Binary SVM, Structural SVM, One-class SVM, Ranking SVM in C++ and Java.

Project Participation Internship at Research, Development and Dissemination (RD&D), Sutter Health, California, May 2015 - August 2015

During the internship, I explored the potential of applying deep learning methods to health care problems, specifically predicting the future heart failure diagnosis. Applying stacked de-noising auto encoders to heart failure prediction enabled sophisticated analysis of the relation between patient features and heart failure diagnosis. Furthermore, through the combination of the word embedding technique and recurrent neural networks, I was able to improve the heart failure prediction performance from 0.81 AUC to 0.86 AUC. This work was published in JAMIA.

Internship at Research, Development and Dissemination (RD&D), Sutter Health, California, May 2016 - August 2016

In my second internship at Sutter Health, I focused on developing interpretable deep learning models for predictive healthcare. Specifically, using the neural attention mechanism combined with RNN and MLP, I was able to design a sequence prediction model RETAIN that demonstrated similar AUC as RNN but completely interpretable; the model allowed precise calculation of how much each diagnosis/medication/procedure in the past visits contributed to the final prediction.

Research Internship at DeepMind, London, U.K., Feb 2017 - May 2017

My first project was to train an embodied agent to find out the heaviest object in a virtual environment. This was an extended work of "Which is heavier?" experiment from Learning to Perform Physics Experiments via Deep Reinforcement Learning (Denil et al. ICLR 2017). The agent was equipped with a hammer to probe the objects, and a positive reward was given when the hammer was in contact with the heaviest object (hence the project name Pinata). The agent successfully learned to interact with the objects and stick to the heaviest one (example video 1, example video 2). My second project was related to language and communication.

Research Internship at Google Research, Mountain View, California, May 2017 - Aug 2017

I was a member of the project team named FluidNets. The objective was to automatically learn the structure of neural networks given some resource constraint (e.g. number of parameters, number of FLOPs), using various regularization methods.


Publication

  1. Yoonjae Choi, Hodong Lee, Ho-Joon Lee, Jong C. Park, 2009, Extracting melodies from polyphonic piano solo music based on patterns of music structure, In Proc. of Human Computer Interaction (HCI) Korea 2009, pp.725-732.
  2. Yoonjae Choi, Jong C. Park, 2009, Extracting melodies from piano solo music based on characteristics of music, In Proc. of Korea Computer Congress (KCC) 2009, pp.124-125. (Best paper)
  3. Yoonjae Choi, Jong C. Park, 2009, Extracting melodies from piano solo music based on its characteristics, Journal of Korean Institute of Information Scientists and Engineers (KIISE): Computing Practices and Letters, vol.15, no.12, pp.923-927.
  4. Jeong Heo, Pum-Mo Ryu, Yoonjae Choi, Hyunki Kim, 2012, Event template extraction for the decision support based on social media, In Proc. of The 24th Annual Conference on Human & Cognitive Language Technology (HCLT) 2012, pp.53-57. (Best paper)
  5. Yoonjae Choi, Pum-Mo Ryu, Hyunki Kim, Changki Lee, 2013. Extracting events from web documents for social media monitoring using structured SVM, The Institute of Electronics, Information and Communication Engineers(IEICE) Transactions on Information and Systems, vol.E96-D, no.6, pp.1410-1414.
  6. Jeong Heo, Pum-Mo Ryu, Yoonjae Choi, Hyunki Kim, Cheol Young Ock, 2013, An issue event search system based on big data for decision supporting: Social Wisdom, Journal of Korean Institute of Information Scientists and Engineers (KIISE): Software and Applications, vol.40, no.7, pp.381-394.
  7. Edward Choi, Hyunki Kim, Changki Lee, 2014. Balanced Korean word spacing with structural SVM, In Proc. of Empirical Methods in Natural Language Processing (EMNLP) 2014, pp.875-879.
  8. Edward Choi, Jina Dcruz, Sizhe Lin, Aashu Singh, Hang Su, Kelly Ryder, Sridhar R. Papagari Sangareddy, Herman Tolentino, Jimeng Sun, 2015, System architecture of CDC I-SMILE recommendation engine, American Medical Informatics Association (AMIA) 2015, poster presentation
  9. Edward Choi, Jina Dcruz, Sizhe Lin, Kelly Ryder, Aashu Singh, Hang Su, I-SMILE: similarity based just-in-time recommendation system for public health, American Medical Informatics Association (AMIA) 2015, poster presentation as a top-7 finalist in Student Design Challenge
  10. Edward Choi, Nan Du, Robert Chen, Le Song, Jimeng Sun, 2015, Constructing disease network and temporal progression model via context-sensitive Hawkes process, In Proc. of International Conference of Data Mining (ICDM) 2015, pp.721-726, Full version (A kinder random process intro, scalability experiments)
  11. Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, Jimeng Sun, 2016, Multi-layer representation learning for medical concepts, In Proc. of Knowledge Discovery and Data Mining (KDD) 2016, pp.1495-1504. GitHub Repo
  12. Edward Choi, Andy Schuetz, Walter F. Stewart, Jimeng Sun, 2016, Using recurrent neural network models for early detection of heart failure onset, Journal of the American Medical Informatics Association (JAMIA), doi:10.1093/jamia/ocw112. GitHub Repo
  13. Edward Choi, Mohammad Taha Bahadori, Andy Schuetz, Walter F. Stewart, Jimeng Sun, 2016, Doctor AI: Predicting clinical events via recurrent neural networks, In Proc. of Machine Learning for Healthcare (MLHC) 2016, pp.301-318. GitHub Repo
  14. Edward Choi, Mohammad Taha Bahadori, Joshua A. Kulas, Andy Schuetz, Walter F. Stewart, Jimeng Sun, 2016, RETAIN: An interpretable predictive model for healthcare using reverse time attention mechanism, In Proc. of Neural Information Processing Systems (NIPS) 2016, pp.3504-3512. GitHub Repo
  15. Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F. Stewart, Jimeng Sun, 2016, GRAM: Graph-based Attention Model for Healthcare Representation Learning, Proc. of Knowledge Discovery and Data Mining (KDD) 2017, GitHub Repo
  16. Mohammad Taha Bahadori, Krzysztof Chalupka, Edward Choi, Robert Chen, Walter F. Stewart, Jimeng Sun, 2017, Causal Regularization, arXiv:1702.02604.
  17. Edward Choi, Siddharth Biswal, Bradley Malin, Jon Duke, Walter F. Stewart, Jimeng Sun, 2017, Generating Multi-label Discrete Patient Records using Generative Adversarial Networks, Proc. of Machine Learning for Healthcare (MLHC) 2017, GitHub Repo
  18. Ariel Gordon, Elad Eban, Ofir Nachum, Bo Chen, Tien-Ju Yang, Edward Choi, 2017, FluidNets: Fast & Simple Resource-Constrained Structure Learning of Deep Networks, arXiv:1711.06798.

Extras


Curriculum Vitae

Here

Contact

mp(two)(eight)(nine)(three) at gatech.edu (write numbers in digits, no spaces)

Experiments

DeepMind internship project: Pinata - agent gets a reward if it touches the heaviest object

This is an extended work of "Which is heavier?" experiment from Learning to Perform Physics Experiments via Deep Reinforcement Learning (Denil et al. ICLR 2017).
Object densities are shown at the bottom right.

Agent trained with 15-second episodes. Agent trained with 50-second episodes
(time ticks 3 times faster, hence the video length 16 seconds)

GRAM Visualization

2D plot of disease representations learned from domain knowledge, initialized with GloVe
2D plot of disease representations learned from domain knowledge
2D plot of disease representations learned from fake domain knowledge
2D plot of disease representations learned by GRU, initialized with GloVe vectors
2D plot of disease representations learned by GRU, randomly initialized
2D plot of disease representations learned by GloVe
2D plot of disease representations learned by Skip-gram

Healthcare Concept Representation Learned by Med2Vec

codeEmb.npy: Embedding matrix of medical concepts (Python Numpy).
int2str.p: Mapping between integer code to string code (Python dictionary).
str2desc.p: Mapping between string code to descriptions (Python dictionary).

The embedding matrix codeEmb.npy has the shape 27523 by 200. Each row is a specific medical concept (diagnosis code, medication code or procedure code) represented by a 200 dimensional vector. int2str.p is a Python dictionary that maps the dimension number of the embedding matrix to the string code of the medical concept. For example the first dimension of codeEmb.npy can be mapped to a string code "D_401.9". The first letter of the string code could be D, R, or P which respectively stand for diagnosis, medication and procedure. str2desc.p is a Python dictionary that maps the string code to the actual description of the medical concept. For example, the string code "D_401.9" is mapped to the string description "Unspecified essential hypertension".

Healthcare Concept Representation Learning Visualization

2D plot of disease representations learned from non-negative Skip-gram

Disease network analysis from ICDM 2015

Disease network constructed from MIMIC II dataset

Last Modified: Nov. 24, 2017