Machine Learning

Aspect-based Product Review Summarizer

A web application that smartly classifies aspects of Amazon's product reviews and summarizes aspect-based sentiment. Two classifiers are implemented to achieve high accuracy; MaxEntropy (supervised learning) and word2vec (unsupervised learning). Four-step pipeline covers review scraping into database, aspect classification, sentiment scoring, front-end visualization.

[Github Code | Website]

Molecule2Vec: Molecular ConvNets

Automatic kinetic model generation, reagent screening in drug discovery, and automatic transition state search are a few trending applications where a large space of molecules is to be explored based on thermodynamic properties. Existing approaches are either slow or inaccurate. This work provides a fast and accurate estimation method; maps molecular graphs into learnable fingerprints and improves prediction accuracy by a factor of 25.

[Github Code | Website | Blog]

MolNet: A Central Database for Molecular Machines

MolNet is a central database of molecules with thermodynamic properties. It is a live growing database powered by RMG users and autoThermo, which is my private package for large scale of quantum mechanics calculations. The MolNet continuously empowers molecular machine learners used in RMG and provides benchmarking data for the community.

[Github Code | Website]

Data Science

Aspect-based Product Review Summarizer

A web application that smartly classifies aspects of Amazon's product reviews and summarizes aspect-based sentiment. Two classifiers are implemented to achieve high accuracy; MaxEntropy (supervised learning) and word2vec (unsupervised learning). Four-step pipeline covers review scraping into database, aspect classification, sentiment scoring, front-end visualization.

[Github Code | Website]

Software Engineering

Reaction Mechanism Generator

Reaction Mechanism Generator (RMG) is one of the most popular software suite for automatically generating chemical reaction models for energy related systems including pyrolysis, combustion, atmospheric science, and more.

Currently I’m the lead developer of this project. Selected work covers

  • Maintain and release new versions (now v2.1.0)
  • RMG memory usage optimization and parallelism
  • Integrate molecule2vec into property prediction
  • Continuous-Integration-Test platform RMG-tests
  • [Github Code | Website]

    RMG-tests

    RMG-tests is the Continuous-Integration-Test platform for RMG. It provides second layer of quality control besides the basic Travis unit tests. Currently it enables both automatically-deployed Travis test and local test modes.

    [Github Code | Website]

    Web Development

    Sidney-Pacific: MIT Graduate Dorm Website

    Sidney-Pacific website is one of the MIT dorm websites that have most exciting functionalities, which include automatic campus shuttle tracking, smart laundry reminder, inventory management and analysis system, real-time package notifications, house repairing and dorm events publicizing, etc.

    My contribution lies in both inventory and package systems.

    [Website]

    Sidney-Pacific TV System

    Sidney-Pacific TV System (SPTV) is a dorm project to enhance residential experience. We place TVs in places where people usually wait, e.g., elevators, showing shuttle time, weather forcast, news and ongoing events inside Sidney-Pacific. Building's 16 TVs are controlled by Raspberry Pi. The whole project is a combination of hardware and web-development. The status of the TVs is monitored by this dashboard.

    [Website]

    RMG website

    RMG website is a sister project of RMG-Py. It provides analysis and visualization tools for convenience of RMG users. Selected features:

  • Search molecules and estimate properties
  • RMG native database browsing
  • Create input file for RMG models
  • Visualize/compare/merge RMG models
  • Latest RMG workshops and materials
  • [Github Code | Website]