Projects

Always building. Always learning.

Here are some of the projects I have been working on:

Project Name Description Year Status
arxiv logo
arxiv code search Searching through arxiv papers (with ML) to see if they include the code and data to reproduce the work. Active research. (Github repo) 2022 🛠️ (active)
PyPHM - Machinery data, made easy
PyPHM Machinery data, made easy. Open-source Python package to easily download and prepare common PHM (prognostics and health management) datasets. Use PyPHM before feature engineering or model training. (Github repo; install with pip) 2022 🛠️ (active)
Can we use a VAE combined with a GAN? Surrogate Modeling of Time Series with GANs Using generative adversarial networks (GANs) for modeling of time series. Applied to applications in manufacturing systems. Active research (not yet public). 2022 🛠️ (active)
Generic feature engineering/ML pipline
Time Series ML Pipeline Developing scalable ETL/ML pipeline for rapid testing of feature engineering and machine learning techniques. For use on industrial time series data. Leverages HPC or cloud infrastructure. Active research. (Github repo) 2022 🛠️ (active)
EarthGAN preliminary results.
EarthGAN Can we visualize a large scientific data set with a surrogate model? Demonstrating a proof-of-concept using the Earth mantle convection data set and GANs. Recipient of the Innovation Award at IEEE VIS 2021. (Github repo; youtube presentation; preprint article) 2021 ⏸️ (paused)
Sources of knowledge used in Weibull-based loss function paper
Results for the RUL prediction on the IMS data set
Knowledge Informed Machine Learning External knowledge can enhance machine learning. We use knowledge from reliability engineering, and integrate it into a machine learner through the use of a Weibull-based loss function. Demonstrated on bearing remaining-useful-life prediction. Published (accepted) in Journal of Prognostics and Health Management. (Github repo; preprint article) 2021 ✔️ (complete)
A violin plot showing the births, by month.
CDC Birth Data Personal project exploring the CDC birth data files from 1968 to 2020. Developed for HPC and local compute. (Github repo) 2021 🛠️ (active)
Jupyter + Compute Canada = Love!
Compute Canada Tutorials New to Compute Canada and high performance computing? Here are some tutorials to get you started. (Github repo; youtube link) 2021 ✔️ (complete)
trend line
precision recall curves
Anomaly Detection for Tool Wear Monitoring Using a $\beta$-VAE Anomaly detection on the UC Berkeley milling data set using a disentangled-variational-autoencoder ($\beta$-VAE). Published in the International Journal of Hydromechatronics. (Github repo; preprint article) 2020 ✔️ (complete)
KNN on the Iris data set, with the decision boundaries
Beautiful Plots A collection of beautiful plots, and other data visualization explorations. (Github repo) 2020 🛠️ (active)
Select trends of features on industrial CNC data. Feature Engineering in Tool Wear Monitoring Demonstrating feature engineering and classical machine learning for use on tool wear monitoring. Applied to industrial partner’s manufacturing environment. Discussed in thesis, Feature Engineering and End-to-End Deep Learning in Tool Wear Monitoring. 2020 ✔️ (complete)
Anomaly Detection in the Wild front page.
Anomaly Detection in the Wild Talk on real-world anomaly detection methods within health care, astronomy, finance, and manufacturing. Presented at Pycon Canada 2019 (video of presentation was lost, but here’s the repo and pdf) 2019 ✔️ (complete)
AI in Public Health
AI in Public Health Presentation on AI within the public health domain. Talk given to medical residents and MOH’s at KFL&A Public Health. (pdf) 2019 ✔️ (complete)
Medium voltage signals
Deep Learning for Partial Discharge Detection Final project for Deep Learning course (CISC-867). Used a convolutional autoencoder to detect faults in medium voltage power lines. Received an A+ in the course too! (Github repo; final paper) 2019 ✔️ (complete)