A matching engine built in C++20 replicating core exchange
infrastructure. Implements price-time priority matching, O(1)
order cancellation via hash-indexed lookup, and fixed-point
integer price representation to eliminate floating-point map key
errors. Includes VWAP tracking, nanosecond-resolution latency
measurement, and test suite covering limit orders, market
orders, partial fills, and cancellation.
Advancing Algorithmic Trading: Tabular Deep Learning & Alternative Data
Featured
MSc dissertation investigating TabNet and FT-Transformer for daily equity trading
signal generation across ~300 S&P 500 constituents (2015–2025). Evaluated via a daily-rebalanced long-short strategy
benchmarked against SPY, with an interactive backtesting dashboard.
Interactive RAG application that indexes my CV and projects for
semantic retrieval. Deployed live on Streamlit Cloud - try
asking it about any of my experience.
Neural Architecture Benchmarking for Financial Sentiment
Analysis Featured
Benchmarked MLP, LSTM, and Transformer architectures on
financial news sentiment classification. Transformer models
delivered the highest performance with 90.3% Accuracy, 0.876 F1
Score, and 0.969 ROC-AUC, outperforming RNN baselines across all
metrics.
Cloud-Hosted Agriculture Database Management System
Featured
Engineered a normalised (3NF) relational database on Azure SQL
to analyse UK agricultural sustainability. Built automated
Python ingestion pipelines and a Node.js REST API layer.
Comparing LLM Methods: Parameter-Efficient Fine-tuning and Retrieval-Augmented Generation
Advanced LLM applications coursework comparing full fine-tuning vs. LoRA on DistilBERT, and
implementing RAG for question answering. Achieved 89.8% accuracy with full fine-tuning
and 85.7% with LoRA (49% faster, 99% fewer parameters). RAG system improved answer
quality by 8.5× using retrieved context.
End-to-end credit risk pipeline on real-world loan data.
Compared Logistic Regression, Random Forest, and XGBoost;
addressed multicollinearity and class imbalance. XGBoost
achieved the best results with 92.8% Accuracy, 0.944 AUC, and
0.816 F1-score.
Implemented a full MLP in NumPy - forward pass, backpropagation,
weight updates - achieving ~89% accuracy on Fashion MNIST, on
par with a reference PyTorch implementation.
Predicting Diabetes: Key Factors & Model Benchmarking
Identified top clinical predictors of diabetes diagnosis and
benchmarked classification models. Logistic Regression yielded
the highest accuracy at 77.3%, while SVM achieved the highest
AUC at 0.850. Includes feature importance analysis and model
interpretability.
Semi-automated reporting pipeline built during internship at
Ekimetrics: raw data ingestion and cleaning through to
interactive Power BI dashboards for commercial KPI tracking.
Power BIPythonData Pipeline
Proprietary - not publicly accessible
ML & Type 1 Diabetes: A Critical Analysis
Third-year dissertation evaluating machine learning applications
in T1D: glucose forecasting models, diagnostic tools, and
closed-loop insulin delivery systems. Highlights include XGBoost
achieving 0.99 AUC for hypoglycaemia prediction and dual-hormone
systems reaching 93.1% Time in Range (TIR).
Streamlit web app for processing student data exports using
a Bronze → Silver → Gold medallion architecture. Cleans and
validates records, then splits into four exports for Power BI,
membership, and survey platforms. Hosted on Azure App Service.
PythonStreamlitAzurePandasETLPower BI
Proprietary - not publicly accessible
Certifications
Verified credentials
IBM Data Science Professional Certificate
12-module certification covering Python, SQL, data visualisation,
machine learning, and applied data science. Verified by IBM via
Coursera.