Josh Le Grice

High-Performance Limit Order Book in C++

A matching engine built in C++20 replicating core exchange infrastructure. Implements price-time priority matching, O(1) order cancellation via hash-indexed lookup, and fixed-point integer price representation to eliminate floating-point map key errors. Includes VWAP tracking, nanosecond-resolution latency measurement, and test suite covering limit orders, market orders, partial fills, and cancellation.

C++20 Matching Engine

GitHub →

MSc Dissertation: Advancing Algorithmic Trading

Advancing Algorithmic Trading: Tabular Deep Learning & Alternative Data Featured

MSc dissertation investigating TabNet and FT-Transformer for daily equity trading signal generation across ~300 S&P 500 constituents (2015–2025). Evaluated via a daily-rebalanced long-short strategy benchmarked against SPY, with an interactive backtesting dashboard.

PythonTabNetFT-TransformerXGBoostOptunaBacktesting

Open app - Beta → Report coming soon Github Repo coming soon

Portfolio LLM Featured

Interactive RAG application that indexes my CV and projects for semantic retrieval. Deployed live on Streamlit Cloud - try asking it about any of my experience.

PythonLLMRAGStreamlit

Open app → GitHub →

Neural Architecture Benchmarking for Financial Sentiment Analysis Featured

Benchmarked MLP, LSTM, and Transformer architectures on financial news sentiment classification. Transformer models delivered the highest performance with 90.3% Accuracy, 0.876 F1 Score, and 0.969 ROC-AUC, outperforming RNN baselines across all metrics.

PythonPyTorchNLPDeep Learning

Read report → GitHub →

Cloud-Hosted Agriculture Database Management System Featured

Engineered a normalised (3NF) relational database on Azure SQL to analyse UK agricultural sustainability. Built automated Python ingestion pipelines and a Node.js REST API layer.

Azure SQLPythonNode.jsREST API

Read report → View implementation → GitHub →

Comparing LLM Methods: Parameter-Efficient Fine-tuning and Retrieval-Augmented Generation

Advanced LLM applications coursework comparing full fine-tuning vs. LoRA on DistilBERT, and implementing RAG for question answering. Achieved 89.8% accuracy with full fine-tuning and 85.7% with LoRA (49% faster, 99% fewer parameters). RAG system improved answer quality by 8.5× using retrieved context.

PyTorchTransformers PEFTLoRA RAGDistilBERT

GitHub →

Predicting Loan Defaults: Credit Risk Analysis

End-to-end credit risk pipeline on real-world loan data. Compared Logistic Regression, Random Forest, and XGBoost; addressed multicollinearity and class imbalance. XGBoost achieved the best results with 92.8% Accuracy, 0.944 AUC, and 0.816 F1-score.

PythonXGBoostScikit-learn

Read report → GitHub →

Multi-Layer Perceptron for Image Classification

Implemented a full MLP in NumPy - forward pass, backpropagation, weight updates - achieving ~89% accuracy on Fashion MNIST, on par with a reference PyTorch implementation.

PythonNumPyDeep Learning

GitHub →

Predicting Diabetes: Key Factors & Model Benchmarking

Identified top clinical predictors of diabetes diagnosis and benchmarked classification models. Logistic Regression yielded the highest accuracy at 77.3%, while SVM achieved the highest AUC at 0.850. Includes feature importance analysis and model interpretability.

PythonMLResearch

Read report → GitHub →

Commercial KPI Dashboard - Ekimetrics Internship

Semi-automated reporting pipeline built during internship at Ekimetrics: raw data ingestion and cleaning through to interactive Power BI dashboards for commercial KPI tracking.

Power BIPythonData Pipeline

Proprietary - not publicly accessible

ML & Type 1 Diabetes: A Critical Analysis

Third-year dissertation evaluating machine learning applications in T1D: glucose forecasting models, diagnostic tools, and closed-loop insulin delivery systems. Highlights include XGBoost achieving 0.99 AUC for hypoglycaemia prediction and dual-hormone systems reaching 93.1% Time in Range (TIR).

ResearchML

Read report →

Ethical Considerations in Automated Insulin Delivery

Evaluated the Diabeloop DBLG1 AI-driven diabetes management system. Analyzed algorithmic bias, transparency, and fairness through utilitarian and principlism frameworks, proposing collaborative governance strategies.

ResearchAI Ethics

Read report →

Exeter Students' Guild – ETL Data Pipeline

Streamlit web app for processing student data exports using a Bronze → Silver → Gold medallion architecture. Cleans and validates records, then splits into four exports for Power BI, membership, and survey platforms. Hosted on Azure App Service.

Python Streamlit Azure Pandas ETL Power BI

Proprietary - not publicly accessible

Technical Skills

Core Languages

Machine Learning & Deep Learning

Quantitative Finance

Software Engineering

Data & Cloud Infrastructure

Featured Projects

Certifications

Contact