Judge an LLM Judge: A Dual-Layer Evaluation Framework for Continuous Improvement of LLM Evaluation
Can "the evaluation of an LLM application by an LLM judge" be audited by another LLM judge for the continuous improvement of the evaluation- 25896Murphy2025-03-22
Towards Generalization on Graphs: From Invariance to Causality
This blog post shares recent papers on out-of-distribution generalization on graph-structured data- 22021Murphy2025-03-22
Modern Enterprise Data Modeling
How to address the shortcomings of shallow, outdated models and future-proof your modeling strategy- 28499Murphy2025-03-22
A Python Engineer's Introduction to 3D Gaussian Splatting (Part 3)
Part 3 of our Gaussian Splatting tutorial, showing how to render splats onto a 2D image.- 21890Murphy2025-03-22
YOLO inference with Docker via API
Learn how to orchestrate object detection inference via a REST API with Docker- 24579Murphy2025-03-22
Forecasting in the Age of Foundation Models
Benchmarking Lag-Llama against XGBoost- 28932Murphy2025-03-22
Product Quasi-Experimentation: Statistical Techniques When Standard A/B Testing Is Not Possible
A guide to the most popular techniques when randomized A/B testing is not possible- 25582Murphy2025-03-22
Counterfactuals in Language AI
with open source language models and LLMs- 29916Murphy2025-03-22
Let's reproduce NanoGPT with JAX!(Part 1)
Part 1: Build 124M GPT2 with JAX. Part 2: Optimize the training speed in Single GPU. Part 3: Multi-GPU Training in Jax.- 24612Murphy2025-03-22
How To Start Technical Writing & Blogging
Why writing data science blogs changed my career.- 27631Murphy2025-03-22
Line By Line, Let's Reproduce GPT-2: Section 1
This blog post will go line-by-line through the code in Section 1 of Andrej Karpathy's "Let's reproduce GPT-2 (124M)"- 24390Murphy2025-03-22
Constrained Sentence Generation Using Gibbs Sampling and BERT
A fast and effective approach to generating fluent sentences from given keywords using public pre-trained models.- 25331Murphy2025-03-22
I Used to Hate Overfitting, But Now I'm Grokking It
The surprising generalisation beyond overfitting- 23411Murphy2025-03-22
Advanced Data Modelling
Data model layers, environments, tests and data quality explained- 27677Murphy2025-03-22
Python's Parallel Paradigm Shift
Exploring the performance potential of a GIL-free Python- 28354Murphy2025-03-22
Evaluating ChatGPT's Data Analysis Improvements: Interactive Tables and Charts
Is ChatGPT becoming a BI tool?- 23130Murphy2025-03-22
Streamlining Object Detection with Metaflow, AWS, and Weights & Biases
How to create a production-grade pipeline for object detection- 23262Murphy2025-03-22
Summer Olympic Games Through the Lens of Data
Using Python and Wikipedia to draw geographical and network maps of the medal-winning countries.- 22011Murphy2025-03-22
Battling Open Book Exams with Open Source LLMs
In the age where everyone uses ChatGPT for work and school, I am taking advantage of that to help me study in my university course- 21324Murphy2025-03-22
You Don't Need Matplotlib When Pandas Is Enough for Data Visualisation
One line of code to plot data makes routine EDA jobs easier- 22749Murphy2025-03-22
The current state of continual learning in AI
Why is ChatGPT only trained up until 2021?Optimizing Pandas Code: The Impact of Operation Sequence
Learn how to rearrange your code to achieve significant speed improvements.