Control AI Costs Through Agile Data Science Project Management
Introduction The world of data science is complex, with hidden costs that go beyond budgetary limits. Data scientists are a significant investment for any organization. Unfortunately, inefficiencies like idle infrastructure can waste significant amounts o- 25922Murphy2025-03-22
Exploring emotions with Artificial Intelligence, OpenAI, and Exploratory Data Analysis
Here's how to visualize emotion in text with Python using OpenAI and Exploratory Data Analysis- 23691Murphy2025-03-22
Large Language Models, MirrorBERT - Transforming Models into Universal Lexical and Sentence…
Discover how mirror augmentation generates data and aces the BERT performance on semantic similarity tasks- 26574Murphy2025-03-22
Nonlinear Dimension Reduction, Kernel PCA (kPCA), and Multidimensional Scaling – An Easy Tutor
How to Flatten your Swiss-Roll without Destroying It!!- 24621Murphy2025-03-22
Exploding & Vanishing Gradient Problem in Deep Learning
How to ensure your neural network doesn't "die" or "blow-up"- 29692Murphy2025-03-22
TiDE: the 'embarrassingly' simple MLP that beats Transformers
A deep exploration of TiDE, its implementation using Darts and a real life use case comparison with DeepAR and TFT (a Transformer architecture) As industries continue to evolve, the importance of an accurate forecasting becomes a non-negotiable asset whet- 23163Murphy2025-03-22
PyTorch Introduction – Building your First Linear Model
Learn how to build your first PyTorch model, by using the "magical" Linear layer!- 20471Murphy2025-03-22
Geometrical Interpretation of Linear Regression in Machine Learning versus Classical Statistics
Demystifying the confusion about Linear Regression Visually and Analytically- 22409Murphy2025-03-22
AI Consciousness Unfolded
Challenging the Integrated Information Theory- 20498Murphy2025-03-22
Add One Line of SQL to Optimise Your BigQuery Tables
Clustering: A simple way to group similar rows and prevent unnecessary data processing- 26625Murphy2025-03-22
End-of-Year Report on a 12-Year Data Journey
Three stories about the data career journey- 26039Murphy2025-03-22
Courage to Learn ML: Demystifying L1 & L2 Regularization (part 4)
Explore L1 & L2 Regularization as Bayesian Priors- 24556Murphy2025-03-22
Can LLMs Replace Data Analysts? Building An LLM-Powered Analyst
Part 1: empowering ChatGPT with tools- 20584Murphy2025-03-22
Create Many-To-One relationships Between Columns in a Synthetic Table with PySpark UDFs
Image generated with DALL-E 3 I’ve recently been playing around with Databricks Labs Data Generator to create completely synthetic datasets from scratch. As part of this, I’ve looked at building sales data around different stores, employees, a- 21384Murphy2025-03-22
Earth Isn't Flat, and Neither Should Your Voronoi Diagrams Be
A story about precision, unveiling the power of spherical geospatial Voronoi diagrams with Python- 26280Murphy2025-03-22
3 Python Operations for Solving Specific Data Processing Tasks Efficiently
Leverage the flexibility of Pandas and Python- 27040Murphy2025-03-22
How to efficiently fine-tune your own open-source LLM using novel techniques – code provided
In this article I tune a base LLama2 LLM to output SQL code. I use Parameter Efficient Fine-Tuning Techniques to optimise the process.- 28719Murphy2025-03-22
The Surprising Behavior of Data in Higher Dimensions
A Journey into the Surprising World of High-Dimensional Data: The Blessings and the Challenges- 28412Murphy2025-03-22
MLX vs MPS vs CUDA: a Benchmark
A first benchmark of Apple's new ML framework MLX- 28407Murphy2025-03-22
Revolutionizing Language Barriers: Mastering Multilingual Audio Transcription and Semantic Search
Unlock the potential of cross-language information accessibility with advanced transcription and semantic search technologies- 22760Murphy2025-03-22
The current state of continual learning in AI
Why is ChatGPT only trained up until 2021?Optimizing Pandas Code: The Impact of Operation Sequence
Learn how to rearrange your code to achieve significant speed improvements.