Spoiler Alert: The Magic of RAG Does Not Come from AI
Why retrieval, not generation, makes RAG systems magical- 23219Murphy2025-03-22
Techniques for Chat Data Analytics with Python
Part II: Topic Extraction with BERTopic- 28624Murphy2025-03-22
NLP Illustrated, Part 1: Text Encoding
An illustrated guide to text-to-number translation, with code- 28772Murphy2025-03-22
How to Reduce Python Runtime for Demanding Tasks
Practical techniques to accelerate heavy workloads with GPU optimization in Python- 20610Murphy2025-03-22
Your Data Quality Checks Are Worth Less (Than You Think)
How to deliver outsized value on your data quality program- 20210Murphy2025-03-22
From Local to Cloud: Estimating GPU Resources for Open-Source LLMs
Estimating GPU memory for deploying the latest open-source LLMs- 29729Murphy2025-03-22
Linear programming: Integer Linear Programming with Branch and Bound
Part 4: Extending linear programming optimization to discrete decision variables- 20921Murphy2025-03-22
Data Validation with Pandera in Python
Validating your Dataframes for Production ML Pipelines- 21426Murphy2025-03-22
Boost Your Python Code with CUDA
Target your GPU easily with Numba's CUDA JIT- 26654Murphy2025-03-22
Third-Year Work Anniversary as a Data Scientist: Growth, Reflections and Acceptance
A letter to myself and fellow data scientists- 26925Murphy2025-03-22
Creating a frontend for your ML application with Vercel V0
Develop an appealing frontend application using v0 by Vercel- 21273Murphy2025-03-22
How to Build a Data-Driven Customer Management System
A High-Level Framework for Building a CBM System with Strategic Impact- 21019Murphy2025-03-22
Is ReFT All We Needed?
Representation Fintuning - Beyond the PEFT Techniques for fine-tuning LLMs- 25782Murphy2025-03-22
Navigating Networks with NetworkX: A Short Guide to Graphs in Python
Explore NetworkX for building, analyzing, and visualizing graphs in Python. Discovering Insights in Connected Data.- 22566Murphy2025-03-22
How to Easily Deploy a Local Generative Search Engine Using VerifAI
An open-source initiative to help you deploy generative search based on your local files and self-hosted (Mistral, Llama 3.x) or commercial...- 22446Murphy2025-03-22
How to Answer Business Questions with Data
Data analysis is the key to drive business decisions through answering abstract business questions but it's hard to get right- 26343Murphy2025-03-22
Getting Started with Multimodal AI, One-Hot Encoding, and Other Beginner-Friendly Guides
Our weekly selection of must-read Editors' Picks and original features- 26411Murphy2025-03-22
Increasing Transformer Model Efficiency Through Attention Layer Optimization
How paying "better" attention can drive ML cost savings- 20964Murphy2025-03-22
The Metrics of Continual Learning
These three metrics are commonly used- 25872Murphy2025-03-22
Collision Risk in Hash-Based Surrogate Keys
Various aspects and real-life analogies of the odds of having a hash collision when computing Surrogate Keys using MD5, SHA-1, and SHA-256.- 28703Murphy2025-03-22
The current state of continual learning in AI
Why is ChatGPT only trained up until 2021?Optimizing Pandas Code: The Impact of Operation Sequence
Learn how to rearrange your code to achieve significant speed improvements.