Reshaping the Model's Memory without the Need for Retraining
Erasing any echo of problematic content a large language model has learned- 23155Murphy2025-03-23
What Happens When Most Content Online Becomes AI-Generated?
Learn how generative models deteriorate when trained on the data they generate, and what to do about it- 23774Murphy2025-03-23
Make a Nested Bar Chart with Seaborn
I Quick Success Data Science An example of a nested bar chart (by author) A nested bar chart is a visualization method that compares multiple measurements within categories. One of these measurements represents a secondary or background measure, such as a- 28770Murphy2025-03-23
Cleaning a Messy Car Dataset with Python Pandas
Whether you are performing exploratory data analysis or building a complex ML system, you need to make sure the data is cleaned- 22945Murphy2025-03-23
Synergy of LLM and GUI, Beyond the Chatbot
Use OpenAI GPT function calling to drive your mobile app- 28781Murphy2025-03-23
How to Train BERT for Masked Language Modeling Tasks
Hands-on guide to building language model for MLM tasks from scratch using Python and Transformers library- 24269Murphy2025-03-23
5 Lessons Learned from Testing Databricks SQL Serverless + DBT
By: Jeff Chou, Stewart Bryson Databricks’ SQL warehouse products are a compelling offering for companies looking to streamline their production SQL queries and warehouses. However, as usage scales up, the cost and performance of these systems become- 25065Murphy2025-03-23
SaaS AI Features Meet Applications Without Moats
Back in July, we dug into generative AI startups from Y Combinator’s W23 batch – specifically, the startups leveraging large language models (LLMs) like GPT that powers ChatGPT. We identified some big trends with these startups – like fo- 27533Murphy2025-03-23
Understanding Retention with Gradio
How to leverage web applications for analytics- 28412Murphy2025-03-23
Sneaky Science: Data Dredging Exposed
Delve into the motivations and consequences of P-hacking- 22703Murphy2025-03-23
5 Ideas to Foster Data Scientists/Analysts Engagement Without Suffocating in Meetings
The author shares strategies they have implemented to strike this balance successfully- 22930Murphy2025-03-23
Exploring a Global Wildlife GIS database
Using Python to characterize the International Union for Conservation of Nature (IUCN)'s geospatial data base.- 28708Murphy2025-03-23
The current state of continual learning in AI
Why is ChatGPT only trained up until 2021?- 25740Murphy2025-03-23
Nine Rules to Formally Validate Rust Algorithms with Dafny (Part 2)
By Carl M. Kadie and Divyanshu Ranjan This is Part 2 of an article formally verifying a Rust algorithm using Dafny. We look at rules 7 to 9: Port your Real Algorithm to Dafny. Validate the Dafny Version of Your Algorithm. Rework Your Validation for Reliab- 25399Murphy2025-03-23
Python Decorators: A Comprehensive Guide
The article introduces the amazingly powerful syntactic sugar of Python: decorators.- 21207Murphy2025-03-23
Building a Batch Data Pipeline with Athena and MySQL
An End-To-End Tutorial for Beginners- 23711Murphy2025-03-23
In-Depth Guide to Creating and Publishing an R Data Package Using Devtools
A step-by-step account of developing my "Richmondway" R Data package, featuring the Expletives Count by Roy Kent.- 27622Murphy2025-03-23
Advanced Python: Dot Operator
The operator that enables the object-oriented paradigm in Python- 21497Murphy2025-03-23
Large Language Models: TinyBERT – Distilling BERT for NLP
Unlocking the power of Transformer distillation in LLMs- 20555Murphy2025-03-23
Python for Data Engineers
Advanced ETL techniques for beginners- 29225Murphy2025-03-23
The current state of continual learning in AI
Why is ChatGPT only trained up until 2021?Optimizing Pandas Code: The Impact of Operation Sequence
Learn how to rearrange your code to achieve significant speed improvements.