DeepGuide for DeepSeek

This Is How LLMs Break Down the Language
The science and art behind tokenization
21494Murphy2025-03-22
One-Tailed Vs. Two-Tailed Tests
Choosing between one- and two-tailed hypotheses affects every stage of A/B testing. Learn why the hypothesis direction matters and explore the pros and cons of each approach.
25031Murphy2025-03-22
LettuceDetect: A Hallucination Detection Framework for RAG Applications
How to capitalize on ModernBERT’s extended context window to build a token-level classifier for hallucination detection
21268Murphy2025-03-22
How to Spot and Prevent Model Drift Before it Impacts Your Business
3 essential methods to track model drift you should know
23601Murphy2025-03-22
Experiments Illustrated: How We Optimized Premium Listings on Our Nursing Job Board
Also, how georandomization can help clean up spillovers
29879Murphy2025-03-22
Experiments Illustrated: How Random Assignment Saved Us $1M in Marketing Spend
Also, a casual intro to the multiple comparisons problem
21035Murphy2025-03-22
Are You Still Using LoRA to Fine-Tune Your LLM?
A look at this year’s crop of LoRA alternatives
25553Murphy2025-03-22
Linear Regression in Time Series: Sources of Spurious Regression
Why does the autocorrelation of the errors term matter?
28976Murphy2025-03-22
From Fuzzy to Precise: How a Morphological Feature Extractor Enhances AI's Recognition Capabil
Mimicking human visual perception to truly understand objects
26368Murphy2025-03-22
The Impact of GenAI and Its Implications for Data Scientists
What we can learn from Anthropic’s analysis of millions of Claude.ai chats
28400Murphy2025-03-22
Mastering Hadoop, Part 3: Hadoop Ecosystem: Get the most out of your cluster
Exploring the Hadoop ecosystem — key tools to maximize your cluster’s potential
21645Murphy2025-03-22
Mastering Hadoop, Part 2: Getting Hands-On — Setting Up and Scaling Hadoop
Understanding Hadoop’s core components before installation and scaling
29119Murphy2025-03-22
Six Organizational Models for Data Science
Setting a team up for success or failure
26135Murphy2025-03-22
Platform-Mesh, Hub and Spoke, and Centralised | 3 Types of data team
Why understanding team structure is critical for data and AI
29695Murphy2025-03-22
Fourier Transform Applications in Literary Analysis
How mathematics and data analysis can offer a head start to analysing poetry, before even reading the words.
24170Murphy2025-03-22
How to Make Your LLM More Accurate with RAG & Fine-Tuning
And when to use which one
24388Murphy2025-03-22
Mastering the Poisson Distribution: Intuition and Foundations
Take a dive into the foundations and exemplifying use cases of the Poisson distribution
23030Murphy2025-03-22
Anatomy of a Parquet File
Parquet from scratch: A Python deep dive into a raw parquet file
22495Murphy2025-03-22
Heatmaps for Time Series
Visualizing trends and outliers with non-linear color scales
24175Murphy2025-03-22
Algorithm Protection in the Context of Federated Learning
A pragmatic look into protecting algorithms and models deployed into real-world federated analysis and learning settings in healthcare.
22115Murphy2025-03-22

< 265 266 267 268 269 >

The current state of continual learning in AI
Why is ChatGPT only trained up until 2021?
Optimizing Pandas Code: The Impact of Operation Sequence
Learn how to rearrange your code to achieve significant speed improvements.

Recommend

◦ The Docker Compose of ETL: Meerschaum Compose

◦ Visualize a business process through data serialization

◦ An Illusion of Life

◦ Efficient Metric Collection in PyTorch: Avoiding the Performance Pitfalls of TorchMetrics

◦ Extracting Information from Natural Language Using Generative AI

◦ A Requiem for the Transformer?

◦ ChatGPT vs. Claude vs. Gemini for Data Analysis (Part 1)

◦ Log Breadcrumbs: only show Logs leading up to an Error

◦ OCR-Free Document Data Extraction with Transformers (2/2)

◦ An Intuitive Guide to Docker for Data Science

◦ Fine-Tuning BERT for Text Classification

◦ Unlocking Growth: 3 Years at Meta - Transformative Lessons for Work and Life