Unbox the Cox: Intuitive Guide to Cox Regressions

Author:Murphy | View: 21942 | Time: 2025-03-23 18:26:08

Introduction

The goal of Cox Regression is to model the relationship between predictor variables and the time it takes for an event to happen – like events that only happen once. Let's dive into a made-up dataset with 5 subjects, labeled A to E. During the study, each subject either experienced an event (event = 1) or not (event = 0). On top of that, each subject got assigned a single predictor, let's call it x, before the study. As a practical example, if we are tracking the death events, then x could be the dosage of a drug we are testing to see if it helps people live longer, by affecting the time until death.

import pandas as pd
import numpy as np

sample_df = pd.DataFrame({
    'subject': ['A', 'B', 'C', 'D', 'E'],
    'time': [1, 3, 5, 4, 6],
    'event': [1, 1, 1, 1, 0],
    'x': [-1.7, -0.4, 0.0, 0.9, 1.2],
})

sample_df

In this dataset, subject E did not experience anything during the study, so we set event = 0 and the time assigned is basically the last moment we knew about them. This kind of data is called "censored" because we have no clue if the event happened after the study ended. To make it easier to understand, a cool "lollipop"

Tags: Cox Proportional Hazards Cox Regression Data Science Maximum Likelihood Statistics