Using python to compute relative risk (risk ratio) from a dataframe with support of the zepid package (simulate the riskratio from epitools r pack.)

Solution 1:

The error is a result of how RiskRatio is parsing your input data set behind the scenes.

When using RiskRatio, the default reference category is set to 0. So, when you independent variable is being processed internally, zEpid is looking for age_group=0. However, there are no instances of 0 in your data set.

To fix this, you can specify the optional argument reference. By default reference=0 but you can set it to 1, which will set age_group=1 as the reference risk for the risk ratio.

The following is a simple example with some simulated data with 'A' and 'Y'

import numpy as np
import pandas as pd
from scipy.stats import norm
from zepid import RiskRatio

np.random.seed(20220120)
df = pd.DataFrame()
df['A'] = np.random.randint(1, 4, size=100)
df['Y'] = np.random.binomial(n=1, p=0.25, size=100)

# Generating some generic data
np.random.seed(20220120)
df = pd.DataFrame()
df['A'] = np.random.randint(1, 4, size=80)           # Note: A \in {1,2,3}
df['Y'] = np.random.binomial(n=1, p=0.25, size=80)   # Note: Y \in {0,1}

# Estimating Risk Ratios with zEpid
rr = RiskRatio(reference=1)
rr.fit(df, exposure='A', outcome='Y')

# Calculating P-values
est = rr.results['RiskRatio'][1:]
std = rr.results['SD(RR)'][1:]
z_score = np.log(est)/std
p_value = norm.sf(abs(z_score))*2

# Displaying results
print("RR:     ", list(est))
print("P-value:", p_value)

Which should output the following

RR:      [1.0266666666666666, 0.7636363636363636]
P-value: [0.93990517 0.5312407 ]

I generated some generic data rather than use the example data set provided because there is another issue in that data that will result in an error. Below is a 2-by-3 table of the data set

adhd_parent   0   1
age_group          
1            62   0
2             0  32
3             0   6

These structural zeroes in the data will through a PositivityError in zEpid. Basically, you can't calculate the risk due to a division by zero (the risk in the referent is 0).

Uploaded File Size in LoadRunner

Which app does this menu bar icon belong to?

Can you add Tags to pre-installed apps on macOS Catalina?

Accuracy of "Data Sent/sec" in Activity Monitor on MacOS Catalina

How To Disable JavaScript for a Specific Website in Safari on IOS?

IKEv2 VPN not working after updating to MacOS Catalina - User authentication failed with username authentication settings

zsh: Can't set DYLD_LIBRARY_PATH in ~/.zshenv to get ImageMagick working under MacOS Catalina

How can I find out the IO throughput of an external storage device connected to a MBP?

MacOS GPT Messed up File System ('FFFFFFFF-FFFF-FFFF-FFFF-FFFFFFFFFFFF')

Restore directory under ~/Library from TimeMachine

How to disable xpc items from launchd?

Can I replace a MBP 2012 SSD with the Flash storage from a MBP 2013?