Introduction
In social sciences, health sciences, and fields like banking and insurance, logistic regression remains a popular method for prediction and inference, especially when the outcome variable is binary. Logistic regression, a specific case of generalized linear models (GLMs), owes its popularity partly to its explainability and straightforward interpretation.
Logistic regression employs the logit link function, which maps probabilities to a linear combination of predictors. However, the logit link is not always the most suitable choice for certain situations.
Alternative link functions such as probit and cloglog (complementary log-log), which also belong to the exponential family, offer potential advantages. Although this article does not delve deeply into the mechanics of probit and cloglog, we emphasize their consideration in applied research since they are readily available in most statistical software packages. Let us compare the three link functions using a practical real-world dataset and analysis.
Dataset Overview
The analysis is based on data collected from 500 newborns at a London hospital. Each infant was evaluated for low birth weight (defined as less than 2500 grams). The binary outcome variable, lowbw, is coded as:
- lowbw = 1: Low birth weight
- lowbw = 0: Normal birth weight
Potential Influencing Factors The dataset includes the following predictors:
Sex of the infant (sex): 1 = Male, 2 = Female
Gestational age (weeks): Measured in weeks.
Mother’s age (age): Recorded in years.
Maternal hypertension (hyp): 1 = Mother has hypertension, 0 = Mother does not have hypertension
Objective
The goal is to examine the impact of maternal age and hypertension on the likelihood of low birth weight. Specifically, we aim to determine how these factors contribute to the probability of a newborn weighing less than 2500 grams.
…
Complete Article on LinkedIn
The full article is available at the following link:
We welcome your comments and questions, and invite you to follow us for more insights.
We help businesses and researchers solve complex challenges by providing expert guidance in statistics, machine learning, and tailored education.
Our core services include:
– Statistical Consulting:
Comprehensive consulting tailored to your data-driven needs.
– Training and Coaching:
In-depth instruction in statistics, machine learning, and the use of statistical software such as SAS, R, and Python.
– Reproducible Data Analysis Pipelines:
Development of documented, reproducible workflows using SAS macros and customized R and Python code.
– Interactive Data Visualization and Web Applications:
Creation of dynamic visualizations and web apps with R (Shiny, Plotly), Python (Streamlit, Dash by Plotly), and SAS (SAS Viya, SAS Web Report Studio).
– Automated Reporting and Presentation:
Generation of automated reports and presentations using Markdown and Quarto.
– Scientific Data Analysis:
Advanced analytical support for scientific research projects.