3 D Statistical Learning

A client X, a health Insurance company, maintains contact with physicians who have given the company opt-in consent via email communication. Customer-centricity is a priority for X, and online communication is tailored to individualize content based on customer preferences and requirements. Consequently, the marketing team generates materials that comply with AMG (Arzneimittelgesetz) and HWG (Heilmittelwerbegesetz) regulations, which are then utilized for online communication.

The marketing team is currently exploring the following questions:

What content resonates exceptionally well?
What is the optimal timing for sending emails?
When do our customers typically open our emails?

Dataset Description

For theĀ  project, you will be provided with the following data:

  1. Email Performance Data: In the file ‘Email_Performance.csv,’ you will find simulated data regarding X’s email communication with customers regarding various products. The information includes details on which email was sent to which physician for which product at a specific time and whether the email was opened or if a link within the email was clicked.
  2. Account Cluster: The file ‘Account_Cluster.csv’ contains the categorization of a physician’s email address into private, practice, or clinic and indicates whether the physician wishes to communicate with Boehringer Ingelheim using their practice, clinic, or private email address. If the information cannot be clearly assigned, it is labeled as ‘not clearly defined.

Vorgehensweise