CatBoost, a high-performance gradient boosting library, gives the flexibleness to outline customized metrics that may be tailor-made to particular enterprise necessities or domain-specific objectives. This text demonstrates methods to create and use customized metrics in CatBoost for each classification and regression duties.
We’ll begin with a classification instance. We’ll use a customized metric based mostly on revenue calculation for a binary classification drawback utilizing the Titanic dataset.
Right here’s the entire code for the customized classification metric:
from catboost import CatBoostClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix
from scipy.particular import expit
import numpy as np# Load dataset
df = sns.load_dataset('titanic')
X = df[['survived', 'pclass', 'age', 'sibsp', 'fare']]
y = X.pop('survived')
# Break up information into coaching and testing units
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=100)
class ProfitMetric:
@staticmethod
def get_profit(y_true, y_pred):
# Apply logistic perform to get chances
y_pred = expit(y_pred).astype(int)
y_true = y_true.astype(int)
tn, fp, fn, tp = confusion_matrix(y_true, y_pred).ravel()
# Calculate revenue
revenue = 400 * tp - 200 * fn - 100 * fp
return revenue
def…