Binder badge Colab badge

Single Feature Contibution

Imports

[1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from deepchecks.checks.methodology import *
from deepchecks.base import Dataset

Generating data:

[2]:
df = pd.DataFrame(np.random.randn(100, 3), columns=['x1', 'x2', 'x3'])
df['x4'] = df['x1'] * 0.5 + df['x2']
df['label'] = df['x2'] + 0.1 * df['x1']
df['x5'] = df['label'].apply(lambda x: 'v1' if x < 0 else 'v2')

[3]:
ds = Dataset(df, label='label')

Running single_feature_contribution check:

[4]:
SingleFeatureContribution().run(ds)

Single Feature Contribution

Return the PPS (Predictive Power Score) of all features in relation to the label. Read More...

Additional Outputs
../../../_images/examples_checks_methodology_single_feature_contribution_7_1.png
The Predictive Power Score (PPS) is used to estimate the ability of a feature to predict the label by itself. (Read more about Predictive Power Score)A high PPS (close to 1) can mean that this feature's success in predicting the label is actually due to data leakage - meaning that the feature holds information that is based on the label to begin with.

Using the SingleFeatureContribution check class:

[5]:
my_check = SingleFeatureContribution(ppscore_params={'sample': 10})
my_check.run(dataset=ds)

Single Feature Contribution

Return the PPS (Predictive Power Score) of all features in relation to the label. Read More...

Additional Outputs
../../../_images/examples_checks_methodology_single_feature_contribution_9_1.png
The Predictive Power Score (PPS) is used to estimate the ability of a feature to predict the label by itself. (Read more about Predictive Power Score)A high PPS (close to 1) can mean that this feature's success in predicting the label is actually due to data leakage - meaning that the feature holds information that is based on the label to begin with.