Binder badge Colab badge

Single Feature Contibution

Imports

[1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

from deepchecks.checks.methodology import *
from deepchecks.base import Dataset

Generating data:

[2]:
df = pd.DataFrame(np.random.randn(100, 3), columns=['x1', 'x2', 'x3'])
df['x4'] = df['x1'] * 0.5 + df['x2']
df['label'] = df['x2'] + 0.1 * df['x1']
df['x5'] = df['label'].apply(lambda x: 'v1' if x < 0 else 'v2')

[3]:
ds = Dataset(df, label='label')

Running single_feature_contribution check:

[4]:
SingleFeatureContribution().run(ds)

Single Feature Contribution

Return the PPS (Predictive Power Score) of all features in relation to the label.

Additional Outputs
../../../_images/examples_checks_methodology_single_feature_contribution_7_1.png
The PPS represents the ability of a feature to single-handedly predict another feature or label.
A high PPS (close to 1) can mean that this feature's success in predicting the label is actually due to data
leakage - meaning that the feature holds information that is based on the label to begin with.

Using the SingleFeatureContribution check class:

[5]:
my_check = SingleFeatureContribution(ppscore_params={'sample': 10})
my_check.run(dataset=ds)

Single Feature Contribution

Return the PPS (Predictive Power Score) of all features in relation to the label.

Additional Outputs
../../../_images/examples_checks_methodology_single_feature_contribution_9_1.png
The PPS represents the ability of a feature to single-handedly predict another feature or label.
A high PPS (close to 1) can mean that this feature's success in predicting the label is actually due to data
leakage - meaning that the feature holds information that is based on the label to begin with.