Single Feature Contibution¶
Imports¶
[1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from deepchecks.checks.methodology import *
from deepchecks.base import Dataset
Generating data:¶
[2]:
df = pd.DataFrame(np.random.randn(100, 3), columns=['x1', 'x2', 'x3'])
df['x4'] = df['x1'] * 0.5 + df['x2']
df['label'] = df['x2'] + 0.1 * df['x1']
df['x5'] = df['label'].apply(lambda x: 'v1' if x < 0 else 'v2')
[3]:
ds = Dataset(df, label='label')
Running single_feature_contribution check:¶
[4]:
SingleFeatureContribution().run(ds)
Single Feature Contribution
Return the PPS (Predictive Power Score) of all features in relation to the label. Read More...
Additional Outputs
The Predictive Power Score (PPS) is used to estimate the ability of a feature to predict the label by itself. (Read more about Predictive Power Score)A high PPS (close to 1) can mean that this feature's success in predicting the label is actually due to data leakage - meaning that the feature holds information that is based on the label to begin with.
Using the SingleFeatureContribution check class:¶
[5]:
my_check = SingleFeatureContribution(ppscore_params={'sample': 10})
my_check.run(dataset=ds)
Single Feature Contribution
Return the PPS (Predictive Power Score) of all features in relation to the label. Read More...
Additional Outputs
The Predictive Power Score (PPS) is used to estimate the ability of a feature to predict the label by itself. (Read more about Predictive Power Score)A high PPS (close to 1) can mean that this feature's success in predicting the label is actually due to data leakage - meaning that the feature holds information that is based on the label to begin with.