{
"cells": [
{
"cell_type": "markdown",
"id": "093664c2",
"metadata": {},
"source": [
"# Quickstart in 5 minutes"
]
},
{
"cell_type": "markdown",
"id": "20be3899",
"metadata": {
"ExecuteTime": {
"end_time": "2021-12-08T16:58:27.272515Z",
"start_time": "2021-12-08T16:58:27.229029Z"
}
},
"source": [
"In order to run your first Deepchecks Suite all you need to have is the data and model that you wish to validate. More specifically, you need:\n",
"\n",
"- Your train and test data (in Pandas DataFrames or Numpy Arrays)\n",
"- (optional) A [supported model](../../user-guide/supported_models.rst) (including XGBoost, scikit-learn models, and many more). Required for running checks that need the model's predictions for running.\n",
"\n",
"To run your first suite on your data and model, you need only a few lines of code, that start here: [Define a Dataset Object](#Define-a-Dataset-Object)."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "2410ae65",
"metadata": {
"ExecuteTime": {
"end_time": "2022-02-03T22:32:08.809372Z",
"start_time": "2022-02-03T22:32:05.953272Z"
},
"execution": {
"iopub.execute_input": "2022-03-03T15:29:17.914166Z",
"iopub.status.busy": "2022-03-03T15:29:17.913360Z",
"iopub.status.idle": "2022-03-03T15:29:20.611330Z",
"shell.execute_reply": "2022-03-03T15:29:20.610130Z"
}
},
"outputs": [],
"source": [
"# If you don't have deepchecks installed yet:\n",
"import sys\n",
"!{sys.executable} -m pip install deepchecks -U --quiet #--user"
]
},
{
"cell_type": "markdown",
"id": "08eb6950",
"metadata": {},
"source": [
"## Load Data, Split Train-Val, and Train a Simple Model"
]
},
{
"cell_type": "markdown",
"id": "6c876505",
"metadata": {},
"source": [
"For the purpose of this guide we'll use the simple iris dataset and train a simple random forest model for multiclass classification:"
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "58f8f2df",
"metadata": {
"ExecuteTime": {
"end_time": "2022-02-03T22:32:10.256689Z",
"start_time": "2022-02-03T22:32:08.862225Z"
},
"execution": {
"iopub.execute_input": "2022-03-03T15:29:20.618388Z",
"iopub.status.busy": "2022-03-03T15:29:20.617957Z",
"iopub.status.idle": "2022-03-03T15:29:23.834414Z",
"shell.execute_reply": "2022-03-03T15:29:23.833772Z"
}
},
"outputs": [],
"source": [
"# General imports\n",
"import pandas as pd\n",
"import numpy as np\n",
"\n",
"from deepchecks.datasets.classification import iris\n",
"from sklearn.ensemble import RandomForestClassifier\n",
"from sklearn.model_selection import train_test_split\n",
"\n",
"# Load Data\n",
"iris_df = iris.load_data(data_format='Dataframe', as_train_test=False)\n",
"label_col = 'target'\n",
"df_train, df_test = train_test_split(iris_df, stratify=iris_df[label_col], random_state=0)\n",
"\n",
"# Train Model\n",
"rf_clf = RandomForestClassifier()\n",
"rf_clf.fit(df_train.drop(label_col, axis=1), df_train[label_col]);"
]
},
{
"cell_type": "markdown",
"id": "59b7b88b",
"metadata": {
"ExecuteTime": {
"end_time": "2021-11-03T23:05:52.781081Z",
"start_time": "2021-11-03T23:05:52.770339Z"
}
},
"source": [
"## Define a Dataset Object"
]
},
{
"cell_type": "markdown",
"id": "10ec6717",
"metadata": {},
"source": [
"Initialize the [Dataset object](../../user-guide/dataset_object.rst), stating the relevant metadata about the dataset (e.g. the name for the label column)
\n",
"Check out the Dataset's attributes to see which additional special columns can be declared and used (e.g. date column, index column)."
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "a9ef516c",
"metadata": {
"ExecuteTime": {
"end_time": "2022-02-03T22:32:10.294607Z",
"start_time": "2022-02-03T22:32:10.288820Z"
},
"execution": {
"iopub.execute_input": "2022-03-03T15:29:23.838764Z",
"iopub.status.busy": "2022-03-03T15:29:23.838397Z",
"iopub.status.idle": "2022-03-03T15:29:23.843091Z",
"shell.execute_reply": "2022-03-03T15:29:23.842604Z"
}
},
"outputs": [],
"source": [
"from deepchecks import Dataset\n",
"\n",
"# We explicitly state that this dataset has no categorical features, otherwise they will be automatically inferred\n",
"# If the dataset has categorical features, the best practice is to pass a list with their names\n",
"\n",
"ds_train = Dataset(df_train, label=label_col, cat_features=[])\n",
"ds_test = Dataset(df_test, label=label_col, cat_features=[])"
]
},
{
"cell_type": "markdown",
"id": "37425f9d",
"metadata": {},
"source": [
"## Run a Deepchecks Suite"
]
},
{
"cell_type": "markdown",
"id": "86dfb918",
"metadata": {},
"source": [
"### Run the full suite"
]
},
{
"cell_type": "markdown",
"id": "50b93b24",
"metadata": {},
"source": [
"Use the `full_suite` that is a collection of (most of) the prebuilt checks.
\n",
"Check out the [when should you use deepchecks guide](../../user-guide/when_should_you_use.html) for some more info about the existing suites and when to use them."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "df37f340",
"metadata": {
"ExecuteTime": {
"end_time": "2022-02-03T22:32:10.368506Z",
"start_time": "2022-02-03T22:32:10.364161Z"
},
"execution": {
"iopub.execute_input": "2022-03-03T15:29:23.846447Z",
"iopub.status.busy": "2022-03-03T15:29:23.846144Z",
"iopub.status.idle": "2022-03-03T15:29:24.163275Z",
"shell.execute_reply": "2022-03-03T15:29:24.162415Z"
}
},
"outputs": [],
"source": [
"from deepchecks.suites import full_suite\n",
"\n",
"suite = full_suite()"
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "b74de249",
"metadata": {
"ExecuteTime": {
"end_time": "2022-02-03T22:32:18.210920Z",
"start_time": "2022-02-03T22:32:10.642863Z"
},
"execution": {
"iopub.execute_input": "2022-03-03T15:29:24.167568Z",
"iopub.status.busy": "2022-03-03T15:29:24.167280Z",
"iopub.status.idle": "2022-03-03T15:29:28.207285Z",
"shell.execute_reply": "2022-03-03T15:29:28.206586Z"
},
"scrolled": false
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "7f98f43e501f4b03be2c0063c5081f56",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Full Suite: 0%| | 0/36 [00:00, ? Check/s]"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "cd8d89453f274dea948dfd6ada2bd33d",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"VBox(children=(HTML(value='\\n
\\n The suit…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "suite.run(train_dataset=ds_train, test_dataset=ds_test, model=rf_clf)" ] }, { "cell_type": "markdown", "id": "d9272c64", "metadata": {}, "source": [ "### Run the integrity suite" ] }, { "cell_type": "markdown", "id": "0fcc64e8", "metadata": {}, "source": [ "If you still haven't started modeling and just have a single dataset, you can use the ``single_dataset_integrity``:" ] }, { "cell_type": "code", "execution_count": 6, "id": "4a85997c", "metadata": { "ExecuteTime": { "end_time": "2022-02-03T22:32:18.995500Z", "start_time": "2022-02-03T22:32:18.215799Z" }, "execution": { "iopub.execute_input": "2022-03-03T15:29:28.400302Z", "iopub.status.busy": "2022-03-03T15:29:28.400012Z", "iopub.status.idle": "2022-03-03T15:29:28.518923Z", "shell.execute_reply": "2022-03-03T15:29:28.518390Z" } }, "outputs": [ { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "6dc2285caf40487a8f76b8856a33bf79", "version_major": 2, "version_minor": 0 }, "text/plain": [ "Single Dataset Integrity Suite: 0%| | 0/8 [00:00, ? Check/s]" ] }, "metadata": {}, "output_type": "display_data" }, { "data": { "application/vnd.jupyter.widget-view+json": { "model_id": "7f9b07c3a401426fa6f9693cbe33543e", "version_major": 2, "version_minor": 0 }, "text/plain": [ "VBox(children=(HTML(value='\\n
\\n…" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "from deepchecks.suites import single_dataset_integrity\n", "\n", "integ_suite = single_dataset_integrity()\n", "integ_suite.run(ds_train)" ] }, { "cell_type": "markdown", "id": "5e776973", "metadata": {}, "source": [ "## Run a Deepchecks Check" ] }, { "cell_type": "markdown", "id": "6627de72", "metadata": {}, "source": [ "If you want to run a specific check, you can just import it and run it directly.\n", "\n", "Check out the [Check Demonstrations](../examples/index.rst) in the examples or the [API Reference](../../api/index.rst) for more info about the existing checks and their parameters." ] }, { "cell_type": "code", "execution_count": 7, "id": "71547084", "metadata": { "ExecuteTime": { "end_time": "2022-02-03T22:32:19.004423Z", "start_time": "2022-02-03T22:32:18.998802Z" }, "execution": { "iopub.execute_input": "2022-03-03T15:29:28.559620Z", "iopub.status.busy": "2022-03-03T15:29:28.559358Z", "iopub.status.idle": "2022-03-03T15:29:28.562844Z", "shell.execute_reply": "2022-03-03T15:29:28.562168Z" } }, "outputs": [], "source": [ "from deepchecks.checks import TrainTestLabelDrift" ] }, { "cell_type": "code", "execution_count": 8, "id": "0747f780", "metadata": { "ExecuteTime": { "end_time": "2022-02-03T22:32:19.181579Z", "start_time": "2022-02-03T22:32:19.009249Z" }, "execution": { "iopub.execute_input": "2022-03-03T15:29:28.566479Z", "iopub.status.busy": "2022-03-03T15:29:28.566182Z", "iopub.status.idle": "2022-03-03T15:29:28.641538Z", "shell.execute_reply": "2022-03-03T15:29:28.640870Z" } }, "outputs": [ { "data": { "text/html": [ "
Calculate label drift between train dataset and test dataset, using statistical measures. Read More...
Calculate the confusion matrix of the model on the given dataset. Read More...
Calculate drift between the entire train and test datasets using a model trained to distinguish between them. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n Drift value is not greater than 0.25 | \n\n |
Check | \nSummary | \n
---|---|
Model Info | \nSummarize given model parameters. Read More... | \n
Columns Info - Train Dataset | \nReturn the role and logical type of each column. Read More... | \n
Columns Info - Test Dataset | \nReturn the role and logical type of each column. Read More... | \n
Confusion Matrix Report - Train Dataset | \nCalculate the confusion matrix of the model on the given dataset. Read More... | \n
Confusion Matrix Report - Test Dataset | \nCalculate the confusion matrix of the model on the given dataset. Read More... | \n
Calibration Metric - Train Dataset | \nCalculate the calibration curve with brier score for each class. Read More... | \n
Calibration Metric - Test Dataset | \nCalculate the calibration curve with brier score for each class. Read More... | \n
Calculate label drift between train dataset and test dataset, using statistical measures. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n PSI <= 0.2 and Earth Mover's Distance <= 0.1 for label drift | \n\n |
Check | \nReason | \n
---|---|
Single Value in Column | \nNothing found | \n
Mixed Nulls | \nNothing found | \n
Mixed Data Types | \nNothing found | \n
String Mismatch | \nNothing found | \n
Data Duplicates | \nNothing found | \n
String Length Out Of Bounds | \nNothing found | \n
Special Characters | \nNothing found | \n
Label Ambiguity | \nNothing found | \n
\n The suite is composed of various checks such as: Model Error Analysis, Calibration Score, Trust Score Comparison, etc...
\n Each check may contain conditions (which will result in pass / fail / warning, represented by \n ✓ /\n ✖ /\n !\n )\n as well as other outputs such as plots or tables.
\n Suites, checks and conditions can all be modified. Read more about\n custom suites.\n
Calculate the calibration curve with brier score for each class. Read More...
Status | \nCheck | \nCondition | \nMore Info | \n
---|---|---|---|
✖ | \n Single Feature Contribution Train-Test | \nTrain features' Predictive Power Score is not greater than 0.7 | \nFeatures in train dataset with PPS above threshold: {'petal width (cm)': '0.91', 'petal length (cm)': '0.84'} | \n
✓ | \n Train Test Samples Mix | \nPercentage of test data samples that appear in train data not greater than 10% | \n\n |
✓ | \n Single Feature Contribution Train-Test | \nTrain-Test features' Predictive Power Score difference is not greater than 0.2 | \n\n |
✓ | \n Datasets Size Comparison | \nTest-Train size ratio is not smaller than 0.01 | \n\n |
✓ | \n Whole Dataset Drift | \nDrift value is not greater than 0.25 | \n\n |
✓ | \n Train Test Label Drift | \nPSI <= 0.2 and Earth Mover's Distance <= 0.1 for label drift | \n\n |
✓ | \n Train Test Drift | \nPSI <= 0.2 and Earth Mover's Distance <= 0.1 | \n\n |
✓ | \n Performance Report | \nTrain-Test scores relative degradation is not greater than 0.1 | \n\n |
✓ | \n Model Inference Time - Train Dataset | \nAverage model inference time for one sample is not greater than 0.001 | \n\n |
✓ | \n Unused Features | \nNumber of high variance unused features is not greater than 5 | \n\n |
✓ | \n Model Error Analysis | \nThe performance difference of the detected segments must not be greater than 5% | \n\n |
✓ | \n Simple Model Comparison | \nModel performance gain over simple model is not less than 10% | \n\n |
✓ | \n ROC Report - Test Dataset | \nAUC score for all the classes is not less than 0.7 | \n\n |
✓ | \n ROC Report - Train Dataset | \nAUC score for all the classes is not less than 0.7 | \n\n |
✓ | \n Model Inference Time - Test Dataset | \nAverage model inference time for one sample is not greater than 0.001 | \n\n |
✓ | \n Special Characters - Test Dataset | \nRatio of entirely special character samples not greater than 0.1% | \n\n |
✓ | \n Special Characters - Train Dataset | \nRatio of entirely special character samples not greater than 0.1% | \n\n |
✓ | \n String Length Out Of Bounds - Test Dataset | \nRatio of outliers not greater than 0% string length outliers | \n\n |
✓ | \n String Length Out Of Bounds - Train Dataset | \nRatio of outliers not greater than 0% string length outliers | \n\n |
✓ | \n Data Duplicates - Test Dataset | \nDuplicate data ratio is not greater than 0% | \n\n |
✓ | \n Data Duplicates - Train Dataset | \nDuplicate data ratio is not greater than 0% | \n\n |
✓ | \n String Mismatch - Test Dataset | \nNo string variants | \n\n |
✓ | \n String Mismatch - Train Dataset | \nNo string variants | \n\n |
✓ | \n Mixed Data Types - Test Dataset | \nRare data types in column are either more than 10% or less than 1% of the data | \n\n |
✓ | \n Mixed Nulls - Train Dataset | \nNot more than 1 different null types | \n\n |
✓ | \n Mixed Nulls - Test Dataset | \nNot more than 1 different null types | \n\n |
✓ | \n Single Value in Column - Test Dataset | \nDoes not contain only a single value | \n\n |
✓ | \n Single Value in Column - Train Dataset | \nDoes not contain only a single value | \n\n |
✓ | \n Label Ambiguity - Train Dataset | \nAmbiguous sample ratio is not greater than 0% | \n\n |
✓ | \n String Mismatch Comparison | \nNo new variants allowed in test data | \n\n |
✓ | \n New Label Train Test | \nNumber of new label values is not greater than 0 | \n\n |
✓ | \n Category Mismatch Train Test | \nRatio of samples with a new category is not greater than 0% | \n\n |
✓ | \n Dominant Frequency Change | \nChange in ratio of dominant value in data is not greater than 25% | \n\n |
✓ | \n Mixed Data Types - Train Dataset | \nRare data types in column are either more than 10% or less than 1% of the data | \n\n |
✓ | \n Label Ambiguity - Test Dataset | \nAmbiguous sample ratio is not greater than 0% | \n\n |
Status | \nCheck | \nCondition | \nMore Info | \n
---|---|---|---|
✓ | \n Single Value in Column | \nDoes not contain only a single value | \n\n |
✓ | \n Mixed Nulls | \nNot more than 1 different null types | \n\n |
✓ | \n Mixed Data Types | \nRare data types in column are either more than 10% or less than 1% of the data | \n\n |
✓ | \n String Mismatch | \nNo string variants | \n\n |
✓ | \n Data Duplicates | \nDuplicate data ratio is not greater than 0% | \n\n |
✓ | \n String Length Out Of Bounds | \nRatio of outliers not greater than 0% string length outliers | \n\n |
✓ | \n Special Characters | \nRatio of entirely special character samples not greater than 0.1% | \n\n |
✓ | \n Label Ambiguity | \nAmbiguous sample ratio is not greater than 0% | \n\n |
Summarize given model parameters. Read More...
Parameter | \nValue | \nDefault | \n
---|---|---|
bootstrap | \nTrue | \nTrue | \n
ccp_alpha | \n0.00 | \n0.00 | \n
class_weight | \nNone | \nNone | \n
criterion | \ngini | \ngini | \n
max_depth | \nNone | \nNone | \n
max_features | \nauto | \nauto | \n
max_leaf_nodes | \nNone | \nNone | \n
max_samples | \nNone | \nNone | \n
min_impurity_decrease | \n0.00 | \n0.00 | \n
min_samples_leaf | \n1 | \n1 | \n
min_samples_split | \n2 | \n2 | \n
min_weight_fraction_leaf | \n0.00 | \n0.00 | \n
n_estimators | \n100 | \n100 | \n
n_jobs | \nNone | \nNone | \n
oob_score | \nFalse | \nFalse | \n
random_state | \nNone | \nNone | \n
verbose | \n0 | \n0 | \n
warm_start | \nFalse | \nFalse | \n
Colored rows are parameters with non-default values
Calculate drift between train dataset and test dataset per feature, using statistical measures. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n PSI <= 0.2 and Earth Mover's Distance <= 0.1 | \n\n |
Verify test dataset size comparing it to the train dataset size. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n Test-Train size ratio is not smaller than 0.01 | \n\n |
\n | Train | \nTest | \n
---|---|---|
Size | \n112 | \n38 | \n
\n The suite is composed of various checks such as: Data Duplicates, String Length Out Of Bounds, Label Ambiguity, etc...
\n Each check may contain conditions (which will result in pass / fail / warning, represented by \n ✓ /\n ✖ /\n !\n )\n as well as other outputs such as plots or tables.
\n Suites, checks and conditions can all be modified. Read more about\n custom suites.\n
Measure model average inference time (in seconds) per sample. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n Average model inference time for one sample is not greater than 0.001 | \n\n |
Return the role and logical type of each column. Read More...
\n | target | \npetal length (cm) | \npetal width (cm) | \nsepal length (cm) | \nsepal width (cm) | \n
---|---|---|---|---|---|
role | \nlabel | \nnumerical feature | \nnumerical feature | \nnumerical feature | \nnumerical feature | \n
Calculate the confusion matrix of the model on the given dataset. Read More...
No outputs to show.
" } }, "7d05fcb238c64ac6a57278eaf0fec5ef": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "7d0a5a51e4b448ae8c2a64cd50f6d133": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "7d0df1741e874df685ed01132d9ddf4d": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "7dd6b53c5ec440f98f08e8080d1c427d": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "7e519146387a46a49e6b670caf1c81d9": { "model_module": "@jupyter-widgets/base", "model_module_version": "1.2.0", "model_name": "LayoutModel", "state": { "_model_module": "@jupyter-widgets/base", "_model_module_version": "1.2.0", "_model_name": "LayoutModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "LayoutView", "align_content": null, "align_items": null, "align_self": null, "border": null, "bottom": null, "display": null, "flex": null, "flex_flow": null, "grid_area": null, "grid_auto_columns": null, "grid_auto_flow": null, "grid_auto_rows": null, "grid_column": null, "grid_gap": null, "grid_row": null, "grid_template_areas": null, "grid_template_columns": null, "grid_template_rows": null, "height": null, "justify_content": null, "justify_items": null, "left": null, "margin": null, "max_height": null, "max_width": null, "min_height": null, "min_width": null, "object_fit": null, "object_position": null, "order": null, "overflow": null, "overflow_x": null, "overflow_y": null, "padding": null, "right": null, "top": null, "visibility": null, "width": null } }, "7f924735c8794a6ea78c57b0b6a8f72e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "DescriptionStyleModel", "state": { "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "DescriptionStyleModel", "_view_count": null, "_view_module": "@jupyter-widgets/base", "_view_module_version": "1.2.0", "_view_name": "StyleView", "description_width": "" } }, "7f98f43e501f4b03be2c0063c5081f56": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HBoxView", "box_style": "", "children": [ "IPY_MODEL_fa7e633f60e2475eb7a7f778806e38c2", "IPY_MODEL_245cebc00df1472e8cfb5518fa205172", "IPY_MODEL_71012ecda8ce407d84871f1a04b37e4e" ], "layout": "IPY_MODEL_afe1f775c0bb42abb3f966e51ce07641" } }, "7f9b07c3a401426fa6f9693cbe33543e": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "VBoxModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "VBoxModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "VBoxView", "box_style": "", "children": [ "IPY_MODEL_5960f8a1784849bba2f1623a58b14c72", "IPY_MODEL_31c3ff7417cc4ccf93ce0607510d6767", "IPY_MODEL_977f7768bfe048e182a5db6894e0688b" ], "layout": "IPY_MODEL_a6c7b31860a944368645c02a7b6da936" } }, "802f4190305d4d6481a0ebb5785c5a62": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_b400a3b9d03e4ce6afe75975434a52ae", "placeholder": "", "style": "IPY_MODEL_c07a4ef9a58b44d58616dc1dd0f264ff", "value": "Calculate the ROC curve for each class. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n AUC score for all the classes is not less than 0.7 | \n\n |
Calculate the ROC curve for each class. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n AUC score for all the classes is not less than 0.7 | \n\n |
Return the role and logical type of each column. Read More...
\n | target | \npetal length (cm) | \npetal width (cm) | \nsepal length (cm) | \nsepal width (cm) | \n
---|---|---|---|---|---|
role | \nlabel | \nnumerical feature | \nnumerical feature | \nnumerical feature | \nnumerical feature | \n
Compare given model score to simple model score (according to given model type). Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n Model performance gain over simple model is not less than 10% | \n\n |
Return the Predictive Power Score of all features, in order to estimate each feature's ability to predict the label. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✖ | \n Train features' Predictive Power Score is not greater than 0.7 | \nFeatures in train dataset with PPS above threshold: {'petal width (cm)': '0.91', 'petal length (cm)': '0.84'} | \n
✓ | \n Train-Test features' Predictive Power Score difference is not greater than 0.2 | \n\n |
Find features that best split the data into segments of high and low model error. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n The performance difference of the detected segments must not be greater than 5% | \n\n |
Check | \nReason | \n
---|---|
Trust Score Comparison | \nNumber of samples in test dataset has not passed the minimum. You can change the minimum number of samples required for the check to run with the parameter \"min_test_samples\" | \n
Regression Systematic Error - Train Dataset | \nCheck is relevant for models of type ['regression'], but received model of type 'multiclass' | \n
Regression Systematic Error - Test Dataset | \nCheck is relevant for models of type ['regression'], but received model of type 'multiclass' | \n
Regression Error Distribution - Train Dataset | \nCheck is relevant for models of type ['regression'], but received model of type 'multiclass' | \n
Regression Error Distribution - Test Dataset | \nCheck is relevant for models of type ['regression'], but received model of type 'multiclass' | \n
Boosting Overfit | \nCheck is relevant for Boosting models of type ('AdaBoostClassifier', 'GradientBoostingClassifier', 'LGBMClassifier', 'XGBClassifier', 'CatBoostClassifier', 'AdaBoostRegressor', 'GradientBoostingRegressor', 'LGBMRegressor', 'XGBRegressor', 'CatBoostRegressor'), but received model of type RandomForestClassifier | \n
Index Train Test Leakage | \nCheck is irrelevant for Datasets without index defined | \n
Date Train Test Leakage Duplicates | \nCheck is irrelevant for Datasets without datetime defined | \n
Date Train Test Leakage Overlap | \nCheck is irrelevant for Datasets without datetime defined | \n
Identifier Leakage - Train Dataset | \nCheck is irrelevant for Datasets without index or date column | \n
Identifier Leakage - Test Dataset | \nCheck is irrelevant for Datasets without index or date column | \n
String Mismatch - Test Dataset | \nNothing found | \n
Data Duplicates - Train Dataset | \nNothing found | \n
Data Duplicates - Test Dataset | \nNothing found | \n
String Length Out Of Bounds - Train Dataset | \nNothing found | \n
String Length Out Of Bounds - Test Dataset | \nNothing found | \n
Special Characters - Train Dataset | \nNothing found | \n
Special Characters - Test Dataset | \nNothing found | \n
String Mismatch - Train Dataset | \nNothing found | \n
Mixed Data Types - Test Dataset | \nNothing found | \n
Single Value in Column - Train Dataset | \nNothing found | \n
Mixed Nulls - Test Dataset | \nNothing found | \n
Mixed Nulls - Train Dataset | \nNothing found | \n
Single Value in Column - Test Dataset | \nNothing found | \n
Label Ambiguity - Train Dataset | \nNothing found | \n
String Mismatch Comparison | \nNothing found | \n
New Label Train Test | \nNothing found | \n
Category Mismatch Train Test | \nNothing found | \n
Dominant Frequency Change | \nNothing found | \n
Mixed Data Types - Train Dataset | \nNothing found | \n
Label Ambiguity - Test Dataset | \nNothing found | \n
Summarize given scores on a dataset and model. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n Train-Test scores relative degradation is not greater than 0.1 | \n\n |
No outputs to show.
" } }, "cc89454d67df4323871ee58bf67aef9d": { "model_module": "@jupyter-widgets/controls", "model_module_version": "1.5.0", "model_name": "HTMLModel", "state": { "_dom_classes": [], "_model_module": "@jupyter-widgets/controls", "_model_module_version": "1.5.0", "_model_name": "HTMLModel", "_view_count": null, "_view_module": "@jupyter-widgets/controls", "_view_module_version": "1.5.0", "_view_name": "HTMLView", "description": "", "description_tooltip": null, "layout": "IPY_MODEL_bb06199fc2f7442eb93900c9d50da759", "placeholder": "", "style": "IPY_MODEL_f222c00eddf643f3975154f44e59cbf7", "value": "Measure model average inference time (in seconds) per sample. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n Average model inference time for one sample is not greater than 0.001 | \n\n |
Detect samples in the test data that appear also in training data. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n Percentage of test data samples that appear in train data not greater than 10% | \n\n |
\n | sepal length (cm) | \nsepal width (cm) | \npetal length (cm) | \npetal width (cm) | \ntarget | \n
---|---|---|---|---|---|
Train indices: 101\nTest indices: 142 | \n5.80 | \n2.70 | \n5.10 | \n1.90 | \n2 | \n
Calculate the calibration curve with brier score for each class. Read More...
Detect features that are nearly unused by the model. Read More...
Status | \nCondition | \nMore Info | \n
---|---|---|
✓ | \n Number of high variance unused features is not greater than 5 | \n\n |