TensorFlow Decision Forests
szilard opened this issue · comments
docker run --rm -ti continuumio/anaconda3 /bin/bash
pip install tensorflow_decision_forests
ipython
import tensorflow_decision_forests as tfdf
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn import metrics
d_train = pd.read_csv("https://s3.amazonaws.com/benchm-ml--main/train-1m.csv")
d_test = pd.read_csv("https://s3.amazonaws.com/benchm-ml--main/test.csv")
d_train["dep_delayed_15min"] = np.where(d_train["dep_delayed_15min"]=="Y",1,0)
d_test["dep_delayed_15min"] = np.where(d_test["dep_delayed_15min"]=="Y",1,0)
dtf_train = tfdf.keras.pd_dataframe_to_tf_dataset(d_train, label="dep_delayed_15min")
dtf_test = tfdf.keras.pd_dataframe_to_tf_dataset(d_test, label="dep_delayed_15min")
md = tfdf.keras.GradientBoostedTreesModel(max_depth=10, num_trees=100, shrinkage=0.1)
%time md.fit(x=dtf_train)
y_pred = md.predict(dtf_test)
print(metrics.roc_auc_score(d_test["dep_delayed_15min"], y_pred))
m5.2xlarge (8 cores)
In [1]: import tensorflow_decision_forests as tfdf
2021-06-04 16:39:24.583254: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-06-04 16:39:24.583295: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
In [2]:
In [2]: import numpy as np
In [3]: import pandas as pd
In [4]: import tensorflow as tf
In [5]:
In [5]: from sklearn import metrics
In [6]:
In [6]:
In [6]: d_train = pd.read_csv("https://s3.amazonaws.com/benchm-ml--main/train-1m.csv")
In [7]: d_test = pd.read_csv("https://s3.amazonaws.com/benchm-ml--main/test.csv")
In [8]:
In [8]: d_train["dep_delayed_15min"] = np.where(d_train["dep_delayed_15min"]=="Y",1,0)
In [9]: d_test["dep_delayed_15min"] = np.where(d_test["dep_delayed_15min"]=="Y",1,0)
In [10]:
In [10]:
In [10]: dtf_train = tfdf.keras.pd_dataframe_to_tf_dataset(d_train, label="dep_delayed_15min")
2021-06-04 16:39:32.461417: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2021-06-04 16:39:32.461464: W tensorflow/stream_executor/cuda/cuda_driver.cc:326] failed call to cuInit: UNKNOWN ERROR (303)
2021-06-04 16:39:32.461493: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (78cd809fe258): /proc/driver/nvidia/version does not exist
2021-06-04 16:39:32.461787: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 AVX512F FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
In [11]: dtf_test = tfdf.keras.pd_dataframe_to_tf_dataset(d_test, label="dep_delayed_15min")
In [12]:
In [12]:
In [12]: md = tfdf.keras.GradientBoostedTreesModel(max_depth=10, num_trees=100, shrinkage=0.1)
In [13]: %time md.fit(x=dtf_train)
2021-06-04 16:39:36.183058: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-06-04 16:39:36.204576: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2499980000 Hz
15625/15625 [==============================] - 15s 780us/step
[INFO kernel.cc:746] Start Yggdrasil model training
[INFO kernel.cc:747] Collect training examples
[INFO kernel.cc:392] Number of batches: 15625
[INFO kernel.cc:393] Number of examples: 1000000
[INFO data_spec_inference.cc:289] 3 item(s) have been pruned (i.e. they are considered out of dictionary) for the column Dest (289 item(s) left) because min_value_count=5 and max_number_of_unique_values=2000
[INFO data_spec_inference.cc:289] 2 item(s) have been pruned (i.e. they are considered out of dictionary) for the column Origin (289 item(s) left) because min_value_count=5 and max_number_of_unique_values=2000
[INFO kernel.cc:769] Dataset:
Number of records: 1000000
Number of columns: 9
Number of columns by type:
CATEGORICAL: 7 (77.7778%)
NUMERICAL: 2 (22.2222%)
Columns:
CATEGORICAL: 7 (77.7778%)
0: "DayOfWeek" CATEGORICAL has-dict vocab-size:8 zero-ood-items most-frequent:"c-5" 147674 (14.7674%)
1: "DayofMonth" CATEGORICAL has-dict vocab-size:32 zero-ood-items most-frequent:"c-17" 33733 (3.3733%)
3: "Dest" CATEGORICAL has-dict vocab-size:290 num-oods:3 (0.0003%) most-frequent:"ATL" 58247 (5.8247%)
5: "Month" CATEGORICAL has-dict vocab-size:13 zero-ood-items most-frequent:"c-8" 88344 (8.8344%)
6: "Origin" CATEGORICAL has-dict vocab-size:290 num-oods:2 (0.0002%) most-frequent:"ATL" 58796 (5.8796%)
7: "UniqueCarrier" CATEGORICAL has-dict vocab-size:23 zero-ood-items most-frequent:"WN" 150937 (15.0937%)
8: "__LABEL" CATEGORICAL integerized vocab-size:3 no-ood-item
NUMERICAL: 2 (22.2222%)
2: "DepTime" NUMERICAL mean:1343.12 min:1 max:2615 sd:476.663
4: "Distance" NUMERICAL mean:728.805 min:21 max:4962 sd:574.475
Terminology:
nas: Number of non-available (i.e. missing) values.
ood: Out of dictionary.
manually-defined: Attribute which type is manually defined by the user i.e. the type was not automatically inferred.
tokenized: The attribute value is obtained through tokenization.
has-dict: The attribute is attached to a string dictionary e.g. a categorical attribute stored as a string.
vocab-size: Number of unique values.
[INFO kernel.cc:772] Configure learner
[WARNING gradient_boosted_trees.cc:1532] Subsample hyperparameter given but sampling method does not match.
[WARNING gradient_boosted_trees.cc:1545] GOSS alpha hyperparameter given but GOSS is disabled.
[WARNING gradient_boosted_trees.cc:1554] GOSS beta hyperparameter given but GOSS is disabled.
[WARNING gradient_boosted_trees.cc:1566] SelGB ratio hyperparameter given but SelGB is disabled.
[INFO kernel.cc:797] Training config:
learner: "GRADIENT_BOOSTED_TREES"
features: "DayOfWeek"
features: "DayofMonth"
features: "DepTime"
features: "Dest"
features: "Distance"
features: "Month"
features: "Origin"
features: "UniqueCarrier"
label: "__LABEL"
task: CLASSIFICATION
[yggdrasil_decision_forests.model.gradient_boosted_trees.proto.gradient_boosted_trees_config] {
num_trees: 100
decision_tree {
max_depth: 10
min_examples: 5
in_split_min_examples_check: true
missing_value_policy: GLOBAL_IMPUTATION
allow_na_conditions: false
categorical_set_greedy_forward {
sampling: 0.1
max_num_items: -1
min_item_frequency: 1
}
growing_strategy_local {
}
categorical {
cart {
}
}
num_candidate_attributes_ratio: -1
axis_aligned_split {
}
}
shrinkage: 0.1
validation_set_ratio: 0.1
early_stopping: VALIDATION_LOSS_INCREASE
early_stopping_num_trees_look_ahead: 30
l2_regularization: 0
lambda_loss: 1
mart {
}
adapt_subsample_for_maximum_training_duration: false
l1_regularization: 0
use_hessian_gain: false
l2_regularization_categorical: 1
}
[INFO kernel.cc:800] Deployment config:
[INFO kernel.cc:837] Train model
[INFO gradient_boosted_trees.cc:480] Default loss set to BINOMIAL_LOG_LIKELIHOOD
[INFO gradient_boosted_trees.cc:1358] num-trees:1 train-loss:0.952696 train-accuracy:0.806957 valid-loss:0.954296 valid-accuracy:0.807567
[INFO gradient_boosted_trees.cc:1360] num-trees:2 train-loss:0.930766 train-accuracy:0.806957 valid-loss:0.935331 valid-accuracy:0.807567
[INFO gradient_boosted_trees.cc:1360] num-trees:28 train-loss:0.759611 train-accuracy:0.837750 valid-loss:0.817667 valid-accuracy:0.827559
[INFO gradient_boosted_trees.cc:1360] num-trees:56 train-loss:0.695891 train-accuracy:0.852399 valid-loss:0.791661 valid-accuracy:0.834203
[INFO gradient_boosted_trees.cc:1360] num-trees:86 train-loss:0.650648 train-accuracy:0.864072 valid-loss:0.772629 valid-accuracy:0.838609
[INFO gradient_boosted_trees.cc:1358] num-trees:100 train-loss:0.633199 train-accuracy:0.868094 valid-loss:0.766061 valid-accuracy:0.839678
[INFO gradient_boosted_trees.cc:319] Truncates the model to 100 tree(s) i.e. 100 iteration(s).
[INFO gradient_boosted_trees.cc:348] Final model valid-loss:0.766061 valid-accuracy:0.839678
[INFO kernel.cc:856] Export model in log directory: /tmp/tmpkhka91x3
[INFO kernel.cc:864] Save model in resources
[INFO kernel.cc:929] Loading model from path
[INFO decision_forest.cc:590] Model loaded with 100 root(s), 93196 node(s), and 8 input feature(s).
[INFO abstract_model.cc:876] Engine "GradientBoostedTreesGeneric" built
[INFO kernel.cc:797] Use fast generic engine
CPU times: user 3min 55s, sys: 6.88 s, total: 4min 2s
Wall time: 2min 6s
Out[13]: <tensorflow.python.keras.callbacks.History at 0x7f88aaaca0d0>
In [14]:
In [14]: y_pred = md.predict(dtf_test)
In [15]: print(metrics.roc_auc_score(d_test["dep_delayed_15min"], y_pred))
0.7612733258837148
Summary:
m5.2xlarge (8 cores)
Wall time: 2min 6s
In [15]: print(metrics.roc_auc_score(d_test["dep_delayed_15min"], y_pred))
0.7612733258837148
In comparison XGBoost (m5.2xlarge):
5.696 (time)
0.7478858 (AUC)
(20x faster)
GPU:
p3.2xlarge
nvidia-docker run -it --rm tensorflow/tensorflow:latest-gpu-jupyter bash
pip install tensorflow_decision_forests sklearn
ipython
In [1]: import tensorflow_decision_forests as tfdf
2021-06-04 19:08:30.923089: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
In [2]:
In [2]: import numpy as np
In [3]: import pandas as pd
In [4]: import tensorflow as tf
In [5]:
In [5]: from sklearn import metrics
In [6]:
In [6]:
In [6]: d_train = pd.read_csv("https://s3.amazonaws.com/benchm-ml--main/train-1m.csv")
In [7]: d_test = pd.read_csv("https://s3.amazonaws.com/benchm-ml--main/test.csv")
In [8]:
In [8]: d_train["dep_delayed_15min"] = np.where(d_train["dep_delayed_15min"]=="Y",1,0)
In [9]: d_test["dep_delayed_15min"] = np.where(d_test["dep_delayed_15min"]=="Y",1,0)
In [10]:
In [10]:
In [10]: dtf_train = tfdf.keras.pd_dataframe_to_tf_dataset(d_train, label="dep_delayed_15min")
2021-06-04 19:08:40.281591: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcuda.so.1
2021-06-04 19:08:41.264152: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:41.265175: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:00:1e.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-04 19:08:41.265220: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-06-04 19:08:41.268516: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublas.so.11
2021-06-04 19:08:41.268583: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcublasLt.so.11
2021-06-04 19:08:41.269670: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcufft.so.10
2021-06-04 19:08:41.269984: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcurand.so.10
2021-06-04 19:08:41.270925: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusolver.so.11
2021-06-04 19:08:41.271691: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcusparse.so.11
2021-06-04 19:08:41.271938: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudnn.so.8
2021-06-04 19:08:41.272066: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:41.273113: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:41.274059: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-06-04 19:08:41.274467: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
2021-06-04 19:08:41.275021: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:41.275996: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1733] Found device 0 with properties:
pciBusID: 0000:00:1e.0 name: Tesla V100-SXM2-16GB computeCapability: 7.0
coreClock: 1.53GHz coreCount: 80 deviceMemorySize: 15.78GiB deviceMemoryBandwidth: 836.37GiB/s
2021-06-04 19:08:41.276119: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:41.277162: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:41.278101: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1871] Adding visible gpu devices: 0
2021-06-04 19:08:41.278156: I tensorflow/stream_executor/platform/default/dso_loader.cc:53] Successfully opened dynamic library libcudart.so.11.0
2021-06-04 19:08:42.672460: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1258] Device interconnect StreamExecutor with strength 1 edge matrix:
2021-06-04 19:08:42.672513: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1264] 0
2021-06-04 19:08:42.672524: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1277] 0: N
2021-06-04 19:08:42.672786: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:42.673838: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:42.674860: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:937] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2021-06-04 19:08:42.675833: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1418] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 14644 MB memory) -> physical GPU (device: 0, name: Tesla V100-SXM2-16GB, pci bus id: 0000:00:1e.0, compute capability: 7.0)
In [11]: dtf_test = tfdf.keras.pd_dataframe_to_tf_dataset(d_test, label="dep_delayed_15min")
In [12]:
In [12]: md = tfdf.keras.GradientBoostedTreesModel(max_depth=10, num_trees=100, shrinkage=0.1)
In [13]: %time md.fit(x=dtf_train)
2021-06-04 19:09:28.430384: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:176] None of the MLIR Optimization Passes are enabled (registered 2)
2021-06-04 19:09:28.452532: I tensorflow/core/platform/profile_utils/cpu_utils.cc:114] CPU Frequency: 2300020000 Hz
15625/15625 [==============================] - 27s 1ms/step
[INFO kernel.cc:746] Start Yggdrasil model training
[INFO kernel.cc:747] Collect training examples
[INFO kernel.cc:392] Number of batches: 15625
[INFO kernel.cc:393] Number of examples: 1000000
[INFO data_spec_inference.cc:289] 3 item(s) have been pruned (i.e. they are considered out of dictionary) for the column Dest (289 item(s) left) because min_value_count=5 and max_number_of_unique_values=2000
[INFO data_spec_inference.cc:289] 2 item(s) have been pruned (i.e. they are considered out of dictionary) for the column Origin (289 item(s) left) because min_value_count=5 and max_number_of_unique_values=2000
[INFO kernel.cc:769] Dataset:
Number of records: 1000000
Number of columns: 9
Number of columns by type:
CATEGORICAL: 7 (77.7778%)
NUMERICAL: 2 (22.2222%)
Columns:
CATEGORICAL: 7 (77.7778%)
0: "DayOfWeek" CATEGORICAL has-dict vocab-size:8 zero-ood-items most-frequent:"c-5" 147674 (14.7674%)
1: "DayofMonth" CATEGORICAL has-dict vocab-size:32 zero-ood-items most-frequent:"c-17" 33733 (3.3733%)
3: "Dest" CATEGORICAL has-dict vocab-size:290 num-oods:3 (0.0003%) most-frequent:"ATL" 58247 (5.8247%)
5: "Month" CATEGORICAL has-dict vocab-size:13 zero-ood-items most-frequent:"c-8" 88344 (8.8344%)
6: "Origin" CATEGORICAL has-dict vocab-size:290 num-oods:2 (0.0002%) most-frequent:"ATL" 58796 (5.8796%)
7: "UniqueCarrier" CATEGORICAL has-dict vocab-size:23 zero-ood-items most-frequent:"WN" 150937 (15.0937%)
8: "__LABEL" CATEGORICAL integerized vocab-size:3 no-ood-item
NUMERICAL: 2 (22.2222%)
2: "DepTime" NUMERICAL mean:1343.12 min:1 max:2615 sd:476.663
4: "Distance" NUMERICAL mean:728.805 min:21 max:4962 sd:574.475
Terminology:
nas: Number of non-available (i.e. missing) values.
ood: Out of dictionary.
manually-defined: Attribute which type is manually defined by the user i.e. the type was not automatically inferred.
tokenized: The attribute value is obtained through tokenization.
has-dict: The attribute is attached to a string dictionary e.g. a categorical attribute stored as a string.
vocab-size: Number of unique values.
[INFO kernel.cc:772] Configure learner
[WARNING gradient_boosted_trees.cc:1532] Subsample hyperparameter given but sampling method does not match.
[WARNING gradient_boosted_trees.cc:1545] GOSS alpha hyperparameter given but GOSS is disabled.
[WARNING gradient_boosted_trees.cc:1554] GOSS beta hyperparameter given but GOSS is disabled.
[WARNING gradient_boosted_trees.cc:1566] SelGB ratio hyperparameter given but SelGB is disabled.
[INFO kernel.cc:797] Training config:
learner: "GRADIENT_BOOSTED_TREES"
features: "DayOfWeek"
features: "DayofMonth"
features: "DepTime"
features: "Dest"
features: "Distance"
features: "Month"
features: "Origin"
features: "UniqueCarrier"
label: "__LABEL"
task: CLASSIFICATION
[yggdrasil_decision_forests.model.gradient_boosted_trees.proto.gradient_boosted_trees_config] {
num_trees: 100
decision_tree {
max_depth: 10
min_examples: 5
in_split_min_examples_check: true
missing_value_policy: GLOBAL_IMPUTATION
allow_na_conditions: false
categorical_set_greedy_forward {
sampling: 0.1
max_num_items: -1
min_item_frequency: 1
}
growing_strategy_local {
}
categorical {
cart {
}
}
num_candidate_attributes_ratio: -1
axis_aligned_split {
}
}
shrinkage: 0.1
validation_set_ratio: 0.1
early_stopping: VALIDATION_LOSS_INCREASE
early_stopping_num_trees_look_ahead: 30
l2_regularization: 0
lambda_loss: 1
mart {
}
adapt_subsample_for_maximum_training_duration: false
l1_regularization: 0
use_hessian_gain: false
l2_regularization_categorical: 1
}
[INFO kernel.cc:800] Deployment config:
[INFO kernel.cc:837] Train model
[INFO gradient_boosted_trees.cc:480] Default loss set to BINOMIAL_LOG_LIKELIHOOD
[INFO gradient_boosted_trees.cc:1358] num-trees:1 train-loss:0.952696 train-accuracy:0.806957 valid-loss:0.954296 valid-accuracy:0.807567
[INFO gradient_boosted_trees.cc:1360] num-trees:2 train-loss:0.930766 train-accuracy:0.806957 valid-loss:0.935331 valid-accuracy:0.807567
[INFO gradient_boosted_trees.cc:1360] num-trees:28 train-loss:0.759611 train-accuracy:0.837750 valid-loss:0.817667 valid-accuracy:0.827559
[INFO gradient_boosted_trees.cc:1360] num-trees:55 train-loss:0.697795 train-accuracy:0.851977 valid-loss:0.792125 valid-accuracy:0.833853
[INFO gradient_boosted_trees.cc:1360] num-trees:83 train-loss:0.655071 train-accuracy:0.862715 valid-loss:0.774906 valid-accuracy:0.837899
[INFO gradient_boosted_trees.cc:1358] num-trees:100 train-loss:0.633199 train-accuracy:0.868094 valid-loss:0.766061 valid-accuracy:0.839678
[INFO gradient_boosted_trees.cc:319] Truncates the model to 100 tree(s) i.e. 100 iteration(s).
[INFO gradient_boosted_trees.cc:348] Final model valid-loss:0.766061 valid-accuracy:0.839678
[INFO kernel.cc:856] Export model in log directory: /tmp/tmp4a7ekm_n
[INFO kernel.cc:864] Save model in resources
[INFO kernel.cc:929] Loading model from path
[INFO decision_forest.cc:590] Model loaded with 100 root(s), 93196 node(s), and 8 input feature(s).
[INFO abstract_model.cc:876] Engine "GradientBoostedTreesGeneric" built
[INFO kernel.cc:797] Use fast generic engine
CPU times: user 4min 41s, sys: 8.69 s, total: 4min 50s
Wall time: 2min 22s
Out[13]: <tensorflow.python.keras.callbacks.History at 0x7f6777917048>
Not using GPU?
dtf_train = tfdf.keras.pd_dataframe_to_tf_dataset(d_train, label="dep_delayed_15min")
creates something on GPU:
[0] Tesla V100-SXM2-16GB | 36'C, 0 % | 0 / 16160 MB |
[0] Tesla V100-SXM2-16GB | 36'C, 0 % | 0 / 16160 MB |
[0] Tesla V100-SXM2-16GB | 36'C, 0 % | 0 / 16160 MB |
[0] Tesla V100-SXM2-16GB | 35'C, 0 % | 0 / 16160 MB |
[0] Tesla V100-SXM2-16GB | 36'C, 0 % | 465 / 16160 MB | root(463M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 465 / 16160 MB | root(463M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 465 / 16160 MB | root(463M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 465 / 16160 MB | root(463M)
then md.fit(x=dtf_train)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 465 / 16160 MB | root(463M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 465 / 16160 MB | root(463M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 465 / 16160 MB | root(463M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
starts something (calculation of stats etc.) but then when trees are started to be built, not using GPU anymore:
[INFO gradient_boosted_trees.cc:1358] num-trees:1 train-loss:0.952696 train-accuracy:0.806957 valid-loss:0.954296 valid-accuracy:0.807567
[INFO gradient_boosted_trees.cc:1360] num-trees:2 train-loss:0.930766 train-accuracy:0.806957 valid-loss:0.935331 valid-accuracy:0.807567
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 2 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 15111 / 16160 MB | root(15109M)
[0] Tesla V100-SXM2-16GB | 37'C, 0 % | 15111 / 16160 MB | root(15109M)
Seems GPU is not supported:
- "Don't use hardware accelerators e.g. GPU, TPU": https://github.com/tensorflow/decision-forests/blob/main/documentation/migration.md#dont-use-hardware-accelerators-eg-gpu-tpu
- "No support for GPU / TPU.": https://github.com/tensorflow/decision-forests/blob/main/documentation/known_issues.md#no-support-for-gpu--tpu
Yeah, I was about to post that, quite hilarious.
Added early_stopping="NONE"
to prevent early stopping for small data size:
import tensorflow_decision_forests as tfdf
import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn import metrics
d_train = pd.read_csv("https://s3.amazonaws.com/benchm-ml--main/train-0.1m.csv")
d_test = pd.read_csv("https://s3.amazonaws.com/benchm-ml--main/test.csv")
d_train["dep_delayed_15min"] = np.where(d_train["dep_delayed_15min"]=="Y",1,0)
d_test["dep_delayed_15min"] = np.where(d_test["dep_delayed_15min"]=="Y",1,0)
dtf_train = tfdf.keras.pd_dataframe_to_tf_dataset(d_train, label="dep_delayed_15min")
dtf_test = tfdf.keras.pd_dataframe_to_tf_dataset(d_test, label="dep_delayed_15min")
md = tfdf.keras.GradientBoostedTreesModel(max_depth=10, num_trees=100, shrinkage=0.1, early_stopping="NONE")
%time md.fit(x=dtf_train)
y_pred = md.predict(dtf_test)
print(metrics.roc_auc_score(d_test["dep_delayed_15min"], y_pred))
m5.4xlarge (16 cores)
TF-DF:
size | time [s] | AUC |
---|---|---|
100K | 16 | 0.704 |
1M | 110 | 0.761 |
10M | 1400 | 0.774 |
XGBoost:
size | time [s] | AUC |
---|---|---|
100K | 0.6 | 0.734 |
1M | 3.5 | 0.748 |
10M | 35 | 0.754 |
LightGBM:
size | time [s] | AUC |
---|---|---|
100K | 2 | 0.717 |
1M | 4 | 0.765 |
10M | 20 | 0.792 |
How much slower:
size | TF-DF/XGBoost | TF-DF/LightGBM |
---|---|---|
100K | 25x | 8x |
1M | 30x | 27x |
10M | 40x | 70x |