Widgets
Data
-
File
-
CSV File Import
-
Datasets
-
SQL Table
-
Data Table
-
Paint Data
-
Data Info
-
Data Sampler
-
Select Columns
-
Select Rows
-
Pivot Table
-
Rank
-
Correlations
-
Merge Data
-
Concatenate
-
Select by Data Index
-
Transpose
-
Randomize
-
Preprocess
-
Apply Domain
-
Impute
-
Outliers
-
Edit Domain
-
Python Script
-
Create Instance
-
Color
-
Continuize
-
Create Class
-
Discretize
-
Feature Constructor
-
Feature Statistics
-
Neighbors
-
Purge Domain
-
Save Data
Visualize
Model
Evaluate
Unsupervised
Spectroscopy
Text Mining
-
Corpus
-
Import Documents
-
The Guardian
-
NY Times
-
Pubmed
-
Twitter
-
Wikipedia
-
Preprocess Text
-
Corpus to Network
-
Bag of Words
-
Document Embedding
-
Similarity Hashing
-
Sentiment Analysis
-
Tweet Profiler
-
Topic Modelling
-
Corpus Viewer
-
Word Cloud
-
Concordance
-
Document Map
-
Word Enrichment
-
Duplicate Detection
-
Statistics
Bioinformatics
Single Cell
Image Analytics
Networks
Geo
Educational
Time Series
Associate
Gradient Boosting
Predict using gradient boosting on decision trees.
Inputs
- Data: input dataset
- Preprocessor: preprocessing method(s)
Outputs
- Learner: gradient boosting learning algorithm
- Model: trained model
Gradient Boosting is a machine learning technique for regression and classification problems, which produces a prediction model in the form of an ensemble of weak prediction models, typically decision trees.
- Specify the name of the model. The default name is “Gradient Boosting”.
- Select a gradient boosting method:
- Basic properties:
- Number of trees: Specify how many gradient boosted trees will be included. A large number usually results in better performance.
- Learning rate: Specify the boosting learning rate. Learning rate shrinks the contribution of each tree.
- Replicable training: Fix the random seed, which enables replicability of the results.
- Regularization: Specify the L2 regularization term. Available only for xgboost and catboost methods.
- Growth control:
- Limit depth of individual trees: Specify the maximum depth of the individual tree.
- Do not split subsets smaller than: Specify the smallest subset that can be split. Available only for scikit-learn methods.
- Subsampling:
- Fraction of training instances: Specify the percentage of the training instances for fitting the individual tree. Available for scikit-learn and xgboost methods.
- Fraction of features for each tree: Specify the percentage of features to use when constructing each tree. Available for xgboost and catboost methods.
- Fraction of features for each level: Specify the percentage of features to use for each level. Available only for xgboost methods.
- Fraction of features for each split: Specify the percentage of features to use for each split. Available only for xgboost methods.
- Click Apply to communicate the changes to other widgets. Alternatively, tick the box on the left side of the Apply button and changes will be communicated automatically.
Example
For a classification tasks, we use the heart disease data. Here, we compare all available methods in the Test & Score widget.