Provided by: mlpack-bin_4.3.0-2build1_amd64 bug

NAME

       mlpack_decision_tree - decision tree

SYNOPSIS

        mlpack_decision_tree [-m unknown] [-l unknown] [-D int] [-g double] [-n int] [-a bool] [-e bool] [-T string] [-L unknown] [-t string] [-V bool] [-w unknown] [-M unknown] [-p unknown] [-P unknown] [-h -v]

DESCRIPTION

       Train and evaluate using a decision tree. Given a dataset containing numeric or categorical features, and
       associated labels for each point in the dataset, this program can train a decision tree on that data.

       The  training  set and associated labels are specified with the '--training_file (-t)' and '--labels_file
       (-l)' parameters, respectively. The labels should be in the range [0, num_classes -  1].  Optionally,  if
       '--labels_file  (-l)'  is  not specified, the labels are assumed to be the last dimension of the training
       dataset.

       When a model is trained, the '--output_model_file (-M)' output parameter may be used to save the  trained
       model.  A  model  may  be  loaded  for  predictions  with  the  '--input_model_file  (-m)' parameter. The
       '--input_model_file (-m)' parameter may not be specified when the  '--training_file  (-t)'  parameter  is
       specified.  The '--minimum_leaf_size (-n)' parameter specifies the minimum number of training points that
       must fall into each leaf for it to be split.  The '--minimum_gain_split  (-g)'  parameter  specifies  the
       minimum  gain  that  is  needed for the node to split. The '--maximum_depth (-D)' parameter specifies the
       maximum depth of the tree. If '--print_training_error (-e)' is specified,  the  training  error  will  be
       printed.

       Test  data may be specified with the '--test_file (-T)' parameter, and if performance numbers are desired
       for that test set, labels may be specified with the '--test_labels_file (-L)' parameter. Predictions  for
       each  test point may be saved via the '--predictions_file (-p)' output parameter. Class probabilities for
       each prediction may be saved with the '--probabilities_file (-P)' output parameter.

       For example, to train a decision tree with a minimum  leaf  size  of  20  on  the  dataset  contained  in
       'data.csv'  with  labels  'labels.csv',  saving  the output model to 'tree.bin' and printing the training
       error, one could call

       $ mlpack_decision_tree --training_file data.arff --labels_file  labels.csv  --output_model_file  tree.bin
       --minimum_leaf_size 20 --minimum_gain_split 0.001 --print_training_accuracy

       Then,  to  use  that model to classify points in 'test_set.csv' and print the test error given the labels
       'test_labels.csv' using that model, while saving the predictions for each point to 'predictions.csv', one
       could call

       $  mlpack_decision_tree  --input_model_file   tree.bin   --test_file   test_set.arff   --test_labels_file
       test_labels.csv --predictions_file predictions.csv

OPTIONAL INPUT OPTIONS

       --help (-h) [bool]
              Default help info.

       --info [string]
              Print help on a specific option. Default value ''.

       --input_model_file (-m) [unknown]
              Pre-trained  decision  tree,  to  be used with test points.  --labels_file (-l) [unknown] Training
              labels.

       --maximum_depth (-D) [int]
              Maximum depth of the tree (0 means no limit).  Default value 0.

       --minimum_gain_split (-g) [double]
              Minimum gain for node splitting. Default value 1e-07.

       --minimum_leaf_size (-n) [int]
              Minimum number of points in a leaf. Default value 20.

       --print_training_accuracy (-a) [bool]
              Print the training accuracy.

       --print_training_error (-e) [bool]
              Print the training error (deprecated; will be removed in mlpack 4.0.0).

       --test_file (-T) [string]
              Testing dataset (may be categorical).

       --test_labels_file (-L) [unknown]
              Test point labels, if accuracy calculation is desired.

       --training_file (-t) [string]
              Training dataset (may be categorical).

       --verbose (-v) [bool]
              Display informational messages and the full list of parameters and timers at the end of execution.

       --version (-V) [bool]
              Display the version of mlpack.

       --weights_file (-w) [unknown]
              The weight of labels

OPTIONAL OUTPUT OPTIONS

       --output_model_file (-M) [unknown]
              Output for trained decision tree.

       --predictions_file (-p) [unknown]
              Class predictions for each test point.

       --probabilities_file (-P) [unknown]
              Class probabilities for each test point.

ADDITIONAL INFORMATION

       For further information, including relevant papers, citations,  and  theory,  consult  the  documentation
       found at http://www.mlpack.org or included with your distribution of mlpack.

mlpack-4.3.0                                     19 January 2024                         mlpack_decision_tree(1)