Provided by: mlpack-bin_4.5.1-1build2_amd64 bug

NAME

       mlpack_kernel_pca - kernel principal components analysis

SYNOPSIS

        mlpack_kernel_pca -i unknown -k string [-b double] [-c bool] [-D double] [-S double] [-d int] [-n bool] [-O double] [-s string] [-V bool] [-o unknown] [-h -v]

DESCRIPTION

       This  program  performs  Kernel  Principal  Components  Analysis (KPCA) on the specified dataset with the
       specified kernel. This will transform the data onto  the  kernel  principal  components,  and  optionally
       reduce the dimensionality by ignoring the kernel principal components with the smallest eigenvalues.

       For the case where a linear kernel is used, this reduces to regular PCA.

       The kernels that are supported are listed below:

              •  ’linear': the standard linear dot product (same as normal PCA): `K(x, y) = x^T y`

              •  ’gaussian':  a  Gaussian  kernel;  requires bandwidth: `K(x, y) = exp(-(|| x - y || ^ 2) / (2 *
                 (bandwidth ^ 2)))`

              •  ’polynomial': polynomial kernel; requires offset and degree: `K(x, y) =  (x^T  y  +  offset)  ^
                 degree`

              •  ’hyptan': hyperbolic tangent kernel; requires scale and offset: `K(x, y) = tanh(scale * (x^T y)
                 + offset)`

              •  ’laplacian': Laplacian kernel; requires bandwidth: `K(x, y) = exp(-(|| x - y ||) / bandwidth)`

              •  ’epanechnikov':  Epanechnikov kernel; requires bandwidth: `K(x, y) = max(0, 1 - || x - y ||^2 /
                 bandwidth^2)`

              •  ’cosine': cosine distance: `K(x, y) = 1 - (x^T y) / (|| x || * || y ||)`

       The parameters for each of  the  kernels  should  be  specified  with  the  options  ’--bandwidth  (-b)',
       '--kernel_scale (-S)', '--offset (-O)', or '--degree (-D)' (or a combination of those parameters).

       Optionally,  the  Nystroem  method ("Using the Nystroem method to speed up kernel machines", 2001) can be
       used to calculate the kernel matrix by specifying the ’--nystroem_method (-n)' parameter.  This  approach
       works  by  using  a subset of the data as basis to reconstruct the kernel matrix; to specify the sampling
       scheme, the '--sampling (-s)' parameter is used. The sampling scheme  for  the  Nystroem  method  can  be
       chosen from the following list: 'kmeans', 'random', ’ordered'.

       For  example,  the  following  command  will  perform  KPCA on the dataset ’input.csv' using the Gaussian
       kernel, and saving the transformed data to ’transformed.csv':

       $ mlpack_kernel_pca --input_file input.csv --kernel gaussian --output_file transformed.csv

REQUIRED INPUT OPTIONS

       --input_file (-i) [unknown]
              Input dataset to perform KPCA on.

       --kernel (-k) [string]
              The kernel to use; see the above documentation for the list of usable kernels.

OPTIONAL INPUT OPTIONS

       --bandwidth (-b) [double]
              Bandwidth, for 'gaussian' and 'laplacian' kernels. Default value 1.

       --center (-c) [bool]
              If set, the transformed data will be centered about the origin.

       --degree (-D) [double]
              Degree of polynomial, for 'polynomial' kernel.  Default value 1.

       --help (-h) [bool]
              Default help info.

       --info [string]
              Print help on a specific option. Default  value  ''.   --kernel_scale  (-S)  [double]  Scale,  for
              'hyptan' kernel. Default value 1.

       --new_dimensionality (-d) [int]
              If  not  0,  reduce  the  dimensionality of the output dataset by ignoring the dimensions with the
              smallest eigenvalues. Default value 0.

       --nystroem_method (-n) [bool]
              If set, the Nystroem method will be used.

       --offset (-O) [double]
              Offset, for 'hyptan' and 'polynomial' kernels.  Default value 0.

       --sampling (-s) [string]
              Sampling scheme to use for the  Nystroem  method:  'kmeans',  'random',  'ordered'  Default  value
              'kmeans'.

       --verbose (-v) [bool]
              Display informational messages and the full list of parameters and timers at the end of execution.

       --version (-V) [bool]
              Display the version of mlpack.

OPTIONAL OUTPUT OPTIONS

       --output_file (-o) [unknown] Matrix to save modified dataset to.

ADDITIONAL INFORMATION

       For  further  information,  including  relevant  papers, citations, and theory, consult the documentation
       found at http://www.mlpack.org or included with your distribution of mlpack.

mlpack-4.5.1                                     29 January 2025                            mlpack_kernel_pca(1)