OTB.TrainImagesRegression: Train a regression model from multiple triplets of feature images, predictor images and training vector data.

Train a classifier from multiple triplets of predictor images, label images and training vector data. The training vector data must contain polygons corresponding to the input sampling positions. This data is used to extract samples using pixel values in each band of the predictor image and the corresponding ground truth extracted from the lagel image. If no training vector data is provided, the samples will be extracted on the full image extent.At the end of the application, the mean square error between groundtruth and predicted values is computed using the output model and the validation vector data. Note that if no validation data is given, the training data will be used for validation.The number of training and validation samples can be specified with parameters. If no size is given, all samples will be used. This application is based on LibSVM, OpenCV Machine Learning, and Shark ML. The output of this application is a text model file, whose format corresponds to the ML model type chosen. There is no image nor vector data output.

Inputs

A list of input predictor images.

format
href
Please set a value for io.

A list of input label images.

format
href
Please set a value for io.ip.

A list of vector data to select the training samples.

format
href
Please set a value for io.vd.

A list of vector data to select the validation samples.

format
href
Please set a value for io.valid.

XML file containing mean and variance of each feature.

format
href
Please set a value for io.imstat.

Number of training samples.

integer

Number of validation samples.

integer

Ratio between training and validation samples.

number
Please set a value for sample.ratio.

Type of sampling (periodic, pattern based, random)

string
Please set a value for sample.type.

Jitter amplitude added during sample selection (0 = no jitter)

integer

Set a specific random seed with integer value.

integer

Available memory for processing (in MB).

integer

This parameter allows selecting a directory containing Digital Elevation Model files. Note that this directory should contain only DEM files. Unexpected behaviour might occurs if other images are found in this directory.

string

Use a geoid grid to get the height above the ellipsoid in case there is no DEM available, no coverage for some points or pixels with no_data in the DEM tiles. A version of the geoid can be found on the OTB website(https://gitlab.orfeo-toolbox.org/orfeotoolbox/otb-data/blob/master/Input/DEM/egm96.grd).

format
href
Please set a value for elev.geoid.

This parameter allows setting the default height above ellipsoid when there is no DEM available, no coverage for some points or pixels with no_data in the DEM tiles, and no geoid file has been set. This is also used by some application as an average elevation value.

number
Please set a value for elev.default.

Choice of the classifier to use for the training.

string
Please set a value for classifier.

SVM Kernel Type.

string
Please set a value for classifier.libsvm.k.

Type of SVM formulation.

string
Please set a value for classifier.libsvm.m.

SVM models have a cost parameter C (1 by default) to control the trade-off between training errors and forcing rigid margins.

number
Please set a value for classifier.libsvm.c.

Cost parameter Nu, in the range 0..1, the larger the value, the smoother the decision.

number
Please set a value for classifier.libsvm.nu.

The distance between feature vectors from the training set and the fitting hyper-plane must be less than Epsilon. For outliersthe penalty mutliplier is set by C.

number
Please set a value for classifier.libsvm.opt.

The training algorithm attempts to split each node while its depth is smaller than the maximum possible depth of the tree. The actual depth may be smaller if the other termination criteria are met, and/or if the tree is pruned.

integer
Please set a value for classifier.dt.max.

If the number of samples in a node is smaller than this parameter, then this node will not be split.

integer
Please set a value for classifier.dt.min.

If all absolute differences between an estimated value in a node and the values of the train samples in this node are smaller than this regression accuracy parameter, then the node will not be split further.

number
Please set a value for classifier.dt.ra.

Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split.

integer
Please set a value for classifier.dt.cat.

Type of training method for the multilayer perceptron (MLP) neural network.

string
Please set a value for classifier.dt.r.

The number of neurons in each intermediate layer (excluding input and output layers).

string
Please set a value for classifier.ann.sizes.

This function determine whether the output of the node is positive or not depending on the output of the transfert function.

string
Please set a value for classifier.ann.f.

Alpha parameter of the activation function (used only with sigmoid and gaussian functions).

number
Please set a value for classifier.ann.a.

Beta parameter of the activation function (used only with sigmoid and gaussian functions).

number
Please set a value for classifier.ann.b.

Strength of the weight gradient term in the BACKPROP method. The recommended value is about 0.1.

number
Please set a value for classifier.ann.bpdw.

Strength of the momentum term (the difference between weights on the 2 previous iterations). This parameter provides some inertia to smooth the random fluctuations of the weights. It can vary from 0 (the feature is disabled) to 1 and beyond. The value 0.1 or so is good enough.

number
Please set a value for classifier.ann.bpms.

Initial value Delta_0 of update-values Delta_

number
Please set a value for classifier.ann.rdw.

Update-values lower limit Delta_

number
Please set a value for classifier.ann.rdwm.

Termination criteria.

string
Please set a value for classifier.ann.term.

Epsilon value used in the Termination criteria.

number
Please set a value for classifier.ann.eps.

Maximum number of iterations used in the Termination criteria.

integer
Please set a value for classifier.ann.iter.

The depth of the tree. A low value will likely underfit and conversely a high value will likely overfit. The optimal value can be obtained using cross validation or other suitable methods.

integer
Please set a value for classifier.rf.max.

If the number of samples in a node is smaller than this parameter, then the node will not be split. A reasonable value is a small percentage of the total data e.g. 1 percent.

integer
Please set a value for classifier.rf.min.

If all absolute differences between an estimated value in a node and the values of the train samples in this node are smaller than this regression accuracy parameter, then the node will not be split.

number
Please set a value for classifier.rf.ra.

Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split.

integer
Please set a value for classifier.rf.cat.

The size of the subset of features, randomly selected at each tree node, that are used to find the best split(s). If you set it to 0, then the size will be set to the square root of the total number of features.

integer
Please set a value for classifier.rf.var.

The maximum number of trees in the forest. Typically, the more trees you have, the better the accuracy. However, the improvement in accuracy generally diminishes and reaches an asymptote for a certain number of trees. Also to keep in mind, increasing the number of trees increases the prediction time linearly.

integer
Please set a value for classifier.rf.nbtrees.

Sufficient accuracy (OOB error).

number
Please set a value for classifier.rf.acc.

The number of neighbors to use.

integer
Please set a value for classifier.knn.k.

Decision rule for regression output

string
Please set a value for classifier.knn.rule.

If activated, the application will try to clean all temporary files it created

format
href
Please set a value for cleanup.

Outputs

Output file containing the model estimated (.txt format).

format
transmission

Mean square error computed using the validation dataset

transmission

Execution options

successUri
inProgressUri
failedUri

format

mode

Execute End Point

View the execution endpoint of a process.

View the alternative version in HTML.

{"id": "OTB.TrainImagesRegression", "title": "Train a regression model from multiple triplets of feature images, predictor images and training vector data.", "description": "Train a classifier from multiple triplets of predictor images, label images and training vector data. The training vector data must contain polygons corresponding to the input sampling positions. This data is used to extract samples using pixel values in each band of the predictor image and the corresponding ground truth extracted from the lagel image. If no training vector data is provided, the samples will be extracted on the full image extent.At the end of the application, the mean square error between groundtruth and predicted values is computed using the output model and the validation vector data. Note that if no validation data is given, the training data will be used for validation.The number of training and validation samples can be specified with parameters. If no size is given, all samples will be used. This application is based on LibSVM, OpenCV Machine Learning, and Shark ML. The output of this application is a text model file, whose format corresponds to the ML model type chosen. There is no image nor vector data output.", "version": "1.0.0", "jobControlOptions": ["sync-execute", "async-execute", "dismiss"], "outputTransmission": ["value", "reference"], "links": [{"rel": "execute", "type": "application/json", "title": "Execute End Point", "href": "http://tb17.geolabs.fr:8090/ogc-api/processes/OTB.TrainImagesRegression/execution"}, {"rel": "alternate", "type": "text/html", "title": "Execute End Point", "href": "http://tb17.geolabs.fr:8090/ogc-api/processes/OTB.TrainImagesRegression/execution.html"}], "inputs": {"io": {"title": "A list of input predictor images.", "description": "A list of input predictor images.", "maxOccurs": 1024, "extentded-schema": {"type": "array", "minItems": 1, "maxItems": 1024, "items": {"oneOf": [{"allOf": [{"$ref": "http://zoo-project.org/dl/link.json"}, {"type": "object", "properties": {"type": {"enum": ["image/tiff", "image/jpeg", "image/png"]}}}]}, {"type": "object", "required": ["value"], "properties": {"value": {"oneOf": [{"type": "string", "contentEncoding": "base64", "contentMediaType": "image/tiff"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "image/jpeg"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "image/png"}]}}}]}}, "schema": {"oneOf": [{"type": "string", "contentEncoding": "base64", "contentMediaType": "image/tiff"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "image/jpeg"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "image/png"}]}, "id": "io"}, "io.ip": {"title": "A list of input label images.", "description": "A list of input label images.", "maxOccurs": 1024, "extentded-schema": {"type": "array", "minItems": 1, "maxItems": 1024, "items": {"oneOf": [{"allOf": [{"$ref": "http://zoo-project.org/dl/link.json"}, {"type": "object", "properties": {"type": {"enum": ["image/tiff", "image/jpeg", "image/png"]}}}]}, {"type": "object", "required": ["value"], "properties": {"value": {"oneOf": [{"type": "string", "contentEncoding": "base64", "contentMediaType": "image/tiff"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "image/jpeg"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "image/png"}]}}}]}}, "schema": {"oneOf": [{"type": "string", "contentEncoding": "base64", "contentMediaType": "image/tiff"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "image/jpeg"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "image/png"}]}, "id": "io.ip"}, "io.vd": {"title": "A list of vector data to select the training samples.", "description": "A list of vector data to select the training samples.", "maxOccurs": 1024, "extentded-schema": {"type": "array", "minItems": 0, "maxItems": 1024, "items": {"oneOf": [{"allOf": [{"$ref": "http://zoo-project.org/dl/link.json"}, {"type": "object", "properties": {"type": {"enum": ["text/xml", "application/vnd.google-earth.kml+xml", "application/json", "application/zip"]}}}]}, {"type": "object", "required": ["value"], "properties": {"value": {"oneOf": [{"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/xml"}, {"type": "string", "contentEncoding": "utf-8", "contentMediaType": "application/vnd.google-earth.kml+xml"}, {"type": "object"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "application/zip"}]}}}]}, "nullable": true}, "schema": {"oneOf": [{"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/xml"}, {"type": "string", "contentEncoding": "utf-8", "contentMediaType": "application/vnd.google-earth.kml+xml"}, {"type": "object"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "application/zip"}]}, "id": "io.vd"}, "io.valid": {"title": "A list of vector data to select the validation samples.", "description": "A list of vector data to select the validation samples.", "maxOccurs": 1024, "extentded-schema": {"type": "array", "minItems": 0, "maxItems": 1024, "items": {"oneOf": [{"allOf": [{"$ref": "http://zoo-project.org/dl/link.json"}, {"type": "object", "properties": {"type": {"enum": ["text/xml", "application/vnd.google-earth.kml+xml", "application/json", "application/zip"]}}}]}, {"type": "object", "required": ["value"], "properties": {"value": {"oneOf": [{"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/xml"}, {"type": "string", "contentEncoding": "utf-8", "contentMediaType": "application/vnd.google-earth.kml+xml"}, {"type": "object"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "application/zip"}]}}}]}, "nullable": true}, "schema": {"oneOf": [{"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/xml"}, {"type": "string", "contentEncoding": "utf-8", "contentMediaType": "application/vnd.google-earth.kml+xml"}, {"type": "object"}, {"type": "string", "contentEncoding": "base64", "contentMediaType": "application/zip"}]}, "id": "io.valid"}, "io.imstat": {"title": "XML file containing mean and variance of each feature.", "description": "XML file containing mean and variance of each feature.", "maxOccurs": 1, "extentded-schema": {"oneOf": [{"allOf": [{"$ref": "http://zoo-project.org/dl/link.json"}, {"type": "object", "properties": {"type": {"enum": ["text/xml"]}}}]}, {"type": "object", "required": ["value"], "properties": {"value": {"oneOf": [{"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/xml"}]}}}], "nullable": true}, "schema": {"oneOf": [{"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/xml"}]}, "id": "io.imstat"}, "sample": {"title": "Number of training samples.", "description": "Number of training samples.", "maxOccurs": 1, "schema": {"type": "integer", "nullable": true}, "id": "sample"}, "sample.nv": {"title": "Number of validation samples.", "description": "Number of validation samples.", "maxOccurs": 1, "schema": {"type": "integer", "nullable": true}, "id": "sample.nv"}, "sample.ratio": {"title": "Ratio between training and validation samples.", "description": "Ratio between training and validation samples.", "maxOccurs": 1, "schema": {"type": "number", "default": 0.5}, "id": "sample.ratio"}, "sample.type": {"title": "Type of sampling (periodic, pattern based, random)", "description": "Type of sampling (periodic, pattern based, random)", "maxOccurs": 1, "schema": {"type": "string", "default": "periodic", "enum": ["periodic", "random"]}, "id": "sample.type"}, "sample.type.periodic.jitter": {"title": "Jitter amplitude added during sample selection (0 = no jitter)", "description": "Jitter amplitude added during sample selection (0 = no jitter)", "maxOccurs": 1, "schema": {"type": "integer", "default": 0, "nullable": true}, "id": "sample.type.periodic.jitter"}, "rand": {"title": "Set a specific random seed with integer value.", "description": "Set a specific random seed with integer value.", "maxOccurs": 1, "schema": {"type": "integer", "nullable": true}, "id": "rand"}, "ram": {"title": "Available memory for processing (in MB).", "description": "Available memory for processing (in MB).", "maxOccurs": 1, "schema": {"type": "integer", "default": 256, "nullable": true}, "id": "ram"}, "elev": {"title": "This parameter allows selecting a directory containing Digital Elevation Model files. Note that this directory should contain only DEM files. Unexpected behaviour might occurs if other images are found in this directory.", "description": "This parameter allows selecting a directory containing Digital Elevation Model files. Note that this directory should contain only DEM files. Unexpected behaviour might occurs if other images are found in this directory.", "maxOccurs": 1, "schema": {"type": "string", "default": "Any value", "nullable": true}, "id": "elev"}, "elev.geoid": {"title": "Use a geoid grid to get the height above the ellipsoid in case there is no DEM available, no coverage for some points or pixels with no_data in the DEM tiles. A version of the geoid can be found on the OTB website(https://gitlab.orfeo-toolbox.org/orfeotoolbox/otb-data/blob/master/Input/DEM/egm96.grd).", "description": "Use a geoid grid to get the height above the ellipsoid in case there is no DEM available, no coverage for some points or pixels with no_data in the DEM tiles. A version of the geoid can be found on the OTB website(https://gitlab.orfeo-toolbox.org/orfeotoolbox/otb-data/blob/master/Input/DEM/egm96.grd).", "maxOccurs": 1, "extentded-schema": {"oneOf": [{"allOf": [{"$ref": "http://zoo-project.org/dl/link.json"}, {"type": "object", "properties": {"type": {"enum": ["application/octet-stream"]}}}]}, {"type": "object", "required": ["value"], "properties": {"value": {"oneOf": [{"type": "string", "contentEncoding": "base64", "contentMediaType": "application/octet-stream"}]}}}], "nullable": true}, "schema": {"oneOf": [{"type": "string", "contentEncoding": "base64", "contentMediaType": "application/octet-stream"}]}, "id": "elev.geoid"}, "elev.default": {"title": "This parameter allows setting the default height above ellipsoid when there is no DEM available, no coverage for some points or pixels with no_data in the DEM tiles, and no geoid file has been set. This is also used by some application as an average elevation value.", "description": "This parameter allows setting the default height above ellipsoid when there is no DEM available, no coverage for some points or pixels with no_data in the DEM tiles, and no geoid file has been set. This is also used by some application as an average elevation value.", "maxOccurs": 1, "schema": {"type": "number", "default": 0}, "id": "elev.default"}, "classifier": {"title": "Choice of the classifier to use for the training.", "description": "Choice of the classifier to use for the training.", "maxOccurs": 1, "schema": {"type": "string", "default": "libsvm", "enum": ["libsvm", "dt", "ann", "rf", "knn"]}, "id": "classifier"}, "classifier.libsvm.k": {"title": "SVM Kernel Type.", "description": "SVM Kernel Type.", "maxOccurs": 1, "schema": {"type": "string", "default": "linear", "enum": ["linear", "rbf", "poly", "sigmoid"]}, "id": "classifier.libsvm.k"}, "classifier.libsvm.m": {"title": "Type of SVM formulation.", "description": "Type of SVM formulation.", "maxOccurs": 1, "schema": {"type": "string", "default": "epssvr", "enum": ["epssvr", "nusvr"]}, "id": "classifier.libsvm.m"}, "classifier.libsvm.c": {"title": "SVM models have a cost parameter C (1 by default) to control the trade-off between training errors and forcing rigid margins.", "description": "SVM models have a cost parameter C (1 by default) to control the trade-off between training errors and forcing rigid margins.", "maxOccurs": 1, "schema": {"type": "number", "default": 1}, "id": "classifier.libsvm.c"}, "classifier.libsvm.nu": {"title": "Cost parameter Nu, in the range 0..1, the larger the value, the smoother the decision.", "description": "Cost parameter Nu, in the range 0..1, the larger the value, the smoother the decision.", "maxOccurs": 1, "schema": {"type": "number", "default": 0.5}, "id": "classifier.libsvm.nu"}, "classifier.libsvm.opt": {"title": "The distance between feature vectors from the training set and the fitting hyper-plane must be less than Epsilon. For outliersthe penalty mutliplier is set by C.", "description": "The distance between feature vectors from the training set and the fitting hyper-plane must be less than Epsilon. For outliersthe penalty mutliplier is set by C.", "maxOccurs": 1, "schema": {"type": "number", "default": 0.001}, "id": "classifier.libsvm.opt"}, "classifier.dt.max": {"title": "The training algorithm attempts to split each node while its depth is smaller than the maximum possible depth of the tree. The actual depth may be smaller if the other termination criteria are met, and/or if the tree is pruned.", "description": "The training algorithm attempts to split each node while its depth is smaller than the maximum possible depth of the tree. The actual depth may be smaller if the other termination criteria are met, and/or if the tree is pruned.", "maxOccurs": 1, "schema": {"type": "integer", "default": 10}, "id": "classifier.dt.max"}, "classifier.dt.min": {"title": "If the number of samples in a node is smaller than this parameter, then this node will not be split.", "description": "If the number of samples in a node is smaller than this parameter, then this node will not be split.", "maxOccurs": 1, "schema": {"type": "integer", "default": 10}, "id": "classifier.dt.min"}, "classifier.dt.ra": {"title": "If all absolute differences between an estimated value in a node and the values of the train samples in this node are smaller than this regression accuracy parameter, then the node will not be split further.", "description": "If all absolute differences between an estimated value in a node and the values of the train samples in this node are smaller than this regression accuracy parameter, then the node will not be split further.", "maxOccurs": 1, "schema": {"type": "number", "default": 0.01}, "id": "classifier.dt.ra"}, "classifier.dt.cat": {"title": "Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split.", "description": "Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split.", "maxOccurs": 1, "schema": {"type": "integer", "default": 10}, "id": "classifier.dt.cat"}, "classifier.dt.r": {"title": "Type of training method for the multilayer perceptron (MLP) neural network.", "description": "Type of training method for the multilayer perceptron (MLP) neural network.", "maxOccurs": 1, "schema": {"type": "string", "default": "reg", "enum": ["back", "reg"]}, "id": "classifier.dt.r"}, "classifier.ann.sizes": {"title": "The number of neurons in each intermediate layer (excluding input and output layers).", "description": "The number of neurons in each intermediate layer (excluding input and output layers).", "maxOccurs": 1024, "schema": {"type": "string", "default": "Any value"}, "id": "classifier.ann.sizes"}, "classifier.ann.f": {"title": "This function determine whether the output of the node is positive or not depending on the output of the transfert function.", "description": "This function determine whether the output of the node is positive or not depending on the output of the transfert function.", "maxOccurs": 1, "schema": {"type": "string", "default": "sig", "enum": ["ident", "sig", "gau"]}, "id": "classifier.ann.f"}, "classifier.ann.a": {"title": "Alpha parameter of the activation function (used only with sigmoid and gaussian functions).", "description": "Alpha parameter of the activation function (used only with sigmoid and gaussian functions).", "maxOccurs": 1, "schema": {"type": "number", "default": 1}, "id": "classifier.ann.a"}, "classifier.ann.b": {"title": "Beta parameter of the activation function (used only with sigmoid and gaussian functions).", "description": "Beta parameter of the activation function (used only with sigmoid and gaussian functions).", "maxOccurs": 1, "schema": {"type": "number", "default": 1}, "id": "classifier.ann.b"}, "classifier.ann.bpdw": {"title": "Strength of the weight gradient term in the BACKPROP method. The recommended value is about 0.1.", "description": "Strength of the weight gradient term in the BACKPROP method. The recommended value is about 0.1.", "maxOccurs": 1, "schema": {"type": "number", "default": 0.1}, "id": "classifier.ann.bpdw"}, "classifier.ann.bpms": {"title": "Strength of the momentum term (the difference between weights on the 2 previous iterations). This parameter provides some inertia to smooth the random fluctuations of the weights. It can vary from 0 (the feature is disabled) to 1 and beyond. The value 0.1 or so is good enough.", "description": "Strength of the momentum term (the difference between weights on the 2 previous iterations). This parameter provides some inertia to smooth the random fluctuations of the weights. It can vary from 0 (the feature is disabled) to 1 and beyond. The value 0.1 or so is good enough.", "maxOccurs": 1, "schema": {"type": "number", "default": 0.1}, "id": "classifier.ann.bpms"}, "classifier.ann.rdw": {"title": "Initial value Delta_0 of update-values Delta_", "description": "Initial value Delta_0 of update-values Delta_", "maxOccurs": 1, "schema": {"type": "number", "default": 0.1}, "id": "classifier.ann.rdw"}, "classifier.ann.rdwm": {"title": "Update-values lower limit Delta_", "description": "Update-values lower limit Delta_", "maxOccurs": 1, "schema": {"type": "number", "default": 1e-07}, "id": "classifier.ann.rdwm"}, "classifier.ann.term": {"title": "Termination criteria.", "description": "Termination criteria.", "maxOccurs": 1, "schema": {"type": "string", "default": "all", "enum": ["iter", "eps", "all"]}, "id": "classifier.ann.term"}, "classifier.ann.eps": {"title": "Epsilon value used in the Termination criteria.", "description": "Epsilon value used in the Termination criteria.", "maxOccurs": 1, "schema": {"type": "number", "default": 0.01}, "id": "classifier.ann.eps"}, "classifier.ann.iter": {"title": "Maximum number of iterations used in the Termination criteria.", "description": "Maximum number of iterations used in the Termination criteria.", "maxOccurs": 1, "schema": {"type": "integer", "default": 1000}, "id": "classifier.ann.iter"}, "classifier.rf.max": {"title": "The depth of the tree. A low value will likely underfit and conversely a high value will likely overfit. The optimal value can be obtained using cross validation or other suitable methods.", "description": "The depth of the tree. A low value will likely underfit and conversely a high value will likely overfit. The optimal value can be obtained using cross validation or other suitable methods.", "maxOccurs": 1, "schema": {"type": "integer", "default": 5}, "id": "classifier.rf.max"}, "classifier.rf.min": {"title": "If the number of samples in a node is smaller than this parameter, then the node will not be split. A reasonable value is a small percentage of the total data e.g. 1 percent.", "description": "If the number of samples in a node is smaller than this parameter, then the node will not be split. A reasonable value is a small percentage of the total data e.g. 1 percent.", "maxOccurs": 1, "schema": {"type": "integer", "default": 10}, "id": "classifier.rf.min"}, "classifier.rf.ra": {"title": "If all absolute differences between an estimated value in a node and the values of the train samples in this node are smaller than this regression accuracy parameter, then the node will not be split.", "description": "If all absolute differences between an estimated value in a node and the values of the train samples in this node are smaller than this regression accuracy parameter, then the node will not be split.", "maxOccurs": 1, "schema": {"type": "number", "default": 0}, "id": "classifier.rf.ra"}, "classifier.rf.cat": {"title": "Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split.", "description": "Cluster possible values of a categorical variable into K <= cat clusters to find a suboptimal split.", "maxOccurs": 1, "schema": {"type": "integer", "default": 10}, "id": "classifier.rf.cat"}, "classifier.rf.var": {"title": "The size of the subset of features, randomly selected at each tree node, that are used to find the best split(s). If you set it to 0, then the size will be set to the square root of the total number of features.", "description": "The size of the subset of features, randomly selected at each tree node, that are used to find the best split(s). If you set it to 0, then the size will be set to the square root of the total number of features.", "maxOccurs": 1, "schema": {"type": "integer", "default": 0}, "id": "classifier.rf.var"}, "classifier.rf.nbtrees": {"title": "The maximum number of trees in the forest. Typically, the more trees you have, the better the accuracy. However, the improvement in accuracy generally diminishes and reaches an asymptote for a certain number of trees. Also to keep in mind, increasing the number of trees increases the prediction time linearly.", "description": "The maximum number of trees in the forest. Typically, the more trees you have, the better the accuracy. However, the improvement in accuracy generally diminishes and reaches an asymptote for a certain number of trees. Also to keep in mind, increasing the number of trees increases the prediction time linearly.", "maxOccurs": 1, "schema": {"type": "integer", "default": 100}, "id": "classifier.rf.nbtrees"}, "classifier.rf.acc": {"title": "Sufficient accuracy (OOB error).", "description": "Sufficient accuracy (OOB error).", "maxOccurs": 1, "schema": {"type": "number", "default": 0.01}, "id": "classifier.rf.acc"}, "classifier.knn.k": {"title": "The number of neighbors to use.", "description": "The number of neighbors to use.", "maxOccurs": 1, "schema": {"type": "integer", "default": 32}, "id": "classifier.knn.k"}, "classifier.knn.rule": {"title": "Decision rule for regression output", "description": "Decision rule for regression output", "maxOccurs": 1, "schema": {"type": "string", "default": "mean", "enum": ["mean", "median"]}, "id": "classifier.knn.rule"}, "cleanup": {"title": "If activated, the application will try to clean all temporary files it created", "description": "If activated, the application will try to clean all temporary files it created", "maxOccurs": 1, "extentded-schema": {"oneOf": [{"type": "object", "required": ["value"], "properties": {"value": {"oneOf": []}}}]}, "schema": {"oneOf": []}, "id": "cleanup"}}, "outputs": {"io.out": {"title": "Output file containing the model estimated (.txt format).", "description": "Output file containing the model estimated (.txt format).", "extentded-schema": {"oneOf": [{"allOf": [{"$ref": "http://zoo-project.org/dl/link.json"}, {"type": "object", "properties": {"type": {"enum": ["text/xml", "text/plain"]}}}]}, {"type": "object", "required": ["value"], "properties": {"value": {"oneOf": [{"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/xml"}, {"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/plain"}]}}}]}, "schema": {"oneOf": [{"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/xml"}, {"type": "string", "contentEncoding": "utf-8", "contentMediaType": "text/plain"}]}, "id": "io.out"}, "io.mse": {"title": "Mean square error computed using the validation dataset", "description": "Mean square error computed using the validation dataset", "schema": {"type": "number"}, "id": "io.mse"}}}

http://tb17.geolabs.fr:8090/ogc-api/processes/OTB.TrainImagesRegression.html
Last modified: Wed Jun 9 17:39:32 CEST 2021