Modeling Data: This tool is in charge of creating models and predictors for Hadoop datasets. Using machine learning algorithms, this tool creates a model from the data-set of executions selected through the right panel. Then returns a model (kept in the system), and the result of testing the model against a sample of executions to check its accuracy. Our methodology uses the 50% of the selected executions as training (tr), a 25% for validating the model (tv), and a 25% to test the model (tt).
To use the tool, select the specific set of executions that you want to model at the right menu, the method to train it, and as option if you want the model to admit future not-seen-yet values. Then observe how the quality of the model in the chart below. [MAE: Mean Absolute Error, RAE: Relative Absolute Error]
|This tool will create Machine Learning models from the selected executions|
|1 -||Select from the Filters Box (right box):|
1) The executions to be filtered by chosing the value constraints per each attribute (if no value selected, all will be added to the combination)
2) The method that will be used to generate the model
3) Optionally, check if the generated model shall accept new values for attributes in the future (attempting to predict them) or fix the values, so new values for attributes will be rejected when using this model
|2 -||Click on Learn Model, and wait until the data is processed and the model created. Take into account that the bigger the data-set selected, the longer can take to process.|
|3 -||Wait until the navigator refreshes, and processes the received data.|
|4 -||Results will appear as:|
a) A chart showing the real values of execution time vs. predicted values of execution time. The closer (horizontally) to the x=y line, the better. Mispredictions and outliers will be far from the line.
b) The expected errors given by the model on training, validation and test (this is the important).