Testing with a quantitative accounting of the results?

There is an algorithm that is constantly being improved (or not). And I want after each improvement to test it on a few reference samples. As I understand it, for this there is a Continuous Integration system that builds the project, test, and report on the results.

And I understand how it works in the case of the classical testing algorithm can either work or fail (1 or 0).
In my case, is an algorithm in computer vision. And it can work at 100%, maybe 0%, maybe 66% or 45.6%.

I haven't found any tools that take into account here such quantitative results. Maybe I do not understand something, or something was missing.

Please help with advice or what some best practices on this topic.

Thank you
October 3rd 19 at 02:16
1 answer
October 3rd 19 at 02:18

Only hardcore, only the development of your own metrics, classification of test data into groups. Expert evaluation of data people compare result with expert opinion, comparison with the previous metrics and the best result for each dataset and class. The framework should be able to count all the metrics for the previous versions to the new algorithm, because the quality evaluation system will be regularly finished. An attempt to bring the evaluation to the binary mind, will give "the average temperature in the hospital". I'm pretty sure that you will not find ready-made solutions, too atypical task for the mass of the solution. At least I have not found.

Yes, think about how evaluation to present in graphic form, already 20 test clips with 3-4 digital metrics question "how?" causes a migraine attack. Because it goes in all directions at once, and a definite improvement - an extremely rare event. - boris_Bernhard commented on October 3rd 19 at 02:21

Find more questions by tags Continuous integrationTesting software