我需要在C++构建目标系统中部署一个SVM。因此,我想使用dlib和python/numpy来训练一个SVM,将其序列化,并在目标系统中进行评估。
dlib的python文档对我来说相当晦涩难懂,所以谁能帮我提供一个最小的示例呢?
import dlib# My data in numpyfeature_column_1 = np.array([-1, -2, -3, 1, 2, 3])feature_column_2 = np.array([1, 2, 3, -1, -2, -3])labels = np.array([True, True, True, False, False, False])# Featuresfeature_vectors = dlib.vectors()for feature_column in [feature_column_1, feature_column_2]: feature_vectors.append(dlib.vector(feature_column.tolist()))# Labelslabels_array = dlib.array(labels.tolist())# Trainsvm = dlib.svm_c_trainer_linear()svm.train(feature_vectors, labels_array)# Testy_probibilities = svm.predict(labels_array_new)
我在训练时遇到了以下错误:
---> 18 svm.train(vectors, array)ValueError: Invalid inputs
回答:
我刚刚为dlib添加了一个官方示例,当我查找时惊讶地发现它之前没有包含。这个示例在这里可以找到:https://github.com/davisking/dlib/blob/master/python_examples/svm_binary_classifier.py。以下是相关细节:
import dlibimport pickle x = dlib.vectors()y = dlib.array()# Make a training dataset. Here we have just two training examples. Normally# you would use a much larger training dataset, but for the purpose of example# this is plenty. For binary classification, the y labels should all be either +1 or -1.x.append(dlib.vector([1, 2, 3, -1, -2, -3]))y.append(+1)x.append(dlib.vector([-1, -2, -3, 1, 2, 3]))y.append(-1)# Now make a training object. This object is responsible for turning a# training dataset into a prediction model. This one here is a SVM trainer# that uses a linear kernel. If you wanted to use a RBF kernel or histogram# intersection kernel you could change it to one of these lines:# svm = dlib.svm_c_trainer_histogram_intersection()# svm = dlib.svm_c_trainer_radial_basis()svm = dlib.svm_c_trainer_linear()svm.be_verbose()svm.set_c(10)# Now train the model. The return value is the trained model capable of making predictions.classifier = svm.train(x, y)# Now run the model on our data and look at the results.print("prediction for first sample: {}".format(classifier(x[0])))print("prediction for second sample: {}".format(classifier(x[1])))# classifier models can also be pickled in the same was as any other python object.with open('saved_model.pickle', 'wb') as handle: pickle.dump(classifier, handle)
然而,如果你想使用C++,你应该直接使用C++。dlib主要是一个C++库,而不是python库。dlib的整个目的是为希望进行机器学习的人提供一个友好的C++ API。因此,直接使用C++进行训练会更好。dlib附带了99个完整的C++示例和完整的C++ API文档。例如,这里有一个相关的示例http://dlib.net/svm_c_ex.cpp.html。
我真的应该强调,dlib的C++ API比python API更加灵活。实际上,dlib的目的是让C++中的机器学习变得简单,dlib的python API只是一个事后的想法。事实上,dlib的许多功能是使用C++模板等表达的,这些在Python中没有对应的概念(例如,因为Python没有类似C++模板的东西),因此这些功能没有在python中暴露出来。所以,如果你想使用C++,就使用C++。如果你知道如何编写C++,就没有理由使用Python API。