我正在尝试在数据集S_train的长度小于或等于给定数字(在本例中为11)时迭代一些值。这是目前我所做的
S_new = trainT_new = testmu_new = mumu_test_new = mu_testwhile len(S_new) <= 11: ground_test = T_new[target].values.tolist() acquisition_function = abs(mu_test - ground_test) max_item = np.argmax(acquisition_function) #step 3 : value in test set that maximizes the abs difference of the energy alpha_al = test.iloc[[max_item]] #identify the minimum step in test set S_new = S_new.append(alpha_al) len(S_new) T_new = T_new.drop(test.index[max_item]) len(T_new) gpr = GaussianProcessRegressor( # kernel is the covariance function of the gaussian process (GP) kernel=Normalization( # kernel equals to normalization -> normalizes a kernel using the cosine of angle formula, k_normalized(x,y) = k(x,y)/sqrt(k(x,x)*k(y,y)) # graphdot.kernel.fix.Normalization(kernel), set kernel as marginalized graph kernel, which is used to calculate the similarity between 2 graphs # implement the random walk-based graph similarity kernel as Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. ICML Tang2019MolecularKernel() ), alpha=1e-4, # value added to the diagonal of the kernel matrix during fitting optimizer=True, # default optimizer of L-BFGS-B based on scipy.optimize.minimize normalize_y=True, # normalize the y values so taht the means and variance is 0 and 1, repsectively. Will be reversed when predicions are returned regularization='+', # alpha (1e-4 in this case) is added to the diagonals of the kernal matrix ) start_time = time.time() gpr.fit(S_new.graphs, S_new[target], repeat=1, verbose=True) # Fitting train set as graphs (independent variable) with train[target] as dependient variable end_time = time.time() print("the total time consumption is " + str(end_time - start_time) + ".") gpr.kernel.hyperparameters rmse_training = [] rmse_test = [] mu_new = gpr.predict(S_new.graphs) print('Training set') print('MAE:', np.mean(np.abs(S_new[target] - mu_new))) print('RMSE:', np.std(S_new[target] - mu_new)) rmse_training.append(np.std(S_new[target] - mu_new) mu_test_new = gpr.predict(T_new.graphs) print('Training set') print('MAE:', np.mean(np.abs(T_new[target] - mu_test_new))) print('RMSE:', np.std(T_new[target] - mu_test_new)) rmse_test.append(np.std(T_new[target] - mu_test_new)
基本来说,我正在计算T_new中最大化T_new中第i个元素与mu_test之间的绝对误差的值,并将其添加到S_train集合中,然后从T_new中删除它。使用新的S_train,我将再次训练我的模型,然后重复上述操作。我从未使用过while循环,我在寻找正确的语法,看起来是正确的,但我得到了这个错误消息:
File "<ipython-input-55-d284ca5f9d1f>", line 42 mu_test_new = gpr.predict(T_new.graphs) ^SyntaxError: invalid syntax
你知道可能是什么原因吗?任何建议都将不胜感激。感谢你一直以来的帮助。
回答:
问题不在于while循环,而只是一个打字错误。具体来说,这行代码 –
rmse_training.append(np.std(S_new[target] - mu_new)
缺少一个闭合括号。
如果你尝试
rmse_training.append(np.std(S_new[target] - mu_new))
你看到的错误将会消失。
值得注意的是,有时报告的特定行错误可能是由于之前的语法错误引起的,在调试时要注意这一点。