我正在使用1000个系数进行LassoCV。Statsmodels似乎无法处理这么多的系数。因此,我使用了scikit-learn。Statsmodels允许通过.fit_constrained(“coef1 + coef2…=1”)来约束系数的总和为1。我需要在Scikit中实现这个功能。我还将截距保持为零。
from sklearn.linear_model import LassoCVLassoCVmodel = LassoCV(fit_intercept=False)LassoCVmodel.fit(x,y)
任何帮助都将不胜感激。
回答:
正如评论中提到的:文档和源代码都没有表明scikit-learn支持此功能!
我刚刚尝试了使用现成的凸优化求解器的替代方案。这只是一个简单的原型方法,可能并不适合你的(未完全定义的)任务(样本大小?)。
一些评论:
- 实现/模型公式化很容易
- 问题比我想象的更难解决
- ECOS求解器普遍遇到麻烦
- SCS求解器达到了不错的精度(与scikit-learn相比较差)
- 但是:调整迭代次数以提高精度会破坏求解器
- 对于SCS来说问题将变得不可行!
- 基于bigM的SCS公式化(约束作为目标函数中的惩罚项)看起来可用;但可能需要调整
- 仅测试了开源求解器,商业求解器可能表现得更好
进一步尝试的方法:
- 对于处理大型问题(性能比稳健性和精度更重要),(加速的)投影随机梯度方法看起来很有前景
代码
""" data """from time import perf_counter as pcimport numpy as npfrom sklearn import datasetsdiabetes = datasets.load_diabetes()A = diabetes.datay = diabetes.targetalpha=0.1print('Problem-size: ', A.shape)def obj(x): # following sklearn's definition from user-guide! return (1. / (2*A.shape[0])) * np.square(np.linalg.norm(A.dot(x) - y, 2)) + alpha * np.linalg.norm(x, 1)""" sklearn """print('\nsklearn classic l1')from sklearn import linear_modelclf = linear_model.Lasso(alpha=alpha, fit_intercept=False)t0 = pc()clf.fit(A, y)print('used (secs): ', pc() - t0)print(obj(clf.coef_))print('sum x: ', np.sum(clf.coef_))""" cvxpy """print('\ncvxpy + scs classic l1')from cvxpy import *x = Variable(A.shape[1])objective = Minimize((1. / (2*A.shape[0])) * sum_squares(A*x - y) + alpha * norm(x, 1))problem = Problem(objective, [])t0 = pc()problem.solve(solver=SCS, use_indirect=False, max_iters=10000, verbose=False)print('used (secs): ', pc() - t0)print(obj(x.value.flat))print('sum x: ', np.sum(x.value.flat))""" cvxpy -> sum x == 1 """print('\ncvxpy + scs sum == 1 / 1st approach')objective = Minimize((1. / (2*A.shape[0])) * sum_squares(A*x - y))constraints = [sum(x) == 1]problem = Problem(objective, constraints)t0 = pc()problem.solve(solver=SCS, use_indirect=False, max_iters=10000, verbose=False)print('used (secs): ', pc() - t0)print(obj(x.value.flat))print('sum x: ', np.sum(x.value.flat))""" cvxpy approach 2 -> sum x == 1 """print('\ncvxpy + scs sum == 1 / 2nd approach')M = 1e6objective = Minimize((1. / (2*A.shape[0])) * sum_squares(A*x - y) + M*(sum(x) - 1))constraints = [sum(x) == 1]problem = Problem(objective, constraints)t0 = pc()problem.solve(solver=SCS, use_indirect=False, max_iters=10000, verbose=False)print('used (secs): ', pc() - t0)print(obj(x.value.flat))print('sum x: ', np.sum(x.value.flat))
输出
Problem-size: (442, 10)sklearn classic l1used (secs): 0.00145102438034889813201.3508496sum x: 891.78869298cvxpy + scs classic l1used (secs): 0.01116567335741745813203.6549995sum x: 872.520510561cvxpy + scs sum == 1 / 1st approachused (secs): 0.1535085389177597813400.1272148sum x: -8.43795102327cvxpy + scs sum == 1 / 2nd approachused (secs): 0.01257956938353649313397.2932976sum x: 1.01207061047
编辑
为了好玩,我实现了一个使用加速投影梯度方法的慢速非优化原型求解器(代码中的注释!)。
尽管这里表现缓慢(因为没有优化),但对于大型问题(因为它是一种一阶方法),这种方法应该能更好地扩展。有很大的潜力!
警告:对一些人来说,这可能被视为高级数值优化 🙂
编辑2: 我忘了在投影上添加非负约束(如果x可以是非负的,那么sum(x) == 1就没有多大意义了!)。这使得求解变得更加困难(数值问题),很明显,应该使用那些快速的专用投影方法(我现在太懒了;我认为有n*log n的算法可用)。再次强调:这个APG求解器是一个原型,不适合实际任务。
代码
""" accelerated pg -> sum x == 1 """def solve_pg(A, b, momentum=0.9, maxiter=1000): """ remarks: algorithm: accelerated projected gradient projection: proj on probability-simplex -> naive and slow using cvxpy + ecos line-search: armijo-rule along projection-arc (Bertsekas book) -> suffers from slow projection stopping-criterion: naive gradient-calculation: precomputes AtA -> not needed and not recommended for huge sparse data! """ M, N = A.shape x = np.zeros(N) AtA = A.T.dot(A) Atb = A.T.dot(b) stop_count = 0 # projection helper x_ = Variable(N) v_ = Parameter(N) objective_ = Minimize(0.5 * square(norm(x_ - v_, 2))) constraints_ = [sum(x_) == 1] problem_ = Problem(objective_, constraints_) def gradient(x): return AtA.dot(x) - Atb def obj(x): return 0.5 * np.linalg.norm(A.dot(x) - b)**2 it = 0 while True: grad = gradient(x) # line search alpha = 1 beta = 0.5 sigma=1e-2 old_obj = obj(x) while True: new_x = x - alpha * grad new_obj = obj(new_x) if old_obj - new_obj >= sigma * grad.dot(x - new_x): break else: alpha *= beta x_old = x[:] x = x - alpha*grad # projection v_.value = x problem_.solve() x = np.array(x_.value.flat) y = x + momentum * (x - x_old) if np.abs(old_obj - obj(x)) < 1e-2: stop_count += 1 else: stop_count = 0 if stop_count == 3: print('early-stopping @ it: ', it) return x it += 1 if it == maxiter: return xprint('\n acc pg')t0 = pc()x = solve_pg(A, y)print('used (secs): ', pc() - t0)print(obj(x))print('sum x: ', np.sum(x))
输出
acc pgearly-stopping @ it: 367used (secs): 0.771451133048702713396.8642379sum x: 1.00000000002