编辑：请查看此问题的末尾以获取解决方案

简而言之：我需要找到一种方法来计算每个批次的标签分布，并更新学习率。是否有办法访问当前模型的优化器，以便每批次更新学习率？

以下是如何计算标签分布的方法。可以在损失函数中完成，因为默认情况下损失是按批次计算的。哪里可以执行这段代码，同时也能访问模型的优化器？

def loss(y_true, y_pred):    y = math_ops.argmax(y_true, axis=1)    freqs = tf.gather(lf, y)  # 如果`lf`和`y`是numpy数组，则等于lf[y]    inv_freqs = math_ops.pow(freqs, -1)    E = 1 / math_ops.reduce_sum(inv_freqs)  # 用于更新学习率的值

解决方案

非常感谢@mrk推动我朝正确的方向解决这个问题！

为了计算每个批次的标签分布，然后使用该值来更新优化器的学习率，必须…

创建一个自定义度量，用于计算每个批次的标签分布，并返回频率数组（默认情况下，keras是按批次优化的，因此度量是每个批次计算的）。
通过子类化keras.callbacks.History类创建一个典型的学习率调度器
覆盖调度器的on_batch_end函数，logs字典将包含批次计算的所有度量，包括我们的自定义标签分布度量！

创建自定义度量

class LabelDistribution(tf.keras.metrics.Metric):    """    计算每个批次的标签分布（y_true）并将数组存储为    可以通过keras CallBack访问的度量    :param n_class: int - 不同输出类别的数量    """    def __init__(self, n_class, name='batch_label_distribution', **kwargs):        super(LabelDistribution, self).__init__(name=name, **kwargs)        self.n_class = n_class        self.label_distribution = self.add_weight(name='ld', initializer='zeros',                                                  aggregation=VariableAggregation.NONE,                                                  shape=(self.n_class, ))    def update_state(self, y_true, y_pred, sample_weight=None):        y_true = mo.cast(y_true, 'int32')        y = mo.argmax(y_true, axis=1)        label_distrib = mo.bincount(mo.cast(y, 'int32'))        self.label_distribution.assign(mo.cast(label_distrib, 'float32'))    def result(self):        return self.label_distribution    def reset_states(self):        self.label_distribution.assign([0]*self.n_class)

创建DRW学习率调度器

class DRWLearningRateSchedule(keras.callbacks.History):    """    用于实现    [Kaidi Cao, et al. "Learning Imbalanced Datasets with Label-Distribution-Aware Margin Loss." (2019)]    (https://arxiv.org/abs/1906.07413)中的差异化重新加权策略    作为度量包含在model.compile中    `model.compile(..., metrics=[DRWLearningRateSchedule(.01)])`    """    def __init__(self, base_lr, ld_metric='batch_label_distribution'):        super(DRWLearningRateSchedule, self).__init__()        self.base_lr = base_lr        self.ld_metric = ld_metric  # LabelDistribution度量的名称    def on_batch_end(self, batch, logs=None):        ld = logs.get(self.ld_metric)  # 每个批次的标签分布        current_lr = self.model.optimizer.lr        # 更新优化器学习率的示例        K.set_value(self.model.optimizer.lr, current_lr * (1 / math_ops.reduce_sum(ld)))

回答：

Keras基于损失的学习率适应

经过一些研究，我发现了这个，你可以不触发衰减，而是定义另一个函数或值来调整学习率。

from __future__ import absolute_importfrom __future__ import print_functionimport kerasfrom keras import backend as Kimport numpy as npclass LossLearningRateScheduler(keras.callbacks.History):    """    依赖于损失函数值变化的学习率调度器    用于判断学习率是否需要衰减的学习率调度器。    LossLearningRateScheduler具有以下属性：    base_lr: 起始学习率    lookback_epochs: 过去的epoch数，用于与当前epoch的损失函数值进行比较，以确定是否取得进展。    decay_threshold / decay_multiple: 如果损失函数值没有改善到decay_threshold * lookback_epochs的因子，则将应用decay_multiple到学习率。    spike_epochs: 你希望提高学习率的epoch编号列表。    spike_multiple: 对当前学习率应用的提高倍数。    """    def __init__(self, base_lr, lookback_epochs, spike_epochs = None, spike_multiple = 10, decay_threshold = 0.002, decay_multiple = 0.5, loss_type = 'val_loss'):        super(LossLearningRateScheduler, self).__init__()        self.base_lr = base_lr        self.lookback_epochs = lookback_epochs        self.spike_epochs = spike_epochs        self.spike_multiple = spike_multiple        self.decay_threshold = decay_threshold        self.decay_multiple = decay_multiple        self.loss_type = loss_type    def on_epoch_begin(self, epoch, logs=None):        if len(self.epoch) > self.lookback_epochs:            current_lr = K.get_value(self.model.optimizer.lr)            target_loss = self.history[self.loss_type]             loss_diff =  target_loss[-int(self.lookback_epochs)] - target_loss[-1]            if loss_diff <= np.abs(target_loss[-1]) * (self.decay_threshold * self.lookback_epochs):                print(' '.join(('Changing learning rate from', str(current_lr), 'to', str(current_lr * self.decay_multiple))))                K.set_value(self.model.optimizer.lr, current_lr * self.decay_multiple)                current_lr = current_lr * self.decay_multiple            else:                print(' '.join(('Learning rate:', str(current_lr))))            if self.spike_epochs is not None and len(self.epoch) in self.spike_epochs:                print(' '.join(('Spiking learning rate from', str(current_lr), 'to', str(current_lr * self.spike_multiple))))                K.set_value(self.model.optimizer.lr, current_lr * self.spike_multiple)        else:            print(' '.join(('Setting learning rate to', str(self.base_lr))))            K.set_value(self.model.optimizer.lr, self.base_lr)        return K.get_value(self.model.optimizer.lr)def main():    returnif __name__ == '__main__':    main()

学技术

是否可以根据批次标签（y_true）分布，每个批次更新学习率？

更多细节

解决方案

创建自定义度量

创建DRW学习率调度器

发表回复取消回复

更多细节

解决方案

创建自定义度量

创建DRW学习率调度器

相关文章：

Related Posts

使用LSTM在Python中预测未来值

如何在gensim的word2vec模型中查找双词组的相似性

dask_xgboost.predict 可以工作但无法显示 – 数据必须是一维的

ML Tuning – Cross Validation in Spark

如何在React JS中使用fetch从REST API获取预测

如何分析ML.NET中多类分类预测得分数组？

发表回复 取消回复

发表回复取消回复