-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathlocal-search.xml
97 lines (46 loc) · 30.8 KB
/
local-search.xml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
<?xml version="1.0" encoding="utf-8"?>
<search>
<entry>
<title>End-to-End Multi-Task Learning with Attention</title>
<link href="/2022/11/29/End-to-End-Multi-Task-Learning-with-Attention/"/>
<url>/2022/11/29/End-to-End-Multi-Task-Learning-with-Attention/</url>
<content type="html"><![CDATA[<h2 id="代码实现">代码实现</h2><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br></pre></td><td class="code"><pre><code class="hljs python"><span class="hljs-keyword">class</span> <span class="hljs-title class_">LossesDWAModel</span>(tf.keras.Model):<br><br> <span class="hljs-keyword">def</span> <span class="hljs-title function_">__init__</span>(<span class="hljs-params">self, temperature=<span class="hljs-number">1</span>, *args, **kwargs</span>):<br> <span class="hljs-built_in">super</span>().__init__(*args, **kwargs)<br><br> self.t = tf.constant(temperature, dtype=tf.float32) <span class="hljs-comment"># 温度系数</span><br> self.loss_ws = [tf.Variable(<span class="hljs-number">0.0</span>, trainable=<span class="hljs-literal">False</span>, name=<span class="hljs-built_in">str</span>(i) + <span class="hljs-string">'_loss_ws'</span>)<br> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(self.output_names))]<br> self.loss_1 = <span class="hljs-literal">None</span> <span class="hljs-comment"># t-1 step loss</span><br> self.loss_2 = <span class="hljs-literal">None</span> <span class="hljs-comment"># t-2 step loss</span><br><br> <span class="hljs-keyword">def</span> <span class="hljs-title function_">train_step</span>(<span class="hljs-params">self, data</span>):<br> <span class="hljs-comment"># Unpack the data. Its structure depends on your model and</span><br> <span class="hljs-comment"># on what you pass to `fit()`.</span><br> x, y = data<br><br> <span class="hljs-keyword">with</span> tf.GradientTape() <span class="hljs-keyword">as</span> tape:<br> y_pred = self(x, training=<span class="hljs-literal">True</span>) <span class="hljs-comment"># Forward pass</span><br> <span class="hljs-comment"># Compute the loss value</span><br> <span class="hljs-comment"># (the loss function is configured in `compile()`)</span><br><br> task_loss = [] <span class="hljs-comment"># 当前batch内各任务损失</span><br> last_w = [] <span class="hljs-comment"># t-1 step w</span><br> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(self.output_names)):<br> target_name = self.output_names[i]<br> loss_i = self.loss[target_name](y_true=y[target_name], y_pred=y_pred[i])<br> task_loss.append(loss_i)<br> <span class="hljs-keyword">if</span> self.loss_1 <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span> <span class="hljs-keyword">and</span> self.loss_2 <span class="hljs-keyword">is</span> <span class="hljs-keyword">not</span> <span class="hljs-literal">None</span>:<br> last_w.append(self.loss_1[i] / self.loss_2[i])<br> <span class="hljs-keyword">else</span>:<br> last_w.append(tf.constant(<span class="hljs-number">1.0</span>, dtype=tf.float32))<br><br> loss_weights_mid = tf.math.exp(tf.divide(last_w, self.t))<br> loss_weights = tf.divide(loss_weights_mid, tf.reduce_sum(loss_weights_mid))<br><br> <span class="hljs-comment"># 归一化损失权重</span><br> total_loss = <span class="hljs-number">0.0</span><br> factor = tf.divide(tf.constant(<span class="hljs-built_in">len</span>(self.output_names), dtype=tf.float32), tf.reduce_sum(loss_weights))<br> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(self.output_names)):<br> lw = tf.multiply(factor, loss_weights[i])<br> self.loss_ws[i].assign(lw)<br> total_loss = tf.add(total_loss, tf.multiply(lw, task_loss[i]))<br><br> <span class="hljs-comment"># 更新loss_1 loss_2</span><br> self.loss_2 = self.loss_1<br> self.loss_1 = task_loss<br><br> trainable_vars = self.trainable_variables<br> gradients = tape.gradient(total_loss, trainable_vars)<br> <span class="hljs-comment"># Update weights</span><br> self.optimizer.apply_gradients(<span class="hljs-built_in">zip</span>(gradients, trainable_vars))<br> <span class="hljs-comment"># Update metrics (includes the metric that tracks the loss)</span><br> self.compiled_metrics.update_state(y, y_pred)<br><br> <span class="hljs-keyword">return</span> {m.name: m.result() <span class="hljs-keyword">for</span> m <span class="hljs-keyword">in</span> self.metrics}<br></code></pre></td></tr></table></figure><h2 id="论文">论文</h2><p><img src="/img/dwa/DWA-01.png" /> <img src="/img/dwa/DWA-04.png" /><img src="/img/dwa/DWA-05.png" /></p>]]></content>
<tags>
<tag>深度学习</tag>
<tag>多目标排序</tag>
<tag>Loss权重优化</tag>
</tags>
</entry>
<entry>
<title>GradNorm: Gradient Normalization for Adaptive Loss Balancing in Deep Multitask Networks</title>
<link href="/2022/11/29/GradNorm-Gradient-Normalization-for-Adaptive-Loss-Balancing-in-Deep-Multitask-Networks/"/>
<url>/2022/11/29/GradNorm-Gradient-Normalization-for-Adaptive-Loss-Balancing-in-Deep-Multitask-Networks/</url>
<content type="html"><![CDATA[<h2 id="代码实现">代码实现</h2><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br></pre></td><td class="code"><pre><code class="hljs python"><span class="hljs-keyword">class</span> <span class="hljs-title class_">GradNormLossesModel</span>(tf.keras.Model):<br><br> <span class="hljs-keyword">def</span> <span class="hljs-title function_">__init__</span>(<span class="hljs-params">self, alpha=<span class="hljs-number">0.5</span>, layer_name=<span class="hljs-string">'gradnorm'</span>, *args, **kwargs</span>):<br> <span class="hljs-built_in">super</span>().__init__(*args, **kwargs)<br><br> self.layer_name = layer_name<br> self.alpha = alpha<br> self.L0 = [tf.Variable(np.log(<span class="hljs-number">2</span>), trainable=<span class="hljs-literal">False</span>, dtype=tf.float32) <span class="hljs-keyword">for</span> _ <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(self.output_names))]<br> self.loss_ws = [tf.Variable(<span class="hljs-number">1.0</span>, trainable=<span class="hljs-literal">True</span>, constraint=tf.keras.constraints.NonNeg(), name=<span class="hljs-built_in">str</span>(i) + <span class="hljs-string">'_loss_ws'</span>)<br> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(self.output_names))]<br> lr_schedule_ws = tf.keras.optimizers.schedules.ExponentialDecay(<br> initial_learning_rate=<span class="hljs-number">1e-2</span>,<br> decay_steps=<span class="hljs-number">100</span>,<br> decay_rate=<span class="hljs-number">0.75</span>)<br> self.optimizer_ws = tf.keras.optimizers.Adam(learning_rate=lr_schedule_ws)<br><br> <span class="hljs-keyword">def</span> <span class="hljs-title function_">train_step</span>(<span class="hljs-params">self, data</span>):<br> <span class="hljs-comment"># Unpack the data. Its structure depends on your model and</span><br> <span class="hljs-comment"># on what you pass to `fit()`.</span><br> x, y = data<br><br> <span class="hljs-keyword">with</span> tf.GradientTape(persistent=<span class="hljs-literal">True</span>) <span class="hljs-keyword">as</span> tape:<br> y_pred = self(x, training=<span class="hljs-literal">True</span>) <span class="hljs-comment"># Forward pass</span><br> <span class="hljs-comment"># Compute the loss value</span><br> <span class="hljs-comment"># (the loss function is configured in `compile()`)</span><br><br> losses_value = [] <span class="hljs-comment"># 记录原始loss</span><br> weighted_losses = [] <span class="hljs-comment"># 记录各个加权后的losss</span><br> total_loss = <span class="hljs-number">0.0</span> <span class="hljs-comment"># 记录当前加权loss</span><br> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(self.output_names)):<br> target_name = self.output_names[i]<br> Li = self.loss[target_name](y_true=y[target_name], y_pred=y_pred[i])<br> losses_value.append(Li)<br> w_Li = tf.multiply(self.loss_ws[i], Li)<br> weighted_losses.append(w_Li)<br> total_loss = tf.add(total_loss, w_Li)<br><br> <span class="hljs-comment"># 计算逆训练速率r(t)</span><br> loss_rate = tf.stack(losses_value, axis=<span class="hljs-number">0</span>) / tf.stack(self.L0, axis=<span class="hljs-number">0</span>)<br> loss_rate_mean = tf.reduce_mean(loss_rate)<br> loss_r = loss_rate / loss_rate_mean<br><br> <span class="hljs-comment"># 获取最后一层共享层参数进行计算</span><br> last_shared_layer_var = [l <span class="hljs-keyword">for</span> l <span class="hljs-keyword">in</span> self.trainable_variables <span class="hljs-keyword">if</span> <span class="hljs-string">'level_1_expert_shared'</span> <span class="hljs-keyword">in</span> l.name]<br> grads = [tape.gradient(wLi, last_shared_layer_var) <span class="hljs-keyword">for</span> wLi <span class="hljs-keyword">in</span> weighted_losses]<br> grads_mid = [tf.concat([tf.reshape(g, (-<span class="hljs-number">1</span>, <span class="hljs-number">1</span>)) <span class="hljs-keyword">for</span> g <span class="hljs-keyword">in</span> gs], axis=<span class="hljs-number">0</span>) <span class="hljs-keyword">for</span> gs <span class="hljs-keyword">in</span> grads]<br> gnorms_mid = [(tf.reduce_sum(tf.multiply(g, g))) ** self.alpha <span class="hljs-keyword">for</span> g <span class="hljs-keyword">in</span> grads_mid]<br> gnorms = tf.stack(gnorms_mid, axis=<span class="hljs-number">0</span>)<br> avgnorm = tf.reduce_mean(gnorms)<br><br> <span class="hljs-comment"># 计算grad loss</span><br> grad_diff = tf.<span class="hljs-built_in">abs</span>(gnorms - tf.stop_gradient(avgnorm * (loss_r ** self.alpha)))<br> gnorm_loss = tf.reduce_sum(grad_diff)<br><br> trainable_vars = [var <span class="hljs-keyword">for</span> var <span class="hljs-keyword">in</span> self.trainable_variables <span class="hljs-keyword">if</span> <span class="hljs-string">'_loss_ws'</span> <span class="hljs-keyword">not</span> <span class="hljs-keyword">in</span> var.name]<br> gradients = tape.gradient(total_loss, trainable_vars)<br> self.optimizer.apply_gradients(<span class="hljs-built_in">zip</span>(gradients, trainable_vars))<br><br> <span class="hljs-comment"># 更新损失权重w</span><br> gradws = tape.gradient(gnorm_loss, self.loss_ws)<br> self.optimizer_ws.apply_gradients(<span class="hljs-built_in">zip</span>(gradws, self.loss_ws))<br><br> <span class="hljs-comment"># 归一化损失权重:减少对lr的影响</span><br> factor = tf.divide(tf.constant(<span class="hljs-built_in">len</span>(self.output_names), dtype=tf.float32), tf.reduce_sum(self.loss_ws))<br> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(self.output_names)):<br> self.loss_ws[i].assign(tf.multiply(factor, self.loss_ws[i])) <span class="hljs-comment"># loss_ws:[tf.Variable, tf.Variable]</span><br><br> <span class="hljs-comment"># Update metrics (includes the metric that tracks the loss)</span><br> self.compiled_metrics.update_state(y, y_pred)<br><br> <span class="hljs-comment"># persistent=True so delete</span><br> <span class="hljs-keyword">del</span> tape<br><br> <span class="hljs-keyword">return</span> {m.name: m.result() <span class="hljs-keyword">for</span> m <span class="hljs-keyword">in</span> self.metrics}<br></code></pre></td></tr></table></figure><h2 id="论文">论文</h2><p><img src="/img/gradnorm/GradNorm-01.png" /> <imgsrc="/img/gradnorm/GradNorm-02.png" /> <imgsrc="/img/gradnorm/GradNorm-03.png" /> <imgsrc="/img/gradnorm/GradNorm-04.png" /></p>]]></content>
<tags>
<tag>深度学习</tag>
<tag>多目标排序</tag>
<tag>Loss权重优化</tag>
</tags>
</entry>
<entry>
<title>Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics</title>
<link href="/2022/11/24/Multi-Task-Learning-Using-Uncertainty-to-Weigh-Losses-for-Scene-Geometry-and-Semantics/"/>
<url>/2022/11/24/Multi-Task-Learning-Using-Uncertainty-to-Weigh-Losses-for-Scene-Geometry-and-Semantics/</url>
<content type="html"><![CDATA[<h2 id="论文解读">论文解读</h2><p>Multi-task learning concerns the problem of optimising a model withrespect to multiple objectives. The naive approach to combining multiobjective losses would be to simply perform a weighted linear sum of thelosses for each individual task:<br> <imgsrc="/img/uwl/naive_loss.png" /><br> However, there are a number ofissues with this method. Namely, model performance is extremelysensitive to weight selection, wi, as illustrated in Figure 2. Theseweight hyper-parameters are expensive to tune, often taking many daysfor each trial. Therefore, it is desirable to find a more convenientapproach which is able to learn the optimal weights</p><h3 id="mathematical-formulation">Mathematical Formulation</h3><p>First the paper defines multi-task likelihoods:<br> - For regressiontasks, likelihood is defined as a Gaussian with mean given by the modeloutput with an observation noise scalar σ:<br> <imgsrc="/img/uwl/reg_likelihood.png" /><br> - For classification,likelihood is defined as:<br> <imgsrc="/img/uwl/class_likelihood_1.png" /><br> where:<br> <imgsrc="/img/uwl/class_likelihood_0.png" /><br></p><p>In maximum likelihood inference, we maximise the log likelihood ofthe model. In regression for example:<br> <imgsrc="/img/uwl/reg_loglikelihood.png" /><br> σ is the model’s observationnoise parameter - capturing how much noise we have in the outputs. Wethen maximise the log likelihood with respect to the model parameters Wand observation noise parameter σ.<br></p><p>Assuming two tasks that follow a Gaussian distributions:<br> <imgsrc="/img/uwl/two_task.png" /><br> The loss will be:<br> <imgsrc="/img/uwl/total_loss_h.png" /><br> <imgsrc="/img/uwl/loss7.png" /><br> This means that W and σ are the learnedparameters of the network. W are the wights of the network while σ areused to calculate the wights of each task loss and also to regularizethis task loss wight.</p><p>However, the extension to classification likelihoods is moreinteresting. We adapt the classification likelihood to squash a scaledversion of the model output through a softmax function:<br> <imgsrc="/img/uwl/sf.png" /><br> with a positive scalar σ. This can beinterpreted as a Boltzmann distribution (also called Gibbs distribution)where the input is scaled by σ2 (often referred to as temperature). Thisscalar is either fixed or can be learnt, where the parameter’s magnitudedetermines how ‘uniform’ (flat) the discrete distribution is. Thisrelates to its uncertainty, as measured in entropy. The log likelihoodfor this output can then be written as<br> <imgsrc="/img/uwl/sf1.png" /><br> assume that a model’s multiple outputs arecomposed of a continuous output y1 and a discrete output y2, modelledwith a Gaussian likelihood and a softmax likelihood, respectively. Likebefore, the joint loss, L(W, σ1, σ2), is given as:<br> <imgsrc="/img/uwl/rgsf.png" /><br> <u><strong><font color=#FF000>Inpractice, we train the network to predict the log variance, s := log σ2.This is because it is more numerically stable than regressing thevariance, σ2, as the loss avoids any division by zero. The exponentialmapping also allows us to regress unconstrained scalar values, whereexp(−s) is resolved to the positive domain giving valid values forvariance.</font></strong></u></p><h2 id="代码实现">代码实现</h2><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br></pre></td><td class="code"><pre><code class="hljs python"><span class="hljs-keyword">def</span> <span class="hljs-title function_">build_model</span>(<span class="hljs-params">model_config</span>):<br> inputs = []<br> outputs = []<br><br> <span class="hljs-keyword">return</span> LossesUWLModel(inputs=inputs, outputs=outputs)<br><br><span class="hljs-keyword">class</span> <span class="hljs-title class_">LossesUWLModel</span>(tf.keras.Model):<br><br> <span class="hljs-keyword">def</span> <span class="hljs-title function_">__init__</span>(<span class="hljs-params">self, *args, **kwargs</span>):<br> <span class="hljs-built_in">super</span>().__init__(*args, **kwargs)<br> self.sigma = [tf.Variable(tf.random.uniform(shape=[], minval=<span class="hljs-number">0.2</span>, maxval=<span class="hljs-number">1</span>, seed=<span class="hljs-number">10</span>), dtype=tf.float32,<br> trainable=<span class="hljs-literal">True</span>,<br> constraint=tf.keras.constraints.NonNeg(),<br> name=o + <span class="hljs-string">'sigma'</span>) <span class="hljs-keyword">for</span> o <span class="hljs-keyword">in</span> self.output_names]<br><br> <span class="hljs-keyword">def</span> <span class="hljs-title function_">train_step</span>(<span class="hljs-params">self, data</span>):<br> <span class="hljs-comment"># Unpack the data. Its structure depends on your model and</span><br> <span class="hljs-comment"># on what you pass to `fit()`.</span><br> x, y = data<br><br> <span class="hljs-keyword">with</span> tf.GradientTape() <span class="hljs-keyword">as</span> tape:<br> y_pred = self(x, training=<span class="hljs-literal">True</span>) <span class="hljs-comment"># Forward pass</span><br> <span class="hljs-comment"># Compute the loss value</span><br> <span class="hljs-comment"># (the loss function is configured in `compile()`)</span><br><br> task_loss = []<br> total_loss = <span class="hljs-number">0.0</span><br> <span class="hljs-keyword">for</span> i <span class="hljs-keyword">in</span> <span class="hljs-built_in">range</span>(<span class="hljs-built_in">len</span>(self.output_names)):<br> target_name = self.output_names[i]<br> loss_i = self.loss[target_name](y_true=y[target_name], y_pred=y_pred[i])<br> task_loss.append(loss_i)<br> total_loss = tf.add(total_loss, tf.divide(loss_i, self.sigma[i] ** <span class="hljs-number">2</span>))<br> total_loss = tf.add(total_loss, tf.math.log(self.sigma[i] ** <span class="hljs-number">2</span>))<br><br> trainable_vars = self.trainable_variables<br> gradients = tape.gradient(total_loss, trainable_vars)<br> <span class="hljs-comment"># Update weights</span><br> self.optimizer.apply_gradients(<span class="hljs-built_in">zip</span>(gradients, trainable_vars))<br> <span class="hljs-comment"># Update metrics (includes the metric that tracks the loss)</span><br> self.compiled_metrics.update_state(y, y_pred)<br><br> <span class="hljs-keyword">return</span> {m.name: m.result() <span class="hljs-keyword">for</span> m <span class="hljs-keyword">in</span> self.metrics}<br><br><br>model = build_model(FLAGS.model_config)<br>model.<span class="hljs-built_in">compile</span>(<br> optimizer=tf.keras.optimizers.RMSprop(FLAGS.lr),<br> loss={<br> <span class="hljs-string">"a"</span>: tf.keras.losses.BinaryCrossentropy(name=<span class="hljs-string">'loss'</span>, label_smoothing=<span class="hljs-number">0.1</span>),<br> <span class="hljs-string">"b"</span>: tf.keras.losses.BinaryCrossentropy(name=<span class="hljs-string">'loss'</span>),<br> },<br> metrics={<br> <span class="hljs-string">"a"</span>: [tf.keras.metrics.BinaryCrossentropy(), tf.keras.metrics.AUC(name=<span class="hljs-string">'auc'</span>)],<br> <span class="hljs-string">"b"</span>: [tf.keras.metrics.BinaryCrossentropy(), tf.keras.metrics.AUC(name=<span class="hljs-string">'auc'</span>)],<br> },<br> <span class="hljs-comment"># run_eagerly=True</span><br>)<br></code></pre></td></tr></table></figure><h2 id="论文">论文</h2><p><img src="/img/uwl/c-01.png" /> <img src="/img/uwl/c-03.png" /> <imgsrc="/img/uwl/c-04.png" /> <img src="/img/uwl/c-05.png" /></p>]]></content>
<tags>
<tag>深度学习</tag>
<tag>多目标排序</tag>
<tag>Loss权重优化</tag>
</tags>
</entry>
<entry>
<title>Hello World</title>
<link href="/2022/09/13/hello-world/"/>
<url>/2022/09/13/hello-world/</url>
<content type="html"><![CDATA[<p>Welcome to <a href="https://hexo.io/">Hexo</a>! This is your veryfirst post. Check <a href="https://hexo.io/docs/">documentation</a> formore info. If you get any problems when using Hexo, you can find theanswer in <ahref="https://hexo.io/docs/troubleshooting.html">troubleshooting</a> oryou can ask me on <ahref="https://github.com/hexojs/hexo/issues">GitHub</a>.</p><h2 id="quick-start">Quick Start</h2><h3 id="create-a-new-post">Create a new post</h3><p>$ f(x) = a+b $</p><p><span class="math display">\[ f(x) = a+b \]</span></p><p><span class="math display">\[\sqrt{x} + \sqrt{x^{2}+\sqrt{y}} =\sqrt[3]{k_{i}} - \frac{x}{m}\]</span></p><p><span class="math display">\[ \lim_{x \to \infty} x^2_{22} -\int_{1}^{5}x\mathrm{d}x + \sum_{n=1}^{20} n^{2} = \prod_{j=1}^{3}y_{j} + \lim_{x \to -2} \frac{x-2}{x} \]</span></p><p><span class="math display">\[\begin{bmatrix}1 & 2 & \cdots \\67 & 95 & \cdots \\\vdots & \vdots & \ddots \\\end{bmatrix}\]</span></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><code class="hljs bash">$ hexo new <span class="hljs-string">"My New Post"</span><br></code></pre></td></tr></table></figure><p>More info: <ahref="https://hexo.io/docs/writing.html">Writing</a></p><h3 id="run-server">Run server</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><code class="hljs bash">$ hexo server<br></code></pre></td></tr></table></figure><p>More info: <a href="https://hexo.io/docs/server.html">Server</a></p><h3 id="generate-static-files">Generate static files</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><code class="hljs bash">$ hexo generate<br></code></pre></td></tr></table></figure><p>More info: <ahref="https://hexo.io/docs/generating.html">Generating</a></p><h3 id="deploy-to-remote-sites">Deploy to remote sites</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><code class="hljs bash">$ hexo deploy<br></code></pre></td></tr></table></figure><p>More info: <ahref="https://hexo.io/docs/one-command-deployment.html">Deployment</a></p><p>这是一句话<sup id="fnref:1" class="footnote-ref"><a href="#fn:1" rel="footnote"><spanclass="hint--top hint--rounded"aria-label="参考资料1">[1]</span></a></sup></p><h2 id="参考">参考</h2><p>常用工具: <a href="https://pdf2png.com/">pdf2png</a></p><section class="footnotes"><div class="footnote-list"><ol><li><span id="fn:1" class="footnote-text"><span>这是对应的脚注<a href="#fnref:1" rev="footnote" class="footnote-backref">↩︎</a></span></span></li><li><span id="fn:1" class="footnote-text"><span>参考资料1<a href="#fnref:1" rev="footnote" class="footnote-backref">↩︎</a></span></span></li><li><span id="fn:2" class="footnote-text"><span>参考资料2<a href="#fnref:2" rev="footnote" class="footnote-backref">↩︎</a></span></span></li></ol></div></section>]]></content>
<tags>
<tag>Hexo</tag>
<tag>Fluid</tag>
</tags>
</entry>
</search>