Scheduled Updating of LossFunction Scaling in NetTrain

Question

I'm looking for a way to externally manipulate (hyper)parameters of a network during NetTrain training, as a function of the training progress. Specifically, I want to increase the weight of a KL Divergence loss in the spirit of annealing, but the MWE below should do as well:

(* dummy net and dummy data *)
net = NetGraph[{1, MeanSquaredLossLayer[]}, {NetPort["Input"] -> 1 -> 2}, "Input" -> 1];
data = Function[data,AssociationMap[
         #[[1]] -> #[[2]]*data &, <|"Input" -> 1,"Target" -> 2|>]
       ][RandomReal[{-1, 1}, {100,1}]];

(* regular training, works as expected *)
NetTrain[net, data, LossFunction -> "Loss"]

(* training with updated LossFunction scale *)
scale = 1;
NetTrain[net, data, LossFunction -> "Loss" -> (Scaled[scale]), 
 TrainingProgressFunction -> (If[#Round == 200, scale = 100] &)]

In this case, I'd want to increase the scale when I reach a certain round. This probably does not work because at execution time the LossFunction option is "hard coded", but I can't see how to get NetTrain to update the scale factor.

Bonus question: Is there a way to update the network architecture during training? E.g., swap the MeanSquaredLossLayer[] with a different loss layer at some point?

Related question: Custom SGD optimizer in Mathematica neural network framework?, which exploits that some options (unlike LossFunction) are functions.

Dropped Bass · Accepted Answer · 2024-10-15 18:51:20Z

You can add a layer which multiplies the loss by an input value, and use a generator function to change that value during training.

In your example:

net = NetGraph[{1, MeanSquaredLossLayer[], 
   DotLayer[]}, {NetPort["Input"] -> 1 -> 2 -> 3 -> NetPort["Loss"], 
   NetPort["Scale"] -> 3}, "Input" -> 1, "Scale" -> 1]

Generate dummy data and generator function:

data = Table[<|"Input" -> RandomReal[{-1, 1}, 1], 
"Target" -> RandomReal[{-1, 1}, 1]|>, 100];

gentrain = 
  Function[
   Join[#, <|"Scale" -> scale|>] & /@ RandomSample[data, #BatchSize]];

Train the net with a scale parameter which changes after a certain number of rounds:

scale = 1;
NetTrain[net, {gentrain, "RoundLength" -> Round[100/64, 1]}, 
 TrainingProgressFunction -> (If[#Round == 10000, scale = 100] &), 
 MaxTrainingRounds -> 100000]

Stack Exchange Network

Scheduled Updating of LossFunction Scaling in NetTrain

1 Answer 1

Linked

Hot Network Questions