Introducing GoLU: a new activation function for neural networks

This title was summarized by AI from the post below.

7mo Edited

Gompertz Linear Unit (GoLU): a new era in activation functions Activation functions are the heartbeat of neural networks, they decide how neurons “fire” and what patterns models learn. From ReLU and GELU to Swish and Mish, each innovation has refined how information flows through deep architectures. This year, Indrashis Das, Mahmoud Safari, Steven Adriaensen, and Frank Hutter introduced GoLU from The University of Freiburg and Prior Labs. Definition GoLU(x) = x · e^(-e^(-x)) A smooth, asymmetric activation that stabilizes learning while maintaining gradient flow. Why it matters • Right-leaning asymmetry reduces activation variance and smooths training • Flatter loss landscapes and more stable optimization • Broader weight distributions that capture richer features • Strong results across vision, language, and diffusion benchmarks, often outperforming ReLU, GELU, and Swish Code: https://lnkd.in/dzkDgyNx Paper: https://lnkd.in/dsmX_8pE #DeepLearning #NeuralNetworks #ActivationFunctions #GoLU #MachineLearning #AIResearch

5 Comments

Mohammad Huzefa Shaikh 7mo

I was just reading about it the other day, it’s fascinating how the Gompertz growth model translates to neural optimization, the term creates a soft saturating gate that’s gentler than sigmoid but more expressive than ReLU. Curious if stability gains hold in very deep architectures and whether the broader weight distributions benefit transfer learning scenarios.

To view or add a comment, sign in

More Relevant Posts

Water MDPI

4,022 followers
6mo
Report this post
📢 Check out the #highly_cited paper from the #Water journal 📄Enhanced Physics-Informed Neural Networks for Deep Tunnel Seepage Field Prediction: A Bayesian Optimization Approach ✍️ Yiheng Pan, Yongqi Zhang, Qiyuan Lu, Peng Xia, Jiarui Qi and Qiqi Luo Find out more 👉 https://brnw.ch/21wXiim
Like Comment
To view or add a comment, sign in
Omotayo

23 followers
6mo
Report this post
CERN scientists have leveraged advanced machine learning—including graph neural networks and transformers—to identify the rarest Higgs boson decays into charm quarks, overcoming longstanding challenges in distinguishing complex particle signatures. This breakthrough sets new constraints on Higgs interactions and marks significant progress toward a complete understanding of mass generation for fundamental particles. The evolving synergy between particle physics and AI is opening new frontiers for discovery at the LHC. Read the full article: https://lnkd.in/dVmhRqSN #HiggsBoson #MachineLearning ![CERN AI Higgs Boson research image](https://lnkd.in/d9cF8yaf
Like Comment
To view or add a comment, sign in
Gulfam Ahmed Saju, Ph.D.
6mo
Report this post
Very excited to share that our paper “STEG-AIW: Spatio-Temporal Gating and Adaptive-Timestep Inference for Efficient Spiking Neural Networks” has been accepted for publication at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2026, one of the flagship conferences in computer vision. In this work, we propose a joint optimization framework that combines a spatio-temporal efficient gate (STEG) with an adaptive inference window (AIW) to learn both where to fire spikes and how long to run inference for each sample. This approach reduces spatial and temporal redundancy in directly trained SNNs, leading to significant SynOps and latency savings while maintaining or improving accuracy on static and neuromorphic vision benchmarks. #WACV2026 #ComputerVision #SpikingNeuralNetworks #DeepLearning #NeuromorphicComputing #AIResearch #EnergyEfficientAI #MachineLearning
2 Comments
Like Comment
To view or add a comment, sign in
IEEE Circuits and Systems Society (CASS)

13,826 followers
7mo
Report this post
🚀 Explore the Latest Issue of IEEE TCASAI! The IEEE Transactions on Circuits and Systems for Artificial Intelligence (TCASAI) continues to shape the future of AI hardware and systems. The Q3 2025 issue features eight cutting-edge papers spanning: 🔹 Compute-in-Memory acceleration for deep neural networks 🔹 Triangular systolic arrays for CNNs 🔹 Neuromorphic RF localization using spiking neurons 🔹 Mixed-precision floating-point designs for AI accelerators 🔹 Hardware-efficient spiking neural networks and adaptive LIF models Each paper showcases innovation where circuits and systems meet AI — from low-power architectures to edge intelligence breakthroughs. 🔗 Access the full issue: https://loom.ly/b7aPYFU 💡 Cite TCASAI papers to strengthen your own AI-hardware research — every citation supports the growth of this essential new IEEE Transactions and the global circuits-and-AI community. #IEEE #CASS #TCASAI #AIHardware #CircuitsAndSystems #ComputeInMemory #Neuromorphic #EdgeAI
Like Comment
To view or add a comment, sign in
Cezar Rigo
6mo
Report this post
Thrilled to share another milestone in our research journey—our latest paper, “Graph neural networks for the offline nanosatellite task scheduling problem,” has just been published in Applied Soft Computing! In this work, we explore how Graph Neural Networks (GNNs) can tackle the Offline Nanosatellite Task Scheduling Problem (ONTS)—a challenging combinatorial optimization problem in the aerospace domain. Our proposed GNN architecture achieves remarkable results in feasibility classification and optimality prediction, outperforming traditional MILP solvers in finding high-quality solutions under strict time limits. Beyond speed, the model demonstrates robust generalization for larger and more complex problem instances, showing the potential of deep learning to complement classical optimization in space mission planning. This research highlights how machine learning can drive scalable and intelligent decision-making for satellite operations and beyond. Huge thanks to my co-authors Bruno Machado Pacheco, Laio Oriel Seman, Eduardo Camponogara, Eduardo Bezerra, and Leandro dos Santos Coelho 🇧🇷 for the excellent collaboration. Read the full paper here: https://lnkd.in/dqSTR584 #GraphNeuralNetworks #DeepLearning #Optimization #SatelliteScheduling #OperationsResearch #AerospaceEngineering #ArtificialIntelligence #Research
1 Comment
Like Comment
To view or add a comment, sign in
Ondrej Muránsky
7mo Edited
Report this post
Our latest paper, “On the application of PINN-style physics-regularised neural networks to high-temperature creep rupture life prediction,” explores how incorporating physical regularisation can improve the extrapolation capability of neural networks when predicting long-term creep life using only short-term data. This work demonstrates that integrating mechanistic creep-damage models directly into neural-network training helps balance empirical accuracy with physical consistency. Publication: https://lnkd.in/gsFdTe8f Authors: Ondrej Muránsky, Minh Tran, Warwick Payten #PhysicsInformedAI #PINN #MachineLearning #MaterialsScience #CreepLifePrediction #HighTemperatureMaterials #NeuralNetworks #PhysicsRegularised #CreepModelling #ANSTO #Research
Like Comment
To view or add a comment, sign in
Bruno Neri
7mo
Report this post
"Bayesian Influence Functions for Hessian-Free Data Attribution" by Philipp Alexander Kreer, Wilson Wu, Maxwell Adam, Zach Furman, Jesse H. "Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function (BIF), an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic-gradient MCMC sampling. This Hessian-free approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. We demonstrate state-of-the-art results on predicting retraining experiments." Paper: https://lnkd.in/dbgYkKKr #machinelearning
1 Comment
Like Comment
To view or add a comment, sign in
J Harlow
6mo Edited
Report this post
Holonomic Unified Field Dynamics (HUFD): A Constant-Attention Manifold for Recurrent Field Computation *** I thank Dr. Paul Burchard for locating an omission on my part - it was late and I was tired - I omitted definition of ɸ and ψ; I have a corrected document with a new Section 2a with these definitions, for anyone who is seriously reading this document. Tahnk you, again, Paul. *** Conventional recurrent neural networks (RNNs) and long short-term memory (LSTM) models maintain temporal dependencies through gate-based recurrences that scale linearly with sequence length. We introduce Holonomic Unified Field Dynamics (HUFD), a geometric generalization of recurrence derived from lattice-gauge field dynamics and Charlton Geometry. HUFD evolves a fixed-dimensional latent field connection on a constant-attention manifold, replacing temporal gating with a holonomic update rule governed by curvature and a lucidity-weighted Lyapunov functional . This system exhibits complexity with respect to context length, while preserving semantic coherence through self-organizing attractors (SOM) and recurrent attention-gated updates (RAGU). #AI #ArtificialIntelligence #Mathematics #HUFD #ConstantAttention #Tier4AI #CognitiveArchitecture #Innovation #MachineLearning #AutonomousSystems #ReflectiveAI #FutureOfAI #SelfModeling #Research #vonNeumannProbe #SentientAI

2 Comments
Like Comment
To view or add a comment, sign in
Muhammad Bilal
6mo
Report this post
The Perceptron: Where Biology Meets Computing 🧠💻 Ever wondered what truly inspired the first Artificial Neural Network? The answer is biology! The foundational Perceptron, developed by Frank Rosenblatt, is a classic example of biomimicry in computer science. As the diagram beautifully illustrates, the artificial neuron is a direct, simplified model of its biological counterpart: 1. The Biological Neuron (The Structure) Dendrites receive input signals from other neurons. The Cell Body sums up the weighted signals and decides whether to "fire." The Axon transmits the output signal to the next neuron. 2. The Artificial Perceptron (The Math) Inputs (X_1, to X_n) are the data, weighted like the signals received by dendrites. The Summation (sum) and Activation Function (f) perform the calculation (input * weight + bias), mimicking the cell body's decision process. The Output (Z) is the result (e.g., a classification), analogous to the signal transmitted by the axon. This simple yet profound model—a linear classifier capable of learning by adjusting its weights—is the core concept that paved the way for modern AI. Even today's advanced models like GPT and Stable Diffusion are built upon this fundamental, biologically-inspired mechanism. Respect the roots! Question for the Community: What's the most surprising real-world application you've seen for a simple, single-layer Perceptron model? Share your thoughts below! #AI #MachineLearning #DeepLearning #Neuroscience #Perceptron #TechHistory #DataScience #FoundationsOfAI
Like Comment
To view or add a comment, sign in
Buildings MDPI

5,497 followers
7mo
Report this post
#highlycitedpaper Predicting the Influence of Soil–Structure Interaction on Seismic Responses of Reinforced Concrete Frame Buildings Using Convolutional Neural Network, by Jishuai Wang, Yazhou (Tim) Xie, Tong Guo and Zhenyu Du from Southeast University and McGill University ⭐Keywords: #soil–structure interaction; regional #seismic damage assessment; #RC frame; machine learning; convolutional neural network 🔗 Read for free at: https://lnkd.in/dukzCH29
Like Comment
To view or add a comment, sign in

41,362 followers

View Profile Connect

Introducing GoLU: a new activation function for neural networks

More from this author

Gen Z, Generative AI, and the New Job Landscape (2025)

Career Progression in Tech: Insights from Senior Developer Advocate Fazalullah

Memory Bank for the Elderly

Explore content categories