Gompertz Linear Unit (GoLU): a new era in activation functions Activation functions are the heartbeat of neural networks, they decide how neurons “fire” and what patterns models learn. From ReLU and GELU to Swish and Mish, each innovation has refined how information flows through deep architectures. This year, Indrashis Das, Mahmoud Safari, Steven Adriaensen, and Frank Hutter introduced GoLU from The University of Freiburg and Prior Labs. Definition GoLU(x) = x · e^(-e^(-x)) A smooth, asymmetric activation that stabilizes learning while maintaining gradient flow. Why it matters • Right-leaning asymmetry reduces activation variance and smooths training • Flatter loss landscapes and more stable optimization • Broader weight distributions that capture richer features • Strong results across vision, language, and diffusion benchmarks, often outperforming ReLU, GELU, and Swish Code: https://lnkd.in/dzkDgyNx Paper: https://lnkd.in/dsmX_8pE #DeepLearning #NeuralNetworks #ActivationFunctions #GoLU #MachineLearning #AIResearch
Introducing GoLU: a new activation function for neural networks
More Relevant Posts
-
📢 Check out the #highly_cited paper from the #Water journal 📄Enhanced Physics-Informed Neural Networks for Deep Tunnel Seepage Field Prediction: A Bayesian Optimization Approach ✍️ Yiheng Pan, Yongqi Zhang, Qiyuan Lu, Peng Xia, Jiarui Qi and Qiqi Luo Find out more 👉 https://brnw.ch/21wXiim
To view or add a comment, sign in
-
-
CERN scientists have leveraged advanced machine learning—including graph neural networks and transformers—to identify the rarest Higgs boson decays into charm quarks, overcoming longstanding challenges in distinguishing complex particle signatures. This breakthrough sets new constraints on Higgs interactions and marks significant progress toward a complete understanding of mass generation for fundamental particles. The evolving synergy between particle physics and AI is opening new frontiers for discovery at the LHC. Read the full article: https://lnkd.in/dVmhRqSN #HiggsBoson #MachineLearning  2026, one of the flagship conferences in computer vision. In this work, we propose a joint optimization framework that combines a spatio-temporal efficient gate (STEG) with an adaptive inference window (AIW) to learn both where to fire spikes and how long to run inference for each sample. This approach reduces spatial and temporal redundancy in directly trained SNNs, leading to significant SynOps and latency savings while maintaining or improving accuracy on static and neuromorphic vision benchmarks. #WACV2026 #ComputerVision #SpikingNeuralNetworks #DeepLearning #NeuromorphicComputing #AIResearch #EnergyEfficientAI #MachineLearning
To view or add a comment, sign in
-
-
🚀 Explore the Latest Issue of IEEE TCASAI! The IEEE Transactions on Circuits and Systems for Artificial Intelligence (TCASAI) continues to shape the future of AI hardware and systems. The Q3 2025 issue features eight cutting-edge papers spanning: 🔹 Compute-in-Memory acceleration for deep neural networks 🔹 Triangular systolic arrays for CNNs 🔹 Neuromorphic RF localization using spiking neurons 🔹 Mixed-precision floating-point designs for AI accelerators 🔹 Hardware-efficient spiking neural networks and adaptive LIF models Each paper showcases innovation where circuits and systems meet AI — from low-power architectures to edge intelligence breakthroughs. 🔗 Access the full issue: https://loom.ly/b7aPYFU 💡 Cite TCASAI papers to strengthen your own AI-hardware research — every citation supports the growth of this essential new IEEE Transactions and the global circuits-and-AI community. #IEEE #CASS #TCASAI #AIHardware #CircuitsAndSystems #ComputeInMemory #Neuromorphic #EdgeAI
To view or add a comment, sign in
-
-
Thrilled to share another milestone in our research journey—our latest paper, “Graph neural networks for the offline nanosatellite task scheduling problem,” has just been published in Applied Soft Computing! In this work, we explore how Graph Neural Networks (GNNs) can tackle the Offline Nanosatellite Task Scheduling Problem (ONTS)—a challenging combinatorial optimization problem in the aerospace domain. Our proposed GNN architecture achieves remarkable results in feasibility classification and optimality prediction, outperforming traditional MILP solvers in finding high-quality solutions under strict time limits. Beyond speed, the model demonstrates robust generalization for larger and more complex problem instances, showing the potential of deep learning to complement classical optimization in space mission planning. This research highlights how machine learning can drive scalable and intelligent decision-making for satellite operations and beyond. Huge thanks to my co-authors Bruno Machado Pacheco, Laio Oriel Seman, Eduardo Camponogara, Eduardo Bezerra, and Leandro dos Santos Coelho 🇧🇷 for the excellent collaboration. Read the full paper here: https://lnkd.in/dqSTR584 #GraphNeuralNetworks #DeepLearning #Optimization #SatelliteScheduling #OperationsResearch #AerospaceEngineering #ArtificialIntelligence #Research
To view or add a comment, sign in
-
-
Our latest paper, “On the application of PINN-style physics-regularised neural networks to high-temperature creep rupture life prediction,” explores how incorporating physical regularisation can improve the extrapolation capability of neural networks when predicting long-term creep life using only short-term data. This work demonstrates that integrating mechanistic creep-damage models directly into neural-network training helps balance empirical accuracy with physical consistency. Publication: https://lnkd.in/gsFdTe8f Authors: Ondrej Muránsky, Minh Tran, Warwick Payten #PhysicsInformedAI #PINN #MachineLearning #MaterialsScience #CreepLifePrediction #HighTemperatureMaterials #NeuralNetworks #PhysicsRegularised #CreepModelling #ANSTO #Research
To view or add a comment, sign in
-
-
"Bayesian Influence Functions for Hessian-Free Data Attribution" by Philipp Alexander Kreer, Wilson Wu, Maxwell Adam, Zach Furman, Jesse H. "Classical influence functions face significant challenges when applied to deep neural networks, primarily due to non-invertible Hessians and high-dimensional parameter spaces. We propose the local Bayesian influence function (BIF), an extension of classical influence functions that replaces Hessian inversion with loss landscape statistics that can be estimated via stochastic-gradient MCMC sampling. This Hessian-free approach captures higher-order interactions among parameters and scales efficiently to neural networks with billions of parameters. We demonstrate state-of-the-art results on predicting retraining experiments." Paper: https://lnkd.in/dbgYkKKr #machinelearning
To view or add a comment, sign in
-
-
Holonomic Unified Field Dynamics (HUFD): A Constant-Attention Manifold for Recurrent Field Computation *** I thank Dr. Paul Burchard for locating an omission on my part - it was late and I was tired - I omitted definition of ɸ and ψ; I have a corrected document with a new Section 2a with these definitions, for anyone who is seriously reading this document. Tahnk you, again, Paul. *** Conventional recurrent neural networks (RNNs) and long short-term memory (LSTM) models maintain temporal dependencies through gate-based recurrences that scale linearly with sequence length. We introduce Holonomic Unified Field Dynamics (HUFD), a geometric generalization of recurrence derived from lattice-gauge field dynamics and Charlton Geometry. HUFD evolves a fixed-dimensional latent field connection on a constant-attention manifold, replacing temporal gating with a holonomic update rule governed by curvature and a lucidity-weighted Lyapunov functional . This system exhibits complexity with respect to context length, while preserving semantic coherence through self-organizing attractors (SOM) and recurrent attention-gated updates (RAGU). #AI #ArtificialIntelligence #Mathematics #HUFD #ConstantAttention #Tier4AI #CognitiveArchitecture #Innovation #MachineLearning #AutonomousSystems #ReflectiveAI #FutureOfAI #SelfModeling #Research #vonNeumannProbe #SentientAI
To view or add a comment, sign in
-
The Perceptron: Where Biology Meets Computing 🧠💻 Ever wondered what truly inspired the first Artificial Neural Network? The answer is biology! The foundational Perceptron, developed by Frank Rosenblatt, is a classic example of biomimicry in computer science. As the diagram beautifully illustrates, the artificial neuron is a direct, simplified model of its biological counterpart: 1. The Biological Neuron (The Structure) Dendrites receive input signals from other neurons. The Cell Body sums up the weighted signals and decides whether to "fire." The Axon transmits the output signal to the next neuron. 2. The Artificial Perceptron (The Math) Inputs (X_1, to X_n) are the data, weighted like the signals received by dendrites. The Summation (sum) and Activation Function (f) perform the calculation (input * weight + bias), mimicking the cell body's decision process. The Output (Z) is the result (e.g., a classification), analogous to the signal transmitted by the axon. This simple yet profound model—a linear classifier capable of learning by adjusting its weights—is the core concept that paved the way for modern AI. Even today's advanced models like GPT and Stable Diffusion are built upon this fundamental, biologically-inspired mechanism. Respect the roots! Question for the Community: What's the most surprising real-world application you've seen for a simple, single-layer Perceptron model? Share your thoughts below! #AI #MachineLearning #DeepLearning #Neuroscience #Perceptron #TechHistory #DataScience #FoundationsOfAI
To view or add a comment, sign in
-
-
#highlycitedpaper Predicting the Influence of Soil–Structure Interaction on Seismic Responses of Reinforced Concrete Frame Buildings Using Convolutional Neural Network, by Jishuai Wang, Yazhou (Tim) Xie, Tong Guo and Zhenyu Du from Southeast University and McGill University ⭐Keywords: #soil–structure interaction; regional #seismic damage assessment; #RC frame; machine learning; convolutional neural network 🔗 Read for free at: https://lnkd.in/dukzCH29
To view or add a comment, sign in
-
I was just reading about it the other day, it’s fascinating how the Gompertz growth model translates to neural optimization, the term creates a soft saturating gate that’s gentler than sigmoid but more expressive than ReLU. Curious if stability gains hold in very deep architectures and whether the broader weight distributions benefit transfer learning scenarios.