-- Quantum ML Series 2 | Post 07 of 11 #QFD2 -- Post 6 ended with a working forward pass. Post 7 closes the loop. Today we train the quantum classifier for the first time. Here is what we built: A complete hybrid quantum-classical model. The quantum circuit handles feature processing. A classical linear layer maps the 16 quantum outputs to 10 digit class probabilities. Both components train together end to end. Cross entropy loss. Adam optimizer at 0.01 learning rate. 10 epochs. 500 training images. The training loop looks like standard PyTorch. Zero gradients. Forward pass. Compute loss. Call loss.backward(). Step the optimizer. The quantum specific part is invisible at this level. PennyLane handles the parameter shift rule gradient computation automatically. Gradients flow through the quantum circuit just like they flow through any classical layer. Honest results from my run: After 10 epochs on 500 training images the loss went from 2.29 down to 1.61. Test accuracy: approximately 35 to 45 percent. Random chance on a 10 class problem is 10 percent. So the model learned something real. But a classical logistic regression on the same 16 features would likely hit 60 to 70 percent. A classical CNN on the full image would hit 99 percent plus. I am sharing the exact numbers, not the best case. That is the point of learning in public. One honest observation. Training a quantum circuit is slow. Each backward pass runs the circuit twice per parameter for the parameter shift rule. With 32 quantum parameters that is a lot of circuit evaluations per batch. We trained on 500 images for practical reasons. The full 60,000 image MNIST training set would take a very long time on a laptop simulator. That constraint is real. It goes in the benchmark comparison in Post 8. Full article with all explanations below. https://lnkd.in/gBgf7avB Article link in first comment 👇 I am also currently open to full stack development and quantum computing opportunities. Six years of coding experience. Building in QML. Looking for a team working on something technically interesting. If that sounds like your team, feel free to reach out or connect. #QuantumComputing #QuantumML #MachineLearning #LearnInPublic #Developer #PennyLane #QML #Coding #FutureTech #QuantumMLSeries #OpenToWork #FullStackDeveloper
Quantum ML Series 2: Training a Quantum Classifier
More Relevant Posts
-
Qiskit QuantumKatas: Benchmarking Large Language Models for Quantum Computing Code Generation IBM researchers adapted Microsoft's QuantumKatas from Q# to Qiskit and evaluated 16 LLMs across 350 quantum programming tasks. Frontier models achieve up to 83% accuracy implementing known algorithms but struggle at 34% on problem encoding, revealing quantum reasoning gaps bey... #QuantumComputing #LLMs #Benchmarking #Research #Informaq
To view or add a comment, sign in
-
Is scaling attention sufficient for long-context modeling, or do LLM need a new form of memory? It feels like we’re moving toward models that don’t just read context—but actively learn from it in real time. This shift becomes especially important as long-context modeling may not scale purely through attention. As context grows (100k+ tokens), attention faces challenges like signal dilution, quadratic cost, and memory overhead. I’ve been diving into recent work on Neural Memory approaches in LLMs (e.g., In-Place TTT, Titans) — and it offer a different path: 👉 Instead of attending over everything, the model learns what to remember and compresses it into its weights This effectively turns parts of the network (especially MLP blocks) into associative memory systems This brings key advantages for long-context scenarios: • Reduces reliance on massive KV-cache storage • Improves retrieval of relevant information from distant context • Scales more gracefully beyond fixed attention windows Curious to see how this evolves—especially for long-context reasoning and memory-efficient architectures. #LLM #NeuralMemory
To view or add a comment, sign in
-
Most AI Engineers work at the very top of the stack. Today, I went straight to the silicon! 🛠️⚡🏎️ For the past year, I’ve been building and deploying AI solutions at the software layer. 🧠💻 But here is the hard truth: high-level abstractions like PyTorch are amazing, but they completely hide the beautiful, raw complexity of the hardware running them. 🤯📉 If you want to truly optimize AI models, you have to understand the actual physics of the chip! 🧬🔌 To bridge this gap, I bypassed all the wrappers and wrote my first custom Vector Addition Kernel from scratch using C++/CUDA! 👨💻🔥🛰️ It’s a foundational step, but orchestrating the underlying hardware architecture was an absolute mind-bender: 🧱💥 🔹 Explicit Memory Management: Moving away from automatic garbage collection to handling raw VRAM allocation (cudaMalloc) and orchestrating host-to-device data transfers (cudaMemcpy). 🔄💾 🔹 The Indexing Grid: Mapping complex thread geometries (Grids and Blocks) using built-in indexing variables to hit specific memory addresses parallelly! 📐🌐 🔹 The SIMT Paradigm: Shifting my software mindset from sequential CPU loops to thinking in terms of 32-thread Warps executing instructions in lockstep! 🌪️⚙️ Optimization isn't just about writing cleaner code; it's about respecting physical hardware constraints—latency vs. throughput, memory alignment, and arithmetic intensity. 📊⚖️ This is just the beginning of a long, deep-dive journey into hardware-aware optimization and modern GPU architectures (Blackwell)! 🐉🗻 Next major milestone: Tiled Matrix Multiplication and profiling memory bottlenecks with Nsight Compute! 🎯🏁 💡 To the Systems, CUDA, and Compiler Engineers in my network: What is the number one trap or anti-pattern beginners fall into when shifting from CPU processing to parallel GPU programming? 🤔💭 Would love to hear your secret hacks in the comments below! 👇🗣️ #CUDA #NVIDIA #HighPerformanceComputing #SystemsEngineering #GPUArchitecture #DeepLearning #CppProgramming #HardwareAcceleration 🚀🔥
To view or add a comment, sign in
-
Excited to share a glimpse of something I have been building — X2DHF (X to Distributed Hartree-Fock), a production-oriented workspace designed for Python-based Hartree-Fock and Density Functional Theory (DFT) computations. I started building this platform to streamline computational chemistry workflows by bringing together a structured, interactive, and scalable environment for molecular simulations, electronic structure calculations, energy computations, and research-driven experimentation. The vision is to bridge scientific computing with a modern, intuitive interface that makes complex quantum chemistry workflows more accessible, manageable, and reproducible. What excites me the most is not just the technology, but the purpose behind it: X2DHF is being built to be absolutely free for everyone. I strongly believe advanced scientific tools and computational research infrastructure should not be restricted by accessibility barriers. Whether someone is a student, researcher, independent learner, or enthusiast, they should have the opportunity to explore and work with computational chemistry without financial limitations. From workflow organisation to computational insights, I wanted to create something that feels closer to a research operating environment rather than just another simulation interface. Watching it evolve from an idea into a working beta has been both challenging and rewarding. Still building, still improving, and still learning — but genuinely excited about where this is heading. #ComputationalChemistry #QuantumChemistry #DensityFunctionalTheory #HartreeFock #ScientificComputing #Python #ResearchAndDevelopment #ComputationalScience #OpenScience #FreeSoftware #STEM #Innovation #SoftwareDevelopment #Chemoinformatics #ScienceTech
To view or add a comment, sign in
-
Tensor parallelism + sequence parallelism — the two sharding tricks behind every modern LLM training and inference stack. My new post on Medium. A beginner-friendly walk-through that doesn't skip the math — with the matmul algebra worked out in equations and the algorithms drawn in pictures. https://lnkd.in/eE9gWQBb #MachineLearning #DistributedTraining #LLM #Transformers
To view or add a comment, sign in
-
The Cirq is an open-source Python framework developed by Google's Quantum AI team specifically for programming NISQ (Noisy Intermediate-Scale Quantum) computers. Unlike general-purpose libraries, Cirq is "hardware-aware," allowing researchers to write code that accounts for the specific physical constraints and noise of a processor Core Features and Components 1.Circuit Construction: Programs are represented as Circuits divided into Moments. Each moment contains gates that can be executed simultaneously on different qubits. 2.Hardware Interfacing: It provides direct control over [qubit arrangements] (e.g. GridQubit) and coupling, which is critical for optimizing performance on real devices. 3.High Performance Simulators: Includes cirq.Simulator for pure state simulations and #qsim for large-scale, high-performance simulations on classical hardware. 4.Noise Modeling: Researchers can add realistic noise models (like cirq.depolarize) to simulate how hardware errors will impact their algorithms. The Software Ecosystem Cirq serves as the foundational layer for several advanced research libraries: #TensorFlowQuantum (TFQ): For hybrid quantum-classical machine learning. #OpenFermionCirq: For simulating quantum chemistry and material science. #Qualtran: For designing algorithms intended for future fault-tolerant quantum computers. Entanglement of two qubits using cirq 👇🏻 import cirq 1. Create two qubits on a grid (mimicking physical hardware layout) q0 = cirq.GridQubit(0, 0) q1 = cirq.GridQubit(0, 1) 2. Build the entanglement circuit Step A: Put q0 into superposition (Hadamard gate) Step B: Link q0 and q1 using a Controlled-NOT (CNOT) gate circuit = cirq.Circuit( cirq.H(q0), cirq.CNOT(q0, q1), cirq.measure(q0, q1, key='result') ) print("--- Entanglement Circuit ---") print(circuit) 3. Simulate the results simulator = cirq.Simulator() samples = simulator.run(circuit, repetitions=10) print("\n--- Correlation Results ---") print(samples) #googlecirq #quantumentanglement #NISQ #mitiq #zne
To view or add a comment, sign in
-
Second patent filing of the year 🎉 Different layer of the stack this time. This one goes inside the shell (2026/004809). TÜRKPATENT 2026/007632, filed May 14, owned by İstinye Üniversitesi Title: "𝘏𝘢𝘻𝘢𝘳𝘥 𝘍𝘶𝘯𝘤𝘵𝘪𝘰𝘯-𝘉𝘢𝘴𝘦𝘥 𝘚𝘱𝘪𝘬𝘪𝘯𝘨 𝘕𝘦𝘶𝘳𝘰𝘯 𝘍𝘢𝘮𝘪𝘭𝘺, 𝘓𝘪𝘲𝘶𝘪𝘥 𝘌𝘹𝘵𝘦𝘯𝘴𝘪𝘰𝘯, 𝘢𝘯𝘥 𝘕-𝘐𝘴𝘭𝘢𝘯𝘥 𝘏𝘦𝘵𝘦𝘳𝘰𝘨𝘦𝘯𝘦𝘰𝘶𝘴 𝘕𝘦𝘶𝘳𝘰𝘮𝘰𝘳𝘱𝘩𝘪𝘤 𝘗𝘳𝘰𝘤𝘦𝘴𝘴𝘪𝘯𝘨 𝘚𝘺𝘴𝘵𝘦𝘮." 𝗖𝗼𝗿𝗲 𝗶𝗱𝗲𝗮: a spiking neuron with two parallel hazard pathways. Threshold-crossing (the classical one) plus barrier-penetration (the new one, sub-threshold firing via a quantum tunneling analogy). The result is TH-LIF. Add input-dependent hazard parameters (the 'liquid' part of the neuron) and you get L-TH-LIF, no external hypernetwork needed. So I went the other direction this time. Instead of fixing the system architecture first and adding neurons later, the neurons themselves became the patent core. 𝗩𝗮𝗹𝗶𝗱𝗮𝘁𝗶𝗼𝗻: • TH-LIF gets 99.3% F1 on RDRD (8-class human activity radar), 27% lower energy than the CNN equivalent • SHD temporal ablation: standard LIF loses 2.7%, TH-LIF loses 17.4%. The 6.4× gap confirms the TNS metric for temporal-information reliance • Arty A7-35T FPGA: 171 LUT, ≤240 pJ per neuron step, bit-exact across 5 references (Python float, Q4.12 fixed-point, Icarus, xsim, hardware ILA). High confidence SAIF-annotated. And 𝘁𝗵𝗲 𝗰𝗼𝗱𝗲 𝗶𝘀 𝗼𝘂𝘁 today. Reference implementation on 𝗚𝗶𝘁𝗛𝘂𝗯: all 17 patent-covered models, the custom TH-LIF CUDA kernel (≈312× faster than PyTorch eager), the FPGA validation database with RTL and LUT memory init files. Apache 2.0, Zenodo DOI archived. 🔗 https://lnkd.in/d8yQxReF DOI: 10.5281/zenodo.20201251 Thanks to İstinye TTO for the support through both filings. #NeuromorphicComputing #SpikingNeuralNetworks #EdgeAI #Patent #DeepTech #GitHub #SpikingNeuronModels
To view or add a comment, sign in
-
-
Quantum Computing. Right now they are built to purpose. What does that mean? Well an easy to way describe it, is to talk about modern software and processors. I’m going to be loose and dirty in my description, so if you’re a computer science professional I apologize. The core of today’s processors run on what are essentially wired logic gates. Essentially a bunch of on/off switches composed and in series. Hundreds of billions of them in cases. This translates to binary code. 1s & 0s. Programming languages sit on top of this binary creating compilers. We use them every day. Quantum assembly language is based on Temperley Lieb Jones algebra. Essentially the language of 3d knots projected into 2D space. Estimated time to general quantum compiler completion is 2029. You think the Agentic recursion cycle is crazy, wait till AI is coupled with quantum processing. There are not words for the outcome, in my opinion. Until then, quantum processors are built specifically for one problem at a time. It’s like building a computer to add two numbers together, and only “those” specific two numbers.
To view or add a comment, sign in
-
Earlier this year, I spoke with a Compiler Engineer who felt stuck. Technically strong, solving difficult problems, but increasingly feeling like a small cog in a very large machine. They wanted more ownership, more influence, and work that genuinely moved the needle. After talking it through, it became clear the issue wasn’t capability, It was environment. Fast forward a few months and they’re now building compiler infrastructure for next-generation AI hardware, working directly alongside architects and research teams. The move brought greater ownership, faster decisions, stronger upside, and most importantly, the feeling that their work matters again. I’m hearing this more and more across Compiler Engineering. The best engineers rarely move purely for money; They move for impact. Whether it’s MLIR, LLVM, AI accelerators, or quantum compilers, the message is the same: People want to build - not just maintain. The companies winning compiler talent in 2026 understand this. What are you seeing in the compiler and AI infrastructure market? #CompilerEngineering #LLVM #MLIR #AIInfrastructure #AIHardware
To view or add a comment, sign in
-
The biggest bottleneck in training long-context models has always been the sheer computational cost of attention. Most researchers try to fix this by changing the architecture permanently, but Nous Research is taking a much smarter approach with Lighthouse Attention. By using a hierarchical mechanism during pretraining and then stripping it away afterward, they’re getting a massive 1.4–1.7x speedup without changing the final model's structure. It’s a clever way to optimize the "heavy lifting" phase of development without adding complexity to the inference stage. This kind of training-only optimization is exactly how we'll scale context windows efficiently. More details here: https://lnkd.in/dzBiJPM7
To view or add a comment, sign in
More from this author
Explore related topics
- Quantum Classifier Demonstrations for Data Scientists
- Pre-Training Models for Quantum Computing Applications
- Quantum Techniques for MNIST Dataset Improvement
- Quantum Techniques for Improving AI Model Training
- Practical Quantum Machine Learning Methods for Professionals
- Applying Quantum Machine Learning to Diverse Quantum Data
- Machine Learning Algorithms for Quantum System Modeling
- Quantum Machine Learning Strategies for Noisy Data
- Applying Quantum Superposition to Machine Learning Models
- Quantum Neural Network Trade-Offs for Engineers
Article 7 -> https://www.linkedin.com/pulse/training-quantum-neural-network-amit-kumar-eyffc New here? This is Series 2: Quantum ML for Developers. Start with Series 1 (Quantum Computing for Developers) first: linkedin.com/feed/update/urn:li:activity:7436025619882307584 Series 1 covers all the quantum fundamentals: qubits, superposition, entanglement, gates, circuits and algorithms. All 10 posts live. Come back here when you are ready to build.