The Validation Gap
I started my career on the avionics test rigs, testing and writing reports.
Integration rigs in aerospace hook up all the hardware, software, and emulated conditions essentially testing before flight test. I'd run the test, document the result, close the action. The system meets the test case, the test case meets the requirement. Pass. Move on. If it failed, write it up and ship it back to design.
Years later I moved into cockpit design. Same company, different role. And those same test reports, some of them authored by me, started landing on my desk.
That's when my perspective started to shift. A test pilot I worked with helped too, if you've read my previous post.
The reports were sound. The tests had been run properly. The system had met the requirement. But sitting on the other side of that process, looking at what we were actually trying to build and how a pilot would use it, I could see the gap. I had been trying to fix to meet the requirement as I understood it as a test engineer without context of the wider system, the operator, the HMI. I had spent years doing verification without fully understanding validation.
It's worth pausing on that distinction, because it's the thread that runs through everything that follows.
Verification asks: does the system do what we specified? Validation asks: did we specify the right thing?
With requirements and verification at the forefront, validation can take a back seat. That's where the gap opens.
𝗢𝗽𝗲𝗿𝗮𝘁𝗼𝗿𝘀 𝗱𝗼𝗻'𝘁 𝗿𝗲𝗮𝗱 𝘁𝗵𝗲 𝘀𝗽𝗲𝗰𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻
This became especially clear when training members of the armed forces on a new piece of kit. Within a week they had surfaced issues that six months of in-house testing hadn't considered.
Were they being rough with it? Yes. But they saw it as a tool. We had been treating it as scientific equipment. We were incentivised to protect the kit, to keep testing, to prevent cost. They only cared whether it did the job under pressure. Both perspectives are valid. Only one was in the test programme.
This is the human dimension of the validation gap that rarely gets written about. Engineers are protective of the systems they build. Their incentives are different during development. But that produces test assumptions that reflect design intent, not operational use. The operator's view, that the kit is a means to an end, used hard in conditions nobody anticipated, has to be deliberately built into the programme. It doesn't have to be a test requirement. Analysis can capture it too. But it has to be captured somewhere.
The Pentagon's DOT&E report makes this finding year after year. Systems pass developmental test and fall short under operational conditions. The labels change, procurement problem, acquisition problem, requirements problem. But the gap underneath is consistent: the people who designed the test thought about the system differently to the people who had to use it. That is a systems issue, not an organisational one.
𝗜𝘁'𝘀 𝗮𝗹𝘀𝗼 𝘄𝗼𝗿𝘁𝗵 𝗰𝗼𝗻𝘀𝗶𝗱𝗲𝗿𝗶𝗻𝗴 𝗵𝗼𝘄 𝘃𝗲𝗿𝗶𝗳𝗶𝗰𝗮𝘁𝗶𝗼𝗻 𝗽𝗿𝗼𝗯𝗹𝗲𝗺𝘀 𝗰𝗼𝗺𝗽𝗼𝘂𝗻𝗱 𝘁𝗵𝗲 𝗴𝗮𝗽
Some standards are genuinely loose on what success looks like. Salt fog testing under DO-160 is a good example, there is real ambiguity about whether the test environment represents what an aircraft actually sees in service. The pass criteria exist. Whether the test maps to operational reality is a different question and it doesn't always get asked.
The same looseness shows up in how standards get interpreted. FAA, stakeholders, suppliers interpret the same requirement through a different lens, producing different conclusions about what passing means. We always reached agreement. But agreeing on an interpretation is not the same as validating it against operational reality. The argument gets resolved. The underlying question often doesn't.
Recommended by LinkedIn
𝗧𝗵𝗲 𝗰𝗼𝘀𝘁 𝗼𝗳 𝗳𝗶𝗻𝗱𝗶𝗻𝗴 𝗼𝘂𝘁 𝗹𝗮𝘁𝗲
My test team turned to me, on a miserable grey day on the water: "Do we have to run it again? We all know it's fixed."
Yes. We did. Because there was no other opportunity where all the kit was connected together, the fix had to be verified on trials. On the water. Burning day rate, vessel time, and weather window.
The F-35 programme is another well-documented example of what happens when integration environments are late or insufficiently representative. The capability existed. But not early enough, or at sufficient fidelity, to de-risk the system before the costs of late discovery compounded.
Early integration environments, System Integration Rigs representative test benches, exist to prevent exactly this. Not just cheaper testing, but earlier testing. Built up incrementally as development progresses, so issues get found when they're cheap to fix rather than when the only option is the field.
The second loss is less obvious but just as real. Without early integration environments you don't just find things late, you burn the field time that was supposed to give you something else. Trials are expensive, weather-dependent, and limited. Every cycle spent on a known issue is a cycle you can't spend on the uncontrolled variable exposure that open water testing is actually for.
What you bring into expensive or unrepeatable test environments should be limited to the things that can only be tested there.
𝗧𝗵𝗲 𝗽𝗿𝗮𝗰𝘁𝗶𝗰𝗮𝗹 𝘀𝗵𝗶𝗳𝘁
Over my career in aerospace, maritime and defence, I've seen this pattern repeat.
The validation gap persists because the conditions that create it are normal. Schedules reward delivery to specification. Budgets cut integration environments early. Operators arrive at acceptance, not at requirements.
That isn't negligence. It's the default.
Closing the gap requires different decisions upstream — how systems are defined, who is in the room when requirements are written, and which environments get funded before the field becomes the only option.
If those decisions aren't made deliberately, operational reality makes them for you.
Usually at the worst possible time.
#systemsengineering #verification #validation #aerospace #defenceacquisition #avionics
Thank you for saying what needed to be said. (-From a system safety engineer)
There’s no excuse for this and it isn’t AI. AI should increase our ability to validate, focus the machine on the low hanging fruit and the invisible data anomalies (to most humans) and leave the humans to focus on validation that needs human thinking. AI should free humans to better focus validation
I've been associated with projects where there has been no test, verification or validation. What happens when AI gets it wrong? Scary.
You have good points here.
Agreed. There's been evidence since the 1990s that software V&V doesn't pick up errors that real-world use does.