Would You Put Your Kids on an AI-Driven Bus?
Imagine it’s 2030. You’re standing at a pickup zone in your neighborhood. A small autonomous shuttle pulls up, doors open. Your children climb in with their backpacks. No driver. No attendant. Just your kids, an AI, and the morning commute to school.
Would you let them go?
This isn’t a thought experiment for some distant future. In Riyadh, autonomous vehicle and robotaxi trials have already begun at key locations including the airport and educational districts. WeRide received the first autonomous driving permit in the Kingdom, with pilot operations involving Uber and Ai Driver already completed with over 1,000 users. Saudi Arabia’s new technical regulations for autonomous vehicles entered into force this month, April 2026, setting out requirements for sensing, perception, decision-making, fault response, and unexpected conditions.
The technology is arriving. The regulations are arriving. The question is whether we’ve earned the right to trust it.
The Promise Is Real
AI is transforming Saudi Arabia faster than most people realize. The National Strategy for Data and Artificial Intelligence aims to elevate Saudi Arabia into the top 15 nations globally for AI readiness by 2030. And the ambition goes well beyond transportation.
In healthcare, AI models can detect cancer from pathology slides with over 90% accuracy. In energy, AI optimizes everything from geothermal wells to solar farm placement. In logistics, AI-driven supply chain management is helping the Kingdom meet its goal of growing the logistics sector’s GDP contribution from 6% to 10% by 2030. The government targets 15% of public transport vehicles and 25% of all goods transport vehicles operating autonomously by 2030. With over 60% of the Saudi population under the age of 35, the country is working to build a new generation of AI-capable professionals through initiatives like KFUPM’s AI+X program, which embeds AI education across every undergraduate major.
The Stanford AI Index 2025 found that Saudi Arabia ranks among the top four most AI-optimistic countries in the world. People here believe AI will make life better. And in many ways, it already is.
But optimism without verification is just hope. And hope is not a safety strategy.
The Gap Between “Works” and “Trustworthy”
Here is the uncomfortable truth about AI systems today. A cancer detection model trained at one hospital scores 93% accuracy, then drops to 70% when deployed at another hospital, simply because the lighting and camera equipment changed. An image classifier looks at a cat covered in elephant-skin texture and confidently says “elephant.” A self-driving car’s perception system misses a traffic light after encountering noise patterns it never saw during training.
These aren’t rare glitches. They’re structural properties of how neural networks learn. The models find shortcuts in the training data (background textures, hospital-specific imaging artifacts, reviewer-specific writing patterns) and mistake those shortcuts for the real thing. When the real world doesn’t match the training data, the model fails quietly and confidently.
A recent readiness assessment for autonomous vehicles in Saudi Arabia published in Research in Transportation Business & Management (2025) identified the same concern: while government investment and pilot projects are strong, critical gaps remain in the regulatory frameworks, cybersecurity protections, and safety infrastructure needed for large-scale deployment. Public acceptance is moderate, with security concerns and lack of awareness affecting trust.
For a recommendation algorithm that suggests the wrong movie, this doesn’t matter much. For an autonomous vehicle carrying your children, it matters completely.
What “Trustworthy” Actually Requires
At the AI V&V Lab at KFUPM, we work on the science of making AI systems provably safe. Not just “tested on a benchmark and scored well” safe, but mathematically guaranteed safe within defined operating conditions.
This involves three layers of assurance that work together.
Before deployment, verify the math. Neural network verification techniques can formally prove that a network’s outputs stay within safe bounds for all possible inputs in a defined region. Not for a sample of inputs. For all of them. This is the difference between testing a bridge by driving a few trucks across it and proving the bridge can hold any truck within its weight rating. Methods like Reluplex extend classical optimization to handle the nonlinearities in neural networks, making formal verification of AI systems practical for the first time.
During development, find the failures before the world does. Adaptive stress testing and importance sampling techniques can discover dangerous scenarios thousands of times faster than random testing. Instead of driving billions of miles hoping to stumble across rare failures, these methods intelligently search for the conditions most likely to cause problems. In our research, we’ve shown that these techniques can find potential failures 1000x faster than conventional approaches.
During operation, know when to say “I don’t know.” Even a verified model will eventually encounter something outside its operational design domain. Runtime monitoring systems use AI embeddings to detect when the world has shifted beyond what the model was trained for, and trigger safe fallback behaviors (slow down, alert a human, switch to a backup system) before a failure occurs.
None of these guarantees are optional. A trustworthy AI system needs all three.
Why This Matters for Saudi Arabia Specifically
Saudi Arabia is not just adopting AI. It is building entire cities and national infrastructure around it. The Kingdom’s new autonomous vehicle regulations set technical requirements across sensing, perception, decision-making, vehicle control, and fault response. The Saudi Road Code Volume 801 establishes infrastructure readiness and safety oversight standards. These are serious, comprehensive frameworks.
But regulations define what must be safe. The V&V methods define how to prove it. Saudi Arabia’s road fatality rate has fallen from 28.8 per 100,000 people in 2016 to 18.5 in 2021, but that figure still sits above the global average of 15. Autonomous vehicles have the potential to reduce these numbers further by minimizing human error, but only if the AI systems themselves are verified to handle the conditions they’ll actually encounter: desert sandstorms, extreme heat, unfamiliar road geometries, and driving behaviors that differ from Silicon Valley training data.
The Kingdom has a genuine strategic advantage here. Saudi Arabia is approaching AI adoption with an emphasis on ethics, fairness and security, with national guidelines governing the design, use, and regulation of AI. SDAIA chairs global forums on AI governance including the annual Global AI Summit in Riyadh. The research infrastructure, from KFUPM’s AI V&V Lab to the SDAIA-KFUPM Joint Research Center for AI and the IRC for Smart Mobility and Logistics, is actively producing the verification and validation methods these deployments will need.
But the window between “we have pilot programs” and “this is daily infrastructure” is closing fast. The V&V frameworks need to scale with the deployments, not chase them.
Building AI Worthy of Your Trust
The illustration at the top of this article captures something that no benchmark score ever will. A parent letting go of their child’s hand. A moment of trust extended to a machine.
That trust isn’t earned by impressive demos or high accuracy on test sets. It’s earned by proving, mathematically and operationally, that the system will behave safely even in the scenarios you haven’t imagined yet. It’s earned by building systems that know when they’re uncertain and act accordingly. It’s earned by validating AI against the actual conditions of the communities it serves, not just the conditions of the lab where it was built.
This is what we work on at the AI V&V Lab. Not because safety is a checkbox on the way to deployment, but because the people who will ride in these vehicles, receive these diagnoses, and live in these cities deserve AI systems that are genuinely worthy of their confidence.
The technology to build trustworthy AI exists. The question is whether we’ll invest in it as seriously as we invest in the AI itself.