Waymo Robotaxis Stranded in San Francisco

A case study for uncovering and resolving AV edge cases using hyperscale virtual simulation

Reacting to emergency and priority vehicles like fire trucks is one of the challenges Autonomous Vehicles face in an urban environment. Although there are some good examples of how Waymo AVs react appropriately to emergency vehicles, both Waymo and Cruise have reportedly impeded emergency vehicle response 66 times this year, based on the San Francisco Fire Department reporting.  

Figure 1 – Waymo incident in San Francisco (Photo courtesy of Carrie Haverty)

For example, in April 2023, a Waymo vehicle blocked a fire truck in San Francisco. This incident was well described by Waymo: “An autonomously driven vehicle from Waymo was traveling on a narrow street with parked cars to the left and right. Due to the parked cars, narrow street, and people in the road and near the car, our vehicle was unable to immediately move for a firetruck attempting to enter the street”.  

Previous case studies demonstrated how hyperscale virtual testing can help uncover rare edge cases before vehicles are deployed.  In this case study, Jan Fiala, a verification & validation expert in Foretellix, looks at the abstract scenario of an AV approaching a priority vehicle on a narrow road. Using the Safety-Driven V&V workflow, he applies constraints to recreate an issue discovered during physical driving in simulation and identify the root cause of the problem more efficiently. Once the issue is fixed, he removes the constraints to automatically generate and test many variations to ensure that similar issues will not happen again when conditions change from the original incident. 

Recreating the issue in simulation  

In the April incident, the AV drove uphill on a narrow road with cars parking on both sides. When the AV approached an intersection, a fire truck attempted to turn into the same street. The AV stopped in the middle of the road, and the fire truck could not go around it.  

Jan defined the abstract scenario using ASAM OpenSCENARIO® 2.0 code (Figure 2). The abstract scenario includes the actors involved: the System Under Test (SUT), an oncoming priority vehicle of a generic type, and the parked cars. It also defines multiple parameters, including the time of day, weather conditions, the number of parked vehicles in each direction, their density, parameters for narrow road geometry, etc. All the parameters will be later randomized during test generation by the Foretify™ platform or can be constrained by the user as needed. 

Figure 2 – OpenSCENARIO 2.0 Abstract Scenario

If we had access to the actual driving log from the SUT, we could have started by using the Foretify™ LogIQ tool to automatically generate the OpenSCENARIO 2.0 code for the concrete scenario that occurred so that it could be easily recreated in simulation. Alternatively, we can constrain the relevant parameters from the abstract scenario, such as the location on the map, priority vehicle type, and weather, to force the Foretify Platform to recreate the conditions seen in the real-world scenario (Figure 3).  

Figure 3 – Recreating the incident by constraining the abstract scenario

With the OpenSCENARIO 2.0 code ready, Jan used Foretify and the CARLA simulator to reproduce the scenario. Foretify enables analysis of the recreated scenario in simulation, with the ability to visualize the scenario and SUT signal traces to aid in debugging the scenario until the root cause is understood.  

Video 1 – Incident recreation in Foretify Developer 

Validating the issue resolution  

After identifying the root cause, a software update was implemented, enabling the AV to maneuver between two parked cars safely and yield to the coming fire truck. Then, the scenario is retested in simulation to ensure the software fix works properly under the same conditions as in the original problem (Video 2). 

Video 2 – simulation of the software fix

Validating the software fix under many scenario variations  

One of the biggest challenges of autonomous vehicles is the huge variability and unpredictability of road conditions and the behavior of actors. Although the software fix was proven to resolve the issue under the specific circumstances of the original scenario, how can we know if the AV will properly handle similar situations under different conditions? Although some situation variations are rare, they might be safety-critical. Furthermore, Waymo and others are still operating relatively small test fleets. Statistically, such rare and possibly dangerous edge cases will become evident more often when a larger fleet of vehicles is deployed. Waiting for these rare cases to happen over time is not cost-efficient, will further delay large-scale deployment of AVs, and is potentially life-risking.  

Smart hyperscale virtual testing can verify the software update, help ensure new bugs were not introduced, and increase confidence in large-scale safe deployment. Since the scenario is written in OpenSCENARIO 2.0, generating an unlimited number of scenario variations, including automatically looking for edge cases, is straightforward. The Foretify platform includes a constrained-random test generator to create as many scenario variations as needed. It automatically solves all physical and system constraints to ensure that relevant, valid, and interesting scenarios are generated, and it can automatically select different locations on any map to execute the scenario on many different road types and geometries. 

The Foretify™ Manager tool can dispatch and manage the execution of large-scale test suites and provide failure triage and coverage analytics on the results. It helps ensure that proper coverage is achieved across the defined ODD. In this case study, Jan has run thousands of scenario variations across multiple locations on the map under various conditions.  The test variations were created automatically by the Foretify platform without any additional work. Video 3 shows nine examples of the many test variations, including varying parked cars, several light and weather conditions, multiple locations on the map, and various types of priority vehicles. 

Video 3 – Examples from a large-scale test suite run

Additional edge cases uncovered, and subsequent fix-test cycles 

Some test variations will likely uncover additional edge cases where the AV software malfunctions. In this example, the Foretify platform uncovered another failing edge case where the AV drove at night in the rain with certain parking cars topology (Video 4). In this scenario, the AV did not brake in time and hit the fire truck. This edge case needs to be recreated and debugged, and when a software fix is available, the entire test suite will need to be re-executed to make sure that that software fix did not introduce new issues. 

Video 4 – A new identified edge case


Autonomous vehicles face challenges in urban settings, particularly when reacting to emergency and priority vehicles. A recent incident, where a Waymo AV blocked a fire truck in San Francisco was used in this case study to demonstrate how the Foretellix safety-driven V&V flow is utilized to recreate the scenario in detail, identify the root causes of the issues, and use smart hyperscale automation testing to verify that a software update addresses the problem, including thorough testing of similar cases to the original problem. 

Disclaimer – Foretellix did not have access to the Waymo data or the Waymo AV software. We used another AV software to simulate an issue similar to the one seen in the Waymo incident. Nevertheless, the methodology and tools used in this case study are applicable to any AV solution, using any AV simulator, and to any real-life scenario. 

Subscribe to our newsletter


Subscribe newsletter​