5 ways QA will evaluate the impact of new generative AI testing tools

In a recent article about upgrading continuous testing for generative AI, I asked how code generation tools, copilots, and other generative AI capabilities would impact quality assurance (QA) and continuous testing. As generative AI accelerated coding and software development, how would code testing and quality assurance keep up with the higher velocity?

At that time, I suggested that QA engineers on devops teams should increase test coverage, automate more testing, and scale test data generation for the increased velocity of code development. I also said that readers should look for testing platforms to add generative AI capabilities.

Top software test automation platforms are now releasing those generative AI-augmented products. Examples include Katalon’s AI-powered testing, Tricentis’ AI-powered quality engineering solutions, LambdaTest’s Test Intelligence, OpenText’s UFT One’s AI-powered test automation, SmartBear’s TestComplete and VisualTest, and other AI-augmented software testing tools.

The task for devops organizations and QA engineers now is to validate how generative AI impacts testing productivity, coverage, risk mitigation, and test quality. Here’s what to expect and industry recommendations for evaluating generative AI’s impact on your organization.

More code requires more test automation

A McKinsey study shows developers can complete coding tasks twice as fast with generative AI, which may mean that there will be a corresponding increase in the amount of code generated. The implication is that QA engineers will have to speed up their ability to test and validate code for security vulnerabilities.

“The most significant impact generative AI will make on testing is that there is much more to test because genAI will help both create code faster and release it more frequently,” says Esko Hannula, senior vice president of product management at Copado. “Fortunately, the same applies to testing, and generative AI can create test definitions from plaintext user stories or test scenarios and translate them to executable test automation scripts.”

Product owners, business analysts, and developers must improve the quality of their agile user stories for generative AI to create effective test automation scripts. Agile teams that write user stories with sufficient acceptance criteria and links to the updated code should consider AI-generated test automation, while others may first have to improve their requirements gathering and user story writing.

Hannula shared other generative AI opportunities for agile teams to consider, including test ordering, defect reporting, and automatic healing of broken tests.

GenAI does not replace QA best practices

Devops teams use large language models (LLMs) to generate service-level objectives (SLOs), propose incident root causes, grind out documentation, and other productivity boosters. But, while automation may help QA engineers improve productivity and increase test coverage, it’s an open question whether generative AI can create business-meaningful test scenarios and reduce risks.

Several experts weighed in, and the consensus is that generative AI can augment QA best practices, but not replace them.

“When it comes to QA, the art is in the precision and predictability of tests, which AI, with its varying responses to identical prompts, has yet to master,” says Alex Martins, VP of strategy at Katalon. “AI offers an alluring promise of increased testing productivity, but the reality is that testers face a trade-off between spending valuable time refining LLM outputs rather than executing tests. This dichotomy between the potential and practical use of AI tools underscores the need for a balanced approach that harnesses AI assistance without forgoing human expertise.”

Copado’s Hannula adds, “Human creativity may still be better than AI figuring out what might break the system. Therefore, fully autonomous testing—although possible—may not yet be the most desired way.”

Marko Anastasov, co-founder of Semaphore CI/CD, says, “While AI can boost developer productivity, it’s not a substitute for evaluating quality. Combining automation with strong testing practices gives us confidence that AI outputs high-quality, production-ready code.”

While generative AI and test automation can aid in creating test scripts, possessing the talent and subject matter expertise to know what to test will be of even greater importance and a growing responsibility for QA engineers. As generative AI’s test generation capabilities improve, it will force QA engineers to shift left and focus on risk mitigation and testing strategies—less on coding the test scripts.

Faster feedback on code changes

As QA becomes a more strategic risk-mitigation function, where else can agile development teams seek and validate generative AI capabilities beyond productivity and test coverage? An important metric is whether generative AI can find defects and other coding issues faster, so developers can address them before they impede CI/CD pipelines or cause production issues.

“Integrated into CI/CD pipelines, generative AI ensures consistent and rapid testing, providing quick feedback on code changes,” says Dattaraj Rao, chief data scientist of Persistent Systems. “With capabilities to identify defects, analyze UI, and automate test scripts, generative AI emerges as a transformative catalyst, shaping the future of software quality assurance.”

Using generative AI for quicker feedback is an opportunity for devops teams that may not have implemented a full-stack testing strategy. For example, a team may have automated unit and API tests but limited UI-level testing and insufficient test data to find anomalies. Devops team should validate the generative AI capabilities baked into their test automation platforms to see where they can close these gaps—providing increased test coverage and faster feedback.

“Generative AI transforms continuous testing by automating and optimizing various testing aspects, including test data, scenario and script generation, and anomaly detection,” says Kevin Miller, CTO Americas of IFS. “It enhances the speed, coverage, and accuracy of continuous testing by automating key testing processes, which allows for more thorough and efficient validation of software changes throughout the development pipeline.”

More robust test scenarios

AI can do more than increase the number of test cases and find issues faster. Teams should use generative AI to improve the effectiveness of test scenarios. AI can continuously maintain and improve testing by expanding the scope of what each test scenario is testing for and improving its accuracy.

“Generative AI revolutionizes continuous testing through adaptive learning, autonomously evolving test scenarios based on real-time application changes,” says Ritwik Batabyal, CTO and innovation officer of Mastek.” Its intelligent pattern recognition, dynamic parameter adjustments, and vulnerability discovery streamline testing, reducing manual intervention, accelerating cycles, and improving software robustness. Integration with LLMs enhances contextual understanding for nuanced test scenario creation, elevating automation accuracy and efficiency in continuous testing, marking a paradigm shift in testing capabilities.”

Developing test scenarios to support applications with natural language query interfaces, prompting capabilities, and embedded LLMs represents a QA opportunity and challenge. As these capabilities are introduced, test automations will need updating to transition from parameterized and keyword inputs to prompts, and test platforms will need to help validate the quality and accuracy of an LLM’s response.

While testing LLMs is an emerging capability, having accurate data to increase the scope and accuracy of test scenarios is today’s challenge and a prerequisite to validating natural language user interfaces.

“While generative AI offers advancements such as autonomous test case generation, dynamic script adaptation, and enhanced bug detection, successful implementation depends on companies ensuring their data is clean and optimized,” says Heather Sundheim, managing director of solutions engineering at SADA. “The adoption of generative AI in testing necessitates addressing data quality considerations to fully leverage the benefits of this emerging trend.”

Devops teams should consider expanding their test data with synthetic data, especially when expanding the scope of testing forms and workflows toward testing natural language interfaces and prompts.

GenAI will continue to evolve rapidly

Devops teams experimenting with generative AI tools by embedding natural language interfaces in applications, generating code, or automating test generation should recognize that AI capabilities will evolve significantly. Where possible, devops teams should consider creating abstraction layers in their interfaces between applications and platforms with generative AI tools.

“The pace of change in the industry is dizzying, and the one thing we can guarantee is that the best tools today won’t still be the best tools next year,” says Jonathan Nolen, SVP of engineering at LaunchDarkly. “Teams can future-proof their strategy by making sure that it’s easy to swap out models, prompts, and measures without having to rewrite your software completely.”

We can also expect that test automation platforms and static code analysis tools will improve their capabilities to test AI-generated code.

Sami Ghoche, CTO and co-founder of Forethought, says, “The impact of generative AI on continuous and automated testing is profound and multifaceted, particularly in testing and evaluating code created by copilots and code generators, and testing embeddings and other work developing LLMs.”

Generative AI is creating hype, excitement, and impactful business results. The need now is for QA to validate capabilities, reduce risks, and ensure technology changes operate within defined quality standards.

READ SOURCE