Exploring the Strategic Business Impact of Common Issues Complicating Value Delivered by Software Test Engineering
Photo by Kai Pilger on Unsplash
Within any field of Engineering, the value of activities undertaken within that field are generally assessed in terms of the value the solutions produced as a result of those activities. Software Engineering is no different: no matter which scorecard ultimately gets used to assess the value of activities Software Engineers undertake, the scorecard almost always reflects the impact those activities have (or are expected to have) on producing software solutions that are highly functional, highly performant, and highly usable.
Software Test Engineering is no different, either: it is also responsible for producing solutions (here related to testing) that deliver value to the software development organizations that depend on them. If the solutions work as expected, the organization can expect to benefit from a well-functioning feedback loop that delivers clear- and reliable visibility into the current functional state of work product that is both serviceable and efficient. Coverage should be robust and expansive, and it should account for breakdowns both in the ability of the solution to function as composed/ designed and in the ability of the solution to deliver the expected value.
The value testing solutions provide when they work as expected is generally of strategic benefit to the organizations that depend on them.
In terms of its ability to provide a well-functioning feedback loop, the effectiveness of testing is generally assessed in terms of its serviceability and efficiency. Among other things, automated testing solutions should be reliable, efficient, easy-to-troubleshoot, extensible, and portable. Manual testing should be efficient in the ways it takes advantage of opportunities it is granted to evaluate any unknowns left after automated testing has completed. Testing strategy should consider both what gets tested and how it gets tested in a manner that maximizes appropriately-targeted coverage at low cost. Test automation frameworks should provide sufficient reach to maximize coverage as reliably and make it easy to troubleshoot in case of failure and automate testing for (i.e. automate testing of the framework itself).
For various reasons, though, solutions produced by Software Test Engineering become problematic. Rather than serving as a reliable source of timely and actionable feedback, they sometimes serve to obstruct strategic initiatives, create operational problems within impact across the organization, lower their own ROI, and present unwelcome trade-offs to that the organization needs to either resolve, mitigate, or circumnavigate somehow. Thereby, a source of strategic benefit becomes the source of strategic problems for the organization that might otherwise depend on solid testing.
This post will outline what those issues generally look like and the common ways these sorts of issues can be expected to have negative strategic or operational impact on software development organizations that depend on them.
Obstructions to Strategic Initiatives
Nearly all software meant for use is developed to take advantage of strategic benefit. That is to say: regardless of whether the envisioned benefits involve supporting a specific set of use cases, making a specific set of operations more efficient, or to targeting- or expanding within a specific market (or market segment), there is a strategic aim that informs why software meant for use is developed. That aim brings development (and the application of resources involved) into clear perspective involves the ability to take advantage of those benefits, as envisioned.
Generally, the ambition that drives investing in software development (whether for-profit or not) is the same that drives any investment: to create and capitalize (to whatever degree) on an opportunity where the projected benefits of success outweigh the projected risks.
When evaluating the return on investment for software development, strategic advantage forms one side of the coin; the other is formed by strategic risk. And just like with any activity (like scuba diving, riding motorcycles, rock climbing, or even crossing a busy street) where engagement with that activity supposes consent for a certain level of risk, releasing software also supposes consent for the risks involved. One cannot hope to effectively manage risk without first being aware of it. Ultimately, then, making one's self aware of possible risks supposes not just consent in general but informed consent (which is also a strategic advantage) as well as identifying tangible opportunities for risk to potentially affect organizational resources.
In this, the same way it pays to inspect one's rig setup before every dive, to run TCLOCS before every ride, and to check knots, harness, and equipment wear before every climb (or even look both ways before crossing the street), it pays to test software.
If, for example, a software publisher were to release new functionality that had not been tested at all, the publisher would then consent (whether knowingly or not) to any risk resulting from the release. This includes risk of support requests, negative reviews, loss of confidence in the publisher's brand (thereby increased difficulty selling product), possible subsequent loss of revenue (whether that be loss of sales of refund requests) that might follow. If the same publisher were to test the new functionality before shipping, the publisher would have an an opportunity to inform consent of any subset of the same set of risks that was not somehow mitigated prior to release.
So if an organization means to safeguard organizational resources from risk that the organization will ultimately consent to (informed or not) upon release, the best option to address that risk involves becoming aware of it early. There generally isn't a better way to do this than clear visibility into current functional state.
If an organization is not able to test in a manner that is both serviceable and efficient, it will be limited in its ability to assess risk for completed work product. This includes not having sufficient tooling to be able to cover the system. It includes not having existing coverage around the system under test or not having confidence in existing Test Engineering resources to be able to implement. It includes lack of a sufficient, efficient, or reliable testing strategy (i.e. the strategy of determining how to exercise the system under test and gather test results or other outputs to be evaluated).
Leadership that is risk averse (or otherwise intolerant of a potential mess) will frequently deprioritize high-risk development (however tantalizing the strategic benefits) with incomplete insight into the organizational risk that present in the current functional state of work product rather than risk any surprises that may follow a release with insufficient evidence. Sometimes an organization will even forego opportunities for development that would otherwise reduce operational costs (including hosting costs) if testing solutions are insufficient to provide insight into organizational risk both serviceably and efficiently.
Challenges to Return on Investment for Testing Efforts
Return on investment for testing solutions involves comparing the value the serviceability and efficiency of those solutions present against the resources required to design/ compose, execute, and maintain them. For example, it takes ten minutes to automate a test and 30 seconds to run it in CI, then net positive return on investment starts after twenty successful runs (pass or fail) of that test specification in CI. If the same specification takes hours to troubleshoot in case of failure, that cost also factors into the ROI every time the test fails. ROI should also be adjusted if, for example, the same specification errors out mysteriously at least once every 5-8 runs.
When testing is not both serviceable and efficient, it can generally be expected to have an impact on other parts of the organization's operations. While at the same time these issues affect the ability to test (and the ability of testing to deliver the expected value), their influence can generally be expected to extend beyond testing immediately. For example, a high prevalence of escaped product issues affects operations in Customer Experience, Marketing, and Sales, as well as any impact it may have on Development efforts. Automated tests with a history of failing intermittently will likely delay release of new functionality (from development or GA -- either of which might affect Marketing schedules) and prompt organizations to consider wagering whether it might be more valuable to consent to release with limited awareness of potential issues or none at all (which, again, might ultimately affect any of the above).
As if that was not enough, the technical debt associated with certain issues affecting the serviceability and efficiency of testing solutions has a tendency to accumulate mass until it starts affecting the trajectories of objects and processes generally beyond its reach. Imagine, for example, a test automation framework where a subset specifications fail (a different subset each run) due to issues within the framework itself, and where the framework is not open either to testing or troubleshooting. In another example, imagine unwieldy suites of indistinguishable manual test cases that seek to exercise the system under test by way of duplicated workflows with minor variations on each other. Within issues like these, the greater the volume of tests involved, the more effort it will likely take to move on in any one direction.
Eventually the gravity this sort of technical debt produces begins to warp the integrity of efforts and processes around it. Here, what once shone a light on at least part of its corner of the universe collapses into what effectively serves a black hole for operations resources. Until the root cause is resolved meaningfully, the problem will continue to devour time and talent invested either in attempting to mitigate the problem or in attempting to circumnavigate it. Eventually nothing escapes its draw (except for the thermal radiation released in data centers where applicable tests are run in CI), possibly not even usable feedback related to current functional state.
Situations like these generally beget questions critical of whether, if testing can be seen as imposing risk on the ability of the organization to be able to go to market, why an organization might invest in Test Engineering at all. Why invest in creating the sorts of problems that then also require significant effort to potentially solve? For that matter, why throw good money after bad at efforts that might ultimately be consumed by what otherwise seems like an infinite source of gravity? Why innovate usable custom testing solutions if off-the-shelf solutions exist that (despite their limitations) could potentially get the organization at least part way there now?
And an apparent favorite: if an organization cannot draw a straight line from its benefits to profitability (or even revenue), why invest sufficiently in Test Engineering that it can deliver the expected value as an engineering practice?
When solutions produced by Test Engineering do not work as expected, the same concerns related to informed consent apply to the impact Test Engineering can be expected to have on the organization that depends on it. Despite any strategic advantage, issues affecting ROI also become a strategic disadvantage. If testing solutions themselves are seen as posing a risk themselves (either to ROI for testing itself or to the greater organization), those responsible (or with an interest in being granted responsibility) for managing that risk might also start asking questions about much consent any organization that might depend on the testing might also be willing to show for tests producing low ROI.
Consequences of Unwelcome Trade-Offs
If a software development organization commits to investing in Test Engineering, the investment itself needs to come from somewhere and needs to be applied somewhere. Somehow the organization needs to source bandwidth and expertise from somewhere. It also needs (somehow) to understand how a decision made today can be expected to affect serviceability and efficiency not just today but as far in the future as foreseeable.
One thing that is constant (definitively) about computing is that it is prone to change. And like nearly any investment in Engineering, when these investments pay off, they pay dividends for the investor now and as long as the solutions remain viable, adaptable, readable, open to troubleshooting, or even portable. And like any investment, when misapplied, investment in Software Test Engineering may be limited in some respects in its ability to pay off. Testing, for example, that provides visibility into the functional state of work product today may- or may not be viable in five years' time, or even a year's time.
And as results show time and again, investing with the long view in mind generally (and perennially) pays greater dividends than investing solely focused on the short view.
Meanwhile, there may- or may not be available bandwidth within the organization. There may- or may not be suitable talent (or even adaptable talent) available within the candidate pool. There may- or may not be suitable tooling available off the shelf that can help make automated testing serviceable and efficient in the long run. There may- or may not be budget (or appetite within the organization) immediately available to hire and retain suitable talent. Either way, the work still needs to get done; no reason to make perfect the enemy of the good. The bandwidth to get the work done will need to be sourced from somewhere.
Evidence of an issue resulting from trade-offs in the past generally surface at some point later on.
- Perhaps a framework developed by software engineering generalists a few years ago isn't extensible or updatable (or it may make failures a challenge to troubleshoot) today because of design decisions that emphasized ease of development over any of the above goals. Maybe the only thing consistent about tests run with these frameworks is that the tests flake; that is, they fail intermittently.
- Perhaps automated tests (or libraries responsible for supporting tests) written by production software engineers don't exercise the system under test either very efficiently or very expansively. The same with manual tests developed- and executed by resources like Support Engineers and Education Specialists: in each case maybe they cover a majority of use cases (mainly within the realm of immediate concern of the resource that wrote the test), but coverage is not systematized much beyond that. Maybe the implemented testing strategy assumes e2e is the only level worthy of testing at (because, for example, it's closest to what the consumers can be expected to experience).
- Perhaps an off-the-shelf solution test automation tooling (like a tool facilitating UI- or API test automation) that some time ago seemed like a reasonable way to implement coverage without reinventing the wheel now shows its limitations where it comes to workflows and testing scenarios that have surfaced since but which don't fit into the narrow scope of testing that the tool's developers (and marketing apparatus) envisioned would likely excite prospects enough to encourage adoption. Maybe at the time of adoption the solution was a very popular disruptor among influencers but now the developers are charging for (or otherwise limiting access to) the solution.
The same way informed consent works for software development organizations (both within- and outside of Development), it also works for Test Engineering. If Test Engineering is allowed (better: encouraged) to project and plan as far into the future as it is possible to see, the solutions it produces can be expected to accommodate that projection and planning. Meanwhile, if the only context it is given to work with is immediate, that's as far as it can be expected to plan. Worse: if the organization plans on behalf of Test Enginereing, it might find itself constrained by the implications of that planning, whehter those who made the plans were aware of the resulting trade-offs initially.
Conclusion
When solutions produced within Test Engineering deliver the expected value, organizations that depend on them can develop (and release) with confidence. Generally, testing functions (and delivers value) serviceably and efficiently enough that it is not left prone to questions related to the value in investing in it. And when testing proactively identifies and acts on opportunities for failure long-term, it feeds back into both of the above in a manner that can generally be expected to make Test Engineering a greater benefit at lower cost over time.
Although confidence is an important word to bear in mind here, confidence as a feeling is not the only benefit. In essence, the key benefits of well-functioning testing are strategic: they inform (at however arguably low-volume, and whatever the type of organization) business intelligence related to what impact current functional state may be expected to have on either on functionality itself or to business interests related to it. This isn't just a feeling; it presents a clear opportunity to be a better steward of organizational resources, with impact on competitiveness both within the market and operationally. In short: much the same way military intelligence makes for better military decision-making, one of the benefits of Software Test Engineering is that it stands a chance of making organizations more effective (therefore likely also more efficient) by enabling them to make better strategic decisions based on clear feedback.
Despite this, software development organizations continue to experience issues like the ones listed here. When they do, this is most often what they look like: obstructions to strategic initiatives, challenges to return on investment, and the consequences of unwelcome trade-offs.
Like with any field of Engineering, and just like with Software Engineering, the value of Software Test Engineering delivers is assessed in terms of its ability of solutions it produces to deliver the expected value. With Test Engineering, the value expected from the testing solutions it produces are serviceability and efficiency. And as the descriptions above hopefully make clear, this isn't an either/ or: the same way intelligence should be both timely and actionable, testing solutions should be both serviceable and efficient.
And with the right talent (hopefully the right expertise, experience, and skills), an organization that invests in (and commits to follow-through) on realizing the benefits of high-value testing should be able to enjoy the benefits now and into the foreseeable future. With this, hopefully it should be possible to transform what might otherwise have become strategic problems into a strategic advantage.