Product Security: Measurement, Complexity, and the Near Future
Prevented > Auto-found > Human-found > Externally-found > Unfound > Exploited
In product security (not just at my company but also generally as an industry) we’re in the business of finding, fixing, and preventing vulnerabilities that have security and/or privacy implications. We aim to “Shift Left™”, which refers to shifting away from reactive and towards preventative on the time scale above. I also like to think in terms of “shifting up” — while shifting left towards prevention we should also optimize for scalability and automation. Then rare and valuable manual security resources can be focused where scale and automation aren’t possible.
If you’re building secure-by-default components that address certain vulnerability classes, mitigating them via an item from your threat modeling exercises, or eliminating attack surface, these are some examples from the prevention spectrum above. Auto-found examples might include scalable, efficient (high signal-to-noise) program analysis capabilities. Human-found would be vulnerabilities found via manual code review. Externally-found include vulnerabilities that come via bug-bounties to be fixed. Exploited of course refers to vulnerabilities that escape those efforts, are shipped into products, found, and utilized — whether they are N-days or 0-days. Unfound is where the majority of all vulnerabilities would lie (“Schrodinger’s Vulns”, perhaps).
Measurement and Impact
At my employer, we “focus on impact”. We also try to be as data driven as humanly possible, concretely measuring via various methods. In fact, this is generally the goal at all the security shops I’ve worked at, with, or around. The security world makes for an interesting pattern in the venn diagram overlap of the two though.
It’s extremely easy to measure how many bugs are found by a given activity, review, or tool. But the value isn’t captured unless those bugs are fixed. Of course they must be found before they can be fixed (at least intentionally), but finding is not the end goal.
Bugs_intentionally_fixed / bugs_found is also measurable, of course. But how about bugs fixed correctly? This can be harder to validate and thus hard to measure. Why is fixed correctly called out? Because incomplete or incorrect patches of vulnerabilities lead to new or modified vulnerabilities that are actively exploited in the wild. In fact, project zero's year in review found that: "25% of the 0-days detected in 2020 are closely related to previously publicly disclosed vulnerabilities. In other words, 1 out of every 4 detected 0-day exploits could potentially have been avoided if a more thorough investigation and patching effort were explored." That 25% metric struck me as surprisingly high when I read it, even though the underlying phenomenon is nothing new to the field or to myself.
Imagine spending all the time and effort in your security program to find a vulnerability, triage, analyze and determine it to be valid, and possibly work with dev teams to create a patch to fix it, roll out a hotfix to all fielded products via over-the-air update… only to have it exploited later because said patch was incomplete or incorrect (and potentially showed people where to look for exploitable vulnerabilities). There are likely patch/fix validation investments that could be made in many security orgs to avoid this “last mile” problem for critical vulnerabilities or critical product risk areas in order to try to avoid being one of these 0-days in the future.
At the far left and most impactful end of the curve is prevention. It’s also one of the more interesting paradoxes in terms of quantifying. How or what do you measure when successful prevention work means there are no data points to measure?
It’s always important to remember: not everything that is measurable is impactful and not everything that is impactful is easily measured. We should absolutely measure what we can. But if some work is impactful and not easily measurable, we should still recognize the value of that work. One of our core values is “Focus on Impact” after all — sometimes externally summarized as “don’t mistake activity for achievement”. We should be careful not to mix up the two and not over-index on less meaningful measurements. Worshipping false metrics is worse than worshipping false idols**.
If you architect a design that prevent a class of vulnerabilities, or build a secure by default component that does the same, how many found or fixed vulnerabilities is that equivalent to? Is it even meaningful to try to draw equivalences? Preventing incidents or breaches is more impactful and desirable than detecting and responding to them, even though doing so in a timely manner is strictly necessary (table stakes, if you will). But how many incidents did you realistically prevent? Min/max of 0/∞? I can easily know how many I worked incident response for (the right side of the shift left spectrum), but I can’t quantify my prevention work. Can it be quantified in a meaningful way? Some parts of industry have experimented with forecasting, but I haven’t seen anything concrete enough come of it to rely on in this context.
In any case, perhaps we can all agree that qualitative assessment of the impactful prevention work is still worthwhile. Shift left.
** this claim is disputed
The Whole Iceberg
A maxim sometimes delivered is “anyone who thinks a large company, is just one company, has never worked for a large company.” I call that bet and raise to: in the largest of companies (a FAANG, if you will) “anyone who thinks $large_FAANG_org can be approximated as just one large company, has never worked in $large_FAANG_org” It’s a bit of a stretch in terms of poetic aesthetic, but I think it gets the point across. Many of our colleagues understand that certain companies and certain business units within them work on new and interesting things, but don’t quite grasp the gravity of it as a whole.
Lots of talented people are out there doing great work in their particular area from their particular worldview. Product security focusing on silicon and processing hardware. Security assurance of low level software up to and including the multiple stages of bootloaders. Popular and complex mobile apps. Drivers. Cloud services. The list goes on. Experts who focus on a particular security activity or a particular part of the tech stack.
People are generally most familiar with Apple, so I tend to use them as the vertical example. The product work performed in house covers the entire tech stack — custom silicon, integrating vendor silicon, hardware, electrical and wireless engineering at the board-level, device design, and so on; managing and modifying branches of existing OSes, to custom built from scratch OSes, to large number of on-device applications per platform, to web endpoints and services (the list goes on and on).
The product work performed in house also covers product lifecycles from cradle to grave — including manufacturing and factory processes, iterating on hardware through various engineering gates and phases, over-the-air updates per device model, and a slew of other areas.
What many people I speak to don’t realize is that while Apple is widely recognized for this complexity and diversity of work up the entire verticality of the stack, they are not the only ones doing so.
Apple isn’t the only Apple with the complexity and verticality of Apple out there. Furthermore, the number of companies with orgs or business units that are checking these same boxes is growing. Now imagine again your particular piece of that tip of the iceberg (whether it was a mobile app, a cloud service, or a particular layer of the tech stack) and just how much work there is just within that scope you know so well. Now try again to imagine the entire iceberg with the aforementioned vertical complexity and full life cycle considerations. Whether your niche is static or dynamic analysis, security architecture, SDLC, offensive security, or whatever else it may be — mentally extend that work up-and-down the vertical tech-stack and across the product lifecycle. The need for security and privacy assurance capabilities and programs is almost certainly going to continue to continue to grow exponentially.
Preventing Today vs. Preventing Tomorrow
So the more you think about a burgeoning field and performing security work, you might start thinking — prevent… what exactly?
Will we need to be prepared to prevent the usual classes of vulnerabilities like memory corruption, XSS, SQLi, weak crypto, etc? Definitely, yes.
Will we need to be prepared to prevent an entirely new, yet to be fully discovered classes of vulnerabilities that are only possible because of these new products and capabilities? Definitely, yes.
If we the industry want people to widely adopt entirely new classes of products then they need to be trustworthy. And entirely new classes of devices and capabilities breed entirely new threat models. To build trustworthy products in the future, we have to start thinking about these new classes of vulnerabilities and risks today. Note that I’m advocating for early research and exploration, not insinuating that no one should even begin exploring these concepts or building first versions of them until any and all potential issues are prevented. This is impossible if you want to ever actually ship anything. Nothing is perfectly secure. But I digress. Let’s consider just a few examples.
The mere existence of augmented reality glasses introduce new threats with different consequences. Autonomous vehicles have a lot of overlap here (e.g. if an exploited vulnerability can result in the driver or vehicle not seeing or recognizing a person in its path, or a stop sign). Mixed reality also means new assets for your threat model that have security or privacy implications, such as 3D point clouds from scanning your AR or VR environment. Some of these new devices have new physical sensors like camera and microphone arrays, depth-sensors, etc. which create new attack vectors — in the human inaudible or invisible spectrums, for example. Avatars, deep fakes, and AI use cases also present new potential threats. There are also opportunities to build, for instance, more usable and secure authentication mechanisms when given an entirely new form factor and use case.
One way to start preparing is via open and shared research in academia via RFPs or Sponsored Research Agreements. Some of us might not have the resources to devote entire teams to studying these problems ourselves, but there exists an entire research sector that can help us. Funding is not the only way you can help either. We can help academia by bridging the gap between them and industry by advising (helping teams ensure their research is impactful and relevant to industry is usually of great value to them) or offering hardware or platforms for them to conduct research on. This research could not only develop defensive capabilities by feeding back into product designs and threat models, but can also grow our offensive capabilities by informing future Red Teaming capabilities and operations.
I also tend to think this is the best way to do so as it benefits the entire industry, is performed by neutral but capable third parties, and will ultimately help keep all users safe — whether they are using our AR/VR/smart-devices products or someone else’s. For one of us to succeed, this new industry as a whole must succeed. I believe this area of research will ultimately begin to grow organically but we would all benefit greatly if other companies join us in helping them accelerate.
Note: This article was stitched together from internal workplace note(s) with parts redacted or removed, other “napkin” notes, and an aging memory. Hopefully it is of interest to folks, or at least sparks some discussion. This aims to be the first of many freeform notes on practicing or researching Security, Privacy, and the AR/VR/Smart Devices space.