AI & Cyber from a Detection Engineer’s Perspective: Explainability Matters and Context is King
Beyond the Code: Rethinking Cybersecurity and Detection in the Modern Era
Hello Cyber Builders!
This week is a new series on AI, Cyber, and Explainability. The series results from a passionate conversation and collaboration with Max Heinemeyer, Chief Product Officer at Darktrace. I thank him warmly for his contribution to the Cyber Builders community.
Just so you know, I am not affiliated nor have any interest in the company. This blog post is not sponsored. As I described earlier, Cyber Builders is not just a weekly publication. It’s a platform for collaboration in the cybersecurity industry (see Cyber Builders: A Collaborative Approach to Cyber Security)
In this series, we'll embark on a journey to demystify AI and cyber detection, from the fundamentals of detection engineering to the complex interplay of machine learning algorithms in threat identification
The new series will focus on a critical topic related to cybersecurity: how to understand and, consequently, trust a cybersecurity detection system. I will explore detection engineers’ needs, the definition of alert and anomaly, and various machine learning systems.
Buckle Up!
Why start a new detection, cybersecurity, AI, and explainability series?
I feel like it's a set of misunderstood concepts, and many people who come to the security seal are still looking for silver bullets, which could be a solution that detects only malicious people's activities without raising false alarms or taking too much time to operate!
There is also tension between detection engineers who operate products from many vendors and vendors’ teams, such as product and engineering teams. Both parties aim for easy-to-use, understandable, and low-maintenance solutions, offering maximum value with minimal configuration effort. Meanwhile, vendors also focus on creating products with specific features to stand out in the market, emphasizing new detections of new threats.
Understanding both perspectives is crucial, especially considering concepts like false positives, signature-based and AI-driven detections, and the distinction between alarms and anomalies. This makes for an excellent topic for a new series, starting with the viewpoint of detection engineers.
Introduction to the Life of Detection Engineers
Since a few years ago, the Detection Engineer's role has become increasingly critical within Operation Security teams and SOC. These professionals are essential in developing and refining detection systems to combat sophisticated cyber threats effectively. However, their role is fraught with unique challenges.
Grasping the Underlying Mechanics: The Car Analogy
Imagine driving a car. As a driver, you have a basic understanding of the car's subsystems: fuel combustion generates energy, which then moves the motor and, subsequently, the wheels. You know how the steering wheel, brakes, and transmission contribute to your journey from point A to point B. This understanding, albeit not expert-level, gives you the confidence to drive the car effectively.
Similarly, Detection Engineers need a comparable understanding of their detection systems. They must grasp how various components - data input and algorithmic processing - work together to detect cyber threats. This comprehension is vital, especially as detection systems evolve from rule-based to more complex machine-learning models.
Deciphering Vendor-Specific Detection Logic
A significant challenge for Detection Engineers is understanding the proprietary detection logic provided by vendors. Often, cybersecurity solutions are akin to a 'black box'’ where the inner workings are not fully disclosed. This lack of transparency can hinder engineers from fully comprehending how threats are identified and addressed.
Navigating Multiple Data and Tool-Specific Knowledge
A Detection Engineer needs to think “cross-data” to have a holistic view of how various tools will see a threat. As organizations deploy several tools in a defense-in-depth strategy on endpoints, networks, and cloud infrastructure, coherency between all these tools is essential.
Detection engineers must become proficient in the specific languages and mechanisms of various tools and understand the underlying logic. Mastery of diverse tools - ranging from Endpoint Detection and Response (EDR) to Network Detection and Response (NDR) systems and cloud-based platforms - is crucial. Each tool has its distinct detection method, requiring engineers to adapt and think “cross-data” for comprehensive threat analysis.
It is hard to be proficient in such a variety of technologies. It will require team time to climb the learning curve. Management must prioritize that and secure a budget for training. Eventually, it will benefit the organization, improving understanding of all the detection.
Rule vs. Machine Learning-Based Detections: A New Complexity
In addition, transitioning from traditional rule and signature-based solutions to machine learning (ML) based detections has brought new layers of complexity. Rule-based systems operate more straightforwardly: a rule is triggered if a specific data pattern is detected. However, the application of ML in detection systems necessitates a deeper understanding.
The challenge now lies in unraveling the 'why' and 'how' behind the outputs of ML classifiers, as it was needed to understand the nuanced workings of a car's engine.
Refined Definitions in Detection Engineering
This panorama of the difficulties Detection Engineers face leads to a rethink about the critical detection concepts. In threat detection, understanding key terms is crucial for comprehending the daily concerns of Detection Engineers. These definitions take more significance when introducing various AI and ML anomaly detections as a mainstream approach.
Detection / Alert: The Breadth of Detection Logic
Detection of an alert in cybersecurity refers to any event triggered by a detection logic system. These systems can employ static indicators, like rules or signatures, or more dynamic indicators derived from machine learning classifiers. The essence of detection lies in its ability to identify potential threats through deterministic and evolving patterns.
Anomalies are all about changes.
An anomaly is a deviation or unusual occurrence within a system or network that strays from average or typical behavior. This definition, however, must go beyond mere deviation from the norm; it must delve into the relevance and context of such deviations.
Anomalies are particularly noteworthy in the context of organizational cybersecurity policies and practices. For example, an unusual external data transfer to a service like Dropbox may be flagged as abnormal. This could represent a significant security concern in one organization where Dropbox is not a vetted SaaS service, making it a true positive in their anomaly detection framework. In another organization, where Dropbox is used routinely for non-sensitive data transfers, the same event might be considered benign, illustrating a context-dependent interpretation of what constitutes an anomaly.
This expanded understanding of anomalies reflects modern cybersecurity systems' increasing sophistication and contextual awareness. Anomalies are not signatures in data patterns; they are events that must be interpreted within the broader context of an organization's specific security policies, data sensitivities, and operational practices.
After discussing that topic with Max and remembering my endless conversations with SOC customers at Sentryo - where Cyber Vision had a behavioral detection for control systems -, I feel our industry must adopt a more nuanced definition of a “True Positive” or a “False Positive.”, recognizing that context is material.
True Positive: Take care of Context
A true positive occurs when a detection system triggers a relevant alert within the scope of its detection logic. I emphasize relevance, as this definition acknowledges the context-dependent nature of threat detection. For instance, an unusual data transfer to an external service like Dropbox could be a true positive if such activity is anomalous within a particular organizational context. However, the relevance of this detection may vary depending on the organization's policies, data sensitivity, and usage norms.
This nuanced position is becoming more and more adopted by our industry. For example, another prolific and great substack author, Tyler Shields, wrote last week.
The difference between traditional cybersecurity products and modern cybersecurity platforms lies in the use of context (vs. data) and AI to facilitate more intelligent and accurate decisions.
Tyler Shields
False Positive: Beyond Binary Interpretation
Conversely, a false positive is identified when a detection system erroneously triggers an alert on an event that is non-relevant or falls outside the intended scope of the detection logic. Beyond the binary view of false positives, the detected activity's broader context and implications must be examined. An alert for data transfer to an unapproved service, like our Dropbox example, might be considered a false positive in an organization where such transfers are ordinary and non-critical, even though the activity is technically within the detection scope.
This expanded interpretation of what constitutes a true or false positive is new for many professionals in our field.
Conclusion
As Max explains, This expanded interpretation of what constitutes a true or false positive is somewhat new for many professionals in our field. Historically, the industry has been inclined to perceive these concepts in starkly binary terms. A true positive was straightforwardly seen as a detection accurately identifying a designated threat, like an exploit moving through the network. On the other hand, a false positive was typically understood as the detection system mistakenly flagging an event that doesn’t align with the specific characteristics of the threat as outlined in the signature. This shift towards a more nuanced understanding marks a significant evolution in our approach to cyber threat detection.
Next week, we will continue to explore this topic, laying down the fundamental algorithm of ML-based detection and trying to define and illustrate explainability challenges!
In the meantime, please engage in the conversation in the comments section. I’d love to get your perspective on this topic.
Laurent 💚