Explainable & Interpretable AI - An Interview with Serg Masis
Detection Engineers aren't the only ones on a quest to fully grasp the intricacies and nuances of ML, AI, and LLM algorithms.
Hello Cyber Builders 🖖
Welcome to the latest edition of Cyber Builders, where we delve into the ever-evolving world of cybersecurity and AI.
In our ongoing series about detection engineering and AI algorithms, we've had insightful discussions with industry leaders like Maximilian from Darktrace. Today, we focus on the fascinating intersection of machine learning and data analysis. We're thrilled to present an exclusive interview with Serg Masis, a renowned data scientist and author known for his profound insights into machine learning.
In this conversation, Serg shares his expertise on explainability and transparency in AI models, the significance of interpretability across various sectors, and the intricate challenges of adversarial robustness in AI systems. Join us as we uncover the layers of AI algorithms with Serg and discuss his upcoming book "DIY AI," which aims to make AI accessible to a wider audience.
Introducing Serg Masís
Today, we're excited to have Serg Masís with us, a multifaceted professional who intertwines the world of data science with the art of authorship.
By day, Serg navigates the complex data realms of a large agribusiness multinational, and by night, he delves into the intricacies of machine learning through his writing. His journey began two decades ago in computer science, evolving through various roles in software, web, and mobile development.
However, the allure of data and its endless puzzles captivated him, eventually steering him toward a graduate degree in data science.
In this interview, we'll explore Serg's insights on machine learning, specifically focusing on explainability and transparency in models, the importance of interpretability in various sectors, and the challenges of adversarial robustness in AI. Additionally, Serg will share details about his upcoming book, "DIY AI," which aims to democratize AI through engaging and accessible projects.
Hi Serg, can you introduce yourself and tell us about your background?
I have a day job as a data scientist in a large agribusiness multinational and a side gig as an author of books about machine learning. But mostly, I’m just someone who loves all things data and models. I got started in computer science two decades ago. I had many roles in software, web, and mobile development, but data was always there. And I was always trying to make sense of it. I saw myself as a developer, but it took me a while to realize that my motivations weren’t the development but the insights and automation driving it. That’s when I decided to get a graduate degree in data science and transition into strictly data roles.
Serg, could you explain what explainability means in the context of machine learning models?
Certainly! Explainability encompasses what interpretability is but delves deeper into transparency requirements. It's about understanding model inference and providing human-friendly explanations for a model's inner workings and training process.
This could involve varying degrees of transparency in model design and algorithms.
Interesting. Can you elaborate on the different types of transparency involved?
There are three main types. First is Model Transparency, where you can explain step-by-step how a model is trained. For example, explaining how the ordinary least squares optimization method finds the best coefficients in a simple weight prediction model.
Second, Design Transparency involves explaining choices like model architecture and hyperparameters based on data size or nature.
The third is Algorithmic Transparency, which explains automated optimizations, though some methods like random search make the algorithm non-transparent.
You mentioned opaque models. Why are they termed 'opaque'?
Opaque models are called so due to their lack of model transparency. They may be justified in their selection, but they often don't inspire trust. Reasons include not being statistically grounded, uncertainty and non-reproducibility, and issues like overfitting and the curse of dimensionality. They can be hard to trust because of their complexity and our human limitations in understanding many dimensions.
So, why and when does explainability really matter?
Explainability is crucial for trustworthy and ethical decision-making. It's especially important in scientific research, clinical trials, consumer product safety, public policy, law, and areas like criminal investigation and regulatory compliance.
In these fields, the ability to reproduce results, prove causality, and ensure safety and simplicity are paramount.
In the previous publications, we discussed why understanding algorithms matters for Detection Engineers. With your broader experience in other fields, can you provide examples of similar issues? Please give us some insight into the business case for interpretability.
Of course. Interpretable machine learning models can lead to better decision-making, trustworthiness, ethical considerations, and profitability. They help in understanding the model's decisions and improving them. For instance, identifying why self-driving cars confuse snow with pavement can significantly improve the model. Interpretability also helps identify biases and prevent errors that could lead to public relations disasters.
Lastly, how does model interpretation help in addressing decision biases?
Model interpretation helps in several ways. It counters biases like conservatism bias, salience bias, and fundamental attribution error. It encourages questioning assumptions and expanding understanding, leading to better decision-making. Crucially, it aids in identifying outliers, which could be opportunities or potential liabilities.
Your book also covers “Adversarial Robustness”. What is it? Is it a significant threat?
When we think about decisions or predictions, we think about the “what if”.
What would need to change for something to switch from one decision or prediction to another? This is called a counterfactual and, very crucially, is fundamental to causal thinking and causal relationships.
Machine learning models are correlation-based, so they lack this level of consistency. And one of the most popular forms of adversarial attacks will exploit this vulnerability. For instance, it will realize that by changing the color of a single portion in a stop sign, a computer vision model will be tricked into thinking it is not a stop sign. Although illogical to any human that focuses on only the relevant parts of a stop sign, to a deep learning model, all parts of an image are potentially relevant to a classification - even a single pixel. So, adversarial robustness is about defending models against such attacks and others that serve to cause sabotage, fraud, or espionage.
Anything you want to add? Projects you want to talk about?
I’m working on a book called DIY AI for Addison-Wesley to help practitioners and enthusiasts alike have fun and learn with generative and discriminative AI projects. It has code, but its target audience is anyone that is willing to take on the projects and make them their own. We speak a lot about making AI responsible, but I believe we have to make it reach more people.
I was around for the early days of the internet. It took a lot of experimenting with the technology from both technically savvy and technically challenged folks alike to realize what was needed from it. There was a grassroots aspect to it before it became increasingly corporate, and I think that is what is needed with AI.
Conclusion
Our enlightening dialogue with Serg Masis has offered us a deeper understanding of the complexities and necessities of explainability in machine learning. His perspectives on transparency, interpretability, and adversarial robustness in AI enlighten us about the technical aspects and the ethical implications of AI in various fields.
Serg's upcoming book, "DIY AI," aligns perfectly with our mission at Cyber Builders - to democratize AI and encourage a community-led approach to understanding and innovating in this space.
As we continue to explore the vast expanse of detection engineering and AI algorithms, let's remember the importance of making AI responsible and accessible, as Serg rightly emphasizes.
We're eager to hear your thoughts on this discussion. Your feedback, questions, and insights are invaluable to us. Feel free to reach out through our blog comments, social media channels, or email.
Your engagement helps us shape future content and fosters a vibrant, collaborative community in cybersecurity and AI. Stay tuned for more insights and collaborative explorations!
Laurent 💚