DEV Community

Cover image for Don't Keep Up with AI!
Jayesh Bapu Ahire for AI Guardrails

Posted on

Don't Keep Up with AI!

Don't try to keep up with AI! by AI Guardrails

In this thought-provoking episode hosted by Jayesh Ahire, Noah Gift, founder and CEO of Pragmatic AI Labs, delves into the complex interplay between generative AI, cybersecurity, and ethics. Gift challenges the current hype surrounding AI, emphasizing its role as an enhancer of existing best practices rather than a disruptive force. With a keen focus on the importance of ethical considerations and the potential risks of commercial AI models, he offers insightful perspectives on the future of technology. Join us as we explore the ethical minefield of AI in cybersecurity and the critical importance of adhering to solid principles over chasing the latest trends, all through the insightful dialogue between Ahire and Gift. Guest Bio: Noah Gift is the founder of Pragmatic A.I. Labs. Noah Gift lectures at MSDS, at Northwestern, Duke MIDS Graduate Data Science Program, the Graduate Data Science program at UC Berkeley, the UC Davis Graduate School of Management MSBA program, UNC Charlotte Data Science Initiative, and University of Tennessee (as part of the Tennessee Digital Jobs Factory). He teaches and designs graduate machine learning, MLOps, AI, and Data Science courses, and consulting on Machine Learning and Cloud Architecture for students and faculty. These responsibilities include leading a multi-cloud certification initiative for students.  Host Bio: Jayesh Ahire is the Founding Product Manager at TraceableAI where he runs the Company’s API Security initiative. He is a Practitioner at heart and has worked with numerous organizations to design and implement secure API architectures and integrate security practices into their development processes. He is the maintainer of Open source projects like OWASP crAPI, Hypertrace, and many others. He has presented at various industry conferences on topics related to API security and secure development practices including DefCon, Bsides, and BlackHat, and also runs the API security Global community.

favicon podcasters.spotify.com

In this episode hosted by ⁠Jayesh Ahire⁠, ⁠Noah Gift⁠, founder and CEO of Pragmatic AI Labs, delves into the complex interplay between generative AI, cybersecurity, and ethics. Gift challenges the current hype surrounding AI, emphasizing its role as an enhancer of existing best practices rather than a disruptive force. With a keen focus on the importance of ethical considerations and the potential risks of commercial AI models, he offers insightful perspectives on the future of technology.

Join us as we explore the ethical minefield of AI in cybersecurity and the critical importance of adhering to solid principles over chasing the latest trends, all through the insightful dialogue between Ahire and Gift.

Guest Bio

Noah Gift⁠ is the founder of ⁠Pragmatic A.I. Labs⁠. Noah Gift lectures at ⁠MSDS⁠, at Northwestern, ⁠Duke MIDS Graduate Data Science Program⁠, the ⁠Graduate Data Science program at UC Berkeley⁠, the UC Davis Graduate School of Management MSBA program, ⁠UNC Charlotte Data Science Initiative⁠, and ⁠University of Tennessee (as part of the Tennessee Digital Jobs Factory)⁠.
He teaches and designs graduate machine learning, MLOps, AI, and Data Science courses, and consulting on Machine Learning and Cloud Architecture for students and faculty. These responsibilities include leading a multi-cloud certification initiative for students. 

Transcript

Jayesh Ahire: Hello, everyone! Welcome to AI and Guardrails. Today we have with us Mr. Noah Gift.

Jayesh Ahire: As you know, in this podcast, we talk about generative AI and security and how to incorporate generative AI into your security strategies. And how to put specific guardrails in your workflows.

Jayesh Ahire: So Noah is the founder and CEO of Pragmatic AI Labs, and as we were chatting before this, he mentioned that he has done a bunch of interesting gigs in the past, one of those being a bouncer.

Jayesh Ahire: So I will let him introduce himself and go through his journey through tech and AI, specifically.

Noah Gift: Yeah, hi, happy to be here. So my background is that I've had a lot of different jobs. Early in my career, I worked in live TV when I was a teenager, which was a pretty useful thing to know how to do because I learned how to edit.

Noah Gift: And then, when I was in college, one of the jobs that I had just for maybe like 6 months was a bouncer at a really large bar, and it wasn't necessarily that I was looking to be a bouncer. It just happened to be the job that I could get so that I could pay the rent while I was in school. And it was pretty fun because I got to work with actually one of the UFC champions, Chuck Liddell. He was one of the bouncers there as well. So it was an interesting kind of accident to work with somebody like that. And then, later in my career, I've worked in TV and film quite a bit and then later startups in the Bay Area. So currently, though, I teach part-time at a couple of different universities, including Duke. And I'm focused on creating content around, I'd say, cloud computing, data engineering, and AI.

Jayesh Ahire: Interesting. Yeah, that must have been very interesting to throw people out of the bar.

Jayesh Ahire: Yeah. So, as you mentioned, right, you have been in this industry for a while now. And focusing on cloud computing, AI, a bunch of different things. So specific to the topic we are discussing right? What are some of the interesting use cases or impactful ways you have seen gen AI or LLM being used in the industry in general, as well as cybersecurity, for that matter, for the sake of the topic? If you can focus on both of those, it would be great.

Noah Gift: Yeah, I think what I see right now with generative AI is that there's a lot of hype around it. But some of the hype doesn't really play out, I think, in terms of usefulness where the organizations that are going to have the best return on investment are going to be organizations that already are well organized. So they are using agile, they use DevOps. They have, you know, security best practices like the principle of least privilege, auditing, you know, multiple layers of security. And so where I see generative AI coming into play is just enhancing what you're already doing. So if you're already doing, you know, exploit analysis, then you can use generative AI to help you with exploit analysis. If you're already doing analysis for outliers, looking at strange behaviors, right? You can also use generative AI to help you with that. So I see it less as like some revolutionary change and more as an accelerant to best practices. So if you are not doing best practices, you're gonna have a very poor time getting results from generative AI.

Jayesh Ahire: And that's an excellent point you made there, right? Enhancing the current workflows which you already have in place. And that's where we have been working with a bunch of customers where people are trying to incorporate this in a way where there's some existing workflow, some existing automation, some existing tech in place, with how can we increase the efficacy? How can we make that workflow more efficient?

Jayesh Ahire: So when it comes to the overall security strategy of an organization, how do you think this fits, and what are some of the interesting use cases or workflows you have seen where gen AI/LLM can have a significant impact?

Noah Gift: Yeah, I think Google recently did a survey where they asked, I think, 100 C-level people what they think the main use cases are for generative AI, and to summarize, I think it was customer service, developer productivity, also content development, and then potentially automation. So where I see maybe a big takeaway would be in terms of developer productivity, you could use generative AI tools to help you look for security holes in your code. So you know, you could have an assistant, maybe with a specific prompt, and the prompt would look for specific patterns that you're trying to identify like, are you declaring

variables that you're not using, or are you not freeing up memory, etc. And so I think that would be potentially an easy one would be to enhance what you're already doing with a chatbot. And then, in terms of automation, as I mentioned earlier, potentially, you could have your already good practices, like looking for outlier behavior or auditing logs, etc. You could have generative AI help you look for patterns like, do you see something that looks like it's a specific behavior, etc. So that's where I really see it fitting in is in those four areas of customer service, content development, developer productivity, and automation.

Jayesh Ahire: Yeah, absolutely. And so, in the same context, right? Also seeing things like people doing SCA and all of those things, using gen AI, using even analyzing the results which come out of this while having predictions to make sure they prioritize the right things, or having all the context built into things so that the personalization or prioritization of a few of these things can be efficient. And that's as we start using this, as you mentioned, the four use cases as we start adopting some of the strategies in our day-to-day workflows.

Jayesh Ahire: What do you think are challenges when it comes to the CxO or CISOs' point of view, where we're trying to incorporate it into security operations. But there are still challenges, including data protection, privacy, a bunch of different things. Right? What is your view on those specific things?

Noah Gift: Yeah, I think that there's a real problem with relying on a commercial, large language model technology. And it's probably much better to think about using open-source large language models as a trial. So what you could do is use technologies like LLaMA, for example, which is Mozilla's runner essentially for large language models. And you could start to take a look at potentially using things like MixTURE or some of these open-source models as maybe the start of what you're building in terms of automation. The real issue with these commercial large language models is that we really don't know yet what will happen when you send all this data to these commercial companies. In theory, the data will never get leaked, in practice, there's a lot of data leaks. So if you're really thinking about security, and then you immediately start using commercial third-party systems and start sending them your data. It doesn't really seem like it fits the smell test for doing security best practices to start sending a bunch of data to a company that, in many cases, is actually already being sued for pirating data or really not caring about data. So I think that's a very big risk for organizations.

Jayesh Ahire: Yeah. And as you're talking about this specifically and the solution being using some of these open-source technologies to make the process efficient and even reliable to some extent. Have you seen any specific scenarios where any examples around you where people have tried to use this, and they have gotten good results? What did that workflow look like, or what are the considerations still in place while using these open-source technologies and building the end-to-end pipeline on it?

Noah Gift: I think an easy one potentially is transcribing video to text. I mean, I think that could be an interesting one. So let's say that you were looking for. Let's say you're working for a government, and the government organization wanted you to prevent classified information from being transmitted into the public. Well, an easy technique could be to transcribe all of the video content into text, and then take that text and then analyze it for leaks. So I think that could be a good example of an open-source tool like Whisper, for example, could use that technology.

Jayesh Ahire: Got it. And as you started talking about government, one of the aspects around AI which is continuously debated and continuously discussed on different fronts is AI and ethics. And when it comes to ethical considerations, there are definitely some of the challenges. On using this AI at a larger scale, especially when you're dealing with sensitive information when you're dealing with things that can impact a lot of lives. There are definitely regulations coming up and then the EU already has some of the things in place, even the laws coming in the US, Asia, but different places right? But at the same time, when we start adopting gen AI, LLM into our workflows, especially when you're dealing with things like security most of the security tools, or even any of the security pieces, will deal with a bunch of sensitive data in theory as well as in practice. And when it comes to dealing with this whole bunch of data, how do you navigate through this ethical consideration, especially around the privacy and protection?

Noah Gift: Well, I think a good starting point would be if you're using commercial models to look at what the company is already doing. So I think if we look at different organizations that are

doing large language models, some of them have already had problems right with ethical issues. And so I would say, avoiding those companies and choosing the companies that have the least amount of litigation, or least amount of concerns about them, would probably be a good choice. So you could rank the different commercial models. So you could pick, you know, 3, 4, different models, just like you would do with a bank, or will you do with any kind of vendor that you deal with, rank them and try to find the companies that seem like they're doing the best with data protection. So I think that's maybe the starting point. Then the second point would be to, you know, really think heavily as well about ways that you can have isolated data when it uses the large language model technology. So if we look at RAG, for example, that's a good example. Where you could have your data protected somewhere talking to a vector database. And there's a large language model that maybe is commercial. If you've implemented it correctly, maybe Amazon Bedrock has a good example of this, or maybe it's a local, large language model. But so I think isolating your data from being exposed to the third-party system. And then there's also things like bias, right where if you're using models that are accentuating what's already happening that was historically bad, like discrimination, then you want to be very careful about putting that into production because you're going to accelerate a problem that society has tried to solve.

Jayesh Ahire: Very good points, and irrespective of the industry in place. What is your personal take on the whole ethical aspect, like anything that is controversial, which can give me a quote?

Noah Gift: So I mean, I think, in terms of the ethical components of large language models. Right now, I think one of the biggest issues is probably piracy. And I think when a large for-profit company is training data intentionally on pirated data, I think that raises a lot of questions about what are their true intentions in terms of helping the world or making a profit, etc. So I think that's probably one of the bigger issues is, are companies respecting intellectual property and are they asking for consent? And really the keyword is consent. And I think we see many examples of a lack of consent when training on data, and even in the case of open-source code, you know, there are different licenses. For example, there's MIT or Apache license, which you can do anything you want. But there's also Creative Commons license as well, where some people have said that their code is non-commercial Creative Commons. So if you've trained your model intentionally on code that's been licensed to be Creative Commons not commercial, then you're basically intentionally breaking the law. There's no other way to put it. I mean, the license specifically says that. So I think those kinds of questions really should be thought about when you're dealing with a company.

Jayesh Ahire: Absolutely. And so just one question slightly deviating from this one. Right? When you mentioned the self-hosted models and building some of these things in-house one of the regular things we keep hearing is cost—the cost of running these models internally or whereas the cost of running this from someone else like using third-party services can be less in some of the cases. So any thoughts around this cost of ownership, and how can we manage it to some extent?

Noah Gift: Well, I think one of the ways to think about it again is thinking about what are your software engineering best practices. So if you have very poor software engineering best practices, you're not doing continuous integration, continuous delivery, you have poor DevOps skills or project management skills, well, you're gonna waste a lot of money on anything you do. So that might be the first place to start, and then in terms of hosting a model. I think it really depends on what it is you're doing. If you're taking an open-source model and you've already got an extremely efficient software system. It may be actually very efficient to host your own model in terms of calling an API. I think one of the problems with calling an API is you have unbounded cost. So if we look at Big O notation, you've got O to the 10 to the N, O to the N squared. When you're basically having O to the N calls. So as your company gets larger and larger, and you keep making calls over and over to an API every time you make a call, you're being charged for it. On the flip side, if the model lives locally, it's O to the 1. Every time you call it, it's just a fixed cost. Like, if you already have a server and you're calling some kind of generative AI workflow, it's essentially a fixed cost that you never have to pay for again. So I think it's really a combination of what you already are doing in

terms of software engineering best practices as well as how you reason about the number of API calls you're going to make.

Jayesh Ahire: That definitely helps, and I guess that will be helpful for many of the others as well.

Jayesh Ahire: So as we go through all of this, like, we talked about some problems, we talked about some solutions. But at the end, if we want to talk about the future specifically, what do you think will be some of the interesting use cases? Which can become prominent in the next 2 to 3 years?

Noah Gift: I think probably software engineering will be more of a collaborative workflow with lots of different agents. So it's possible that you have 2 or 3 different chatbots that are watching what you're writing, and maybe one of them is looking for security. Maybe another one is looking for architecture. And then maybe another one is looking for code quality. Right? So it's almost like you could have lots of different people pair programming with you, but not slowing you down, so I think I think we'll probably see more tooling around code. I don't think coding will be automated anytime soon, but I think there will be different workflows.

Jayesh Ahire: Interesting and anything specific to security?

Noah Gift: I think it's the same with security. So in terms of security, anything you're doing already, you could just think of it as adding additional personnel, right? So maybe you have someone when you're looking at security incidents, you could have a chatbot helping you right? Looking at different outliers, giving you ideas, but it's up to you, as the domain expert, to filter all those different ideas, to make sure that they make sense.

Jayesh Ahire: And as you mentioned one thing right? Like, you don't think programmers will be replaced by AI at all. It will be more of a collaborative effort going forward. Do you think like any of the do you think this this applies to most of the jobs that exist right now, like most of the jobs, will turn into that kind of fashion. Or do you feel that some of those things can get replaced at some point? I know we are way diverging the topic at this point. But yeah, to take your view.

Noah Gift: Yeah, I think anything that requires human judgment. We're a far way off, right? So if it requires an expert to make a decision, I don't think that's going to happen anytime soon. So in the case of security, sure, maybe some things could be automated. But at some point, somebody's still gonna have to make a decision based on that data. And that's where I think there are real gaps in what we've currently got. Same with coding, same with self-driving cars. Right? We're nowhere close to self-driving cars. And so I think the same with the automation. Now, on the flip side, if what you've been doing is cutting and pasting text, yeah, I think you're going to get automated.

Jayesh Ahire: Cool absolutely. And as we're talking about this specific point, there are a lot of folks who are interested in learning more about LLM, learning more about gen AI, as this is going to be more collaborative in the future. And we'll need to know these things. Anyways, there's a lot of interest in learning about these things. Right? So what would be your advice for these folks who want to learn more about gen AI, LLM any workflows, or any of the things you have in mind, any courses you have in mind? I know you have been working on one for some time, so.

Noah Gift: Yeah, so I do a lot of work with Duke on Coursera. I also have some stuff coming up on EdX. That's gonna get announced in Q2, 2024. But basically, if you just look for Coursera Noah Gift, the amount of courses I have is, I think I have roughly 40 courses on Coursera, which is one of the largest amounts of courses by an individual. But a lot of the topics are around real-world, large language model usage, real-world security, real-world cloud computing. So stuff that would really apply to you immediately at work, or maybe would help you get a job. And so that's probably where I would point people is, look at the content I've created on Coursera. There are, I think, 7 courses that are live right now around large language models. It's in the realm of LLM Ops on Coursera, and that would probably be a good spot. I also have some stuff coming up in the next few months on Agile with AI, and also security with AI.

Jayesh Ahire: Interesting. Yeah. So that will definitely be helpful. But one of the things which keep coming up. And I have been discussing

with a bunch of people. So there are a lot of things happening, a bunch of things happening every single day right there. There are new models coming in, new papers being written, so how to keep up with the space, and how to keep ourselves updated every single day. Like, what do you follow, or what will you recommend for people to follow?

Noah Gift: Yeah, I think that's an interesting question, because it's true that there's so much stuff happening that it feels like how could you possibly keep up, and what I would say is that don't keep up. So what I mean by that is that you can. There's nothing that's urgent in the last 20 years, 30 years that I've been in the tech industry. There's never been a case where, if you didn't know something that one week that you lose your job. That's not how the tech industry works. What's the most important is the principles. So, as I mentioned before, you know, agile, DevOps, all these core principles of software engineering best practices. That's what's important. So if you don't have that, it really doesn't matter what new advance is happening. I mean, if you're thinking that you can basically cut the line and get in front of other people who have deep, expert experience by cutting and pasting things from ChatGPT, you're gonna be in for a deep surprise, right? Because that's not how the world works is cutting and pasting code. Right? It works with automation, continuous improvements, these best practices. So I would say, it's not really that important to be up to date on a week to week basis. I think it's much more important to have a deep portfolio of work that shows software engineering best practices, and then wait a little bit. Let the people filter out what's important, right? Because there's so many people trying to be on top of everything. Just wait a week or two, and then you'll figure out based on what the community has decided is important. And then just do that.

Jayesh Ahire: That's definitely a very interesting take on the whole thing.

Jayesh Ahire: Yeah, absolutely. I agree with some of this part, because I remember, like a couple of months back having a discussion. And then one thing that came up was this feels so important because this is one of the most interesting things that has happened in the last decade. Also because everybody was doing exactly the same thing right level. Everything was running in AWS, Azure, Google Cloud day off. All of those things were becoming so standard that there was nothing new to look forward to. And then suddenly the hype started. And now everybody has something to discuss every single day. And that's why I guess people keep falling for keeping up with things, because meetings are happening, but, as you likely pointed out, don't keep up you're trying to see something.

Noah Gift: Yeah, I mean, I'm not saying that you shouldn't try to be up to date on technology. It's just the time interval. So what I mean is that it's okay to be a month behind. There's absolutely nothing really of value that's going to happen within one month. There really just is not that the core principles are what's important. And I would say, there's actually a huge strategy to letting other people figure out what's important first and again waiting a few weeks. And then, if you know, like somebody figures out a new technology great, then use it. Be lazy. Stop working so hard. Be much lazier.

Jayesh Ahire: Yeah, absolutely.

Jayesh Ahire: Cool. So mostly that was all I had around the specific topic, right? As we go towards them. I like reading myself. But do you read often, or any fiction nonfiction recommendations which you have gone to recently?

Noah Gift: Yeah, I read a lot of books, I would say in terms of reading, I think a lot of books around the turn of the century. Like, in terms of 1910, 1920, 1930, have all been kind of interesting to me lately. I think reading Hemingway is an interesting author because he talks about a lot of the things that happened in World War I, World War II, the different changes. So if someone hasn't read Hemingway, I think that could be a good choice for a book to read like, for example, For Whom the Bell Tolls is actually a pretty good book, and maybe even a timely book that talks about the rise of fascism in the 1930s. And how a fascist dictator came to Spain, Franco, and there was a civil war between socialism and fascism. It could be a very interesting book, especially in the world that we're in right now. There's a lot of political unrest, and that could be a good author to read

, because he covers a lot of the topics that I think are potentially topics that we're re-addressing a hundred years later.

Jayesh Ahire: Interesting. Yeah, absolutely I'll personally go through that. But anybody who is interested in the specific era and the specific topic as Noah comments go through it. Cool. So mostly that's it. And thanks for joining us. Thanks for giving the insights. And I think a lot of these things will be very helpful for listeners as well as I. Personally, learned a few important things from the conversation as well. So yeah, thank you, and have a great evening.

Noah Gift: Alright! Talk to you later. Bye.

Jayesh Ahire: Bye.

Conclusion

Thanks for listening/ reading the first episode of AI Guardrails podcast. You can find the latest episode here: https://podcasters.spotify.com/pod/show/ai-guardrails . We are available on Spotify, Apple Podcast, or any of your favorite podcast apps. What are the next topics you want to listen about? Let us know in comments!

Top comments (0)