Greg Benson | SnapLogic Innovation Day 2018

Greg Benson, Chief Scientist at SnapLogic and Professor at University of San Francisco sat down with Jeff Frick from theCUBE to discuss the impact of AI on app and data integration projects.

More videos from SnapLogic Innovation Day 2018

Read full transcript

>> Narrator: From San Mateo, California, it’s theCUBE, covering SnapLogic Innovation Day 2018. Brought to you by SnapLogic.

>> Welcome back, Jeff Frick here with theCUBE. We’re at the Crossroads, that’s 92 and 101 in the Bay Area if you’ve been through it, you’ve had time to take a minute and look at all the buildings, ’cause traffic’s usually not so great around here. But there’s a lot of great software companies that come through here. It’s interesting, I always think back to the Siebel Building that went up and now that’s Rakuten, who we all know from the Warrior jerseys, the very popular Japanese retailer. But that’s not why we’re here. We’re here to talk to SnapLogic. They’re doing a lot of really interesting things, and they have been in data,and now they’re doing a lot of interesting things in integration. And we’re excited to have a many time Cube alum. He’s Greg Benson, let me get that title right, chief scientist at SnapLogic and of course a professor at University of San Francisco. Greg great to see you.

>> Great to see you, Jeff.

>> So I think the last time we see you was at Fleet Forward. Interesting open-source project, data, ad moves. The open-source technologies and the technologies available for you guys to use just continue to evolve at a crazy breakneck speed.

>> Yeah, it is. Open source in general, as you know, has really revolutionized all of computing, starting with Linux and what that’s done for the world. And, you know, in one sense it’s a boon, but it introduces a challenge,because how do you choose? And then even when you do choose, do you have the expertise to harness it? You know, the early social companies really leveraged off of Hadoop and Hadoop technology to drive their business and their objectives. And now we’ve seen a lot of that technology be commercialized and have a lot of service around it. And SnapLogic is doing that as well. We help reduce the complexity and make a lot of this open-source technology available to our customers.

>> So, I want to talk about a lot of different things. One of the things is Iris. So Iris is your guys’ leverage of machine learning and artificial intelligence to help make integration easier. Did I get that right?
>> That’s correct, yeah. Iris is the umbrella terms for everything that wedo with machine learning and how we use it to enhance the user experience. And one way to think about it is when you’re interacting with our product, we’ve made the SnapLogic designer a web-based UI, drag-and-drop interface to construct these integration pipelines. We connect these things called Snaps. It’s like building with Legos to build out these transformations on your data. And when you’re doing that, when you’re interacting with the designer, we would like to believe that we’ve made it one of the simplest interfaces to do this type of work, but even with that, there are many times we have to make decisions, like what type of transformation do you do next? How do you configure that transformation if you’re talking to an Oracle database? How do you configure it? What’s your credentials if you talk to SalesForce? If I’m doing a transformation on data, which fields do I need? What kind of operations do I need to apply to those fields? So as you can imagine,there’s lots of situations as you’re building out these data integration pipelines to make decisions. And one way to think about Iris is Iris is there to help reduce the complexity, help reduce what kind of decision you have to make at any point in time. So it’s contextually aware of what you’re doing at that moment in time, based on mining our thousands of existing pipelines and scenarios in which SnapLogic has been used. We leverage that to train models to help make recommendations so that you can speed through whatever task you’re trying to do as quickly as possible.

>> It’s such an important piece of information, because if I’m doing an integration project using the tool, I don’t have the experience of the vast thousands and thousands, and actually you’re doing now, what, a trillion document moves last month? I just don’t have that expertise. You guys have the expertise,and truth be told, as unique as I think I am,and as unique as I think my business processes are,probably, a lot of them are pretty much the same as a lot of other people that are hooking up to SalesForce to Oracle or hooking up Marketta to their CRM. So you guys have really taken advantage of that using the AI and ML to help guide me along, which is probably a pretty high-probability prediction of what my next move’s going to be.

>> Yeah, absolutely, and you know, back in the day, we used to consider, like, wizards or these sorts of things that would walk you through it. And really that was,it seemed intelligent, but it wasn’t really intelligence or machine learning. It was really just hard-coded facts or heuristics that hopefully would be right for certain situations. The difference today is we’re using real data, gigabytes of metadata that we can use to train our models. The nice thing about that it’s not hard-coded it’s adaptive. It’s adaptive both for new customers but also for existing customers. We have customers that have hundreds of people that just use SnapLogic to get their business objectives done. And as they’re building new pipelines, as they are putting in new expressions, we are learning that for them within their organization. So like their coworkers, the next day, they can come in and then they get the advantages of all the intellectual work that was done to figure something out will be learned and then will be made available through Iris.>> Right. I love this idea of operationalizing machine learning and the augmented intelligence. So how do you apply it? Don’t just talk about it, don’t give it a name of some dead smart person, but actually apply it to an application where you can start to see the benefit. And that’s really what Iris is all about. So what’s changed the most in the last year since you launched it?

>> You know, one thing I’ll say: The most interesting thing that we discovered when we first launched Iris, and I should say one of the first Iris technologies that we introduced was something called the integration assistant. And this was an assistant that would make, make recommendations of the next Snap as you’re building out your pipeline, so the next transformation or the next connector, and before we launched it, we did lots of experimentation with different machine learning models. We did different training to get the best accuracy possible. And what we really thought was that this was going to be most useful for the new user, somebody who hasn’t really used the product and it turns out, when we looked at our data, and we looked at how it got used, it turns out that yes,new users did use it, but existing or very skilled users were using it just as much if not more,’cause it turned out that it was so good at making recommendations that it was like a shortcut. Like, even if they knew the product really well, it’s still actually a little more work to go through our catalog of 400 plus Snaps and pick something out when if it’s just sitting right there and saying, “Hey, the next thing you need to do,” you don’t even have to think. You just have to click,and it’s right there. Then it just speeds up the expert user as well. That was an interesting sort of revelation about machine learning and our application of it. In terms of what’s changed over the last year, we’ve done a number of things. Probably the operationalizing it so that instead of training off of SnapShot, we’re now training on a continuous basis so that we get that adaptive learning that I was talking about earlier. The other thing that we have done, and this is kind of getting into the weeds, we were using a decision tree model, which is a type of machine learning algorithm, and we switched to neural nets now, so now we use neural nets to achieve higher accuracy, and also a more adaptive learning experience. The neural net allowed us to bring in sort of like this organizational information so that your recommendations would be more tailored to your specific organization. The other thing we’re just on the cusp of releasing is, in the integration assistant, we’re working on sort of a, sort of, from beginning-to-end type recommendation, where you were kind of working forward. But what we found is, in talking to people in the field, and our customers who use the product, is there’s all kinds of different ways that people interact with a product. They might know know where they want the data to go, and then they might want to work backwards. Or they might know that the most important thing I need this to do is to join some data. So like when you’re solving a puzzle with the family, you either work on the edges or you put some clumps in the middle and work to get to. And that puzzle solving metaphor is where we’re moving integration assistance so that you can fill in the pieces that you know, and then we help you work in any direction tomake the puzzle complete. That’s something that we’ve been adding to. We recently started recommending,based on your context, the most common sources and destinations you might need, but we’re also about to introduce this idea of working backwards and then also working from the inside out.

>> We just had Gaurav on,and he’s talking about the next iteration of the vision is to get to autonomous, to get to where the thing not only can guess what you want todo, has a pretty good idea, but it actually starts to basically do it for you, and I guess it would flag you if there’s some strange thing or it needs an assistant, and really almost full autonomy in this integration effort. It’s a good vision.

>> I’m the one who has to make that vision a reality. The way I like to explain is that customers or users have a concept of what they want to achieve. And that concept is asa thought in their head, and the goal is how to get that concept or thought into something that is machine executable. What’s the pathway to achieve that? Or if somebody’s using SnapLogic for a lot of their organizational operations or for their data integration, we can start looking at what you’re doing and make recommendations about other things you should or might be doing. So it’s kind of like this two-way thing where we can give you some suggestions but people also know what they want to do conceptually but how do we make that realizable as something that’s executable. So I’m working on a number of research projects that is getting us closer to that vision. And one that I’ve been very excited about is we’re working a lot with NLP,Natural Language Processing, like many companies and other products are investigating. For our use in particular isin a couple of different ways. To be sort of concrete, we’ve been working on a research project in which, rather than, you know, having to know the name of a Snap. ‘Cause right now, you get this thing called a Snap catalog, and like I said, 400 plus Snaps. To go through the whole list, it’s pretty long. You can start to type a name, and yeah, it’ll limit it, but you still have to know exactly what that Snap is called. What we’re doing is we’re applying machine learning in order to allow youto either speak or type what the intention is of what you’re looking for. I want to parse a CSV file. Now, we have a file reader,and we have a CSV parser, but if you just typed, parse a CSV file, it may not find what you’re looking for. But we’re trying to take the human description and then connect that with the actual Snaps that you might need to complete your task. That’s one thing we’re working on. I have two more. The second one is a little bit more ambitious, but we have some preliminary work that demonstrates this idea of actually saying or typing what you want an entire pipeline to do. I might say I want to read data from SalesForce, I want to filter out only records from the last week, and then I want to put those records into Redshift. And if you were to just sayor type what I just said, we would give you a pipeline that maybe isn’t entirely complete, but working and allows you to evolve it from there. So you didn’t have to go through all the steps of finding each individual Snap and connecting them together. So this is still very early on, but we have some exciting results. And then the last thing we’re working on with NLP is, in SnapLogic, we have a nice view eye, and it’s really good. A lot of the heavy lifting in building these pipelines, though, is in the actual manipulation of the data. And to actually manipulate the data, you need to construct expressions. And expressions in SnapLogic, we have a JavaScript expression language, so you have to write these expressions to do operations, right. One of our next goals isto use natural language to help you describe what you want those expressions to do and then generate those expressions for you. To get at that vision, we have to chisel. We have to break down the barriers on each one of these and then collectively,this will get us closer to that vision of truly autonomous integration.

>> What’s so cool about it, and again, you say autonomous and I can’t help but think autonomous vehicles. We had a great interview, he said, if you have an accident in your car, you learn, the person you had an accident learns a little bit, and maybe the insurance adjuster learns a little bit. But when you have an accident in an autonomous vehicle, everybody learns, the whole system learns. That learning is shared orders of magnitude greater, to greater benefit of the whole. And that’s really where you guys are sitting in this cloud situation. You’ve got all this integration going on with customers, you have all this translation and movement of data. Everybody benefits from the learning that’s gained by everybody’s participation. That’s what is so exciting, and why it’s such a great accelerator to how things used to be done before by yourself, in your little company, coding away trying to solve your problems. Very very different kind of paradigm, to leverage all that information of actual use cases, what’s actually happening with the platform. So it puts you guys in a pretty good situation.

>> I completely agree. Another analogy is, look, we’re not going to get rid of programmers anytime soon. However, programming’s a complex, human endeavor. However, the Snap pipelines are kind of like programs, and what we’re doing in our domain, our space, is trying to achieve automated programming so that, you’re right, as you said, learning from the experience of others, learning from the crowd, learning from mistakes and capturing that knowledge in a way that when somebody is presented with a new task, we can either make it very quick for them to achieve that or actually provide them with exactly what they need. So yeah, it’s very exciting.

>> So we’re running out of time. Before I let you go, I wanted to tie it back to your professor job. How do you leverage that? How does that benefit what’s going on here at SnapLogic? ‘Cause you’ve obviously been doing that for a long time, it’s important to you. Bill Schmarzo, great fan of theCUBE, I deemed him the dean of big data a couple of years ago,he’s now starting to teach. So there’s a lot of benefits to being involved in academe, so what are you doing there in academe, and how does it tie back to what you’re doing here in SnapLogic?
>> So yeah, I’ve been a professor for 20 years at the University of San Francisco. I’ve long done research in operating systems and distributed systems, parallel computing programming languages, and I had the opportunity to start working with SnapLogic in 2010. And it was this great experience of, okay, I’ve done all this academic research, I’ve built systems, I’ve written research papers, and SnapLogic provided me with an opportunity to actually put a lot of this stuff in practice and work with real-world data. I think a lot of people on both sides of the industry academia fence will tell you that a lot of the real interesting stuff in computer science happens in industry because a lot of what we do with computer science is practical. And so I started off bringing in my expertise in working on innovation and doing research projects, which I continue to do today. And at USF, we happened to have a vehicle already set up. All of our students, both undergraduates and graduates, have to do a capstone senior project or master’s project in which we pair up the students with industry sponsors to work on a project. And this is a time in their careers where they don’t have a lot of professional experience, but they have a lot of knowledge. And so we bring the students in, and we carve out a project idea. And the students under my mentorship and working with the engineering team work toward whatever project we set up. Those projects have resulted in numerous innovations now that are in the product. The most recent big one is Iris came out of one of these research projects.

>> Oh, it did?

>> It was a machine learning project about, started around three years ago. We continuously have lots of other projects in the works. On the flip side, my experience with SnapLogic has allowed me to bring sort of this industry experience back to the classroom,both in terms of explaining to students and understanding what their expectations will be when they get out into industry, but also being able to make the examples more real and relevant in the classroom. For me, it’s been a great relationship that’s benefited both those roles.

>> Well, it’s such a big and important driver to what goes on in the Bay Area. USF doesn’t get enough credit. Clearly Stanford and Cal get a lot, they bring in a lot of smart people every year. They don’t leave, they love the weather. It is really a significant driver. Not to mention all the innovation that happens and cool startups that come out. Well, Greg thanks for taking a few minutes out of your busy day to sit down with us.

>> Thank you, Jeff.

>> All right, he’s Greg, I’m Jeff. You’re watching theCUBE from SnapLogic in San Mateo, California. Thanks for watching.


The AI Mindset: Getting Started with Self-Service Machine Learning

Watch Now

Analyst Report

Gartner names SnapLogic a Leader in 2019 iPaaS Magic Quadrant…

Read Now

Case Study

Making the faculty and academia even smarter at Boston University

Read Now
Contact Us Free Trial