Jump to content
Every attendee will receive access to their Prezi on “Simplify Integration: Cloudify Your On-Premise Data” which includes an application inventory pre-flight checklist to help you get started right away.
**Follow us on Twitter, Linkedin and Facebook!
Download: Cloudify On-Premise Data Prezi
Tony: Good morning, everyone. Hello and welcome to our webinar. I'm AnthonyYoung, and I'm the director of digital marketing at SnapLogic. We will get started, by way of introduction, with Zeb and Abhi shortly.
Today we'll talk about cloudification of on premises data. We'll talk about what steps are needed to cloudify data that's sitting in on premise applications. What are the options available? We'll follow with a short demo that shows one way to cloudify on prem data, and then we'll wrap up the discussion with general market demand. Joining me today is Abhi Dharia. Abhi has several years of experience working with data integration products like ETL, EAI, etc. Hi, Abhi . Would you like to say a few words about yourself?
Abhi : Thank you Zeb, thank you Tony. Hello everyone, my name is Abhi Dharia. I'm a principal consultant at SnapLogic, and as Tony mentioned, I've been working in the space of enterprise integration for the last 10 years. I've written commercial integration software, designed mission critical integration solutions, and I've also advised multiple Fortune 100 clients, on their enterprise integration platforms. Right now, I'm mostly interested in focusing on cloud integration implementation and strategies. Thanks again for joining the webinar, and I look forward to hearing from you about your experiences, and really hearing about what you are currently looking to do in this space over the next half an hour or so. Back to you, Tony.
Tony: Awesome, thank you Abhi . Next up is Zeb. Zeb, can you tell us a little bit about yourself?
Zeb: Sure. I'm Zeb Mahmood, I've been in the enterprise space for 12, 15 years now, Tony. I've been in the master data management, or MDM space, for about 8 years. Then actually I moved over to the data integration phase about a year and a half ago. To be more specific, data integration moved over to the cloud integration space. Myself and Abhi , for those of you who just joined, will be the main presenters. Let's get started. Let's see how we can unlock your on premise data while having some fun. Yes, as we get started, please note that we will be making this presentation publicly available on [inaudible 00: 00:02:28]. If you're not familiar with that, just ask the presentation software. Just enjoy the presentation. You probably don't have to take too many notes. We will also make the recording available [inaudible 00: 00:02:41] as well.
Tony: Just a word to folks out there. Let us know if you have any problems hearing us or if there's any audio issues, and we'll try to make adjustments as well.
Zeb: Thank you. Let's jump into it right away. If you joined this webinar, then you've at least heard the term REST, right? Or restful. At least as far as this webinar is concerned, very quick overview of REST in one page. It's one of the web services methodologies which is out there. Bing, being the other popular one. Really, what you're essentially doing, what we're talking about here is how to use REST or RESTful API to cloudify your data. When we talk about that, then we talk about the most common example, which is the HTTP protocol. Which everybody, it's fair to assume, is familiar with. We use it every day in our browsers. Imagine if you are working in an enterprise service and enterprise API, using a URL, with a combination of data or post method. That's really the simplicity of REST. That is the beauty, that it's platform, language and location independent. All you need to really know is the URL. That's your path to your API.
With that, let's talk about the problems that it has. In the past, accessing data from on premise applications meant having to deal with application specific API, or direct database calls. Mostly these APIs would be what, your C API, Java API, whatever your language of preference is. It will be some language specific API, which makes data not so easily accessible. In any integration scenario, you have at least two integration points, right? If that makes sense. You're pulling data out from one and pushing to the other. Now it's possible that the first application uses C API while the other one uses, I don't know, Python or Java API. You can see we have some dancing around to do, right?
What happens? End result, we all know. Most of the enterprises, we end up doing what? File data configuration, right? [inaudible 00: 00:05:00], the most common format. If you get specific it's XML. That's what we're doing. Inherently what's happening is that these lockdowns that you have based on the language preference of the API ends up making you conform to something which is common, and what's common is a file system. What's the current state? The current state is good. It's getting much better. We have all these cloud vendors coming up with web services. If you look at salesforce.com, we look at Twitter, Facebook, they have SOAP API, they have REST API. That's great.
Hey, when you really think about it, it's really not that great, because you still have your old on premise applications. Now the challenge is to connect, or integrate this on premise applications with these new interfaces. Right? To make things even more complex, you throw in something like big data. Because it's not just about the transfer protocol anymore, it is also about the type of data. Big data is mostly unstructured, mostly you get value from big data when you're doing some kind of real time, what's called streaming these days. Things tend to change significantly, point being. They changed significantly from what you're probably used to on on premise data.
Then you move to future, what's the future? Most organizations are so overwhelmed that they are in reactionary mode right now, as far as integration's concerned. They are, let's go ahead and do this one more integration. Just one more integration. There isn't really a big strategy out there. Where do we go from here? You've got your on premise data, you've got your data in PRP, commerce, DLM. You've got all of these typical enterprise applications. You want to make all of these data sources exposed to your enterprise ecosystem, so you're ready to roll.
Let's analyze what is it you want to accomplish, before you start on your journey. Let's take a look at the IT guys' Post-It notes on his desktop. What does he have there? Leverage API, leverage the application's API. Definitely a very good start. You don't want to go directly against the back end database as a first choice. First choice most of the time should be an API. There are some caveats, but let's go with that for now. Then you definitely want to avoid vendor lock-in. Lock-in here, by the way, refers to mostly programming language lock-in that I was referring to, locking in software, hardware stock that is of preference. Then you have business. Of course technology is a tool, we all know that. It has to eventually assist, it has to benefit the business. Why make business readjustments because you picked a specific technology? Never, ever, ever underestimate the importance of the quality of the data.
As you embark on integration, you will realize that having planned data, harmonized data, is extremely important. You can have the sexiest integration solution out there, but if the data going in is garbage, then you've just got a pig with lipstick on it coming out the other side. Data quality is really critical, but having said that, you should not also jump to just getting the data quality, tool. A lot of the integration programs out there have clearly sufficient level of data camping [SP], harmonized data tools built in. Make that call carefully, that do you need a full [inaudible 00: 00:08:44] data quality solution or not.
Lastly, the frequency of the API upgrades. Classic example, salesforce.com rolls out three releases a year, typically. In fact, that's the model that most applications have today, two to four releases that come. That was unheard of in the installed software world. If you look at, I don't know, IBM, they'll have one release a year, one release every 12 months. Compare that to three to four releases coming out every year from these cloud vendors. That's definitely something that you need to factor in. OK, let's keep moving forward. I talked about, as a first reference, to integrate should be the API. As we look at that option, what if there is no decent API that exists? Then what you need to do is you look to see if there are any scripts available. Be careful about that, because usually a lot of [inaudible 00: 00:09:41] scripts are not available as part of the [inaudible 00: 09:45] product. You have a good opportunity, you should call up tech support, and maybe they can help you get some scripts which are not part of the [inaudible 00: 00:09:57] product.
Most of the time some kind of script, data extraction tools or reports would exist. Definitely look for those on [inaudible 00: 00:10:06], etc. and find those. Those are still all I would call front door accesses. What happens if there is no front door, or nobody's answering on the front door, there is no API? Let's go around the back. Let's go to the persistence layer, the database most of the time. As we go to that also, we realize that a lot of applications tend not to have their schemas expose. Yeah, you have access to the database, but if they have not documented it, that's a big sign that they don't want you to access the back end database. If you do, then you will not get support from them. In which case, your probably best friend is your own in-house DBA. Really, that is one approach that you can plug into.
Like I mentioned earlier, the frequency of the data is also very important, especially when there's streaming data that is coming in. Because by the nature of it, the streaming on real time data tends to be data which is communicated as deltas and other [inaudible 00: 11:06] batches. The deltas have to be dealt differently than a full load, and if the deltas change, you have deltas which are record level deltas and then you have ones which are C level. There are different ways to handle those. There are different technologies which understand those. What do we do? Get through the topic. No surprises here, but yeah. We cloudify the data. You should cloudify your data. Yes, give your on premise data the same characteristics like you have in your browser. Just like any browser running on any operating system running on pretty much any device these days can access any web page, your on premise data should be cloudified similarly.
What I mean is that make your on premise data accessible from pretty much anywhere, from any calling program, and respond back in any format. Do not force constraints like programming language, data formats, or software stacks or hardware stacks onto [inaudible 00: 00:12:10] program. Cloudify, what does that mean? The first step we're talking about here today is really the REST interface. We learned that cloudification is a good idea, now what we're doing is we're saying that we are going to go ahead and put a REST API front end to the on premise data.
Let me repeat that. If you expose the data from your on premise applications with your REST API, in essence you have cloudified the on premise data. That's really the gist of the webinar in terms of what we're trying to do. Next, you've decided you want to cloudify your data. Now you have to pick the right tool. Like I said, cloudification is closely related to your ability to making your data RESTful. Which means that you must take a tool that specifically works well with the REST concepts as well.
Next you should estimate your performance needs, because the tool that you picks needs to have that much horsepower. Can it scale horizontally by throwing more hardware at it, for example? Is there a breaking point that's lower than the peak load requirements that you have in your enterprise? Data formats. Very important not to pick a tool which is going to force you to work with a handful of data formats. Don't lock in with the CSXML [SP] format. For example, when you're talking REST you're mostly talking the JSON format. The RESTful architecture is inherently what's called the hypermedia architecture. If you read into it in more detail, you'll realize that you need to start supporting varying data formats. You'll have the same API call return different data formats depending on what a given program's asking for.
Next is maintenance. Earlier I said that most cloud [inaudible 00: 14:08] what, three to four releases a year? What it means is that when your business finds out about it, they're going to ask you, you being the IT guy, three to four upgrades a year as well? Pick the tool that can react to that high frequency of upgrades. If you're going to be waiting on your vendor to give you an upgraded connector or adapter, that is going to have a negative impact on your business. That's related to the previous point, that is, can you connect [inaudible 00:14:39] build? You cannot rely solely on the vendor for the data integration tools to provide you all the connectors. There is no vendor today that can provide 100% coverage. Your best bet is probably that a vendor has a developer community based on [inaudible 00:14:56] there, that simple connectors can be developed either by yourself in-house, or there's a developer community which is building those and sharing those.
Now that you've seen what to consider for your selection of your cloudification tools, let's go forth and conquer. Cloudify your on premise data, but [inaudible 00:15:19] piecemeal, that's something that we always do. I was recently [inaudible 00:15:25] to a potential customer, and they had 50 to 60 on premise applications that are their target for cloudification. I talked to them, and I found out that only about 12 were really high priority. Yeah, take the piecemeal approach, understand your priorities from a business angle, from a technical angle, and then take small bites.
Always look at all the integration options. Remember what we were just talking about is you have all these options going through API, databases, using scripts. Look at all those options and weigh in in terms of time, complexity, investment, which is the best option to go for. You have to pick the right tools. Based on the checklist that we just had in the previous slide, just be conscious that your tool has to address not just today's requirements, but you should be able to estimate the tool based on your foreseeable future requirements also. To be more specific, when I talk to enterprises, mostly are like hey, we have SOAP today, and therefore we have to have some kind of SOAP connectivity, but where we're really going is REST. We'll see that trend with the data centers as well.
OK. Then after that, really just let the fun begin. Treat your on premise data like SSD, [inaudible 00:16:45], website commerce, [inaudible 00:16:46], as if the data was in [inaudible 00:16:50] applications. Make a REST call from anywhere. Enjoy your freedom. Lose your shackles with specific languages, C API, Java API, nothing like that. You're totally independent of what software you pick, what calling application you pick, because the data on your on premise applications, [inaudible 00:17:09], etc., has been cloudified. Remember this is really a big achievement in that sense, not how much effort will go in, but you realize once you've done this, accomplished this, it opens up so many more opportunities for businesses also to leverage the valuable data which is not [inaudible 00:17:29] applications. I hope by this time you have some sense of why it's important, what are the options available. I think it's a good time for Abhi , if you want to walk us through a short demo and demonstrate how we can do this.
Abhi : Thank you, Zeb. Let me share my screen. All right. Let's talk about, can everybody hear me?
Zeb: We can hear you fine here.
Abhi : Perfect. Thank you, Zeb.
Does this scenario sound familiar? Your organization has a whole bunch of on premise applications, and you're slowly looking into cloud as a strategy. You now have a healthy mix of on premise applications and cloud applications. The big challenge of course, is integrating these two applications. You need to integrate, even though you have applications spread across the cloud as well as on premise, the data as a whole, is your entity, your organization, is one company. You need to integrate the data spread all across the cloud.
You also need to give controlled access to your data to your business partners. Does that sound like something that you have come across? That was exactly the scenario that one of our clients was recently in. We worked with them, and the scenario that they were facing was, that they had a legacy ERP system, and in that ERP system, they had information about all the products, all the widgets that they offered. As a part of the new cloud strategy, they were exploring building a new application using one of the more modern cloud platform. This modern cloud platform application needed to have access to data in their ERP system.
What's the solution here? We had them cloudify the data. What did our client want? We worked with the features that the client was looking for in the integration platform. They were looking for support, they were looking to build integrations easily. Their big requirement was integration had to be a snap, integration shouldn't be a pain, integration should be a big overhead. You need to be able to deliver it quickly. That's number one. Number two is even though it's built quickly doesn't mean that it shouldn't be built to last. Even though it's built very quickly, it should be very maintainable, it should be extensible, and it should be able to support not just the requirements that are immediately known, but it should also support multiple output formats like XML, JSON and so on.
What we did for them was we built them what you have in front of you, a version of the integration that we built for them. We built a very, very simple integration for them that essentially read the data, applied some small filters and presented that data out as a well formatted XML. This is actual data that we are looking at. What you're looking at is the integration built completely using the browser, no special tools needed. Built completely in the browser, in the cloud. This is a SnapLogic server sitting in the cloud, accessible from everywhere.
This is the list of the products, of course the data has been changed to make sure we don't [inaudible 00:21:36], but this is the representative data of all the products. You can see that here, I can, on the fly, represent the data as JSON. If need be, on the fly I can represent the data as HTML. Of course it's quite easy to apply CSV on top of this data and format it according to the standards and the needs and the template. If need be, I can have the entire file be downloaded as a CSV, that can be then loaded up with Excel. This is actually a great tool to very easily allow the entire data set to be downloaded. The important thing to note here is all the features that we saw, everything that we saw, is really being delivered by just this one simple integration. There's just one simple integration built in very, very quickly, that delivered all of these features. Any questions? Zeb, Tony, back to you.
Tony: Thanks, Abhi . I think Zeb will, if we can get his screen up, he will conclude his presentation and move on to some questions from the group after that.
Zeb: OK. Thank you. First of all, thanks Abhi , thank you for a really nice demo. That was really sweet. There's a scenario also, I know somebody who pinged also on the chat, asking for the scenario that you were just walking us through. We will be sharing the entire presentation, so you will have that scenario written out for you to view as well. In the interest of time, I'll quickly skip that, since you've seen the demo, and go toward our computing kind of scenario, which I think is very relevant.
It's the brief history of integration, because I looked at a few chat messages, people are asking questions like how do we differentiate from PSP, how do we work with PSP, because they also claim to cloudify the data I think it's worth going through this. What happened in 1990, more like mid 1990s, is that we started seeing all these tools which were coming out in the market like ESBs, ETL, EAI, which were doing an awesome job, in terms of integrating data, integrating from on premise applications like ERP, e- commerce, PLM, CLM, all of those.
There were some distinct [inaudible 00:24:08] of this data. The data was mostly batched jobs, running at prescheduled times, and was structured data. All of these tools, ETL, ESB, etc. were built to work with that kind of data. That was fine, it still does work really well. But what happened when we moved into the enterprise cloud space? In 2000, companies like salesforce.com, [inaudible 00:24:35], they started being recognized as true players in the enterprise space. When we look closely at the enterprise cloud applications, we see that business loves it. Mostly because it requires little to no IT involvement, users could go sign up, pay a monthly subscription fee, everybody was happy.
Hey, is really everybody happy? That's the question, because now you have your data sitting out in silos, with either no or very limited connectivity across those applications, how can you be happy? IT is not happy, business is not happy. Now you have to manage integration between your cloud instances, SAS applications, and your on premise applications. Then we move forward to let's say 2005, 2006, and even a little before that actually. I think Amazon.com, the consumer review was really the first true crowdsourcing implementation that started providing a lot of insight not just to Amazon, but also to the general public as well.
Then we have more traditional, more talked about giants like Facebook, Twitter. Look at their data. What's happening here is this data is distinctly different from the data we're used to, that ETL is used to handling, that ESB is used to handling. There are three Vs that come out as challenges. The volume is huge, the velocity, it's streaming data, people are blogging, writing on Facebook in real time. The value really comes out when you process that in almost real time as well. Then the variety, the constructed data. It's not first name, 20 characters long kind of a field anymore.
From that, we move and we come into the more recent past. [inaudible 00:26:27] IPO, right? Everybody talks about big data. Although I personally get involved in work like [inaudible 00:26:34], now known as [inaudible 00:26:36], RFID tags were being built and talked about, and Apple was going on to enforce those tags back in 2004, 2005.
In fact, Wal-mart had this 2005 date where they wanted all of their vendors to conform to that. The point is that big data was being generated. Big data was being generated, it was being promoted, but it was really Hadoop that was the game changer. Hadoop came along and the barrier to entry was so low that everybody started capturing big data. Pretty much any enterprise that you talk to today said yeah, we have at least one big data project going on. Then you ask the second question, what are you doing with your big data? They're like, 'Give us ideas.'
I'm not kidding. This is what's happening pretty much everywhere, and we're all guilty of that. The point is that, why are you asking this question? Big data by itself gives some insight, but unless you connect the dots, take it back to the on premise, your staff applications, all the other things in your ecosystem, it does not give you a complete picture. What's the requirement? The requirement is that you need to have data shared across applications, and you do that. You do that using the tools that you have available. You have ETL, ESB, and coding preferences as well, and you connect everything. Everything looks good?
I don't think it looks good, because what you have is a bunch of integrations without any immediate reaction. Whenever you throw something new into your enterprise, the IT guys is like hey, I have to go and build a new integration. He's really not happy. You're talking to business and he's like, "I asked for this thing two months ago and now I'm getting integration. By now I've changed my requirement software." IT's not happy, business is not happy. Where are we going with this? I say no to these kinds of tools when you're going to cross different styles of data, different locations of the data. Although [inaudible 00: 28:32] ETL, EI, ESB, everything works really well still with what it was meant to build. When you look at that first block, it should still be there. When you're going across, do not use this, please. Please, there are better options out there.
Coming back to the topic. Cloudify, use REST API as a layer around your on premise data. Really use that as the prime interface for anything that you have [inaudible 00: 28:59], because it's [inaudible 00: 29:00] enterprise cloud, whether it's a consumer cloud, you're in big data. I understand big data mostly today is actually on premise, social data in the cloud.
The point is that you support it as REST, and whether it's on premise or in the cloud, that's how you get the data. Then everybody's happy. This is a good solution. IT guy's happy, "I've cloudified my data, I don't have to re- enter every time somebody throws a new system at me." The business is so happy. They have this valuable data that they've been generating for the past ten years, 15 years, 20 years, in EPC, CRM, MDM, that's now accessible. They can connect it, they can share it. Share your data. Unlock your data from these on premise systems, bring it to your cloud environment, cloudify yourself. That's really the gist of it, and I think we're pretty much on time on that as well. I'm going to wrap up there and see if we can answer maybe, Tony, a couple of questions?
Tony: Yeah. We're going over just a bit. Everyone is welcome to hang on a little while longer. We've got just a few questions here. Feel free to pass along some questions through chat, and hopefully we'll be able to get to those. I have one question here, Zeb. It says that basically you emphasized REST. Isn't SOAP another viable mechanism to cloudify legacy applications?
Zeb: Sure, sure, absolutely. SOAP is definitely another way, and I did mention actually in the beginning of the webinar that REST is one way to do it. SOAP is definitely another alternative. Really, there is a whole debate going on with REST vs. SOAP. Do your research, you'll be convinced that REST is the way to go.
Just as two points, the two big giants, salesforce.com in the enterprise space started off with SOAP. Then they have REST. If you go and talk to them, they are heavily promoting REST. Although they say a lot of customers are locked into their SOAP API. You move to the consumer space, look at Twitter. Started out with SOAP, now talking about REST. Not just talking about REST, they have REST. That is a trend, and there's a reason for it, but yes, SOAP is another alternative that you can look into.
Tony: OK. Thanks, Zeb. One more question here. Most ETL tool vendors like IBM, Informatica and Timco offer connectors or adapters for on premises as well as software as service applications. This person's company already has licenses for these technologies. Why should they bring in yet another integration technology into the mix?
Zeb: Right, absolutely. All those names mentioned, they do have ETL tools, they've had those for a long time. Like I said, it costs a lot more money, it takes a lot more effort. These tools were really built for structured data, they were built for scheduled jobs, batch jobs. That's where their horsepower is, that's where the value is. They work really well there. Just to give you an example, most of them will have connected adapters which are upgraded once a year. Again, going back to my example of Salesforce, which will have three to four releases a year, how can they possibly keep up with it?
Inherently these technologies become banded approaches. You might be able to solve a problem for the next few months I would say, not even a few years, and then you would have to go with something modern which has a RESTful architecture [inaudible 00: 32:31] architecture. The point is that make your investment now. Look into the foreseeable future and make a decision based on that, and not just because you have something in house you want to use.
Tony: Abhi , is there anything that you'd like to add to this question?
Abhi : No, not to this question. I'll ask for something later on. Thank you, Tony.
Tony: OK. OK. Just wanted to make sure we include both you guys. We have a third question here. Let's see. Let's say I built a REST API interface for my on prem ERP, and I decide to move it to a private cloud later, for example Rackspace. What happens to my integrations?
Zeb: Sure. This is actually a very good question. Feels like [inaudible 00:33:18] ask that question. Yeah, seriously, this is [inaudible 00:33:22]. You are accessing everything through a URL. What happens is when physically you move that box, your data, from on premise into let's say a private cloud like Rackspace that you mentioned or Amazon, really, the calling applications should not be impacted. What the calling is the URL. As long as they're calling the URL, everything should work fine. In reality you would have to take care of a few things like taking care of the firewall, the ports. It will be more of some configuration I think that you have to take care of, but your integration should not have to be rebuilt. You should not have to redo any of that work, not at all.
Abhi : To add to that, in fact we have quite a few clients who have some of the maybe best systems in the private cloud, production systems, and in the public cloud or vice-versa, these kind of integrations can easily be transferred from private cloud to public cloud. We have done that. It's not a big deal at all.
Tony: OK. Guys, I think we're out of time here. It looks like we still have a lot of great questions coming in through chat. Please feel free to forward these two us, and Zeb and Abhi will get to answering those. Definitely a lot of great questions here, guys. We want to follow up with each of you. Feel free to get in touch.
Abhi : Thanks, Tony. We'll wrap up here. Thank you all for joining, please feel free to share the content of this presentation. We will be sending out a link of the presentation on Prezi. We will also make the recording available, so the presentation will be without the voice over, and the recording will have that. Feel free to use it. You can even post your comments there. We were happy to prepare content for this presentation and I hope that you were happy to join in and listen in and got some value out of it.
Tony: Questions can be sent to Zeb at SnapLogic, or you can actually hit us up through our Twitter account, @SnapLogic on Twitter. Thank you all very much.
Zeb: Thank you, bye bye.
"Integration Strategy Guy"
"Integration Technology Expert"