SnapLogic Live: Spark Data Pipelines and Strata+Hadoop World

This is a big month for big data. With an announcement around Spark in our recent Winter 2016 release and Strata+Hadoop World coming up in a few weeks in San Jose, big data is on our minds and Spark Data Pipelines will be the focus of next week’s SnapLogic Live demo session.

SnapLogic Live - Spark Data Pipelines Continue reading “SnapLogic Live: Spark Data Pipelines and Strata+Hadoop World”

SnapLogic on the Radar with MWD Advisors

MWD AdvisorsSnapLogic was recently reviewed by advisory firm MWD Advisors for our efforts to reinvent integration platform technology by creating one unified platform that can address many different kinds of application and data integration use cases.

A few highlights from the report:

  • An in-depth look at our multi-tenant, AWS-hosted platform which includes the SnapLogic Designer, Manager and Dashboard
  • The Snaplex execution environment; namely, the Cloudplex, Groundplex and Hadooplex
  • 3 competitive differentiators – deployment flexibility, a unified approach across multiple integration scenario types and both scalability and adaptability

Read the full review here, or take a look below. You can also check out other SnapLogic reviews on our website.

MWD Advisors is a specialist advisory firm providing practical industry insights to business leaders and technology professionals working to drive change with the help of digital technology.

Collaborations in Building Hybrid Cloud Computing and Data Integrations

Post first published by Ravi Dharnikota on LinkedIn.

It’s one thing to create application and data integrations; it’s an even bigger challenge to collaborate with other teams in the enterprise to reuse and repurpose and standardize on what has already been built.

The need for seamless content collaboration is a key ingredient for overall success in app and data integrations, just as it is in app development and delivery. A platform that allows for easy sharing of information between employees is the different between a platform’s adoption throughout the enterprise or becoming shelf-ware. Continue reading “Collaborations in Building Hybrid Cloud Computing and Data Integrations”

Webinar: Building the Enterprise Data Lake with Mark Madsen

DataLake_Webinar_Events_Page_Graphic“Without focusing on the data architecture of the lake, you will build a swamp with nice code.”

— Mark Madsen, President and Analyst, Third Nature

Join us for a live webinar on Thursday, December 8th to hear from industry analyst and big data thought leader Mark Madsen about the future of big data and the importance of the new Enterprise Data Lake reference architecture. Mark will be joined by Craig Stewart, SnapLogic’s Senior Director of Product Management, and Erin Curtis, SnapLogic’s Senior Director of Product Marketing.

As Mark says, “Building a data lake requires thinking about the capabilities needed in the system. This is a bigger problem than just installing and using open source projects on top of Hadoop. Just as data integration is the foundation of the data warehouse, an end-to-end data processing capability is the core of the data lake. The new environment needs a new workhorse.”

This webinar will cover:

  • What’s important when building a modern, multi-use data infrastructure and why the field of dreams approach doesn’t work
  • The difference between a Hadoop application vs. data lake infrastructure
  • An enterprise data lake reference architecture to get you started

Craig and Erin will also discuss how SnapLogic’s Elastic Integration Platform powers the new enterprise data lake reference architecture and some of the benefits of a modern data integration solution.

Who should join:

  • Enterprise architects
  • Data warehouse managers
  • Chief data officers
  • Business intelligence practitioners
  • Data integration managers
  • Anyone building or considering the new hybrid data architecture

Register here - we look forward to seeing you online on Tuesday, December 8th!


Mark MadsenAbout Mark Madsen

Mark Madsen, president and founder of Third Nature, is a well-known consultant and industry analyst. Mark frequently speaks at conferences and seminars in the US and Europe and writes for a number of leading industry publications. Mark is a former CTO andCIO with experience working for both IT and vendors, including a stint at a company used as a Harvard Business School case study. Over the past decade Mark has received awards for his work in analytics, business intelligence and data strategy from the american Productivity and Quality Center, the Smithsonian Institute and TDWI. He is co-author of several books and lectures and writes frequently on analytics and data topics.


Re-Thinking Data and App Integration With Elastic iPaaS

A few months ago we published a series about the new hybrid cloud and big data integration requirements. Here’s an update:

Elastic IntegrationTraditional approaches to data and application integration are being re-imagined thanks to:

  1. Enterprise “cloudification”: Cloud expansion has hit a tipping point and most IT organizations are either running to keep up or trying to get ahead of the pace of this transformation; and
  2. The need for speed: Cloud adoption and big data proliferation have led to business expectations for self-service and real-time data delivery.

As a result, the concept of integration platform as a service (iPaaS) has gained momentum with enterprise IT organizations who need to connect data, applications, and APIs faster. Typical iPaaS requirements include: an easier- to-use user experience, metadata-driven integrations, pre-built connectivity without coding, data transformation and other ETL operations, and support for hybrid deployments. Here are four additional iPaaS requirements that cannot be ignored.

  1. Loose Coupling to Manage Change: It is now expected to respond to changing business requirements immediately. These changes result in data changes that impact the integration layer. For example, a new column is added to a table, or a field to an API, to record or deliver additional information. Last generation ETL tools are strongly typed, requiring the developer to define the exact data structures that will be passing through integrations while designing them. Any departure from this structure results in the integration breaking because additional fields are not recognized. This brittle approach can bring today’s agile enterprise to its knees. The right iPaaS solution must be resilient enough to handle frequent updates and variations in stride. Look for “loose coupling” and a JSON-centric approach that doesn’t require rigid dependency on a pre-defined schema. The result is maximum re-use and the flexibly you need for integrations to continue to run even as endpoint data definitions change over time.
  2. Platform Architecture Matters: Your integration layer must seamlessly transition from connecting on-premises systems to cloud systems (and vice versa) while still ensuring a high degree of business continuity. Many legacy data integration vendors “cloud wash” their solutions by simply hosting their software, or by providing only some aspects of their solution as a multi-tenant cloud service. Some require on-premises ETL or ESB technologies for advanced integration development and administration. When looking at a hybrid cloud integration solution, look under the hood to ensure there’s more than a legacy “agent” running behind the firewall. Look for elastic scale and the ability to handle modern big (and small) data volume, variety, and velocity. And ensure that your iPaaS “respects data gravity” by running as close to the data as necessary, regardless of where it resides.
  3. Integration Innovation: Many enterprise IT organizations are still running old, unsupported versions of integration software because of the fear of upgrades and the mindset of “if it ain’t broke, don’t fix it.” Cumbersome manual upgrades of on-premises installations are error-prone and result in significant re-development, testing cycles, and downtime. The bigger the implementation, the bigger the upgrade challenge—and connector libraries can be equally painful. Modern iPaaS customers expect the vendor to shield them from as much upgrade complexity as possible. They are increasingly moving away from developer-centric desktop IDEs. Instead, they want self service—browser-based designers for building integrations, and automatic access to the latest and greatest functionality.
  4. Future Proofing: Many IT organizations are facing the Integrator’s Dilemma, where their legacy data and application integration technologies were built for last decade’s requirements and can no longer keep up. In order to be able to handle the new social, mobile, analytics, cloud, and Internet of Things (SMACT) requirements, a modern iPaaS must deliver elastic scale that expands and contracts its compute capacity to handle variable workloads. A hybrid cloud integration platform should move data in a lightweight format and add minimal overhead; JSON is regarded as that compact format of choice when compared to XML. A modern iPaaS should also be able to handle REST-based streaming APIs to continuously feed into an analytics infrastructure, whether it’s Hadoop, a cloud-based or traditional data warehouse environment. With iPaaS, data and application integration technologies are being re-imagined so don’t let legacy, segregated approaches be a barrier to enterprise cloud and big data success. Cloud applications like Salesforce and Workday continue to fuel worldwide software growth, while infrastructure as a service (IaaS) and platform as a service (PaaS) providers offer customers the flexibility to build up systems and tear them down in short cycles.


The History of Middleware

This post originally appeared on Glenn Donovon’s blog.

I’ve recently decided to take a hard look at cloud iPaaS (integration platform as a service) and in particular, SnapLogic due to a good friend of mine joining them to help build their named account program. It’s an interesting platform which I think has the potential to help IT with the “last mile” of cloud build-out in the enterprise, not just due its features, but rather because of the shift in software engineering and design occurring that started in places like Google, Amazon and Netflix – and startups that couldn’t afford and “enterprise technology stack” – and is now making its way into the enterprise.

However, while discussing this with my friend, it became clear that one has to understand the history of integration servers, EAI, SOA, ESB, and WS standards to put into context the lessons that have been learned along the way regarding actual IT agility. But let me qualify myself before we jump in. My POV is based on being an enterprise tech sales rep who sold low latency middleware, and standards based middleware, EAI, a SOA grid-messaging bus as well as applications like CRM, BPM and firm-wide market/credit risk management, which have massive system integration requirements. While I did some university and goofing off coding in my life (did some small biz database consulting for a while), I’m not an architect, coder or even a systems analyst. I saw up close and personal why these technologies were purchased, how they evolved over time, what clients got out of them and how all our plans and aspirations played out. So, I’m trying to put together a narrative here that connects the dots for people other than middleware developers and CTO/Architect types. Okay, that said, buckle up.

The Terrain
Let’s define some terms – middleware is a generic a term – I’m using it to refer to message buses/ESBs and integration servers (EAI). The history of those domains led us to our current state and helps make clear where we need to go next, and why the old way of doing systems integration and building SOA is falling away in favor of RESTful web services micro-services based design in general.

Integration servers – whether from IBM or challengers in those days like SeeBeyond (who I worked for), the point of an integration server was to stop the practice of hand writing code for each system/project to access data/systems. These were often referred to as “point to point” integrations in those days, and when building complex systems in large enterprises before integration servers, the data flows between systems often looked like a plate of spaghetti. One enterprise market risk project I worked on at Bank of New York always comes to mind when I think of this. The data flows of over 100 integration points from which the risk system consumed data had been laid out in one diagram and we actually laminated it on 11×17 paper, making a sort of placemat out of it. It became symbolic of this type of problem for me. It looked kind of like this:
integrationimageJohnSchmidt(Image attribution to John Schmidt, recognized authority in the integration governance field and author of books on Integration Competency Centre and Lean Integration)

So, along came the “integration server”. The purpose was to provide a common interface and tools to use to connect systems without writing code from scratch so integrations between systems would be easy to build and maintain, while also being secure and performing well. Another major motivation was to loosely couple target systems and data consuming systems to isolate both from changes in the underlying code in either. The resulting service was available essentially as an API, they were proprietary systems, and in a sense they were ultimately black boxes from which you consumed or contributed data. They did the transforms, managed traffic, loaded the data efficiently etc. Of course, there were also those dedicated to bulk/batch data as well such as Informatica and later on Ab Initio, but it’s funny, these two very related worlds never came together into a unified offering successfully.

This approach didn’t change how software systems were designed, developed or deployed in a radical way though. Rather, the point was to isolate integration requirements and do them better via a dedicated server and toolkit. This approach delivered a lot of value to enterprises in terms of building robust systems integrations, and also helped large firms adopt packaged software with a lot less code writing, but in the end it offered only marginal improvements in development efficiency, while injecting yet another system dependency (new tool, specialized staff, ongoing operations costs) into the “stack”. Sure you could build competency centers and “factories” but to this day, such approaches end up creating more bureaucracy, more dependencies and complexity while adding less and less value compared to what a developer can build him/herself with RESTful services/micro-services, and most things one wants to integrate with today already have well defined APIs so it’s often much easier to connect and share data anyway. Innovative ideas like “eventually consistent data” and the incredible cost advantages and computing advantages of open source versus proprietary technologies, in addition to public IaaS welcoming such computing workloads at the container level, well let’s just say there is much more changing than just integration. Smart people I talk to tell me that using an ESB as the backbone of a micro-services based system is contrary to the architecture.

Messaging bus/ESB – This is a system for managing messaging traffic on a general purpose bus from which a developer can publish, subscribe, broadcast and multi-cast messages from target and source services/systems. This platform predates the web, fyi, and was present in the late ’80s and ’90s in specialized high speed messaging systems for trading floors, as well as in manufacturing and other industrial settings where low latency was crucial and huge traffic spikes were the norm. Tibco’s early success in trading rooms was due to the fact they provided a services based architecture for consuming market data which scaled and also allowed them to build systems using services/message bus design. These allowed such systems to approach the near-real time requirements of such environments, but they were proprietary, hugely expensive and not at all applicable for general purpose computing. However, using a service and bus architecture with pub/sub, broadcast, multi-point and point to point communications available as services for developers was terrific for isolating apps from data dependencies and dealing with the traffic spikes of such data.

Over time, it became clear that building systems out of services had much general promise (inheriting basic ideas from object oriented design actually), and that’s when the idea of Web Services began to emerge. Entire standards bodies were formed and many open source and other projects spun off to support the standards based approach to web services system design and deployment. Service Oriented Architectures became all the rage – I worked on one project for Travelers Insurance while at SeeBeyond where we exposed many of their mainframe applications as web services. This approach was supposed to unleash agility in IT development.

But along the way, a problem became obvious. The cost, expertise, complexity and time involved in building such elegantly designed and governed systems frameworks ran counter to building systems fast. A good developer could get something done that worked and was high quality without resorting to using all those WS standardized services and conforming to its structure. Central problems included using XML to represent data because the processing power necessary for transforming these structures was always way too expensive for the payoff. SOAP was also a highly specialized protocol, injecting yet another layer of systems/knowledge/dependency/complexity to building systems with a high degree of integration.

The entire WS framework was too formalized, so enterprising developers said, ‘how could I use a service based architecture to my advantage when building a system without relying on all of that nonsense? This is where RESTful web services come in, and then JSON, which changes everything. Suddenly, new, complex and sophisticated web services began to be callable just by HTTP. Creating and using such services became trivially easy in comparison to the old way. Performance and security could be managed in other ways. What was lost was the idea of operating with “open systems” standards from a systems design standpoint, but it turns out that it was less valuable in practice given all the overhead.

The IT View
IT leadership already suspects that the messaging bus frameworks did not give them the most important benefit they were seeking in the first place – agility – yet they have all this messaging infrastructure and all these incredibly well behaved web services running. In a way, you can see this all as part of how IT organizations are learning what real agility looks like when using services based architectures. I think many are ready to dump all that highly complex and expensive overhead which came along with messaging buses when an enterprise class platform comes along that enables them to do so.

But IT still loves integration servers. I think they are eager for a legit, hybrid-cloud based integration server (iPaaS) that gives them an easy way to build and maintain interfaces for various projects at lower cost than on ESB-based solutions while running in the cloud and on-prem. It will need to provide the benefits of a service based architecture – contractual level relationships – without the complexity/overhead of messaging buses. It needs to leverage the flexibility of JSON while providing a meta-data level control over the services, along with a comprehensive operational governance and management framework. It also needs to scale massively, and be distributed in design in order to even be considered for this central role in enterprise systems.

The Real Driver of the Cloud
What gets lost sometimes with all of our technical focus is that economics are what is mostly driving cloud adoption. With the massive cost differentials opening up in public computing versus on-prem or even outsourced data centers due to the tens of billions IBM, Google, MSFT, AWS and others are pouring into their services, the public cloud will become the platform of choice for all computing going forward. All that’s up for debate inside any good IT organization is how long this transition is going to take.

Many large IT organizations clearly see the tipping point cost-wise is at hand already with regard to IaaS and PaaS. This has happened in lots of cloud markets already – Salesforce didn’t really crush Siebel until its annual subscription fees were less than Siebel’s on prem maintenance fees. Remember, there was no functional advantage in Salesforce over Siebel. Hint: Since the economics are the driver, it’s not about having the most elegant solution rather than being able to actually do the job somehow, even if not pretty or involving tradeoffs.

Deeper hint: Give a good systems engineer a choice between navigating a nest of tools and standards and services, along with all the performance penalties, overhead, functional compromises and dependencies they bring along, versus having to write a bit more code and being mindful of systems management, and she/he will choose the latter every time. Another way of saying this is that complexity and abstraction are very expensive when building systems and the benefits have to be really large for the tradeoff to make sense, and I don’t believe that in the end, ESBs and WS standards for the backbone of SOA ever paid off in the way it needed to. And never will. The industry paid a lot to learn that lesson, yet it seems that there is a big reluctance to discuss this openly. The lesson? Simplicity = IT Agility.

The most important question is what core enterprise workloads can move to the public cloud feasibly, and that is still a work in progress. Platforms like Apprenda and other hybrid cloud solutions are crucial part of that mix as one has to be able to assign data and process to appropriate resources from a security/privacy and performance/cost POV. A future single “cloud” app built by an enterprise may have some of its data in servers on Azure in three different data centers to manage data residency, other data in on prem data stores, be accessing data from SaaS apps, and accessing still other data in super cheap simple data stores on AWS. Essentially, this is a “policy” issue and I’d say that if a hybrid iPaaS like Snaplogic can fit into those policies, and behave well with them and facilitate connecting these systems, it has a great shot at being a central part of the great rebuild of core IT systems we are going to watch happen over the next 10 years. This is because it provides the enterprise with what it needs, an integration server while leveraging RESTful services, micro-services based architectures and JSON without an ESB.

This is all coming together now, so you will see growing interest in throwing out the old integration server/message bus architectures in organizations focused on transformation and agility as core values. I think the leadership of most IT organizations are ready to leave the old WS standards stuff behind, along with bus/message/service architectures, but their own technical organizations are still built around them so this will take some time. It’s also true that message buses will not be eliminated entirely, as there is a place for messaging services in systems development – just not as the glue for SOA. And of course, low latency buses will still apply to building near-real time apps in say engineering settings or on trading floors, but using message buses as a general purpose design pattern will just fade from view.

The bottom line is that IT leaders are just as frustrated they didn’t get the agility out of building systems using messaging bus/SOA patterns as business sponsors are about the costs and latency of these systems. All the constituencies are eager for the next paradigm and now is the time to engage in dialog about this new vision. To my thinking, this is one of the last crucial parts of IT infrastructure which needs to be modernized and cloud enabled to move the enterprise computing workloads to the cloud, and I think SnapLogic has a great shot at helping enterprises realize this vision.


About the Author

Glenn Donovon has advised, consulted to or been an employee of 23 startups, and he has extensive experience working with enterprise B2B IT organizations. To learn more about Glenn, please visit his website.

Hybrid Cloud Integration for the Digital Era

A few years ago I wrote a post on the blog entitled: Is Hybrid the New Black in The Era of Cloud Computing? “Cloud First” wasn’t exactly a mainstream concept for enterprise IT organizations at the time, but “more comprehensive (and collaborative) cloud computing strategies” were being considered. It was clear that hybrid was going to be the reality for the foreseeable future.

Fast forward to mid-2015 and:

hybrid_cloud_integration_451Given the dramatic acceleration of cloud and big data adoption in the enterprise, how companies connect all of their data sources, applications and APIs is also going through a re-imagining. Today’s increasingly hybrid cloud infrastructure, self-service end-user requirements and the need for speed and agility are at the center of the transformation.

So what is hybrid cloud integration and why is it so important?

In a recent SnapLogic webinar (Hybrid Cloud Integration: Why it’s Different and Why it Matters), 451 industry analyst Carl Lehman defined Hybrid Cloud Integration from two perspectives:

  • Tactical: enables the data and process flows between any number and type of cloud services with any number and type of in-place IT systems
  • Strategic: enables cloud services to be dynamically consumed within IT architecture and exploited on demand for their price/performance and elasticity advantages

He referred to the hybrid cloud integration reference architecture as, “a blueprint of goals, practices, tools and techniques used for data and application integration across clouds.” He also referred to it as, “a framework for assembling the integration technology needed for hybrid IT.”

Clearly legacy ETL and ESB technologies and approaches are not going to deliver on the modern hybrid cloud integration requirements. Industry expert David Linthicum noted in the whitepaper, The Death of Traditional Data Integration:

“It’s not a matter of “if” we’re moving in new directions that will challenge your existing approaches to data integration, it’s “when.” Those who think they can sit on the sidelines and wait for their data integration technology provider to create the solutions they require will be very disappointed.”

Cloud-ComputingAccording to Carl Lehman, there are three waves of hybrid cloud integration:

  1. The first wave addressed on-premises to SaaS data-loading: little data quality management; accessibly and security entrusted to the SaaS providers
  2. The second wave occurs when SaaS deployments accelerate, data quality begins to matter, cross-functional processes exchange data among on-premises and SaaS systems (hybrid IT), and other applications are offloaded to IaaS and PaaS (workloads shift)
  3. The third wave occurs when big data becomes a real strategic initiative in an enterprise; value can be derived from intelligently managing the Internet of Things (IoT) – operational technology (OT) automation; and IT meets OT

You can download the slides here and listen to the recording here.

So what is the recommendation for getting started with a hybrid cloud integration strategy? 

Seek platforms that unify data and application integration functions to:

  1. Simplify tooling and enable consistency for development, execution and management
  2. Enable the second wave of cloud integration (expanded private cloud, hybrid IT and ITaaS)
  3.  Support the third wave of cloud integration (actionable intelligence from big data)