As the volume, velocity and variety of corporate data grow, extract, transform and load (ETL) projects become more complex, challenging and costly. Traditional ETL tools are point-to-point solutions based on rows and columns and source-to-target column mappings that struggle with ever-increasing volumes of real-time, unstructured and hierarchical data. These traditional and homegrown low-level relational database-focused solutions for ETL require expensive cycles of development and maintenance on static architectures that are incapable of handling an ever-evolving environment that demands agility and elasticity.
SnapLogic’s approach to data integration is focused on data streams between applications differentiated by its flexibility to connect to both cloud and on-premise systems. With SnapLogic, “citizen integrators” (end-users, SaaS application users, application developers, database administrators, business analytics, etc.) can take advantage of powerful parallel processing and management capabilities via the multi-tenant cloud Designer, which can handle data flows (called pipelines), from any number of sources to any number of destinations. And because SnapLogic is 100% REST-based, each pipeline is abstracted and addressable, usable, consumable, trigger-able and schedule-able as a REST call – able to do the job of many traditional static integrations with considerable advantage.
SnapLogic’s Elastic Integration Platform provides a rich set of functionality including a cloud-based, intuitive, drag-and-drop graphical integration orchestration console with important ETL features such as pipeline versioning, aggregation, joins, unions, splitting, slowly changing dimensions (SCDs) and scheduling in a multimodal integration platform. SnapLogic elastically scales up or down to accommodate workload demand using collections of JVMs (called Snaplexes) that can reside on-premise and/or in the cloud. Whether your requirement is event-based, real-time, streaming or scheduled batch-oriented data integration, SnapLogic’s modern, JSON-centric, RESTful platform has a clean advantage over traditional ETL tools that were built for last generation’s structured data management challenges.
As the number of available data sources expanded and new business insights became available, requests were made for additional views from data sources with longer histories. Legacy data transformation systems began struggling under the press from these growing volumes of data. The concept of extract, load and transform (ELT) was introduced to alleviate this problem by loading higher volumes of raw data directly into a warehouse staging area where it could be processed and transformed after loading.
As real-time analytics and insights become increasingly vital for the business, data scientists and business analysts need to have more control over the entire data lifecycle. Adds, moves and changes in the integration cycle need to be easily orchestrated while preserving the time required for analytics and insights. Accomplishing this requires a modern, agile and simplified approach to ETL/ELT.
SnapLogic’s modern approach to Hadoop ETL/ELT begins by moving the extraction process beyond structured data sets, allowing queries across disparate data types and structures and then streaming all data as JSON documents. Rather than traditional point-to-point data loading, SnapLogic’s horizontally scalable elastic pipeline provides powerful multi-point, multimodal integration while hiding the underlying complexity of data integration from the user. SnapLogic’s easy to use integration platform helps you to quickly ingest, prepare and delivery big data in Hadoop or Spark environments, regardless of the data’s velocity, variety and volume. With SnapLogic, the data scientist/business analyst is fully empowered.