Deep Dive into SnapLogic Winter 2017 Snaps Release

By Pavan Venkatesh

Data streams with Confluent and migration to Hadoop: In my previous blog post, I explained how future data movement trends will look. In this post, I’ll dig into some of the exciting things we announced as part of the Winter 2017 (4.8) Snaps release. This will also address future data movement trends for customers who want to move data to the cloud from different systems or migrate to Hadoop.

Major highlights in 2017 Winter release (4.8) include:

  • Support of Confluent Kafka – A distributed messaging system for streaming data
  • Teradata to Hadoop – A quick and easy way to migrate data
  • Enhancements to the Teradata Snap Pack: On the TPT front, customers can quickly load/update/delete data in Teradata
  • The RedShift Multi-Execute Snap – Allows multiple statements to be sequentially executed, so customers can maintain business logic
  • Enhancements to the MongoDB Snap pack (Delete and Update) and the DynamoDB Snap pack (Delete and Delete-item)
  • Workday Read output enhancements – Now it’s easier for the downstream systems to consume
  • Netsuite Snap Pack improvements -Users can now submit asynchronous operations
  • Security feature enhancements – Including SSL for MongoDB Snap Pack and invalidating database connection pools when account properties are modified
  • Major performance improvement while writing to an S3 bucket using S3 File Writer – Users can now configure a buffer size in the Snap so larger blocks are sent to S3 quickly

Confluent Kafka Snap Pack

Kafka is a distributed messaging system based on publish/subscribe model with high throughput and scalability. It is mainly used for ingestion from multiple sources and then sent to multiple downstream systems. Use cases include website activity tracking, fraud analytics, log aggregation, sales analytics, and others. Confluent is the company that provides the enterprise capability and offering for open source Kafka.

Here at SnapLogic we have built Kafka Producer and Consumer Snaps as part of the Confluent Snap Pack. A deep dive into Kafka architecture and its working will be a good segue before going into the Snap Pack or pipeline details.

kafka-cluster

Kafka consists of single or multiple Producers that can produce messages from a single or multiple upstream systems, and single or multiple Consumers that consume messages as part of downstream systems. A Kafka cluster constitutes one or more servers called Brokers. Messages (key and value or just the value) will be fed into higher level abstraction called Topics. Each Topic can have multiple messages from different Producers. User can also define different Topics for new category of messages. These Producers write messages to Topics and Consumers consume from one or more Topics. Also Topics are partitioned, replicated, and persisted across Brokers. Messages in the Topics are ordered within a partition and each of these will have a sequential ID number called offset. Zookeeper usually maintains these offsets but Confluent calls it coordination kernel.

Kafka also allows configuring a Consumer group where multiple Consumers are part of it, when consuming from a Topic.

With over 400 Snaps supporting various on-prem (relational databases, files, nosql databases, and others) and cloud products (Netsuite, SalesForce, Workday, RedShift, Anaplan, and others), the Snaplogic Elastic Integration Cloud in combination with the Confluent Kafka Snap Pack will be a powerful combination for moving data to different systems in a fast and streaming manner. Customers can realize benefits and generate business outcomes in a quick manner.

With respect to the Confluent Kafka Snap Pack, we support Confluent Version 3.0.1 (Kafka v0.9). These Snaps abstract the complexities and users only have to provide configuration details to build a pipeline which moves data easily. One thing to note is that when multiple Consumer Snaps are used in a pipeline and have been configured with the same consumer group, then each Consumer Snap will be assigned a different subset of partitions in the Topic.

kafka-producer

kafka-consumer

pipeline1

In the above example, I built a pipeline where sales leads (messages) stored in local files and MySQL are sent to a Topic in Confluent Kafka via Confluent Kafka Producer Snaps. The downstream system Redshift will consume these messages from that Topic via the Confluent Kafka Consumer Snap and bulk load it to RedShift for historical or auditing needs. These messages are also sent to Tableau as another Consumer to run analytics on how many leads were generated this year, so customer can compare this against last year.

Easy migrations from Teradata to Hadoop

There has been a major shift where customers are moving from expensive Teradata solutions to Hadoop or other data warehouse. Until now, there has not been an easy solution in transferring large amounts of data from Teradata to big data Hadoop. With this release we have developed a Teradata Export to HDFS Snap with two goals in mind: 1) ease of use and 2) high performance. This Snap uses the Teradata Connector for Hadoop (TDCH v1.5.1). Customers just have to download this connector from the Teradata website in addition to the regular jdbc jars. No installation required on either Teradata or Hadoop nodes.

TDCH utilizes MapReduce (MR) as its execution engine where the queries gets submitted to this framework, and the distributed processes launched by the MapReduce framework make JDBC connections to the Teradata database. The data fetched will be directly loaded into the defined HDFS location. The degree of parallelism for these TDCH jobs is defined by the number of mappers (a Snap configuration) used by the MapReduce job. The number of mappers also defines the number of files created in HDFS location.

The Snap account details with a sample query to extract data from Teradata and load it to HDFS is shown below.

edit-account

terradata-export

 

The pipeline to this effect is as follows:

pipeline2

As you can see above, you use just one Snap to export data from Teradata and load it into HDFS. Customers can later use HDFS Reader Snap to read files that are exported.

Winter 2017 release has equipped customers with lots of benefits, from data streams, easy migrations, to enhancing security functionality, and performance benefits. More information on the SnapLogic Winter 2017 (4.8) release can be found in the release notes.

Pavan Venkatesh is Senior Product Manager at SnapLogic. Follow him on Twitter @pavankv.

Winter 2017 Release Is Now Available

As enterprises grow and adopt best of breed solutions based in the cloud, on-premises and/or hybrid, integrating data between varied applications, databases and data warehouses (used by the enterprise) continues to be a challenge. New solutions are rapidly adopted, and technical and non-technical users alike need help to meet the challenge of quickly integrating the data from multiple sources into one view to make decisions at the speed of business.

snp-76209-winterrelease-484x252-facebookThe release includes several new Snaps and Snap updates that make it faster and easier to integrate Workday, NetSuite and Amazon Redshift with other applications and data sources across the enterprise. All three systems are increasingly popular as businesses embrace the cloud to run their business, a “cloud shift” that Gartner says will drive more than $1 trillion in technology spending by 2020.

Here is a brief overview of new and enhanced Snaps:

  • Confluent KafkaThe need for streaming data becomes more important and today about one-third of the Fortune 500 uses Kafka. SnapLogic is pleased to introduce a new Snap for Confluent’s distribution of Apache KafkaTM, an enterprise-ready solution that connects data sources, applications and IoT devices in real time.
  • TeradataSeveral new Snaps have been added to Teradata Snap Pack expanding support with Teradata TPT Load, TPT Update Snap, and Teradata Export to HDFS Snap which allows customers to easily export data from Teradata to an HDFS cluster without the need for any additional installation or complex configuration.
  • Workday: Workday Read Snap has been enhanced to provide a simplified Workday output format making it even easier to be consumed by downstream systems.
  • NetSuite: Asynchronous operations support for NetSuite, enables more efficient use of NetSuite’s capabilities, through new Snaps including Netsuite Async  Upsert, Async Search, Async Delete List, Async GetList, Check Async Status and GetAsync Result Operations Support Snap.
  • Amazon Redshift: Our customers use Redshift to connect multiple on-premises data sources and applications to Redshift without any coding. The Winter 2017 release introduces a new Snap to execute multiple RedShift commands in one Snap, thereby making RedShift data integration pipelines even more easy to create and manage.
  • Amazon S3: The Winter 2017 release brings additional streaming performance improvement while writing to an Amazon S3 bucket.

Continued Enterprise Focus: Introducing Asset Search Functionality

SnapLogic continues to be the best platform for enterprise IT and LOB teams to integrate applications and data sources without any coding. Enterprises often have thousands of pipelines, files and accounts and it’s hard to search for a given asset. The Winter 2017 release allows customers to quickly search for assets and also filter search outputs.

Security and Performance Enhancements

Security and performance continue to be focus areas for SnapLogic. To further tighten user passwords, the Winter 2017 release enforces enhanced password complexity requirements. Customers can also configure session timeout and idle timeout parameters. In addition, the MongoDB snap pack has been extended to support SSL.

SnapLogic is committed to supporting the growing enterprise’s needs. We hope you will find the new Confluence Kafta snap, expanded support for WorkDay, Netsuite, Amazon RedShift, enhanced search and security useful. Customers can start using the capabilities described in the Winter 2017 release right away. For more information on the Winter 2017 release, including demo videos, see www.snaplogic.com/winter2017.

Fall is Here, and So is Our Fall 2016 Release!

Fall is the time to move your clocks back, get a pumpkin latte and slow down with the approaching cold weather. But not for SnapLogic! We continue to deliver system integration solutions with full force. After our Summer 201fall-2016-graphic6 release, which was feature-packed, the Fall 2016 release takes the SnapLogic platform to a whole new level by extending support for Teradata, introducing new Snap Packs for Snowflake and Azure Data Lake Store, adding more capabilities to Spark mode, and delivering several enhancements for security, performance and governance. As our VP of engineering, Vaikom Krishnan, aptly said, “We continue to make it easier and faster for organizations to connect any and all data sources – whether on premises, in the cloud, or in hybrid environments.”

Continue reading “Fall is Here, and So is Our Fall 2016 Release!”

New With the Spring 2016 Release: Data Ingest-Prep-Deliver for Microsoft HDInsight

SnapLogic continues to build on its momentum in cloud-based data management with new support for HDInsight, Microsoft’s big-data-as-a-service on Azure. This follows our other recent announcements regarding support for the Microsoft Azure and Cortana ecosystem including availability in the Azure Marketplace. Continue reading “New With the Spring 2016 Release: Data Ingest-Prep-Deliver for Microsoft HDInsight”

March 2016 Snap Update

This weekend is our latest SnapLogic Snap Update, with many new and updated Snaps being delivered. Here’s a brief overview. Be sure to contact our Customer Success team if you have any questions about the release.

There is a new RabbitMQ Snap Pack available with this update, which contains a Consumer and Producer Snap. Updated Snap Packs include: Continue reading “March 2016 Snap Update”

August 2015 Snap Update – From Active Directory to Zuora

With 300+ Snaps now available, we’re regularly updating and enhancing our intelligent connector library. Building on our recent Summer 2015 release, this weekend all SnapLogic customers will be updated with our August Snap update. Here’s a summary – from A to Z.

updated_snaps_snaplogicUpdated Snaps include:

  • Active Directory
  • AWS Redshift
  • Anaplan
  • Binary
  • Concur
  • DynamoDB 
  • Email
  • Flow
  • JDBC
  • LDAP
  • MongoDB
  • MySQL
  • Oracle RDBMS
  • Oracle E-Business Suite
  • SQL Server
  • SOAP
  • Transform
  • Zuora

New Snaps include:

  • Google SpreadSheet Snap Pack contains Snaps for browsing Google SpreadSheets, reading worksheets, and writing to worksheets.
  • In the Binary Snap Pack there is a new File Poller Snap that polls a directory looking for files matching the specified pattern.
  • There are many new Snaps for AWS DynamoDB. Check out our recent AWS partner webinar with Earth Networks for a great customer overview.
  • The Flow Snap Pack contains a new Exit Snap, which forces a pipeline to stop with a failed status if it receives more records than the user-defined threshold.
  • The Transform Snap Pack contains a new Transcoder Snap, enabling a preview if a Snap contains special characters.

As always, be sure to contact our Support Team if you have any questions. If you’re new to SnapLogic, you can learn more about our Snaps here.  (Yes, that’s me in the video!)

Self-Service Integration for the Enterprise – SnapLogic Summer 2015 Release

G+_Summer15I talk with customers as often as I am able, and one of the questions I typically ask is why they chose SnapLogic. “Ease of use” is one of the most common answers. The SnapLogic interface is sometimes described as “deceptively simple” in that it is easy and intuitive enough for the non-developer to use without assistance, yet powerful enough for the expert integration developer. The Summer 2015 release makes things even simpler for the self-service integrator, but enhances the structure and governance that enterprises require.

Why is it important for large enterprises to enable self-service integrators? In short: business agility. If app and data integration is the exclusive domain of a few highly trained experts, integration will always be a bottleneck. When new business applications or data sources are added to the mix to meet company objectives, manual, code-intensive integrations will slow the process. By opening integration to a wider group of employees — with the appropriate controls — enterprises can respond to change faster. In this way, the Summer 2015 enhancements to the SnapLogic platform improve our support for business agility.

The release features a variety of new capabilities and improvements including:

  • Impact analysis and modeling so both experts and non-experts can anticipate the impact of their changes
  • Enhanced governance so administrators can provision and monitor self-service users with confidence
  • New Snaps for Amazon DynamoDB, Apache Avro, Apache HBase and Google Spreadsheets
  • Updates to a variety of Snaps

To learn more: