SnapLogic Tips and Tricks: XML Generator Snap Overview (Part 3)

In part two of this series, we covered how to map to the JSON schema upstream. In this post, we will cover how to validate the generated XML against the XSD.

Example 3: Writing the Generated Content to File
Sometimes one wants to write the generated XML to a file. For that use case we provide a DocumentToBinary Snap which can take the content and convert it to binary data object, which then can be written to a file, e.g using a File Writer Snap.

xml-gen-5

Above we map XML to the content field of the DocumentToBinary Snap, and set the Encode or Decode option on the DocumentToBinary Snap to NONE.

This outputs then one binary document for each order. We can then write it to a directory. (Careful, here you’d want to use the append option, since you potentially would be writing two files to the same directory, *which will be supported soon for SnapLogic’s file system) or you can use an expression such as Date.now() to write individual files for each incoming binary data object).

In our final part of this series, we will demonstrate how the XML Generator Snap creates one serialized XML string for every input document.

Additional Resources:

SnapLogic Tips and Tricks: XML Generator Snap Overview (Part 2)

In the first part of this series, we explained how to use the XML Generator Snap to generate XML based off an XSD. In this post, we will cover how to map to the JSON schema upstream.

Example 2: Mapping to XML Generator via XSD
Lets use a JSON Generator to provide the input order data, such as defined below:

We then map the data using the Mapper Snap, which has access to the XSD of the downstream XML Generator Snap of the previous example (now with an added input view).

xml-gen-3

Here we map the items to the item list on the target. Further we map the city, address, country and name to the shipTo object on the target and then finally we map the name against orderperson and orderId against @orderId on the target. The @ indicates we map against an XML attribute.

Hint: the Mapper Snap was enhanced in the Fall 2014 release to allow viewing the data on the in/output while doing the mappings (on the bottom, expanded with the arrow in the middle)

Lets look at the output of the XML Snap:

xml-gen-4

Here we see that each incoming order document was translated into an XML string. We include the original data from the input view, in case it is further needed downstream.
The XML Generator Snap can validate the generated content if needed using the “Validate XML” property.

In our next post in this series, we will demonstrate how the XML Generator Snap validates the generated XML against the XSD.

Other Resources:

SnapLogic Tips and Tricks: The XML Generator Snap (Part 1)

The XML Generator Snap was introduced in the Summer 2014 release. In the Fall release, it was enhanced with the addition of XML generation based on a provided XSD and the suggestion of the JSON schema (based of the XSD schema) to the upstream Snap. The XML Generator Snap is similar to the XML Formatter Snap, which formats incoming documents into XML, however this Snap allows you to map to the XML content to allow a more specific XML generation. In a four-part series, we will explain how the XML Generator Snap:

Example 1: XML Generation via XSD
For this first example, I created a simple pipeline to generate order data XML directly with the XML Generator Snap.

xml-gen-1

We provide the sample XSD (originating from: http://www.w3schools.com/schema/schema_example.asp) defined as:

We then suggest the XML root element, which returns {}shiporder.
Finally, we click on Edit XML which will automatically trigger the XML template generation based off the XSD, as seen below.

xml-gen-2

Now we could replace the variables with our own values to generate the XML on the output view or move on to the next example.

Note: The execution of the Snap above will create an XML attribute on the output view which provides the serialized XML content as a string.

In part two of this series, you will see how to use a JSON Generator to map to the XML Generator XSD.

Other Resources:

SnapLogic Tips and Tricks: Understanding the Task Execute Snap

This article is brought to you by our Senior Director of Product Management, Craig Stewart.

SnapLogic’s Task Execute Snap was introduced in the Summer 2014 release. In the Fall 2014 release, the Task Execute Snap was enhanced with the addition of (transparent) compression and data-type propagation. This Snap is similar to the ForEach Snap, where a single execution of the pipeline is fired off for each incoming data document), but the Task Execute Snap:

  • sends the whole input document
  • can aggregate a number of input rows to stream to the target pipeline
  • sends the SnapLogic data-type information across to the target, preserving date/time, numeric and string types
  • compresses data as it is passed to the target pipeline for optimization of both network and memory use

The target pipeline should be configured to receive the input data (in the case of POSTing the data) or produce the data (in the case of a GET), much like a sub-pipeline, although this is even more loosely coupled. The target pipeline will be invoked once for each batch of input data. Let’s dig into some details. I’ve created three examples:

  1. POST-Type, where we push data to the target pipeline and expect no response
  2. GET-Type, where we get the output or a remotely executed task
  3. POST-and-GET type, where we combine inbound payload and retrieved payload

Example 1: POST-type

For my first example, I created a simple pipeline to consume the data, expecting it to be POSTed, as payload to the URL request, just made up of the JSON Formatter and a File Writer, although it could have been any other Snap with a document input:

JSONFormat

And then I created a triggered task in the Manager to invoke the target pipeline:

triggeredTask

And created a pipeline to send the data, in this case I’m selecting from my favourite Oracle database, limiting it to 50102 rows (an arbitrary number).  As you see, I have configured the Task Execute Snap it to use the task I defined earlier, with a batch size of 10,000 rows, implying that it should make 6 calls, 10,000 x 5, 1 x 102.  Each request is made synchronously. Note that as this is all within the same organisation, the Snap handles all of the authentication and authorisation for you.

task-execute

The Task is selected from the drop-down, which introspects onto the available metadata, showing on the triggerable pipelines from both the current and shared projects. (Note: If the Use On-Premises URL option is checked, it will only show those pipelines where an on-premises URL is available, i.e. running in a Groundplex.) If this option is selected, and the Snaplexes are all on-premises, no data will go out through the firewall; it would all remain secure between the nodes locally.

The Batch size can be adjusted to your requirements, balancing the load and memory usage. Each pipeline invocation does have a certain overhead in preparing, executing and logging, and this should be considered if you are using a low number of rows in batch.  The higher the number of rows per batch, the higher the memory consumption.

When I run the pipeline, the data is streamed from the source, in this case into the task execute, which with the batch size set to 10,000, aggregated in memory until it either completes the input stream, or reaches the batch size, when it then sends the data to the target pipeline with the data payload.

Here is the execution run log, where you can see the expected 6 calls, where it has passed the data to the target task, as expected, compression taking place automatically as it knows it is able to gzip the content and preserve the data types.

stats-1

The output of the Task execute is just the HTTP return code given by the target pipeline:

preview-1

This is shown in the Dashboard pipeline display as follows:

dashboard-1

 

Example 2: GET-Type

In this example, I have changed the first pipeline to remove the input view, and just execute the target task and receive its output data.

task-execute-2

In this case, the batch size is irrelevant. Next, I changed the called pipeline to be my data producer:

oracle-select-2

This time, the result is a smaller set of data out of my Oracle database. Next, I created a new Task, this time to my smaller, producer pipeline:

create-task-2

Now, when I execute the pipeline the run-time stats are as follows:

stats-2

And from the Dashboard Pipeline Display is as follows:

dashboard-2

 

Example 3: POST-and-GET-type

The Executed pipeline can also be a data producer. In this example, I am using the same type of calling pipeline, although this time I limited the Oracle SELECT to 52 rows of data. The Driving pipeline looks remarkably similar:

task-execute-3

Notice how you see I have a different target URL, and a much lower batch size. In the Executed pipeline this time you can see it has an input stream, which will take the inbound payload, and in this case double data, by copying and unioning the result. Then it has an unterminated output, which will be returned to the caller.

union

Again I created a task for it in the SnapLogic Integration Cloud Manager:

task-3

Now I have the complete set, the idea of this configuration is that I select a set of data out of my Oracle database, in this case 52 rows, which I then send in batches of 10 to the target pipeline, taking the benefit of passing the data types, compression, etc. as described previously. But this time I will actually get a result set streamed back, again preserving data types and formats.

Here are the run-time execution stats:

stats-3

As you can see, this time I both sent and received payloads, allowing the SnapLogic Elastic Integration Platform to handle the authentication, authorisation and payload compression. No messing with headers or any other additional configuration. Here, you see the execution from the Dashboard Pipeline Display:

dashboard-3

Summary

In summary, the Task Execute Snap enables you to pass batches of data to and from target pipelines, automatically aggregating, authenticating, compressing the data payload, and waiting for successful completion. For more SnapLogic best practices and tips and tricks, be sure to check out our TechTalk webinars and recordings.

SnapLogic Tips and Tricks: REST Snap Compression Capabilities

This article is brought to you by our Senior Director of Product Management, Craig Stewart.

In the Fall 2014 release, SnapLogic added a number of new features across the broad range of Snaps. Amongst those was the ability for a REST GET operation to accept gzip-encoded data. When combined with a triggered pipeline in another Snaplex, this can add significant performance and reliability (the less time you spend moving data over the wire, the less total packets moved, the less scope there is for network errors, and the less time it should take).

As an example, I created a simple pipeline which outputs a set of data, in this case just an Oracle database query returning 101,000 rows of data:

Oracle Select

For this, I created a task so I could call it using the REST GET Snap in the other pipeline:

task

To call it, I created a pipeline using the REST GET snap, which would call this URL:

rest-get

As the URLs for triggered pipelines require authentication, I created and assigned a Basic Auth account with my credentials, and associated it with the REST GET Snap.  The URL is copied and pasted from the task created previously. This was all possible in earlier versions of SnapLogic. The change in this version, is the ability to add the content-type accept headers:

rest-get-headers

Now what will happen is that the Snap, if it gets data in gzip format, will automatically uncompress and process that data received (even when not from a SnapLogic triggered pipeline). No additional Snaps required. The clever bit is that the the triggered pipeline will also note that the caller is able to accept gzip format, so it will automatically send the data in that format.

In summary, you just need to add the HTTP Headers to the REST Get.

As an aside, the Task Execute Snap will do this compression automatically, to be covered in a future post. For more SnapLogic Integration Cloud best practices and tips and tricks, be sure to check out our TechTalk webinars and recordings.

SnapLogic TechTalk – Data Transformations and Mappings

Data Mapping and TransformationEvery two weeks we’ve been hosting a TechTalk series where one of our cloud integration gurus dives into a specific topic and reviews best practices, as a way to provide more training and information for SnapLogic partners and customers. It’s a 30-minute, interactive, ask-the-experts training session and topics come from the new SnapLogic Developer Community forum (requires customer login).

In our most recent integration platform as a service (iPaaS) TechTalk, we reviewed the SOAP Snap, the functionality it covers and configuring a data flow pipeline. We also reviewed what’s new in our Fall 2014 release in a webinar this week – be sure to check out the recording here. Join us on Thursday, October 2nd (tomorrow!) at 10:00am PST / 1:00pm EST for our next TechTalk , which will focus on Data Transformations and Mappings to help with your application and data integration development and deployment cycle. What you can expect to learn during this training session includes:

  • Mapping Expressions
  • Manipulating Data
  • JSON Path Examples
  • Aggregate Transformations
  • Scripting Transformations

Joining me this week will be Tim Lui, who is on our Product Team. We hope you can attend and learn something new.

And if there is a specific topic you’d like us to cover in the future, please let us know in the Developer Community (or in the comments section below) and we will review it in an upcoming session. Don’t forget to register, and check out some past recorded TechTalks here.