Data volumes are exponentially increasing and many organizations are starting to realize the complexity of their growing data management and movement solutions. Data exists in various systems, and getting meaningful value out of it has become a major challenge for many companies. Also, most of the data is usually stored in relational systems like MySQL, PostgreSQL and Oracle, these being the mainstream databases primarily used for OLTP purposes. NoSQL systems like Cassandra, MongoDB and DynamoDB have also emerged with tunable consistency model in order to store some of these mission critical data. Customers then typically move these data to much bigger systems like Teradata and Hadoop (OLAP) that can store large amounts of data, so they can run analytics, reporting or complex queries against it. There is also a recent trend where some of these data are moved to the cloud, especially to Amazon RedShift or Snowflake and also to HDInsights or Azure Data Warehouse.
This week, the SnapLogic team will be supporting one of our partners, Amazon Web Services, in Las Vegas for the annual AWS re:Invent conference. This gathering of the global AWS community will feature hands-on labs and bootcamps and cover topics such as infrastructure maintenance, and improving developer productivity, network security and application performance.
Fall is the time to move your clocks back, get a pumpkin latte and slow down with the approaching cold weather. But not for SnapLogic! We continue to deliver integration tools with full force. After our Summer 2016 release, which was feature-packed, the Fall 2016 release takes the SnapLogic platform to a whole new level by extending support for Teradata, introducing new Snap Packs for Snowflake and Azure Data Lake Store, adding more capabilities to Spark mode, and delivering several enhancements for security, performance and governance. As our VP of engineering, Vaikom Krishnan, aptly said, “We continue to make it easier and faster for organizations to connect any and all data sources – whether on premises, in the cloud, or in hybrid environments.”
What is Apache Hive? Hive provides a mechanism to query, create and manage large datasets that are stored on Hadoop, using SQL like statements. It also enables adding a structure to existing data that resides on HDFS. In this post I’ll describe a practical approach on how to ingest data into Hive, with the SnapLogic Elastic Integration Platform, without the need to write code.
SnapLogic co-founder and CEO Gaurav Dhillon sat down recently with Scott Kupor, managing partner at Andreessen Horowitz, for a wide-ranging podcast discussion of all-things-data.
The two discussed how the data management landscape has changed in recent years, the rise of advanced analytics, the move from data warehouses to data lakes, and other changes which are enabling organizations to “take back their enterprise.”
SnapLogic was in New York this week for Strata + Hadoop World NYC, and our CTO James Markarian took the opportunity to sit down with Dave Vellante and George Gilbert, hosts of theCUBE, for a wide-ranging discussion on the shifting big data landscape.
One of the most common requests I hear from colleagues and customers is, “How do I estimate how many jobs I can run on a node and how fast will they run?” The immediate and most accurate answer is… it depends. While it may seem a flippant answer, it is a succinct response to a complex multidimensional problem. Let’s examine the variables. Continue reading “Estimating Load and Performance of Integrations”