Data volumes are exponentially increasing and many organizations are starting to realize the complexity of their growing data management and movement solutions. Data exists in various systems, and getting meaningful value out of it has become a major challenge for many companies. Also, most of the data is usually stored in relational systems like MySQL, PostgreSQL and Oracle, these being the mainstream databases primarily used for OLTP purposes. NoSQL systems like Cassandra, MongoDB and DynamoDB have also emerged with tunable consistency model in order to store some of these mission critical data. Customers then typically move these data to much bigger systems like Teradata and Hadoop (OLAP) that can store large amounts of data, so they can run analytics, reporting or complex queries against it. There is also a recent trend where some of these data are moved to the cloud, especially to Amazon RedShift or Snowflake and also to HDInsights or Azure Data Warehouse.
What is Apache Hive? Hive provides a mechanism to query, create and manage large datasets that are stored on Hadoop, using SQL like statements. It also enables adding a structure to existing data that resides on HDFS. In this post I’ll describe a practical approach on how to ingest data into Hive, with the SnapLogic Elastic Integration Platform, without the need to write code.
SnapLogic was in New York this week for Strata + Hadoop World NYC, and our CTO James Markarian took the opportunity to sit down with Dave Vellante and George Gilbert, hosts of theCUBE, for a wide-ranging discussion on the shifting big data landscape.
The future for big data processing lies in the adoption of commercial Hadoop distributions and their supposed deployments. The macro use case for big data are data lakes, which are massive amounts of structured and unstructured data that do not carry the same restrictions as traditional data warehouses. They store everything, including every type of data, any volume, any scope of data that may be used by enterprise data users, for any reason.
Despite the power and potential of data lakes, many enterprises continue to approach this technology with the same data integration approaches and mechanisms they’ve used in the past, none of which work well. How can we tap into the power of the data lake? Continue reading “The Data Lake Data Integration Challenge”
Next up in our ongoing podcast series: an episode on the “lifecycle of data” featuring our guest, Enterprise Solution Architect Rich Dill. The series is hosted by our own head of enterprise architecture, Ravi Dharnikota.
In this episode, Ravi Dharnikota and Rich Dill discuss the lifecycle of data, including the transition of data storage and processing to the cloud, the implications of distributed data, a “multi-tiered data lifecycle,” and the evolution of the data lake.
You can view and subscribe to the entire series here.
We are pleased to announce our new podcast series called SnapTalk. The series will feature short, 10-15 min. episodes on topics relevant to big data, data management and app and data integration. Our host for the series is Ravi Dharnikota, SnapLogic’s head of enterprise architecture. Each episode features a special guest in conversation with Ravi, such as SnapLogic’s chief scientist, Greg Benson.
This project grew out of the great conversations we have at Snappy Hour. Eating lunch as a group at least a couple of times a week and our weekly happy hour (called Snappy Hour) are big parts of the SnapLogic culture. And, invariably, the conversations at these gatherings range from the lightweight, such as the latest episode of Game of Thrones, to the complex, such as the future of Spark and what makes streaming data streaming. This podcast series is intended to capture the essence of those ad hoc discussions, get people thinking, and hopefully inspire additional discussions.
The first episodes are posted now and cover topics such as Spark, streaming data and Kafka. Stay tuned to this space for the next episode. The SnapTalk playlist is here and our new SoundCloud channel is here– I hope you’ll subscribe, and we welcome your feedback.