What is Apache Hive? Hive provides a mechanism to query, create and manage large datasets that are stored on Hadoop, using SQL like statements. It also enables adding a structure to existing data that resides on HDFS. In this post I’ll describe a practical approach on how to ingest data into Hive, with the SnapLogic Elastic Integration Platform, without the need to write code.
SnapLogic CTO James Markarian recently appeared as a guest on DisrupTV, a weekly live-interview web-series produced by analyst firm Constellation Research and hosted by R “Ray” Wang and Vala Afshar. The trio discussed a variety of enterprise topics including modern data management, data lake strategy considerations and big data analytics.
Last week, part of the SnapLogic team was in New York City for the Strata/Hadoop World conference. It’s one of the largest big data events in the U.S. and has grown steadily larger over recent years. The agenda has shifted a bit as well – from largely academic discussions and how-to presentations by open source committers to real-world case studies by non-ISV enterprises.
With that in mind, I’d like to share a story from one of our enterprise customers. In fact, this customer is a 100+ year old financial institution. Perhaps not a company that you would associate with the cutting edge of data management technologies… Due the nature of their industry, I can’t share their name.
Like many established companies, this bank’s data processing and storage systems have been acquired or added over the years based on the most pressing needs and compliance requirements at the time. They ultimately found themselves trying to manage an unwieldy mix of 240+ interfaces and applications. Continue reading “A Hadoop Data Lake For Banking: A SnapLogic Story”
SnapLogic was in New York this week for Strata + Hadoop World NYC, and our CTO James Markarian took the opportunity to sit down with Dave Vellante and George Gilbert, hosts of theCUBE, for a wide-ranging discussion on the shifting big data landscape.
SnapLogic announced the availability of new pre-built intelligent connectors – called Snaps – for Microsoft Azure Data Lake Store. The new Snaps provide fast, self-service data ingestion and transformation from virtually any source – whether on-premises, in the cloud or in hybrid environments – to Microsoft’s highly-scalable, cloud-based repository for big data analytics workloads. This latest integration between SnapLogic and Microsoft Azure helps enterprise customers gain new insights and unlock business value from their cloud-based big data initiatives.
Next week our team of integration experts will be in New York for Strata + Hadoop World to demonstrate how our big data integration platform as a service (iPaaS) allows customers to quickly ingest, prepare and deliver data to other sources within their IT ecosystems. We are also hosting a networking event for big data game-changers on demystifying data lakes, Hadoop and hybrid architecture. Learn more here.
In this episode of the SnapTalk podcast series, enterprise architect Ravi Dharnikota talks with Rakesh Raghavan, Director of Snap Engineering at SnapLogic. Rakesh comes to SnapLogic having designed, developed and managed data lakes for several leading online retailers and consumer-facing websites. He has successfully navigated enterprise data lakes using open source tools and manual techniques, and in this episode shares his first-hand experiences.
Ravi and Rakesh discuss the pitfalls of jumping into a data lake without a clear architecture, the challenges of supporting both traditional reporting and ad hoc data exploration use cases in the same environment, and the often-overlooked, often manual data engineering tasks involved in data lake implementation.
Subscribe to the series: https://soundcloud.com/snaplogic/sets/snaptalk