James Markarian: Was the Election a Referendum on Predictive Analytics?

In his decades working in the data and analytics industry, SnapLogic CTO James Markarian has witnessed few mainstream events that have sparked as much discussion and elicited as many questions – around the value and accuracy of predictive analytics – as our recent election.

In a new blog post on Forbes, James examines where the nation’s top pollsters (who across the board predicted a different election outcome) possibly went wrong, why some predictions succeed and others fail, what businesses who have invested in data analytics can learn from the election, and how new technologies such as integration platform as a service (iPaaS) can help them make sense of all their data to make better predictions.

Be sure to read James’s blog, titled “What The Election Taught Us About Predictive Analytics”, on Forbes here.

SnapLogic CEO Gaurav Dhillon Shares His “Founder Story” with Ignition Partners

Founding a startup is not for the faint of heart. Raising capital, building product, managing teams, winning customers, fending off competitors — the list goes on and on. Every successful entrepreneur knows you can’t go it alone, you’ve got to find strategic, committed, supportive partners that will help you through the tough times and propel you forward in good times.

SnapLogic found exactly that in Ignition Partners back in 2012. In the trenches together now for the past four years, the two have grown SnapLogic into what it is today and look forward to helping the company along its path to becoming a world-class cloud software company for the ages.

Reflecting on the partnership, SnapLogic founder and CEO Gaurav Dhillon sat down with the team from Ignition to share his thoughts on the daunting fundraising process, overcoming bumps in the road, and why Ignition is the ideal partner for SnapLogic. Gaurav is joined in the “Founder Stories” video by other Ignition portfolio executives including Laura Mather of Unitive, Godfrey Sullivan and Erik Swan of Splunk, Kumar Sreekanti of BlueData, Jonathan Gray of Cask and Amy Chang of Accompany.

Below are excerpts from Gaurav’s conversation with Ignition Partners.

On the daunting fundraising process:

“Your product gets examined, people need to take apart the idea and put it together again in front of your eyes. For a lot of founders and entrepreneurs, it’s very difficult to have someone sort of decompose your baby right in front of you.”

On what makes for a good venture partner:

“There are certain firms and certain partners who approach you as ‘the boss’; no bad news is ever allowed. The joke in the Valley is if you take money from ‘that guy’ you need a ‘VP of Management of That Guy’.”

On overcoming bumps in the road:

“At SnapLogic, we had to make some tough choices, we had to really take a different product approach than we had started with.”

On why SnapLogic partnered with Ignition Partners:

“A lot of folks in the venture capital business are well connected but less and less of them have the sense for how to build a great company, how to build something that is durable, who really take pleasure in helping someone grow. That’s when you start to say – ‘who’s going to be the best person for us on our journey’. And make no mistake, all companies, especially successful ones, are on a journey and they have to deal with the twists and turns of what is going to happen in the marketplace. In the case of Ignition, this team had all come from operational backgrounds, they had been in my shoes, they had run big products and big companies.”

To watch the full videos, please click here and here.

 

 

Fall is Here, and So is Our Fall 2016 Release!

Fall is the time to move your clocks back, get a pumpkin latte and slow down with the approaching cold weather. But not for SnapLogic! We continue to deliver integration tools with full force. After our Summer 201fall-2016-graphic6 release, which was feature-packed, the Fall 2016 release takes the SnapLogic platform to a whole new level by extending support for Teradata, introducing new Snap Packs for Snowflake and Azure Data Lake Store, adding more capabilities to Spark mode, and delivering several enhancements for security, performance and governance. As our VP of engineering, Vaikom Krishnan, aptly said, “We continue to make it easier and faster for organizations to connect any and all data sources – whether on premises, in the cloud, or in hybrid environments.”

Continue reading “Fall is Here, and So is Our Fall 2016 Release!”

Big Data Management: Doug Henschen Dives into the Data Lake

doug_henschen_constellationYesterday SnapLogic hosted a webinar featuring Doug Henschen from Constellation Research called Democratizing the Data Lake: The State of Big Data Management in the Enterprise. Doug kicked things off by walking through where we were and where we are today with some compelling examples from The Second Machine Age, by Erik Brynjolfsson and Andrew McAfee. When it comes to the power of modern computing, for instance, in 1996 the U. S ASCI Red at Sandia Labs cost $55M, was 1,600 square feet and had 1.8 Teraflops of computing power. In 2006 the Sony PlayStation 3 sold for $499 at the size of 4 x 12 x 10 inches and could handle 1.8 Teraflops of computing power. Amazing! Doug went on to discuss the impact of distributed computing and how software has evolved (think: Kasparov vs. Big Blue compared to the chess game on your laptop today). Continue reading “Big Data Management: Doug Henschen Dives into the Data Lake”

New Brand Campaign Is Here

SnapLogic had a terrific 2015. We more than doubled our bookings, added 300+ new customers and raised a new round of funding, among other milestones. While most of this success came from the hard work and creativity of the people at SnapLogic who build, manage, market and sell this platform, as well as our customers and partners, it also reflected of a level of growth and maturity in the modern integration market.

What we heard from customers last year were:

  • the old months-long, manual app and data integration projects were not conducive to business agility
  • legacy ESB- and ETL-type products could not keep up with the realities of on-prem/cloud hybrid environments (much less the demands of big data)
  • having integration tasks that could only be performed by a few experts was a bottleneck

Continue reading “New Brand Campaign Is Here”

SnapLogic and the Data Lake

In this final post in this series from Mark Madsen’s whitepaper: Will the Data Lake Drown the Data Warehouse?, I’ll summarize SnapLogic’s role in the enterprise data lake.

SnapLogic is the only unified data and application integration platform as a service (iPaaS). The SnapLogic Elastic Integration Platform has 350+ pre-built intelligent connectors – called Snaps – to connect everything from AWS Redshift to Zuora and a streaming architecture that supports real-time, event-based and low latency enterprise integration requirements plus the high volume, variety and velocity of big data integration in the same easy-to-use, self service interface. Continue reading “SnapLogic and the Data Lake”

Empty or Full: What Lies Beneath the Data Lake

With Hadoop Summit this week in San Jose and so many opinions (and survey results) being shared about whether the data lake and Hadoop are half full or half empty, I thought I’d repost an article I wrote that was fist published on the Datanami site a few weeks ago. But first, a few of the half full/empty posts I’m referring to:

I’ll be at the Hadoop Summit this week with the SnapLogic Team (details here) and would love to explore these and other big data topics. Here’s my Datanami post: What Lies Beneath the Data Lake. Please let me know if you have feedback.

—-

Hadoop and the data lake represents potential business breakthrough for enterprise big data goals, yet beneath the surface is the murky reality of data chaos.

In big data circles, the “data lake” is one of the top buzzwords today. The premise: companies can collect and store massive volumes of data from the Web, sensors, devices and traditional systems, and easily ingest it in one place for analysis.

The data lake is a strategy from which business-changing big data projects can begin, revealing potential for new types of real-time analyses which have long been a mere fantasy. From connecting more meaningfully with customers while they’re on your site to optimizing pricing and inventory mix on-the-fly to designing smart products, executives are tapping their feet waiting for IT to deliver on the promise.

Until recently, though, even large companies couldn’t afford to continue investing in traditional data warehouse technologies to keep pace with the growing surge of data from across the Web. Maintaining a massive repository for cost-effectively holding terabytes of raw data from machines and websites as well as traditional structured data was technologically and economically impossible until Hadoop came along.

Hadoop, in its many iterations, has become a way to at last manage and merge these unlimited data types, unhindered by the rigid confines of relational database technology. The feasibility of an enterprise data lake has swiftly improved, thanks to Hadoop’s massive community of developers and vendor partners that are working valiantly to make it more enterprise friendly and secure.

Yet with the relative affordability and flexibility of this data lake come a host of other problems: an environment where data is not organized or easily manageable, rife with quality problems and unable to quickly deliver business value. The worst-case scenario is that all that comes from the big data movement is data hoarding – companies will have stored petabytes of data, never to be used, eventually forgotten and someday deleted. This outcome is doubtful, given the growing investment in data discovery, visualization, predictive analytics and data scientists.

For now, there are several issues to be resolved to make the data lake clear and beautiful—rather than a polluted place where no one wants to swim.

Poor Data Quality
This one’s been debated for a while, and of course, it’s not a big data problem alone. Yet it’s one glaring reason why many enterprises are still buying and maintaining Oracle and Teradata systems, even alongside their Hadoop deployments. Relational databases are superb for maintaining data in structures that allow for rapid reporting, protection, and auditing. DBAs can ensure data is in good shape before it gets into the system. And, since such systems typically deal only with structured data in the first place, the challenge for data quality is not as vast.

In Hadoop, however, it’s a free-for-all: typically no one’s monitoring anything in a standard way and data is being ingested raw and ad hoc from log files, devices, sensors and social media feeds, among other unconventional sources. Duplicate and conflicting data sets are not uncommon in Hadoop. There’s been some effort by new vendors to develop tools that incorporate machine learning for improved filtering and data preparation. Yet companies also need a foundation of people—skilled Hadoop technicians—and process to attack the data quality challenge

Lack of Governance
Closely related to the quality issue is data governance. Hadoop’s flexible file system is also its downside. You can import endless data types into it, but making sense of the data later on isn’t easy. There’s also been plenty of concerns about securing data (specifically access) within Hadoop. Another challenge is that there are no standard toolsets yet for importing data in Hadoop and extracting it later. This is a Wild West environment, which can lead to compliance problems as well as slow business impact.

To address the problem, industry initiatives have appeared, including the Hortonworks-sponsored Data Governance Initiative. The goal of DGI is to create a centralized approach to data governance by offering “fast, flexible and powerful metadata services, deep audit store and an advanced policy rules engine.” These efforts among others will help bring maturity to big data platforms and enable companies to experiment with new analytics programs.

Skills Gaps
In a recent survey of enterprise IT leaders conducted by TechValidate and SnapLogic, the top barrier to big data ROI indicated by participants was a lack of skills and resources. Still today, there are a relatively small number of specialists skilled in Hadoop. This means that while the data lake can be a treasure chest, it’s one that is still somewhat under lock and key. Companies will need to invest in training and hiring of individuals who can serve as so-called “data lake administrators.” These data management experts have experience managing and working with Hadoop files and possess in-depth knowledge of the business and its various systems and data sources that will interact with Hadoop.

Transforming the data lake into a business strategy that benefits customers, revenue growth and innovation is going to be a long journey. Aside from adding process and management tools, as discussed above, companies will need to determine how to integrate old and new technologies. More than half of the IT leaders surveyed by TechValidate indicated that they weren’t sure how they were going to integrate big data investments with their existing data management infrastructure in the next few years. Participants also noted that the top big data investments they would be making in the near term are analytics and integration tools.

We’re confident that innovation will continue rapidly for new Big Data-friendly integration and management platforms, but there’s also need to apply a different lens to the data lake. It’s time to think about how to apply processes, controls and management tools to this new environment, yet without weakening what makes the data lake such a powerful and flexible tool for exploration and delivering novel business insights.

—-

For more information about SnapLogic big data integration visit www.snaplogic.com/bigdata. Please be sure to also take a minute to complete the Hadoop Maturity Survey for a chance to win an Amazon Gift Card.
snaplogic_data_lake