9 Questions to Ask Before Kicking off any Big Data Project

What do you get when you combine rebranded analytics systems, a minefield of consultants turned “big data experts,” and insanely expensive “big data servers” that look suspiciously similar to commodity machines?

You get: the most complicated space for any business or decision maker to navigate.

Frankly, as one of the providers of what we feel are real big data solutions, it’s getting a little annoying. So if you are a decision maker, here are 9 questions to help you navigate your decision:

1. Do you have an actual business objective you are trying to optimize for?

What is your specific business goal? Big data projects aren’t magic and shouldn’t be viewed as such. To avoid an exercise in futility you need to narrow down your desired outcome.

For example: “We want to increase our conversion rate for first time visitors,” or “we want to maximize revenue for the children clothing category,” or “we need to find out how to reduce our loyalty member churn.” You do NOT need a consultant to answer this question for you.

2. Have you identified which data sources are valuable after you have determined your business objectives?

It is important to identify which data sources you will pull from to achieve your desired goal. You can’t assume “the more the better.” The term “garbage in and garbage out” is the perfect way to describe how effective machine learning models can be at predicting an outcome.

3. Have you and your team had the internal debate about hosting data on an elastic compute platform or on dedicated hardware for various analysis?

Have you decided if you are comfortable moving your data off-premise and into a cloud? Many new software systems are hosted on elastic compute platforms like AWS, Azure, Softlayer, Google Compute, etc. Security is critical but there are ways to securely move your data into an elastic compute platform (save that for another post). Your big data strategy and cloud strategy have to be discussed in parallel.

4. If you already have a business intelligent tool, are you able to answer the following questions?

  • What is the true ROI of the BI and analytics tools each year relative to what you are spending? Internal engagement is not enough ROI.
  • How well are the users in your organization using these tools? Do they feel satisfied with their usage?
  • Are there available open source plug-ins so that you don’t have to purchase from separate new vendors?

It is also important to weed out the glorified analytics companies that are incorrectly calling themselves big data solutions.

5. Have you investigated flexible storage solutions for your data?

The database you choose is one of the most important decisions. There are both free open source databases as well as paid versions. The decision should be based on time/cost savings and the resources available on your team to manage it. Better yet, are there options that already incorporate the database directly into their system and therefore you don’t need to buy a separate DB?

6. Have you hired a Data Scientist? If so, what are his/her responsibilities? Are they helping you achieve #1, or just mining for random insights?

It is important to know the difference between a Data Analyst and a true Data Scientist. A Data Scientist can build a brand new model where as a Business Analyst can only use off the shelf models or software to comb through data. Good Data Scientists can usually code and hit the ground running with the raw frameworks themselves versus only using the packaged versions. Having a data scientist on staff means they can be quantified based on how close they get to helping you achieve your overall goals set out in #1.

7. Have you looked at your technology vendor/VAR contracts to ensure you are using all the frameworks you have available to you?

It is a good idea to take a second look at your contract. Often you have access to other frameworks that are supported, but aren’t being used. This can help determine the direction your project takes, and eliminate some unnecessary costs. I can’t begin to count the number of analytics directors who are complaining they can’t get access to the Hadoop server that’s sitting “in our other data centre.”

8. Have you experimented with/or adopted data visualization tools like Tableau, Qlikview or others?

Having a good grasp on these tools will optimize effectiveness. However, it is important to remember that while these tools are great for visualizing your data, they won’t necessarily help you with actionable recommendations to get you closer to your objectives in question 1, and do not replace the need for a predictive analytics tool.

9. Do you have a realistic “Throw away” budget for your big data projects over the next 24 months?

You need to have a realistic discretionary budget allocated to big data. Big data projects often uncover many opportunities that go beyond your original goals listed in #1. This means that you really don’t know which business units will ultimately benefit the most and therefore assigning budgets too early may cause conflict and confusion down the line. This realistically should come from net new budget, expecting a net new outcome.

At the end of the day any big data project should help you do two things:

  • Make a prediction on a valuable question that leads to a KPI change
  • Help turn your analytics and data processing from a cost centre to a profit centre

Everything else is just sand.