PolyStat: How should I start Big Data?

Thursday, June 25, 2015

How should I start Big Data?

Many companies are apprehending the benefits of Big Data and are starting to use their data more effectively. With the benefits of Big Data, companies now have an opportunity to control the power of real-time information and use analytics to interact with their consumers in real time.

We know the advantages of Big Data: understanding your customer, improving customer loyalty and gaining competitive advantage. I have recently started to learn Big Data. At the beginning I've asked this most popular question many times-“How should I start Big Data?”

Actually, this is a great question since there are numerous resources to learn about Big Data and it is so difficult to select one to start. Therefore I decided to write this post to share a summary of those I found.

So how do we start our journey to Big Data? Here are the five tips I recommend to help you get started.

1-Learning the tools and technologies: start with any tool that you can access to like Python, SAS, SPSS, SQL, R (which is available as open source) and try to learn it at a deep and practical level. Then you will have some knowledge and then you can search and study relevant topics as you now know a little to grow your knowledge. Remember that people with high level knowledge about one special tool are more preferable than who know a little bit about everything! So, it is strongly recommended to master one tool and a few techniques of the tool to have a better chance of getting the opportunities and accomplishing them. For example, you can try with Introduction to Data Science from University of Washington on Coursera website. Just remember to plan in right direction what tools and technologies you want to learn.

2-Learning the tricks: indeed a supplementary step to master a tools is learning the tricks of that tool from another experienced in your company or learn from professional courses. Notice that self-study courses and tutorials mostly will not provide you the key secrets and tricks which are very crucial for solving real life problems.

3-Look for an opportunity in your company to apply analytics in your organization. Mainly it is difficult to identify where to start. If you know the sources of data and where data is being collected (like some data repository) according to a certain business process then you have a good chance to use it in your first Big Data scenario. Start by generating simple insights from the data which is not presently captured in the business reports and create simple metrics which will add tremendous value to the businesses to show to the top management in your company interested in what you are doing. Remember that most organizations do not even do the most obvious understanding from a data analysis perspective.

4-Create a case study of your work and show your analytics to your superiors. If they don’t support you, devise a job search to extramural companies related to your new skills.

5-Read more and more: it is strongly recommended to join blogs and forums on Big Data, follow carefully companies in related domain and participate in the latest discussions and events in Big Data such as LinkedIn. This help you being aware of how Big Data is being applied in different business applications and functions and increase your knowledge.

7 comments:

Sajjad GhaemiJune 26, 2015 at 1:44 PM
Thanks for your post.

Do you have any idea to how much extent the volume of the data, we could call it Big Data?
ReplyDelete
Replies
UnknownOctober 2, 2015 at 10:09 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownOctober 2, 2015 at 10:10 AM
I think the significance of "big data" has nothing to do with storage capabilities.Even if storage is being considered as one of multiple other challenges of big data (such as capture, search,sharing, transfer and analysis complexity, just to name few) the definition of "big" data is not function of hardware .So actually what can be qualified as "big" for hardware configuration A (disk, memory , ..) , it is still considered as big data for hardware configuration B. Now the question is still relevant : is there a metric used or any threshold from which one can qualify a data set of being "big" ?I think that,size wise, it is commonly known that 1TB is big but , as far as i know, there is no official size attribute of big data!
ReplyDelete
Replies

Add comment