Many companies are apprehending
the benefits of Big Data and are starting to use their data more effectively.
With the benefits of Big Data, companies now have an opportunity to control the
power of real-time information and use analytics to interact with their
consumers in real time.
We know the
advantages of Big Data: understanding your customer, improving customer loyalty
and gaining competitive advantage. I
have recently started to learn Big Data. At the beginning I've asked this most
popular question many times-“How should I start Big Data?”
Actually, this
is a great question since there are numerous resources to learn about Big Data
and it is so difficult to select one to start. Therefore I decided to write this
post to share a summary of those I found.
So how do we
start our journey to Big Data? Here are the five tips I recommend to help you
get started.
1-Learning the
tools and technologies: start with any tool that you can access to like Python,
SAS, SPSS, SQL, R (which is available as open source) and try to learn it at a
deep and practical level. Then you will have some knowledge and then you can
search and study relevant topics as you now know a little to grow your
knowledge. Remember that people with high level knowledge about one special
tool are more preferable than who know a little bit about everything! So, it is
strongly recommended to master one tool and a few techniques of the tool to have
a better chance of getting the opportunities and accomplishing them. For
example, you can try with Introduction to Data Science from University of
Washington on Coursera website. Just remember to plan in right direction what tools
and technologies you want to learn.
2-Learning the
tricks: indeed a supplementary step to master a tools is learning the tricks of
that tool from another experienced in your company or learn from professional courses.
Notice that self-study courses and tutorials mostly will not provide you the key
secrets and tricks which are very crucial for solving real life problems.
3-Look for an
opportunity in your company to apply analytics in your organization. Mainly it
is difficult to identify where to start. If you know the sources of data and where
data is being collected (like some data repository) according to a certain
business process then you have a good chance to use it in your first Big Data
scenario. Start by generating simple insights from the data which is not
presently captured in the business reports and create simple metrics which will
add tremendous value to the businesses to show to the top management in your
company interested in what you are doing. Remember that most organizations do
not even do the most obvious understanding from a data analysis perspective.
4-Create a
case study of your work and show your analytics to your superiors. If they don’t
support you, devise a job search to extramural companies related to your new skills.
5-Read more
and more: it is strongly recommended to join blogs and forums on Big Data, follow
carefully companies in related domain and participate in the latest discussions
and events in Big Data such as LinkedIn. This help you being aware of how Big
Data is being applied in different business applications and functions and
increase your knowledge.
Thanks for your post.
ReplyDeleteDo you have any idea to how much extent the volume of the data, we could call it Big Data?
Once I heard that it must be big enough not to fit in you hard disk and memory.
DeleteBig Data for your smartphone, is not necessarily big data for a computing server.
Thanks for your reply, I heard that 1 TB is big, but this is a loose definition because it depends how one encode the data. Suppose the data is compressed and fits the hard disk, but could not be held on it after decompressing! It looks "Big Data" is a new hype while has not defined yet formally!
DeleteThis comment has been removed by the author.
ReplyDeleteI think the significance of "big data" has nothing to do with storage capabilities.Even if storage is being considered as one of multiple other challenges of big data (such as capture, search,sharing, transfer and analysis complexity, just to name few) the definition of "big" data is not function of hardware .So actually what can be qualified as "big" for hardware configuration A (disk, memory , ..) , it is still considered as big data for hardware configuration B. Now the question is still relevant : is there a metric used or any threshold from which one can qualify a data set of being "big" ?I think that,size wise, it is commonly known that 1TB is big but , as far as i know, there is no official size attribute of big data!
ReplyDeleteThe common definition for big data (that I know of) is: not being able to store the data on the machine. Big data for my cell phone is not big data for my desktop, and big data for my desktop is not big data for my server. Big data makes the experts handicap to analyse them using the conventional ways.
DeleteIn my opinion, big data has nothing to do with the size of data!
DeleteI can either change the encoding of a dataset (compressing, low-rank factorization for sparse data) that does fit in your device or I can change the data structure of the existing one in your device so that it cannot fit it any more! I think we need a better concrete definition in terms of mathematics and computer science so that it becomes hardware-independent. Anyway, it is still vague and it is more discussed as a new term for more attraction rather than being addressed by researchers in Machine Learning, Data Mining or Statistics. But in hardware/software engineering, Map-reduce, Cloud computing, Hadoop, Spark, ... are designed and implemented for this task.