Why should I become a data scientist? - The Big Data Pipeline for Beginners


Why am I supposed to become a Data Scientist? – Big Data’s Pipeline for beginners

Post - Why am I supposed to become a Data Scientist? – Big Data’s Pipeline for beginners data science

Another article on Big Data? Seriously, another one?

We are tired of reading about Big Data here and there. Everyone talks about it but not everyone ends up specifying what this new universe is all about and what it is to be a data scientist. And the fact is that Big Data does not have a single end goal. Depending on where the large volume of data to be analyzed is obtained or the type of data used, we can predict whether a person will suffer from diabetes in the future, to show what is sought before it has been lost (yes, Amazon uses Big Data, like... a lot. Translated with www.DeepL.com/Translator (free version)

This is all fine and dandy, but how is it done?

You are stunning but Big Data is OSUMING. Why are you stunning? As a curiosity, the data scientist role is known in the business world as the blue unicornrole. Personally, I prefer to say we are like detectives, like detective padawans as we are beginners. We have to use our deduction power to find patterns and predict behaviours of the data.

Also, we don’t just do Big Data without knowing what we are doing. There are six simple steps you can follow in order to be successful as soon as possible.

O – Obtain the data.

S/C – Scrubbing/cleaning the data of null or wrong data (known as false positives).

U – Understanding the data in order to find the patterns we were talking about earlier.

M – Modelling the data to be able to use our deduction power.

I – Interpreting the results.

G – Get the success (and the money, we all like that step).

The most time consuming part is understanding our data. If you really want to find secret patterns or get to the finish line, understanding the data should be the most important part. Figuring out where it's best to install AC in subway cars is not the same as figuring out how the economy is evolving. About 80% of the time spent will be just understanding. Modeling will be easier (as a beginner) with Big Data software such as Translated with www.DeepL.com/Translator (free version) Knime or R. The latter is more programming but nothing you can't get by being a data scientist.

Now you have the clues to find out what happened in the Big Data case, take your detective license and make Sherlock proud of you.

If you are interested in training in this field, from Qaleon we propose you to learn about Empleable to have a training path for the data scientist you want to be.