Data scientist vs Data analyst , their roles and qualification



just hi welcome to big data simplified or blocks border 10 so today we are going to look about at the difference between data analysts and data scientists is the biggest question and there is a biggest dilemma between these two things that which we need to be clear about these two rules because even the people walk with the data they don't know what is the difference between the analyst and what is the difference between the scientist so that's what we are going to do so here the main thing is all about tt8 are okay what is the date I am putting a question what is the D hm a date is all about sharing an information or in normal age that's all the data so I am saying that something like or an information I am giving to you so this is a data so the data what who generates the data so signing an information as a data so who generates these data so human so first of all the individual person he is generating the data by participating in so many social media including Facebook whatsapp LinkedIn YouTube Twitter and Gmail etcetera so they are started uploading their informations and more than terabytes and Travis of data per hour it is getting uploaded into the social medias and the first part always the data generation is goes to humans the next part who generates the next to human who generates more data if you take organizations they are generating so many data for example healthcare they generates the data based on the patient details historical details and about the D diseases and about the location GPS location of the our health instruments etc and etc including pharmacy details and if you take a retail online shopping online marketing so people purchase historical information about the customer historical information about their likes dislike etc and if you take a transport if you take Airways roadways railways and waterways they are just tremendously the data is getting increased day by day and including in oil and gas also and even in banking sector financial fraud deduction and incidents etc etc so first generates the data is generated by the human the second generation is all about an organization and the third generation is all about a mission submission even nowadays started generating for example ATM automated teller machine and if you take a mobile device which we have in our hand it's a smartphone or whatever it is we have an entire world in our hand through a mobile a smartphone we started uploading lot of datas means that mobile which is generating lot many data if you take a photo and that is generated by your mobile phones right so we feel go for a parties or somewhere the data has bitten been getting generated and the same thing happened with our laptops so we are streaming webcam chat etc etc which is even generated by the missions so mission has started generating the data organization has started generating the data and humor has been started generating a data so huge amount of data has been getting generated but the technology cannot handle these kinds of a data so what actually I am going to do with the data is all about I am going to store my data and I am going to do some process on the data to get some outcome so if I get the problem with these two storage and processors and I can I'm going to face these many V's is nothing but problems in my data if you take a volume storage as my data mean sorry storage is my problem velocity speed is my problem variety handling structure semi and unstructured is a problem veracity how can I trust my data valid ET valid you of my data and if you take a visualization I need a bar chart pie chart etcetera and I have a value what is the value of my data and wallet a wallet it was all about the outcome needs to be saved whatever I expected so I am getting these many V's because of my data is very big so that we call it as big data so people as a developer I get tensed how can I handle these these means these many problems so we call it as V sin big data how can I solve this I have a storage and I have a process problem and I am getting these many V's so this is called a big data is all about a problem so you can see in bottom I have written size can be anything even it can be an MB or GB whatever it is the data which across a problem cause the problem is a big data's not all about the size so I'm just got tensed so someone is there to say me calm down okay what what is there any solution for me I've got tense with my data someone is saying calm down you have a solution for it of course I'm wired I need to see what is the solution immediately okay so the solution is I have a data scientist and I have a data analysis to give me the solution but I don't know what is these two guys are doing it I don't know what does analyst will do and what is the scientist will do I don't know I know only the problem what I am facing so we can see what the analyst will do and what the status scientists will do and in in this picture you can see the data analyst who is pointing to something a representation so he used to the data analyst he used to break the Norge problem into small pieces for ever better understandability so data analyst they are just used to give the solution through our representation through pie chart or graph chart based on what happens so far okay what happened so for very important the word what happens so forth but the data scientist you see who will see the problem in the business point of view and this data scientist will do the predictive analysis means what happened what is going to be happen the next year and what is going to be happen in the next rimmel so that is what the data scientist will do you can see here in making data scientist I have just written something so he is a data scientist and it is in making prediction so what is going to be happen I have to predict something if you take healthcare or retail I have to predict something if I want in healthcare I want to predict something that this is going to become because with respect to these data so I have to predict it so for example day to day life you are acting as a status teams for example you can see here I have given within a double course you are more likely to pass the exam if you start preparing earlier so we are predicting that he if he if he has been preparing earlier he could be hit pass mark right so that is what are actually the statistical in nature so I want to approach the data I want to the knowledge of the data through some statistics and mathematical x' so that is what the data scientist will be doing data analysis all these process of breaking a complex topic or a substance into small paths to gain a better understanding ET asset graph by short and exit Rand etc so this is the difference between these two guys and what what makes the data analyst team and what makes the data scientist team you can see here I am going to build an analyst team then I need a software engineers and I need a data viral seam guys that makes the data analyst team and if you take a data scientist in the right hand side I need a software engineer I need an analyst plus I need a domain exports so especially like Java scalar or and Python which deals with a functional programming of mathematics and our graphical user representation is there in these languages the combination of domain expects software engineer analyst makes a data science team and the combination of software engineer and data virus makes the data analyst team so the data analyst he just give me the representation of what happened so for data scientist will give me the prediction is what going to be happened here after it's only would a business point of view and the rules whatever data analyst will enjoy and what the teacher scientist will enjoy the rules of data analysis all about he can became a date or architecture and database administrator analyst engineer and operation guy and data scientist role is all about a data researcher developer in creative is very important and business people and the technology a qualification what you need if you want to became a data scientist and data analyst so data scientist he needs to know all the databases even if you take the traditional system or the big data field whatever the Hadoop hi all those things and you need to know some know sequel databases to if you want to became a data scientist and data analyst is all about it is to understand the business intelligent concept and that is enough and if you take a data scientist you need to know some better familiar programming paradigms like a Java Python MapReduce etc and here it is very important to know sequel and if you take are in data scientists you need to know mathematics statistics correlation data mining and predictive analysis so better prediction for business decisions but if you take up a perfect with the tools and components of data architecture they haven't been tuned for that in an ALICE so you need to know the tools and here if you take knowing or is like a future of a data scientist gap or is a programming language like it's all about a mathematic a deals with the mathematics they say mainly use this in financial and or and you need to know some statistical framework like a machine learning now how the Bayesian clustering etc and data analyst is familiar needs to be familiar with ETL tool and proficiency in decision making so the qualification seems to be a bit tough for a guy who is seeing this video because the data scientist is not as it's not a simple or it is not a small job it's a combination of domain export you need to know five plus programming languages and you need to know all the databases as much as possible and you need to know how to develop a product so that's what a software engineer do so combination of all three bits it builds a carrier data scientist so data scientists the current and future succeed stand hardest job of course the statement is very true on this statement as being said by a top economist who's working for Google so currently I am I am a data analyst too then you need to be very strong in Big Data technologies like Hadoop and low sequel databases plus you need to know some domain expert programming languages which deals with the mathematics and statistics so what you have studied in your college days like integration or matrix or vectors whatever statistics you mean medium all those things you need to revise it back so you have to gather the knowledge from the data by using statistics and max so thanks for watching big data simplify dot blogspot wrote in so subscribe our channel if you like these videos like us on Facebook you

6 Comments

  1. Mvs Sudha said:

    Awesome explanation

    June 26, 2019
    Reply
  2. Shantanu Khanna said:

    i had done msc.maths. which course should I pursue to become data analyst?

    June 26, 2019
    Reply
  3. PRATIBHA SRIDHAR said:

    hello sir,
    I am BE-2015 Graduate in Computer Science. I have also learnt SAS and Hadoop. Now I'm totally confused as to how to approach my career from here. I would like to go ahead with data science/analysis into finance or stock market research. But what should I do for that? What job should I be looking for?and is ut the right choice to make?

    June 26, 2019
    Reply
  4. arunbm123 said:

    nice explanation

    June 26, 2019
    Reply
  5. vishwa joshi said:

    i am MBA with marketing – can you tell me which course i should pursue to become data analyst?

    June 26, 2019
    Reply
  6. aish aish said:

    nice explanation

    June 26, 2019
    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *