What is Predictive Analytics?



hi my name is Brad hill thanks for joining me today for this introductory session titled what is predictive analytics so the three objectives I'll cover in this session are what is predictive analytics where is predictive analytics used and how does predictive analytics work so what is predictive analytics here's a definition of predictive analytics from research company Gartner predictive analytics helps connect data to effective action by drawing reliable conclusions about current conditions and future events particularly vexing compasses a variety of techniques such as simulation statistics game theory and data mining to do this analysis and make these predictions so these predictions enable organizations to use predictive models to exploit patterns found in a historical data to address a business goal and this goal could be something like a customer attrition or consumer demand these predictive models capture relationships among many factors to allow assessment of risk or potential associated with a particular set of conditions which allows decisions to be guided on a proactive basis which then result in better outcomes so what kind of data do you use for this type of analysis most commonly it's descriptive and behavioral data however by adding interactional and attitudinal data you can experience a significant increase in the accuracy and performance of these predictive models so as an example if we were predicting whether a student would complete their study at say college or university descriptive data will be fields like gender age marital status their address whether they're an international student part of an ethnic minority or something else some other variable like that whereas the behavioural would include the subjects that they're enrolled in their assessment marks for the individual subjects whether or not they're attending their lectures or their tutorials and maybe how many years of study or how far they're progressing through the degree so this will provide a really solid foundation and good basis for the analysis and prediction however if you think about it if you knew their attitudes from say a student survey or from social media if they provided some type of feedback or commentary around that they were having difficulty coping with the the workload or if they even stated that they were considering dropping out then that would give you a huge clue as to this state of mind and they're likely actions the last areas of the interaction data and that could be something like email correspondence you know about enrollment deadlines or inquiries about deferring studies or even if there was some inactivity and logging into a unis web account that may have been dormant since they've enrolled or if they're very active in forums or other class activities so in their banking activity I guess the descriptive and behavioral would be who the customer is and their banking history for their credits and debits across all their accounts their interactional data would be the activity when they logged into their internet banking or maybe the call center history when they inquired about fees for closing their credit card and their attitudinal could be gained through polls or surveys or again through social media channels to gain an insight into their opinions or desires or how their preferences for new products and our offerings and and how they would like to be contacted so given a few examples on where predictive analytics can be used let's take a broader look at where it applies generally speaking there are three main areas or pillars of predictive analytics customer analytics operational analytics and threat and fraud analytics so the most common of these is around customer analytics enabling organizations to better understand their customers and predict what it is that they're likely to do so by acquiring customers more efficiently growing the value of these existing customers and then retain the profitable customers for a longer period of time all these goals are assisted by prediction for example whether an individual from a mailing list is likely to be a profitable customer or whether or not they're likely to respond to a campaign and they're going to be interested into in a particular product you know whether someone behaviors indicates whether they're thinking of switching to a different supplier altogether operational analytics generally revolves around assets or processes whether it be helping to manage a physical or virtual asset from planning the right physical inventory to stock in your supply chain to assessing how many components to purchase to support particular particular production facilities it enables organizations to manage physical infrastructure and their capital equipment by ensuring the allocation of people and the cash in the most efficient manner to maximize their capital for example preventing unscheduled downtime of a truck by analyzing the service history to identify when certain parts are likely to fail based on the conditions or the environment that's working in you know in combination with the intensity and the duration rather than just saying that this part needs to be serviced every six or 12 months the third group of applications is around the area of threat and fraud analytics here analytics is used to detect suspicious or anomalous transactions like the potentially fraudulent insurance claim or to detect money laundering activities in this case it's about monitoring your environment by including a wide variety of data sources across multiple areas detecting suspicious behaviors to identify those threats or information breaches patterns in crime or fraud and then control the outcomes to deliver the best response to reduce exposure reduce loss and maximize the impact of any action that that's taken more appropriate it's looking a little bit more detail around customer analytics it's not just about treating each customer uniquely yeah and not every customer is a good customer so being able to identify and acquire the ideal customer that is those that will be profitable throughout their entire lifecycle allowing you to put targeted acquisition efforts in and growing that customer increasing their lifetime value through personalized upsell and cross-sell efforts now letting some of these customers go to be perfectly fine because some of them won't pay on time others are going to cost you money and retention is not about keeping every single customer rather ensuring that you retain the most valuable customers by identifying the indicators that lead towards defection and proactively reaching out to out to them to make sure that they can stay as well as enhancing the customer loyalty by turning those satisfied customers into brand advocates so let that let those other other customers that aren't profitable or the bad customers go and go to a competitor and that increases your competitive advantage operational analytics is particularly relevant for manufacturing or supply chain or those in a services industry when you think about operations is a little differently so think about as people processes and assets it opens itself up to a much wider area of applications so being able to plan operations by allocating future expenditures in the most efficient manner and having the right quantity of the right product available at the right time at the right location then managing the day-to-day operations looking at the identifying areas that you could improve existing operational processes and employee productivity and effectiveness therefore maximizing the longevity of infrastructure equipment and employee performance and finally analytics to protect an organization from threatened fraud monitoring your environment Blanc putting a wide variety of data across multiple sources within an organization even externally to detect suspicious behaviors and identify these threats whether they be information breaches crime fraud whatever the case may be but allowing you to then control the outcomes to deliver the best response to reduce exposure or loss and maximize the impact of any action that's taken so for example in the case of insurance claim identifying a claim as fraudulent after you've paid out may prove difficult to recover the costs however identifying the claim has a high likelihood of being false when the claim is actually made that notification of first loss you know whether it be through a website or a call center could mean that it's handled very differently or referred to an investigator immediately before any payments made conversely this allows those claims with a high degree of legitimacy to be fast-tracked minimizing the time for processing them and therefore increasing the customer satisfaction let's move now to look at how predictive analytics works let's take the example that I want to create a model that predicts the propensity of a customer to churn that is to voluntarily cancel their services and select an alternative supplier so starting with some historical data about our customers now in this case some demographic data and some transactional you'll see we have a gender recent activity satisfaction marital status and the the data that we're using here contains the outcome that we're going to predict so in this case that last filled churn which is colored blue so the historical data contains known outcomes which is an indicator in this case of a customer who has churned where churn is equal to T and those where they haven't left where churn is equal to F so we can build a model here to make predictions and determine when each outcome is most likely to occur so generally we'll start by splitting the historical data into two sets the training set which we'll use to build the model and then the testing set or also known as the validation set in which we use to test the model so this approach allows the model to be built and compared against similar but not identical data for validation purposes to minimize overfitting the model so the next step is to is to build the the model using the training set so an algorithm is used and applied to the training data to construct a set of rules that predict the value of this churn attribute which is our target variable so that could also be called the predicted value or the dependent variable so the remaining variables are then used to construct the prediction logic and those inputs are also called predictors or independent variables so the result of this training process is it could be a set of rules to that uses those input variables or the predictors to generate a prediction for that target which in this case is whether or not the customer is likely to churn and one technique may be a decision tree which is what we can see at the bottom here once this model is built it needs to be tested against the testing of the validation set so this is done by feeding the data through the model using the same attributes as before to generate a predicted value for that target variable which in this case is churn so that predicted value is then recorded and compared to the actual values so looking at that first case the customer churn value was true and the predicted value that white value right at the end there has also come up true so it's been predicted correctly the same with the second case where it was pretty that it would not churn the third case was predicted incorrectly where they did in fact churn and the model predicted that they wouldn't so if we take the ratio of correct predictions to actual outcomes we can compare the accuracy of this model against others I often get asked how accurate does a model need to be to be used and I guess the answer to that really depends but it also has a disclaimer attached to it so if we're talking about medical research or Public Safety or an area that could affect the life of someone or general general safety then your model needs to be as accurate as it possibly can be now if we're talking about an area that does involve any one of those it just has to be better than what you're doing now for example if you're running a direct mail campaign and you need to create a list of prospects to target and your response rates currently two-and-a-half percent and if you can create a model that's only forty percent accurate but it means you can get a response rate of five percent and you can mail to much fewer people then it's helped you in two ways because there's increase your response rate and it's also decreased to campaign cost which in turn improves your ROI dramatically so the accuracy doesn't need to be be really high it just needs to be better than what you're doing unless of course you're dealing with you know medical research or public safety areas so after we built the predictive model we can gain some insight into what's going on by examining the model but the real value is in the deployment of model which is what we call scoring so in this example we can take some new data which may be a list of customers that are up for renewal in the net in the next month or next time period and we want to predict which one of those are likely to churn and go elsewhere so the data we feed into this model has to have the same attributes that we used in the creation of the model but the result in this case is where we don't know the outcome these are all current customers but we don't know which ones are likely to churn or voluntarily cancel their contracts so we can take this new data feed it through that model and the result is getting them going to be a prediction for each customer as to whether or not they're going to churn those that are predicted to churn might then be contacted by the retention team and offered a special offer while those that are not maybe fuller may not even be followed up or they may get an alternative offer for an additional product for a cross sell so this scenario uses a group of cases to be be assessed which is known as batch scoring this approach is appropriate when there's enough lead time before an event takes place such as you know next month renewals or a direct marketing direct mail campaign so in in the case where times critical one record may be used at a time and in this case we're looking at real time scoring an example here if someone was purchasing items online and submitting a basket and they were faced with the relevant cross sell offer specific to the goods that they just purchased or when someone was lodging an application for credit online having a risk assessment done to determine how likely they are to default in a line this concludes today's session on Pro Analytics thanks very much for viewing but please feel free to leave a comment below or connect via Twitter or LinkedIn to hear about new sessions or to request a future topic Cheers

24 Comments

  1. Wewerton Mendonça said:

    Thanks for sharing your knowledge!

    May 23, 2019
    Reply
  2. Daniel Mathew said:

    anyone from god friended me

    May 23, 2019
    Reply
  3. Sanapala Srikanth said:

    Global Predictive Analytics Market – Size, Outlook, Trends and Forecasts (2019 – 2025)

    The predictive analytics market value is estimated to reach up to $13.04 billion by the end of 2025 with a CAGR of 22.8% during the forecast period from $4.23 Billion

    in 2018.

    Keyplayers:

    @Oracle

    @HP

    @Dell

    @SAP

    Download a Sample Report at:- https://www.envisioninteligence.com/industry-report/global-predictive-analytics-market/?utm_source=youtube-Srikanth

    May 23, 2019
    Reply
  4. Jevin tan Jin yi said:

    Try not to whisper next time

    May 23, 2019
    Reply
  5. Saif Ali said:

    For learn Predictive analytics must visit kics.edu.pk/web

    May 23, 2019
    Reply
  6. Hank Janké said:

    Let me summarize this: stereotype the shit out of people because, computers

    May 23, 2019
    Reply
  7. Jm Pl said:

    In certain fields PA could also be considered formulae for creating specific desired outcomes .

    May 23, 2019
    Reply
  8. Thiago Dalboni said:

    @bradhill14 Brad, thanks for the great video. In the process of building the model, is there a thumb rule to separate the data sets in trainingor testing, in order to avoid biased data in one or in both sets?

    May 23, 2019
    Reply
  9. Anita Barb said:

    Very informative and easy to follow. Thanks!

    May 23, 2019
    Reply
  10. Xiaoxiao Lei said:

    GREAT JOB!

    May 23, 2019
    Reply
  11. Meyyappan Lakshmanan said:

    Just loved the presenatation

    May 23, 2019
    Reply
  12. Chandra G said:

    Thanks Brad – pretty good content and presentation.  Enjoyed it.

    May 23, 2019
    Reply
  13. alvrChowdary said:

    Precise and perfect. Thank you

    May 23, 2019
    Reply
  14. Bruno CATHERIN said:

    nice and content and clear explantation for something which becoming our future

    May 23, 2019
    Reply
  15. venkatesh sree said:

    very informative. Thank you. Is it possible for you to explain few tools that operates predictive analytics?

    May 23, 2019
    Reply
  16. Menna Mahmoud said:

    i am working on a graduation project about predictive maintenance could you help me in algorithms and data sets ??
    my mail : [email protected]
    i will be grateful

    May 23, 2019
    Reply
  17. Dowta Dibalaba said:

    good

    May 23, 2019
    Reply
  18. Radoslav Dzhidzhov said:

    Interesting and instructive 🙂 Well done, Brad !

    May 23, 2019
    Reply
  19. Ak Abdulla said:

    Thank you. A good presentation on the reality around us which opens great opportunities for the world to make it a better place to live. Thank you again.

    May 23, 2019
    Reply
  20. Valter Herman said:

    This was a great introduction, Brad!  I'd love to see some examples of how algorithms and eventual models are built.

    May 23, 2019
    Reply
  21. Arpana Pathak said:

    vedio initial slides are good…but didn't understand the "testing " and "training" slide…and its final implementation

    May 23, 2019
    Reply
  22. Himanshu Gupta said:

    Good one…able to understand through slides.

    May 23, 2019
    Reply
  23. manu khandelwal said:

    Thanks…Thats was good..

    May 23, 2019
    Reply
  24. Johan Peralta said:

    Excellent.

    May 23, 2019
    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *