Data Visualization Tools – Predictive Analytics(tm)



with advisors visual discovery software a business person can easily display 5 10 20 or even a hundred or more metrics from one or more data tables being able to display many metrics is great for navigation and exploration however for correlation a human being can only grasp patterns across 5 10 or maybe 15 different things that's where predictive analytics comes in it uses math to determine causal factors behind something you've selected that complements what you see visually let's see how it works in this case we're working with the same data set that we've shown in several of the other short videos it's a set of roughly 10,000 customers comprising 496 million dollars over the last year and here's a list of all the customers we've run a campaign against those customers and the yes responders were 10.2 percent and ninety percent did not respond to the campaign we've looked at them by region and so forty one percent or two hundred three million dollars came from the West region smaller amount from the south and so forth so this this customer base is skewed towards the west by industry the biggest industry is finance and insurance comprise 16% of the total 496 million dollars of revenue so that was they comprised 76 million out of the 496 we've also taken the customers we spread them from low to high revenue and low to high margin so this quadrant up here are the customers and the dots each dot represents a customer that have high margin but low revenue and they're colored by region so the blue a customer up here would be one in the east the the gold is one of the south and so forth what we're trying to do is understand from the campaign how to take the high margin accounts with lower revenue and derive them to the right so they have higher revenue that was the purpose of this campaign so we want to explore the high marginal revenue accounts who responded yes to the campaign so to do that first we got to select the high marginal revenue account so we sweep over them on the scatter plot with the mouse we grab the accounts that are roughly fifteen percent or more in margin and under one hundred and twenty thousand dollars in revenue over this period that's roughly fourteen hundred accounts now let's set the selection so we intersect select with the responders so now I'm looking at the yes responders in that high-margin lower revenue category here they are there's roughly 130 or so accounts and I can drop down to just look at those I see this skew towards the east which is you know different than my overall population is skewed towards the West here the East comprises 55% of the revenue in this group the government industry so this is a very different profile my overall population let's bring it back and I'm going to create a bookmark up here so I can come back to this state later on I'm going to call this high margin low Rev responders so I've seen visually this data is skewed towards the east and in the government industry but what else matters if I'm trying to profile these accounts to take action on them what what else would I need to know let's run some predictive analytics I go to my task view I go down to analyze data click predictive modeling opens up a pane and I'm going to now run a new model so I click new model I'm going to name the model high revenue low margin responders I'm going to then what make sure it's on the right table the customers table I can then target any of the fields that came in with the data or I can target what I've selected visually which is what I want to do the intersection intersection of the high margin low revenue customers will responded to my campaign all of the columns of possible explanatory variables for this I need to take off the things I've selected or the model will tell me that's what's determining the subset so you can go down and take off response to the goal bundle campaign I need to take off margin I need to take off revenue so click OK I'm now going to run a multivariate regression model against this selected data and this is algorithms it's vapnik algorithms from a French company kxe in which we've OEM din to our product is rich multi variant regression it runs a test model behind the scenes to validate that the data is complete enough and then it runs a full model and comes back with the results two little indicators is there enough information are there enough rows of data yes there are are there enough dimensions to prop Marlys yes there are so the factors that came back with city explains 32.8% of this population in its Albuquerque and Bronx in Brooklyn New York so I've got you know one in the West a couple in the east followed by Baltimore st. Louis and then there's some cities down here with all these are negative impacts so these are the ones with positive impact all of these have negative impact I can see the impact of City as a factor by itself by just selecting it here I'll see it over here so if I select that city put it back on replace selection select the city and I can see over here that it is fairly well explaining it by itself but there's clearly other factors let's go back to the bookmark and go back to the original selection and now let's look at some of the other factors the platinum credit card explains eight point nine percent of this population and if they responded if they've bought this platinum credit card they're likely to be in this category if not they're not likely to be company size and number of employees came in third I'm scrolling over this looks like 93 to 197 is a positive correlation one to ninety-two so the smaller companies are generally in this sector looks like the bigger companies aren't so that's it at the inside I hadn't even thought of before so let's actually look at company size maybe that's a factor I want to bring up and as I go through this that's the kind of thing you would do so let's modify the dashboard and bring up another factors I can bring up a bar chart I'm gonna title it on the bottom here put the bar chart on the table customers put the field on customer ID I'm going to size it by the employees and so now I have a the set of customers there's all my customers and the selected ones it's a little hard to see with the webex here but there tend to be the smaller customers down in here I can now exclude everything that's not in this category and drop down here or you know the profile by company size of my high margin low revenue responders the campaign is a handful of bigger customers over 750 employees who are in the West but then it drops off stupidest li and everything in the east the Midwest and so forth are these these smaller customers and they're generally you know under this is 200 employees right about here then a drops off precipitously and there's a lot of you know 50 25 person companies in here so this this campaign has been responded to primarily by small companies in a few industries in the East this actually may not have been the kind of campaign I wanted to run if that was the impact of it so I've learned something about this campaign and and this is maybe something I never would have even thought of looking at visually now that I have this model there's a couple of things I can do with it obviously got the actors out of it I can look at the other dimensions like I have here I've got a better characteristic description of my population of these high margin low revenue yes responders I also have a model so if I have another maybe another region where I haven't run this campaign I'm trying to predict response to the campaign I can bring that data in and score I can run I could predict against that and will score the data set I brought in as to its propensity to be in this quadrant now obviously I also now have the list of customers let's go back to the prime dashboard here so I can export this list to a mail engine a campaign engine and take action on it so I've kind of gone full cycle here from bringing in and looking at my customer base and then trying to understand visually and then with multivariate regression what mattered to this campaign I was running I have some insights I can now use this on another data set and and follow up that way thank you very much for your time and attention

One Comment

  1. Daniela Georgieva said:

    Very interesting, thanks for the nice presentation

    June 29, 2019
    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *