Forecasting, Predictive Analytics and Big Data



thanks for joining me as we take a look at forecasting predictive analytics and big data now I'm betting you know what forecasting is if you're watching this video but do you know what big data is you may think so but if all you've heard is the industry standard for v's definition then you don't have a good working idea of big data and you won't have a real sense of how big data does and doesn't impact forecasting I've covered this material in more detail elsewhere so if you've already seen my big data in sequel or Big Data beyond the height presentations just skip ahead to around the 8 minute mark if you haven't chances are this is what you think of as big data well the four V's is pretty much the industry standard definition of Big Data if you have a lot of volume a lot of variety particularly unstructured or the rather unhelpful category of semi structured data if you have lots of velocity with data streaming in real time and if you have data quality issues well then you might have big data it's not a very useful definition not least because we've always had some combination of these issues and we've always been able to handle them in our traditional relational database and analytics worlds if Moore's law says processing power doubles every two years my law says we've always had a little more data that we have had processing power and yes unstructured data is a challenge but it's not at the heart of most big data problems ditto for velocity doing anything in real time is hard but it's hard from an IT perspective not from a business or analytics perspective and double ditto that for data quality I've been doing analytics for nearly 30 years and I've never met a data set that wasn't dirty there's nothing interesting in any of this from a forecasting perspective because it doesn't matter how many rows you have it doesn't matter how many different sources create columns it doesn't even matter how fast they're changing it's all the same from a forecasting perspective so if this is all there is to big data it might be of interest IT folks but as a concept it seems pretty unimportant to analysts and forecasters but that isn't all there is or even what there is at all I think real big data emerges when you drop down a level and granularity in detail this isn't so much a matter of having more data though that is often a consequence it's that the data now lives at a level where your traditional unit of analysis doesn't exist or it doesn't have meaning indeed the data often exists here at a level where an individual measurement taken on its own has no meaning at all when that happens you have to get the meaning from understanding the relationship between events specifically the sequence time and pattern of those events here's an example of what I mean from my world digital data in the first case the customer saw an ad visited the web and called a classic pre-sales pattern in the second case the customer purchased then visited the web then saw an ad and then called the sequence here is much less enlightening and it's unclear what will come next what's critical to realize is the regardless of how you interpret this sequence whether the ad appeared before or after the sale will tell us everything meaningful about whether the ad might have worked when you're analyzing a series of events sequence matters but it isn't just sequence time is almost always a big factor in event analysis here's another example from digital advertising the lag between when you saw an ad and when a conversion event happens is critical for understanding how seriously we wait the influence of that ad and this isn't just limited to digital advertising almost anytime you have a sequence events whether its website pageviews smart meter reading internet-of-things health readings or ad impressions time is going to be a critical variable you can really think about sequence and time as a subset of a broader theme pattern patterns can be time or sequence pace but there are other sorts – there might be patterns around trend velocity timing or position that are significant depending on the dataset or application you have here I've taken a consumption problem and showing two different patterns one relatively steady and the other variant by day of week note that each pattern has the same average but the forecast would be different for each note to that in a pattern like this different smoothing techniques so yield different answers but none is as likely to be as accurate as if we're basing the analysis on understanding the actual pattern typically we'd solve a problem like this by creating two different forecasts that approach still leaves us in the world that's pretty much the same from a forecasting perspective but not every problem that is amenable to that kind of treatment we'll take a closer look at how we handle individual variation and pattern detection in a big data world when we dive into forecasting this shift and understanding patterns of data is true in every paradigm case of big data from smart meters to digital my own field to CRM data you can take smart meter data add it up and generate forecast exactly like you used to with a three-month read it's not really big data though big data forecasting is when you use patterns in the data to answer forecast questions you couldn't before and that shift understanding patterns matters a lot because most of our traditional tools aren't meant for this kind of work in particular sequels very poor incorporating sequence time or complex pattern into queries it has no real capabilities for identifying patterns and no means of querying them in a performant manner sure you can use sequel for this as I sometimes say you can use sequel for almost anything but it's like using a salish a word processor it's the wrong tool for the job similarly many of the most common tools we use in our stats toolbox things like correlation and regression assume that the unit of meaning of the analysis is identical with that in the data and our time series techniques aren't really suited either they're mostly designed to look at events aggregated into discrete units of time that's not usually what we have when it comes to big data finally joins with detailed stream data aren't usually ecwid joins the kind of joins we do when for example we append census data to a customer record the vast majority of the joints we do in traditional analysis or equi-join x' but Akwa joints don't work on uh Nagre gated detailed data making data integration far harder and here's the thing between sequel our core stats tools that equi-join x' that's about 99 percent of what we do in traditional statistical forecasting if it's all broken in a big data world no wonder it's hard so that's what big data is and if you didn't know before now you know big data happens when you take analytics down to a level where individual events aren't the fundamental unit of analysis and we have to get meaning by studying their patterns and relationship over time but what does this mean for forecasting and predictive analytics there are three common scenarios that challenge analysts and forecasters when it comes to big data the first is when you have true unstructured data usually text I'll explain why this is a big data problem of a slightly different sort the second common scenario is when you're trying to move from an aggregate forecast to aggregating individual forecasts this is the most common big data scenario and forecasting and it's pretty interesting finally there are the cases where the data requires complex patterning before it can be understood at all this is the case in things like the Internet of Things the first of these text analytics is important and common if you need to work with unstructured text analytics data your problem is generating meaning from text it is a problem in pattern identification but it's a problem that requires tools specifically designed for the task we often work with tools that are simple read checks or wild card based pattern matching tools and for really understanding text they suck text is a unique kind of pattern that requires the linguistic approach something that can recognize key language constructs and how they're related anytime you're working with raw text your biggest task will be the structure that texts into traditional categories that isolate what the text is about and perhaps the sentiments behind it once you've done that your problems are mostly over that now structured data typically aggregates cleanly and can be used in a traditional manner you'll need different tools to handle text analytics and it's always a lot of work but once you've categorized the data it doesn't usually change your forecasting or predictive techniques that much the big data pattern matching aspects of text analytics are all in the upfront data transformation that's not true in our second case where we quite literally change the way we build forecasts remember that weekday weekend slide I showed earlier we've already touched on a situation where individuals might have different weekday and weekend patterns a situation that in forecasting we usually handle by generating two separate forecasts one for each pattern but it's not always that simple consider this situation we have a sample of three different consumers each has a different pattern of consumption with the implication that there are lots of possible patterns living at the individual level of the data I've labeled each pattern increasing oscillating and seasonal and I don't mean to suggest these labels are absolute C 3 for example might have just decreased and then held steady if you have a bunch of customers with a variety of patterns then the question is what's the best technique for creating a forecast unlike the weekend weekday case these patterns aren't uniform and they don't align by some exogenous factor that will allow us to sort every customer we can't just produce a forecast for every pattern not only because there be too many patterns but because they'll vary by different time periods traditionally we ignore this level of detail in the data instead we simply sum the periods across all customers and then generated forecast you see neither smoothing or modeling techniques those techniques work in aggregate and they will work here too they'll work just fine provided the underlying mix of consumers and their patterns isn't changing that much if it is they may miss significantly in addition those aggregates introduce increasing amounts of air into the forecast as you try to refine that forecast into finer grained segments especially if you're trying to set specific expectations like forecasting revenue for a set of accounts this part is important Big Data doesn't change the nature of most forecasting and it doesn't mean that everything you've been doing is wrong it's almost always a lot easier to build a forecast without going down to the level of individual forecast but if you do go down to that level of detail you may be able to create more accurate forecasts in times of significant market and customer change and you can slice your forecast into finer grained segments with more accuracy it's obvious that you'll only be able to create an individual forecast when you have enough data points for each individual in fact you'll have to think about how to handle the forecast for cases where you don't have enough detail to build any pattern it may be that you'll need a default forecast technique for those folks and note that I didn't say a default forecast it's likely that the group that doesn't have enough behavior will still have enough identifying characteristics for some segments driven forecasting that will improve your individual accuracy and after all remember that group of low behavior or new individuals is most definitely not a representative of your business for other folks you can build forecast via segmentation clustering people and then building forecasts per cluster segment simple regression line models momentum and smoothing techniques pretty much similar to what we might do in the aggregate and pattern matching techniques to try to isolate the type of patterns I've shown in the previous slide once you've built a forecast for each record you can sum those forecast to generate any level of aggregation you might also think about using probabilistic techniques here to generate a forecast that includes your understanding of potential variation particularly in scenarios where small subsets may drive huge changes things like b2b CRM prediction this is probably a good idea finally we have the scenario where our data is essentially meaningless without patterning what I have in mind here is cases like Internet of Things data fitness monitoring data equipment data in-store tracking data via Wi-Fi signal that sort of thing in all of these cases the raw data is almost completely unusable for analysis before it can be used the deep patterns have to be identified so I can tell for example whether a series of fitness device readings indicate someone was hiking biking cooking or riding an elevator this scenario is a big data in its purest form and this is true data science before anyone can really use this data some pretty deep analysis of the patterning sort has to happen it's also not that different in some respects from the text analytics case you need specialized tools and knowledge to work at this level what comes out of this level is then used by people building predictive analytics or forecasts it's important to realize that it's usually not the same folks or the same techniques working at each level so let's sum up Big Data isn't about having a lot of data it's about having data at a level of detail where the meaning resides and the interpretation of the pattern of events when that happens a lot of our traditional IT and analytic machinery breaks down sequel isn't performant equi joins don't work and the most common statistical toolkit doesn't apply which means to that most of our traditional forecasting techniques either don't apply or don't take advantage of the data you can aggregate the data it approves the same old type of forecasts but it's not going to improve your accuracy or your models so there's an extra step to making this kind of data useful and that extra step nearly always involves some form of patterning in the data analysis finally there are three common scenarios of that patterning patterning text using linguistic tools patterning individual forecasts using fairly traditional techniques and patterning complex machine data that's non linguistic usually with deep learning techniques when you're done with that extra step you can forecast away so happy predicting I forecast lots of interesting work in your future here's my contact info feel free to reach out to me and my team about big data forecasting predictive analytics or all things digital and I can't resist mentioning my new book measuring the digital world if you're looking for a deep study of the discipline of digital analytics check it out thanks for listening you

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *