Simplifying Predictive Analytics with Power BI by Devin Knight



welcome everyone to the Global Polio committee my name is Sarah Genesis and today we're gonna have to send me cards every night from fnatic parks we're gonna give some highlights of our global power VI community our global power community always have a session from an expert from the industry we have presents leader participation worldwide also we have worldwide presence in different countries our Facebook group has like two point ninety members and we have like thousands of plus linking members and our group also like we have 600 plus range member in our community page I want to mention like some accommodation so you guys need to start register for this upcoming session where next section is going to be power bi sharing and security unleash with Azeri next session is going to be only possible using power bi with Leila these sessions are going to happen on 31st and November and not today we're gonna have we got to be talking about simplifying predictive analytics with power bi with Debbie Knight and Devin is a Microsoft MEP and training director of dramatic works consulting he's in order of six equals of books and a speaker conference like pass summit past business analytics conference CEO salaries in contact he's also contributing member for the past business intelligence with Berto chapter and Devin is passive and of the local user groups in Jacksonville how are you guys can connect with us you guys can connect in Twitter at global RPG also Facebook linking and power community site if you guys want to get all the updates from our community while we encourage you guys to go and register to all community HTTP the big data our power bi are eg and if you guys want to percent for our community you can you guys can reach out B power and either hey Suz arab to comm with this I handed over to Debbie and then we can start with the presentation thank you everyone sounds good let me share my screen and let me know when you can see my screen hopefully it's showing there I want it didn't really ask me which one perfect using my slide deck is that right alright well thank you for having me and for everyone for joining a few people still coming in in today's session we're going to be focusing in again all aren't put it through analytics and even a lot of our session is actually gonna be just showing a lot of what you can do with the our language and power bi and some of that isn't necessarily even predictive analytics but some of the things that you can do that you can use to extend what power bi has available to it using the the our language we're going to see a little bit of both some some just from a usability perspective how power bi and our work together and then also we're gonna look at really how our can be used to enhance and go beyond what's what's typical in power bi so if you're not free Luminara think you're doing a great introduction but if you're not familiar with who I am like Eric said I am the training director here at pragmatic works so that means I work with a lot of instructors actually we had at least one of them on the call here today Thomas isn't one of our instructors and I've worked with a lot of our instructors and some of our internal folks and external folks if you're ever interested in actually developing courses for us yourself let me know and that's something we're interested in working with you on but we do things like free training webinars like this every Tuesday we also do a paid training as well we have a pack we're actually up to eight power bi courses we're hoping by the end of the year we'll have ten different power bi courses those varies from kind of the basics of using power bi desktop we're working on an advanced power bi course right now we have a power bi administration course one for the data consumer so one honest really for those that do development but more for those end users that need to be able to interact with power bi and understand how it works so we have a lot of we and also all be kind of Excel variations of power bi as well so things like power pivot power query those sort of things we have courses on as well so we do a lot of round training a lot of round data training not just power bi we actually have 30-plus courses in total I'm also a Microsoft data platform in VP which means something I guess and I have also authored several books all of them do you have to do a sequel server not on donkeys I also run a local local user group in Jacksonville Florida it's called the Jacksonville sequel server user group and I've been running that for about five or six years now and we just had our big event a few moments a few months ago actually in August a sequel Saturday event we had about 500 folks come out and learn all about how to work with data with Microsoft technologies we also I also blog at a website called Devon Knight single comm I do quite a bit of blogging there if you follow any activity on the Facebook group I usually post at least one thing in the Facebook group once a week because I blog at minimum once a week about power bi and a often times post that in the Facebook group typically around the power bi custom visuals and what kind of things you can do with those visuals because there's really no instructions on how to use them I kind of took it upon myself to do a blog series on all the custom visuals that are available all right so our plan for today so this time that we have together may be about 50 minutes here together I have several demos it's gonna be a really heavy demo intensive session I have one slide after this one and pass that slide we're gonna spend a lot of time showing you what's possible with our what's possible with what makes are so special and how you can enhance what you can do with power bi so I don't one more slide after this but then we'll be all demos so we're gonna focus on how we can do that then well actually like I said show you a lot of what it can do inside of the power bi desktop so for those of you that aren't really familiar with our and are just kind of learning a little bit about it our is actually a free open-source language it's used for doing statistical analysis and doing visualizing graphics and things like that but it's it's oftentimes thought of as a language that can be used for doing predictive analytics you'll see a lot of people use that whether you're using things like Azure machine learning you can plug in our scripts there whether you're using power bi or other tools and other technologies it has the ability to and with a very large library of scripts that you can import be able to extend what's possible so the our language is again it's a tool set of a bunch of libraries it's a bunch of libraries you can almost think of it like PowerShell if you've worked with PowerShell from from this perspective it's a coding language that you can import different libraries in and then have the ability to do extra data cleansing or extra data predictive analytics has a lot built in that you can bring in so what do you need to do it if you want to actually work with our if you want to follow some of the samples I want to show you today you'll need to at least install our four windows and you can find our four windows assuming you're running a Windows machine by going to crayon dot our project org and from there you'll actually see where you can download the the our client they are installation for Windows and then once you do that you'll also want to make sure you also have a client that you can open up and run code against and that can be what it could be one of two things I'm really there's a lot of different interfaces for writing our scripts you can even do it in Visual Studio if you wanted to but you'll want to probably install either the Microsoft our client which is okay it's it's nice it lets you run some scripts but it's not as thorough as you might like as a IDE for running our really our studio is is a great tool for working with our and you might see me open that a couple times I might actually show you a little bit of both the our Microsoft our client maybe some art studio just to show you the differences between them but you'll you'll want to be able to run our scripts in one of those two client tools I kind of think of this as if you come from a sequel background think of this like management studio you would open management studio to run queries that's kind of the same thing with our studio or or they are clients by Microsoft you'll also likely want to get started by looking at some of the samples that are available to you so if you're really not an hour guy or girl you don't you don't work a lot with our and you want to be able to learn how to get started with it and maybe how it can integrate well with power bi there are a lot of examples out there there's people that have been developing our for all in our with for all different platforms for quite a while now and so you can actually go find some some samples if you go to on the Microsoft community site so the community power bi comm you'll find there's a showcase an R script showcase that if I launch this here and I can show you what it looks like oh I already had it open we bring that over this actually has a library or showcase they call it of several different kinds of our scripts you can take a look at most of them having to do with data visualizations so you can see you can actually create data visualizations that don't even exist so an example of one of our customers that really is interested in this is they're looking at something like this schedule view and they want to be able to actually plot out it's a hospital and they want to be able to plot out operating you operating room utilization so they want to know they think right now that their operating rooms are getting underutilized because they're not really tracking the the time that they're being used very well and so they think that they could actually do a better job scheduling and they want to be able to build out some reports typically with with an R script that they can use something like this as an example they're actually building their own version of it but that would allow them to actually be able to plot out that information and it doesn't necessarily have anything predictive in nature but it does use R to be able to extend what power bi has available I'm gonna show you for example one layer where we're going to show you the the map with connecting lines and I'll show you how you can actually create this nice little map visual here using an R script now I don't plan on going super deep and teaching you really even the basics of our I really just want you to get an idea of what's possible so that way hopefully it'll spur some ideas in yourself and when you go back to to your office whether you're there now or what you're gonna be there tomorrow you'll be able to come up with some ideas of hopefully how you can use it in scenarios for yourself so again there's a lot of examples out here you can search for the R scripts showcase and you should be able to find all these samples that are already out here or you can submit your own so you can spend some time learning it and submit your own by the way I don't want to heavily plug my smite my sone content too much but we actually have a pragmatic works we have a introduction to our course that's coming out as well it's not so much integrated with power bi although we have a course for that coming as well but it's more focused on just learning or in general from from the basics so if you're interested that let me know that should actually be out next month all right so we've talked a bit about what it can do it's going to help us extend what we can do in power bi there is a predictive analytics nature to it there's some statistical computing that's involved with it there can't be at least let's actually walk you through several samples of what it can do and there's really three different areas with inside of power bi itself that you can use our scripts so you're you're able to easily take what you've already developed and enhance what you can do and a multitude of ways okay so the first thing I want to do is I'm gonna go launch open the power bi desktop I'm not gonna spend too much time spending giving you a basics of power bi I assume many of you are fairly familiar with power bi already so don't waste too much of your time with that and what we're gonna do in power bi is we're going to start by connecting to a data source of course anytime we want to do something a power bi we've gotta start by connecting to our data source in this case I'm gonna go connect to a basic little CSV file here so I'm gonna connect to a text file and for this example we're gonna be connecting to a file called European stock market this is actually a sample that Microsoft put out so you can actually go find yourself I'll be happy to send it to you if you're interested in seeing it as well now the problem with this data set the problem that we're trying to solve with this data set is if I make this a little larger you'll see is we're getting for each day an entry for the stock market index for the European stock market ok so we can see values coming back except for on the 15th day and the 20th day for some reason those values weren't recorded at for some sort of a reason that maybe and so because of that it's gonna skew my reports quite a bit and I can try and build some reports off this but it's certainly going to skew them some so what I'd like to do is I want to be able to use the AR scripting language to actually predict what those values should have been okay I don't obviously I don't have what they are here that they it's not in the data but you can use the AR scripting language to actually do a find missing basically I'll replace missing values based on the values that are surrounding it so it's gonna look at the pattern of the data around it and based on that pattern or influx or trend in the data around it it's going to give you a prediction on what those values shouldn't have been all right so if I want to do that this is more of a transform that we're trying to do because there's really three ways you can use our and power bi one you can use it as a source so if you go underneath get data and hit more I'm going to show you this in a moment you'll find that there is an R script source here so you can actually plug in an entire R script which is nice I'll show you one of the reasons why you might do that here in a few moments you can also do R as a visual so you'll notice over here right here hopefully that zoom shows up for you guys where you can actually make the and visual out of R as well so you have a couple different places you can use R one is a source one as a visual and then the third place I mentioned there's three the third place is has a transform so you'll actually find that inside the query editor if I were to go back over to my source one more time and to make it to make R as a transform we're going to have to go into the query editor which once this launches here for me there it is I would click Edit to take that take myself over to the query editor again the problem we're trying to solve is replacing these missing values here all right so I'll hit edit and once I hit edit it's gonna launch open the query editor for me okay and of course you have all the typical data transforms available to you from power bi here but what we want to do again is to replace these values now the the script that we're gonna use for this is actually fairly simple and I'm gonna go ahead and show it to you on the screen and then I'm gonna talk you through how you would use it okay so I have a script here called find missing values it's just a tiny little script the part that we actually care about is the part down here but if we want to use this script inside a power bi there's a few things that we need to tell power bi before we can use it one we have to tell power bi where our script library is where where are we storing all of our our script libraries so that's the first thing we have to do the second thing that we need to do is this top part where it says install dot packages mice mice is the little lie very that we want to install in here the package that it's going to have all of the necessary code or script script for us I need to actually install that you don't necessarily install this in power bi you actually need to go open up something like our studio or that Microsoft client that I mentioned earlier to install the package okay so I'm gonna walk you through both those things the two things I mentioned one is we need to connect power bi and tell it where our our script sources are our our script libraries are it's a lot of ours there and then second we need to actually tell its that we want to install this package so that we can run the code below now the code below is not too complicated just to highlight this briefly basically we're passing in the data set that we have into a little temporary sets kind of in memory and then we're going to use that to be able to use this mice library to find missing values and it's actually going to do two different outputs for us one is gonna have the outputs where it's just simply replaced the values and then two will have a second output that's going to have the values as a they existed previously and then a new column next to it with the missing with the replaced values so we're gonna see what this can do for us here but we again first of things first we have to do two steps one install this library and then to connect power bi to that library alright so I'm gonna go ahead and copy out this installation and I'm gonna launch our studio that's gonna be the first thing we're gonna do it'll take just a moment and it'll launch here for us alright you can see I have quite a few things going on here I'm gonna go ahead and close that out and start with a new script here and it's pretty simple I already copied it so all I have to really do is paste this in here let me control V that and it's gonna install this package and that's gonna have everything that's necessary to be able to run the rest of the script that we have so I'm gonna go ahead and hit enter and it really does all the work for me it's gonna go download and install the script for me you can see it's plugging it into a certain library here and I've actually already installed it so then no really need to do anything special here as we've already had it installed now the other component of this like I mentioned it is having this connected to power bi now to have a connect power to power bi we'll need to go to our file and options menu so I'll go up to file and options and you'll find the our scripting section here is what you need to configure so this section right here where it says detected our home directory that's where you need to make sure that you have our power bi connected to each other so it's looking for that library that we just installed a moment ago that package that we just installed a moment ago it's looking for the location of that and we have stored ours in this particular location here so if you didn't have that selected you would come in and you could actually select other and you can find the location where you're just stored now I've installed our several times here so it has a bunch of different locations for me but we're gonna be using the one here that's our – three point four point one you can also see if you haven't already installed our you can do that from here so if you haven't already downloaded the client tool or if you haven't downloaded the actual engine for our you can download and install that here that's easily available for you alright so in my case everything is already connected up and I'm good to go so I'll go ahead hit cancel here and then for the the transform that we want to apply we're gonna go to the transform tab up top so you should be able to see that transform tab up in the very top left and then on the far right under the transform tab you'll find there's this option here called run our script okay so it's on the far right on my screen right now I'll go ahead and select that and it's basically gonna open up a nice little I shouldn't say nice it's pretty it's a pretty terrible editor there's really nothing that you can do here you can't test your code it's basically an empty and empty text box where you can drop your code in and so what we're gonna do is we're gonna drop our code in here yes but I want to make it so that let me pull it aside here from I want to make it so that you guys can kind of understand what's going on basically this this editor is it's about it's an empty box and so all I have to really do is take the code that I had in my nice little text file paste that in and it's gonna do most the magic here for me it's doing some sampling to be able to see where the data should be and then based on that sampling it's gonna output the results of what I should have in those missing values all right so if I hit OK on this again like I said in the early on not the purpose this session isn't necessarily to teach you the details of how you can use our it's more to show you what's possible with power vir all right so we have a two data sets that have really output it from this or two tables that have been outputted from this you have one called completed data and you can select the table option right here the cell where has this table in it and it'll show you what the values look like below if you remember we were missing values in row 15 so you can see there's a value in row 15 and I think the other one was 20 or something like that you can scroll down you can see where those missing values are now replaced the other option here called output this actually shows you if you select that one this is going to show you what the value was and what it got replaced to so it wasn't set to null and that null value got replaced with 1723 point 8 and then on day 20 it was null and we replaced it with 1723 point 1 so we can have a see where we were and where we where we've now replaced our values 2 with these two different outputs that's provided to us now I don't necessarily care about the the original data set here in this case I just want to see what the completed values have now been once they've been replaced so I can select the table link and it will bring across the completed data set that does not really show me where the missing values are if I don't really care about them and I can go ahead and hit close and apply and build a nice little visual out of this maybe a simple line chart to be able to show me where all those values have been overcooked over time so it makes for a simple little way let's just go ahead and complete this example we'll make a line chart here so that we can actually see how those values was shown now had I brought in the other data set that was missing values we'd see some pretty sharp declines in here where those values had not been fixed or had not been replaced so this is a small example again a small amount of code really here only four or five lot five lines of code to be able to find values and look at surrounding values and replace them where they're necessary all right cool so that's our first example the second example I want to show you guys is I'm going to start off with a partially built out example and so what I'm going to do in this one is I'm going to open up a partially completed version here let me bring this open for you guys and what I want to show you in this one is some of the built-in capabilities for predictive analytics so this one isn't necessarily in our example though some of that may be happening behind the scenes what I really want to show you in this example is how you can use the built in per ticket predictive analytics capabilities of power bi you don't have to know any kind of stripping for this you don't have to know any kind of code it's all kind of right-clicks to be able to create and do some predictive analytics so the two things were to show one is forecasting so many of you may already be aware that you can do forecast lines inside power bi and then the other thing I wanna show also is going to be clustering and how you can actually run a clustering algorithm behind the Sun it's not behind-the-scenes to group your values or group certain values into different clusters okay all right so first thing that we're gonna do is we're going to go up to the line chart that we have here okay and the data that we're looking at here by the way is baseball data I'm not sure how many of you guys are baseball fans but you can see these are different baseball teams different cities of baseball teams franchises I should say and then on the top here this line chart that we're looking at let me make this a little larger the line chart that we're looking at is by year the number of home runs that each the total league has had so we can see there's some big fluctuations and the number of home runs one of the big years for homeruns was kind of the late 90s 2000s here and in 2016 was also a pretty big year for homeruns but what I'd like to do is I want to be able to try and do some predictions on what the homeruns might look like in the future okay so I want to know what's gonna happen in the future so what I'm gonna do is I'm going to have this line chart selected I'll go over to the analytics pane I'm not sure how many of you guys have played around with the analytics side of some of the visuals that are available to you but if you select the analytics side here you'll see there's a bunch of different kind of statistical lines that you can add things like the max the men the trendline median percentile I'm forecasting there's all kinds of things that you can add into the report itself or into the chart itself and in this example what we're going to do is add in a forecasting line now one thing I want to make sure I highlight as you're looking at this is there are some things that you can do to actually make this forecasting option not up here one of those has to do with the granular of the the data itself so I believe it is under let's see is it here it is under the x-axis you must have a continuous value so you'll see right here underneath the x-axis the type you'll see that it's set to a continuous value it does have to be a continuous value if you want to use the forecasting option that we're looking at if it's set to a categorical option like this you can see it actually returns back every single value as soon as you change it to categorical you'll notice that the forecasting line option is no longer there so just something to be aware of I'm gonna undo that and send it back to how it was but just be aware that if you have more of a categorical value it won't work now some of you might be asking what does categorical reverse continuous what's the difference between those two well the difference between probably the easiest way to explain this is you can think of categorical okay here's our options you can think of categorical as just discrete it's another word for discrete I'm not sure how many of you guys have actually done things like data mining sequel super data mining in the past discrete values are ones that typically have a low number of distinct values so think of things like a product category there's only so many product categories you can have and that's more of a discrete value or a categorical value that you have a continuous value is you have so many distinct possibilities of what dupatta value might be that comes up that it's for example it's not something you would ever put in a drop-down filter okay so what I mean by that is if you were going to allow your users to actually filter your reports maybe with a slicer you would never make it a drop-down slicer because there'd be so many distinct values that they would never be able to find the value that they actually need that means there's so many distinct possible values and when you choose a continuous data type or did that type that does allow you then to do some more predictive analytics you had that's really the limit of one of the things that you have to have if you're going to be able to or if you choose to do forecasting all right so enough about that let's actually go ahead and implement this will select forecast and we'll add a forecast line here okay so you can see the forecast already implemented here and you can see it has a kind of this code it's been hurricane season for a lot of people I'm in Florida so that we haven't had a lot of this kind of look to us but kind of this cone of uncertainty here we're the values that it's protein are gonna fall somewhere in between this cones you can actually see it's predicting out ten years into the future how do I know ten years in the future well one you can look at it on the chart and you can see it started in 2016 my data set did and it's pretty teeming all the way out to 2026 you can also see that the number of units that it's predicting into the future if you look into the analytics pane right here you can see that it's predicting in ten units or ten points into the future so you can actually control that if you wanted to you can change it from instead of doing ten units into the future maybe make it something like five units into the future if you make a change to that though you need to make sure you hit apply here and then it'll reapply that that prediction that you've done the other thing that you can do as well is if you're if you're uncertain on whether or not this forecast is accurate you can actually change and tell that you want to ignore a certain number of years in this case so if I wanted to ignore let's say I want to take nor the last three years I can hit apply and you'll notice what it's done here is it's actually pretended like those years didn't even happen and my forecast line predicted what the values would have been for those years that did happen as well as the two years after that so did as best it could it couldn't a predictor this big sharp decline and then incline again but it did have kind of a happy meeting you can see that code actually lies within that range of values that we have okay it's kind of interesting you have that capability built into power bi the other type of predictive analytics that are built into power bi are around clustering so we have this scatter chart here on the bottom and the scatter chart on the bottom is showing us let me go ahead and fullscreen this for a moment the scatter chart on the bottom here is showing us each of the different baseball franchises and we're looking at the year on the play axis so I can actually play this and see how each of the franchises have changed over time and how well I've done but I can also see here the number of home runs they've had by team so my horizontal axes here is based on the number of homeruns my vertical axis is the number of wins and then the size of the bubble is let's see the size of the bubble is represented by the number of runs that each team has scored okay so home runs and runs don't necessarily equate to each other but I can see the size of the bubble is representative by the number of run now what I'd like to do is I kind of assume that if you're in the the top right quadrant meaning you have a lot of home runs and a lot of wins you're probably a pretty good team frankly if you have a lot of wins if you're in the top half of the chart you should be a pretty good team so what I'd like to do is I want to do some kind of a clustering in here and so what I can do is up for a moment I'm going to take out the size okay and just look at each of the teams regardless of the number of runs they've scored I want to look at home runs and wins here and I want to try and organize these together in different clusters and I would say anyone that's you know above a certain range is a playoff team likely anyone below a certain range is probably not a playoff team and so what I'm gonna do is I'm gonna go up to the top right corner of this chart where it says more options and let's see what am I missing here oh you know what let me build rebuild this chart real quickly I'm gonna rebuild it one more time just so we can see it kind of beginning to end and I'm gonna put let's look at this for a particular value here so I'm going to add and let's do the home runs again and wins let's see where did homeruns go sorry guys I realize there's one other thing I wanted to make a change to you on here and we're looking at this buy franchise there we go and buy wins so wins obviously should be further down right here okay there we go that's a little bit better representation of what I want to see the other the previous one by the way was filtered to a particular range that they didn't want to filter to but now looking at all the teams across all the ranges here that we have I can go up to the top right corner where it says more options there we go and now you'll see there's this option here called automatically cluster I'll automatically flying clusters another way you can do some predictive analytics built in here where it'll do and run some algorithms behind the scenes a clustering algorithm to be able to find how each of these values group together best and so I can find the clusters and when you tell it that you want to create clusters it's actually going to create a new field inside of your data model and I can tell what do I want to call this cluster so it's gonna right now what's gonna call it franchise ID clusters which might be fine if you'd like but you can really name it whatever you want you can also tell it how many clusters you want so the number of clusters right now is set to automatic but maybe I want to make there be three clusters and so I can type in the number three and when I hit okay it's going to actually automatically cluster each of these into three separate groups so I'll probably have I would imagine the top right corner will be run one group the kind of middle section here is another group and then the bottom left is my my bottom-dwellers here these are the ones that just are never probably not gonna be in the playoffs for a long time so this is pretty interesting now what I can do is I can also let's do this let me filter to one particular year just so it's a little easier to see and so we're not looking at it across too big of a range here I'm gonna add in a quick filter let me go down to the filter section here okay and we'll make this a basic filter let's Beto's filter it to last year okay now one of the problems was what I've done here is I've added the the clustering prior to making that filter so there is a little bit of kind of inconsistencies that could happen if you do that so probably what will be best is for me to rerun that cluster one more time let me delete it first since I added the filter a little bit late it does tend to mess things up I've already created the cluster so let's create the cluster one more time we'll put it into three groups again and then there we get a little bit better breakdown so I can see cluster one in the top right are likely my playoff teams I can hover above them and actually see who they were Cleveland yeah they went to the World Series last year Chicago they won the World Series last year Washington Nationals they were I believe they went to the playoffs so yeah I have quite a good little breakdown here now you'll notice that the clusters here are called cluster one cluster two cluster three you can of course go into the cluster you can come over to the field list on the route so I guess you can come over to the field list on the right hand side right here and you can right click and edit one of your clusters so I can edit the cluster there on the bottom and you can rename each of these clusters if you wanted to so I can tell it that I want to rename cluster 1 let's call this playoff teams and I can rename let's see where they are cluster 2 is probably bottom dwellers here losing teams and that cluster 3 is kind of somewhere but really they're lose there some of them are losing quite a bit as well so we'll just call it leave it as that but the point being I want to show you can change the names of those clusters and you can see they reappear on the chart here with those new names ok so pretty neat capability built in you didn't really have to write any code to make that happen but very easily was able to do that now the other thing you can also do with a with a scatter charts which is what we're looking at here is you can actually add have a nice background to this if you wanted to so I can do something like this I can go underneath the format paintbrush area here go down to the plot area and tell it that I want to add an image to the background of the chart so I can do something like this go ahead and add an image here tell it the location and we copy and paste my location here and I can pay some achar a background image something like this and then you could actually tell it over here that you want to make it fill or fit the entirety of the D chart and you can kind of see how these break down really might my top half or my better teams my bottom left are probably some of my worst teams and then the the bottom right there really isn't anybody in there because probably if you're hitting a ton of homeruns you're probably winning more than within you're losing so interesting way again that you can be able to pull together a nice chart like this use some of the interconnected capabilities of how RBI already without having to write new code alright the next thing I want to show you is something else that you can do with our so we're going to take a step back over to what we can do with the the our query language and in this example what we're going to show is how you can actually it really doesn't have a ton to do a particular lytx in this example but what it does have to do with what is what you can do with the our scripting language that really enhances what is even possible inside of power bi all right so here's what here's the tip over to show next in the next example and I'll tell you I'm going to close this out and we'll go ahead and kill this as well and this next example we're going to be pulling in a new data set and but but I have a new kind of requirement here that may be something that you need to do but it's a little bit tougher to find ways to do this and so here's my scenario I need to be able to pull in a data set that is coming in from a zip file okay so I haven't some data in the zip file and I want to easily be able to bring that and I don't necessarily want to even unzip it I want to leave it in the zip file location bring it in a power bi and start to use it as a data set on top of that my data set is not only zipped but it's also on the web so I want to go download this data set and unzip it and then bring in the particular table that I want for the example that we're doing here next and that really is is one of the things that's possible with our with our you can you can actually take and unzip things very easily it's really not a ton of code to be able to do this let me go ahead and show you what the code looks like for this example so here's my code I have it's only if I were to smush all to get this together if you four five six lines of code sorry there's a little bit of cleanup task here on the bottom as well but it's really not a ton of code that we're working with here basically what we're doing in this example is we're creating kind of this little in-memory temporary table or temporary file and we're going to download this data set there's a baseball data set I have a couple baseball's example it samples today but I have stored on this URL this is it file location I have a baseball data set and so what I'm gonna do is I'm gonna go download that data or find that location I'm going to store it in a temporary location which is going to be cleaned up or later on so it's not going to keep the zip file it's gonna actually ditch it later I'm then going to unzip it this is a little unzip script here unz there's on two different unzip scripts that you can use in NR but I'm gonna unzip it and I'm gonna find a particular file name called teams now you could pull in multiple file names if you wanted to but I'm gonna pull in just this one the csv file called teams and bring it in to power bi and start to use it as a dataset now on the bottom here does a little bit cleanup so on the bottom here it's going to do things like really ditch or on unlink or remove any of the file temporary file locations that we have and so it's just a little bit of cleanup at the bottom here but what I can do with this if I copy and paste this let's copy this out and go back over to power bi you can see we're kind of a fresh instance of power bi here and what I would like to do is again I want to use that as a data source I want to use this script as a source now I showed you it very early on that you can use our scripts as a source if you go under they get data section so if I go to get data and you can either type are here are scripts or if you don't feel like typing it you can scroll all the way the bottom and you'll find it here as well so I've select our script and hit open or connect I mean and again very similar to what we saw earlier it's going to give you basically this empty editor where you can copy and paste code in you can do whatever you want with it this is again the script you could also take into our studio I usually recommend folks actually going to our studio first and attempt to run scripts there first to validate they're going to work before you come to this editor that's not as nice as our studio but I can take this and paste this in here it's the same code I just had my notepad document a moment ago hit OK and what its gonna do is it's gonna go download that zip file and hold it temporarily and be able to present the CSV file called teams now one thing I'll point out to you this is something that actually took me a while to figure out that I was making a mistake on some of this stuff is case-sensitive so for example teams it does have to be a capital T here because the file name inside of that zip file is a capital T now you'll notice here that it's coming through as a lowercase T but that's because if somewhere else later in the the data set I reference it as a lowercase T so I've got this reference pulled in here I'm gonna go ahead and select the team's file and you can see it's bringing this data set in again this is all from a zip file is stored on the web somewhere on a file server somewhere and I can select teams go ahead and bring that in hit load and that'll bring this now into the power bi desktop it's now bringing it into the data model okay so it's in the data model now and now it's all in memory I can use use it and abuse it however I want this is actually the same data set that we were looking at earlier as far as the different baseball teams that are available so I can build out a nice report with this if I wanted to build another scatterplot we've already done that though okay all right the next table I want to show you is getting into some of those capabilities that I was talking about with the our script showcase so I bring that back up for a moment so here's the our script showcase again I recommend you taking a look at this if you're really trying to learn what's possible with R this is a great place to start there's a lot of different primarily visuals that you can work with when you go to any of these for example the decision tree one or the forecasting one or close go to clustering for a moment if I were to go to clustering you can of course see a screenshot of what it can do but it also give you some instructions on what you need to have downloaded so you'll need to have the orange and installed we can already talked about that earlier and then you could actually download the P bi X file this is the power bi desktop file right here that you can then use and it'll also have the R script probably should we have it embedded into the the P bi X file you can also see there's an R strip separated here if you wanted to just take the script by itself that's also an option so what we're going to do in our example is I'm going to open up a completed power bi desktop file so you don't have to watch me pull in the data but it's gonna be based off that example that had the map in it so just a reminder of which one we're talking about here let me take a step back and I'm gonna show you this guy right here this is the one mapping with connecting lines I want to show you in this example how you can do that yourself and actually in this case I open the completed example let me open the one that's not already done right here all right so once this opens up I'm going to show you how you can use the our script visuals to be able to enhance what type of visuals you can use okay so this is just a blank design surface where the data set in it here's what my dataset looks like this one you're actually looking at travel data latitude and longitude data in here to see where flights are coming and going to sew a lot we're having going into Melbourne Australia Australia data you saw the map a moment ago but I want to be able to take this data pass it into an R visual and then be able to use that to visualize a different type of visual that I don't even have available inside the visualizations gallery now you of course do have custom visuals and you can find our strip visuals inside of the custom visual section here as well I recommend you take a look at those as well so if you select this custom visual store I'm a bit permanent proponent of this obviously most of you know you'll find there are a bunch of our script sources in here as well so if I search our script let's just try R these are actually one of the set of visuals that are all powered by R of many of them are this one it doesn't happen to be but a lot of these top ones here the clustering lawn Association rules correlation plot these are all powered by r the nice thing about these is you don't necessarily have to learn R to be able to use them and what they allow you to do is basically take them in like any other custom visual and you can just plot data into them now you do have to have that our engine installed and it might have to install some library scripts for you remember I showed you how you can install that package earlier it may require you to install some packages but it oftentimes will do that automatically for you all right so the point of what we're doing though is not necessarily a custom visual it's creating this visual from scratch so you see you have this capability to create our visuals up in the visualizations pane and I want to take advantage of that I know that really gives me a lot of open capabilities remember I mentioned you can do things like looking at the operating rooms and their utilization in this case we're going to be looking at travel data and so I'm going to go ahead and select the the our script visual tell it that I want to enable it yeah that's fine now I have noticed one thing with these that you might want to take a take advantage of is you may actually want to do some things like resizing this a little bit maybe you can kind of expand this a bit you can kind of a very basically I'm telling you I would recommend resizing this or reorient orienting it ahead of time because I've noticed that the our script visual itself can be be a little finicky once you resize it after you bring in the data so go ahead and resize it ahead of time okay all right yeah I see your comment yeah I mentioned early on some of this stuff won't necessarily be pretty command a little more showing you our capabilities then this next one I'm going to show you assuming we have time is going to actually be looking at a decision tree which is pretty chaotic so yeah we'll get to that this is more of a basic one first then we'll get some one that's a little bit more focused on pretty chaotic okay so in this case I'm gonna bring in the the different fields that I want to make part of this visual so usually I oftentimes will check every field here because I want them to be available inside of the code set that we have in the bottom if you don't check off a field that it's not going to be available to the our script editor on the bottom so you'll notice in the bottom you have this whole our script editor that's popped up and the way that you can use this basically is you can basically drop in the code that you're going to use for the visual that you're working on so in this case we have a map visual here and there's a couple libraries that we would install here let me point out what's going on here there's a couple libraries up top there's a Maps library there's a geosphere library that you would install you would do these inside the our studio for example to be able to install those packages and then the rest of it was actually focused on the visual itself so I can take the the rest of the set here you can see it's making certain references to two columns that we have so you'll even see you can do things like modify the colors you can see kind of these red orange colors are going to appear in the color palette that we're working on you can certainly take this and modify quite a bit to do what you wanted to do but in this case I'm gonna pretty much take it as is pacing that code down here and hit the little Run button right here you'll see there's a little run strip button it's kind of small but this run script button will allow you to actually execute that script so if I hit that run script button you'll see it'll build out that visual okay and again you'll notice as you resize it after the fact it's a little slower to be able to react but you can also do things like have this our visual interact and do cross highlighting with other visual that you import so if I were to have something like a basic bar chart in here too and I brought in something like the the ears and account of all the flights I can have this are visual now you'll notice it's a little bit slower but you can't have this are visual interact and do cross highlighting between the different options or the different charts that you have so I select the lowest year here see that the the the filter brings it down to a much smaller scale of values so it's a nice little add-in is it shows you what's possible what you can add into power bi even though there's no visual that necessarily does quite like this the the mapping capability but again you'll notice that there's a little slow reaction between the cross highlighting of multiple visuals now the last one which show you is is definitely a one that's focused on predictive analytics this one is all going to be around a decision tree example and so what I'm going to open up here this is a typical if you've done any kind of machine learning or Azure ml examples you'll oftentimes see this example this is actually going to be the Titanic survival rate and so it's going to look at different characteristics of someone that was on the Titanic and based on their characteristics we're going to see whether or not they would have likely survived okay so it's kind of a little morbid example here but this last example is actually showing the custom visual capabilities that I talked about a few moments ago so I mentioned you have the capabilities of importing our values through the our script visual you can also find them through the custom visual gallery up top here so I already have done that you can see the decision tree visual right here but again if you were doing this for the first time you would come up to the from store option and we can search decision tree here if we wanted to now if you're not entirely familiar with what a decision tree algorithm is basically the idea of decision tree is how did some something or how did the end result I come to that decision so if say for example you're running a store okay your your retail example it retails easy for this and you're trying to determine why did one customer buy for me while another customer did not buy for me well there's probably a lot of factors it might be where they live it might be their age it might be how many children they have it might be whether or not they own a home there's a lot of factors that might go into the reason why somebody would buy from you and so what a decision tree does is it takes in all of those different attributes that you're trying to determine how they're a factor and you also pass in some kind of a measure so the measure is like a like a yes or no did they buy from me or did they survive in this case did they fill out a survey yes or no so you're trying to get typically a yes or no answer back okay kind of a bit and then I would hit add here to bring this visual in and then we would be able to pass in our fields into this visual in this case that's already it already exists and it would be able to return back the reason why someone did or did not choose to buy for me or in this case why someone did or did not survive the Titanic sinking so in this case we're gonna bring in the decision tree visual go ahead make this a little larger here again and we're gonna pass in the different attributes that we want to look at so the the target variable is generally going to be the the different kinds of things that you want to look at so things like the did they survive or not or did they buy from me or not the target variable is what the decision is that you're trying to evaluate and so I'm trying to determine whether or not someone would have survived so I would drag and drop survived into the target variable and then for the input variables you're passing in the different attributes that may have been a factor in that decision so their age might have been the contributing factor there ginger might have been a contributing factor you know women and children first right there passenger class so whether or not they were first class second class third class that sort of thing might have been a factor and you can see as we bring in those fields into the input variables it actually generates a nice decision tree here for me where I can see the reasons why something occurred so I can make this a little larger so we can look at a little bit better and we can see here that it looks like one of the major factors here is gender so if they were female so male is going to the left female is going to the right so you can see the the gender equals male yes goes to the left females are going right then it has to be if we can see how that split actually impacted and so we see first based off of gender then we see it based off their age then we see it based off of towards the very bottom their their class the the passenger class that they were you can see passenger class here as well so a third class likely did not survive and you can actually look at the different ranges here on the bottom of which one's more more likely to survive versus others and of course like I mentioned in the previous example all this actually will work with other visuals that you have so if I have something else in here maybe I was looking at something like a column chart where I was looking at the passenger class and the survival rate based on passenger class I make that actually into a value there we go so I can see third class was in this case there was a lot that survived that were in the third class but a lot of this might have been more of a factor of their the larger number of people in 3rd class versus the lower number of people in first and second class but again the other thing that we can do with this is I can now use this for cross highlight and I can select certain values I can select first class and see how that actually impacts the visual now we're looking at just first class passengers and whether or not they survived or not we can look at second class and so on and so forth so we can use this as a way to be able to filter down our data set and just return back to things that we care about so we only have about five minutes left I do want to save some time for questions there I should mention one other thing I should mention is I mentioned our machine learning a couple times a Shroomish meet machine learning if you're working with that already or if you're interested in working with that that also does have some capabilities to be able to integrate into power bi you can use that as a source for example if you've already created this nice algorithm and a test inside of Azure ml you can then use that inside a power bi as an R script source to be able to bring those values in and certainly show those here as well but a lot of people also will do an azure nml is they'll take the results of an azure ml result set and they'll pump it into like an azure sequel database and then you can of course use an address equal database as a source and set a power bi as well that's that's fairly simple to do so real close I'll take a look at what kind of questions we have if you have any kind of go ahead and queue them up so is the course you're talking about a free course so I believe I saw that pop up whenever I was talking about the our course it is a paid course but we do have where you can do a free trial of if you want to test it out and see if it's something you like so that is actually given by Brian Cafferty a really good guy and then the other one I think I already answered Eric's question but any changes in presentation we really like it and I mean in the name of the global power bi community hope like we can have you back like doing another presentation all the time but this is was a really good stuff that you show over here and all the capabilities of the alright into the power bi I saw this like some presentations but nothing like this and I really like this last presentation that you did I think you appreciate it and then Eric if someone's asking about where they can find the recording will that be in the channel and also we're gonna post a link in the power bi community Facebook account that we have and if you have any macro that we can show like I mean before to send it out to me and then like I can for it to the to the thinnies so it's gonna be happy to you know how to do one

3 Comments

  1. Soumyendu Paul said:

    sound is poor

    July 19, 2019
    Reply
  2. Niharika Sharma said:

    nice video! Power BI is quickly gaining popularity among professionals in data science as a cloud-based service that helps them easily visualize and share insights from their organisations’ data. It is undoubtedly one of the best BI tool available in the market.https://goo.gl/wQcp5b

    July 19, 2019
    Reply
  3. Silke Kleinhenz said:

    bad sound makes this video difficult to follow

    July 19, 2019
    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *