Technical Deep Dive Into Storage for High Performance Computing (Cloud Next '19)



thanks for joining us my name is Wyatt Gorman I am a high-performance computing specialist with Google cloud I'm based in New York City thanks for joining us at technical deep dive into storage for high-performance computing just quick show of hands how many people know what high-performance computing is what kind of workloads there are and then keep them up and how many people are actually users of HPC or researchers or engineers or admins or manage teams ok great about half excellent so we have please first of all if you have any questions up front or throughout the throughout the talk please put them on our new Dory it's a internal system that we've had for a long time that's really helpful for just submitting questions that you might want to ask and easily going through them so we have a great lineup of speakers today again I'm Wyatt I have with me Dean from Google cloud as well as fan and men from data direct networks and I'll let them tell you a little bit about what they're doing with Google cloud in the marketplace but first I just want to talk about what HPC is for those of you who don't know aren't familiar with it so just some examples of some industries that HPC users are in that that really is fundamental to their industry manufacturing so computer-aided design chip design things like this life sciences in terms of genomics financial services climate modeling everything like that right so as you might know for those of you who do HPC in these industries compute is the real driver right so all the time questions are getting bigger right demands for computer growing and so we see this especially in the top 500 supercomputers so this is a graph of the growth of those systems in flops which is floating-point operations per second and so you can see that we've grown now and exceeded in total capacity an exaflop of storage performance I'm sorry and we're going to see the first exascale system very shortly around 2021 so all of this compute and also storage is constantly growing so if you look at the last ten years of storage growth we started in 2008 with about three petabytes on the largest system roadrunner at Los Alamos that was about fifty five gigabytes a second which was pretty impressive for the time but in just 10 years we've 83 axed the capacity to 250 petabytes and a single storage system and gone up to two and a half terabytes a second which is absolutely tremendous the the factors of scaling that we're seeing not only in compute but also in storage and so all of these things together combining compute and combining storage and software of course drives your mission right it's not just about infrastructure it's really about your science and what you're trying to do so a great example of that is the Broad Institute the Broad Institute is a genomics research organization and genomic sequencing organization a partnership with MIT and Harvard and others and so these guys were able to use in Google cloud and the massive amount of compute that they have get the price of processing a single genome down to under $5 which is really tremendous for making genome sequencing accessible to the world and letting you do things like 23andme and so on and so when you think about what it takes to actually do this work it really costs first of all a lot of money so that's why you saw National Laboratories owned these systems to build these systems at a massive scale is really a huge undertaking not only in infrastructure but also the hardware and you know buildings and maintenance and everything like that but it's also very difficult to run these systems efficiently right so you can see on the right a graph of typical utilization in a system and blue is the total capacity of your system where the red is jobs and so at a time of very high utilization you have users sitting in the queue waiting for their jobs to run waiting for other jobs to complete and on the right you can see times when there aren't enough jobs and you're just wasting money running your supercomputer and so this is a problem even within Google but we've gotten really really good at bin packing which is a term used in HPC which basically means just utilizing your resources very efficiently right and so that's all driven by Borg and we'll talk about a little bit about how we do this in Google Cloud and so to avoid having to deal with all of this on your own system you know on Prem you can really very easily turn to Google Cloud for all of our very easy fast provisioning our security and our storage which we'll be talking about a little bit more and so another user that decided that it would probably be a good idea to turn to Google was Twitter and so you'll probably hear at our keynotes this week that Twitter moved 300 petabytes of data to Google Cloud storage and they used Google Cloud and decided on Google cloud because of all the reasons that I just showed you in terms of faster provisioning flexibility and security so in terms of infrastructure and what it actually takes to run HPC very quickly before we dive deeper into the HPC storage you really need three things that's compute storage and networking right so if we look at compute the fundamental driver of compute and Google Cloud is the Google compute engine and so compute engine is really designed to be powerful and flexible for your workloads so we have up to four hundred and sixteen CPUs in a single machine 12 terabytes of RAM would you just heard about this week we have until obtain and local SSDs which we'll talk about in a little bit but we also have something that's very key to HPC workloads called custom machine types and so this is a way that you can size your instance choosing very specifically the number of CPUs and RAM that you want to perfectly fit every workload that you run on on Google compute engine right so if you have one one system that needs 12 terabytes of RAM and 400 cpus and gpus right you can do that if you need one that needs 12 CPUs and 12 gigabytes of RAM you can build one we also have computer optimized VMs which our super high-speed CPU machines great for HPC we have GPUs of course so we have Nvidia K 80s all the way through T fours as well as TP use which are awesome for machine learning use cases preemptable VMS are an awesome tool for running check pointed and quickly running code so they are up to 80% off of a standard instance however they live for a maximum of 24 hours and get can get preempted any time within that 24 hours with 30 seconds of notice right so again check pointed jobs jobs like genomic sequencing where a single job within a batch of jobs can fail and be restarted is perfect for preemptable VMS and then at the instance level networking scales in terms of bandwidth at 2 gigabits per second per V CPU up to 16 Giga bits per second per V CPU and we support 15,000 vm's per VPC so that's a lot of a lot of instances on a single network if you need more than that let me know and speaking of networking in terms of the global network that you need for massive collaboration and massive data storage we have exactly that so we've invested billions of dollars into our network and when you think about a global network you have to think about laying undersea cables and dealing with things like shark attacks which sharks like the electromagnetic signals coming off of the wires I don't know why but so we have a shark proof network it's just just one benefit and then finally in terms of compute storage we have a wide array of local compute storage options in terms of not only Google Cloud storage for object storage but also persistent disk local storage like local SSD and then file store and and other solutions that we'll be talking about just a little bit so to talk about HPC storage in Google Cloud at a larger scale I'd like to introduce Dean great thanks white so what I wanted to talk a little bit about is all right now we're thinking about doing HPC in the cloud in many cases you know users already have an HPC infrastructure like a computer you know Linux cluster or supercomputer or something in their existing premise they're not just coming to this you know you're like brand-new but in a lot of cases that that can happen too so what you end up with is a series of on-premise data centers each with their own cluster and then connect it into a variety of Google Cloud regions and so the trick then is how do we get data into the regions and back again and then once we have the region up and running how do we tailor make I like to say the compute side and the storage side for the workload that you're trying to run so some of the common models that we're seeing from a variety of HPC users boils down into three categories one of them is what I like to call launch which is your running maybe MATLAB or some smaller program on your laptop or on a desktop somewhere and you're not using a lot of compute because it's just a single machine or a couple machines but then you want to dynamically burst your job into the cloud and so there are partners that can enable that where you know you're using two cores here let's say and you burst out to you know 200 something and it becomes a very dynamic real-time type bursting scenario good for bursting compute not so good for bursting storage because you know if you're trying to do this in real time you know you're gonna wait for that data transfer to end up happening while you're you're trying to dynamically burst the the one that I see a little more common is the burst by type I like to say which is essentially taking the workloads that are running on premise and splitting them up into a set of categories and so two categories might be the HPC high performance type workloads that you know require super low latency networks and use MPI and a variety of those kinds of tools and then separating that out from the htc so the high throughput computing which is more you know embarrassingly parallel type jobs or ensembles and in some communities where they're running just millions of smaller jobs on the system and so what you can do is then you know leave the the on-premise system that you have optimized that for one set of jobs let's say the HPC type jobs and then burst the rest into the cloud and have the data stored permanently up in the cloud and so you're really sort of bifurcating your data playing but as well as your jobs and then that way that enables you to optimize instead of having to optimize on-premise for every workload you know small I ops and and Big Bend with jobs you can really separate and target each one individually and in the last one of course is all in which is obviously you know awesome and removes you know the hassle of trying to do data movement back and forth but you know I think that that's more of a process right in terms of getting there all the way to that phase all right so on the storage side then what you end up with is an on-premise data center that probably has your Linux cluster running and your own storage system and then you have a set of file systems that you might want to target inside the cloud so here just as an example have our Google Cloud file store and then we have a lustre system in the system and so first you can see that we can tailor make the the compute side of each of these clusters to whatever the workload is right so file store is a smaller you know NFS based offering so you'd have a smaller set of nodes that would be accessing that with whatever you know number of CPUs that they require and tailor making then the storage and file store for whatever that is whatever the requirements are and then we have the other one as well on the larger compute cluster that you're going to have and then targeting a lustre system that's optimized for whatever the workload is right high bandwidth high ops or whatever though whatever is needed and then the challenge becomes is how to get the data back and forth right and so a lot of cases you people are using cloud storage in the middle for that data transfer because it provides the secure and highly bandwidth pipe that you basically need in order to get the data up you know and then from cloud storage you can then essentially burst out into your file systems and then the results can then flow back the other way as needed so talking a little bit about our cloud storage offering I think most of you might already know about it but essentially it can be used for a variety of different purposes you can use it as your production you know HPC storage offering or you can also just use it as a coal tier right and store essentially the cold data in your system and save money by keeping it there or you can just use it as again a transitory place to copy data in and out as as data is flowing back and forth from your on-premise system and so cloud storage is obviously awesome and it can do amazing things and it can scale across the globe but in many cases what you end up with is still rakaat your applications are not written for that and if you're writing applications for on-premise supercomputers or Linux clusters you probably are not writing applications for object storage in which case you end up with needing file some sort of file storage and so you know I asked this question well what do you mean file systems in the cloud like you know we're gonna modernize write this old file system stuff you know POSIX is from you know 50 years ago like why are we why are we there but the thing is is many as I said in many HPC workflows that are running today rely on a POSIX based file system they rely on the performance and they also rely on the types of behaviors that file storage can support such as in-place updates you know handling large numbers of small files because they deliver Layton sees that are sub-millisecond versus you know the much tens of milliseconds right that object cloud storage provides so and the other thing that I like to remind people is that file is actually very cloud friendly you know NFS says the concept is built into Linux and it's available from every VM that you launch and it's a core part of the kubernetes infrastructure and you can dynamically use NFS and kubernetes just without doing any work whatsoever and so it's also very friendly to preemptable VMS which is another core part of HPC is as Wyatt brought up and you know obviously very familiar right all of us understand what file is we do this on our laptops every day these are the apps that you know we were learning to write in university and so you know it's just something that becomes a very natural transition so what I wanted to do then is go into some of the details that ok so you're going to use the cloud you're going to deploy a larger file system inside the cloud going through some of the just general bare-bones ways on how GCP works in terms of compute and storage so at the top there we have our compute VMs and so you know something you know as Wyatt said you need to choose the right number of e CPUs or if your GPUs or TP use whatever the the infrastructure you need for your compute layer but then these nodes are also your file system client right whether it's NFS or a POSIX based clients such as lustre and so you need to make sure that it has the bandwidth that needs and so you add the two gigabits per CC P you you always need to make sure that you have enough of those to at least deliver the the network performance even if you didn't need them you know you're gonna need them for the network I'm sure so the next step then is the storage servers so these file system clients are then you know using some level of distributed i/o across a set of storage servers that you're gonna launch right and these end up being VMs inside of Google Cloud as well so you need to keep the same things in mind the one is the networking right there's no point in putting up a whole bunch of small VMs in many cases you're gonna want to have the largest network capacity possible so you're gonna want to think about at least you know again eight cores four for those servers and you might then want to think about well does it need does the file system support some level of extra caching where we could put in some local SSDs how much ram do we want to support and all of that can be configurable to on the type of job the or that you're using so then finally there's gonna be some storage would be the point of the file system and so you know if you're gonna store this data durably you're going to be using persistent disk and then there's a few things that you need to watch out there for as well and so one of them is that the performance scales linearly with capacity so you know you can look at your capacity requirements and then you have to look at your performance requirements and then you need to choose the capacity vision the capacity that meets both of those so you know some people don't like paying for extra capacity if you're not going to use it but don't think of it that way right think about it is you're paying for performance and you know some of that is just logical right I mean if you can't have the more capacity it has the more spindles or the more SSDs you're engaging which means then you have you know enables more bandwidth than I ops to the system right so that it it works very similarly onto how on-premise systems work the other thing to remember is that with PD SSD there is a set of V CPU requirements and that's because the high IAP rates of using SSDs just drives cores and so you it's a tiered system but you get the best performance with using 32 V CPUs at that level and then the other thing is to remember is that the scale linearly but then they hit different limits at different capacities and so it's easier just to choose the largest capacity and then you know you don't have to worry about whether or not you could somehow have eked out performance but some of the common mistakes I've seen is where they choose smaller VMs and smaller capacities and then they have a lot of them and it's just a lot easier to shrink the number of VMs and storage servers that you're deploying and maximize out the bandwidth and the AI ops of each of them all right so we have a lot of storage options in Google and it can be quite confusing in terms of figuring out which ones are you know right for you the top row here is the structured storage options which I always like to put up but you know we're focusing in on the bottom ones today and so we talked a little bit about cloud storage and persistent disk and I'll mention file store briefly but then we also have a set of HPC storage partners that we're partnering with and they're putting up different levels of file systems inside of Google that are really targeted towards these HPC use cases so the first one to talk about is Google's own file store so this is great for small to medium sized jobs like I said maybe a MATLAB job or some sort of smaller and you know test system that you're running and it's fully managed and NFS based from from Google next if you need to go bigger than that alas the file is putting up a fully managed NFS based scale load file system that really is targeted as the next tier of file store and delivers a really great number of IAP so if you have a high IAP workload then this is a great choice and of course we have our net App Cloud volume service and so with them you get all the performance that you normally expect out of using net upon premise but you get it with all of the integration and the ease of use of of GCP and so it's just like the other two it's fully integrated into the console and the billing system and you could just dynamically use it as if you were using any of the other storage options in Google and so we are also developing now partnership with ddn around their luster or service and their other software storage offerings and so now I'm going to pass it off to spend from ddn to talk about that hello my name is Vinay I'm the chief research officer for didi and with me is Ming one of our senior software developers let me switch to the next slide so since this week we actually announced on a marketplace and you offering DDM acquired last year when cloud which was influenced vision that was running the last development organization so ddn now essentially owns the majority of all developers worldwide so we brought a service online min was instrumental and actually get this up and running so and while I'm speaking and talking over the next slides I'm gonna pass it over to min for a second he's gonna start a demo to show you how easy it actually is to deploy this on Google Cloud and while the basically the deployment in the background is running I'm gonna talk about the rest of the slides thanks so the lastest solution the easiest place to fight is on the marketplace so we just need to search for lustre okay and here is our offering restaurant TCP and as you see is similar to any other offering so when we launch keep in mind that this solution is long multiple instances so you may want to plan ahead of time so but we do provide step-by-step and predefined profile for you to just launch is the easiest so all you need is a couple steps so the deployment name is already prefilled if you have one deploy already it will change this deployment name to different one the zone you can select your zone and the network by default is just pick whatever network does your zone in one thing that you might want to consider is using the private lustre network that you choosing so for this demo I am just a wrench and for the if you want to access directly to your cluster from outside you can just pick external IP automatically allocate for you or you can just you can select none and it will create a rounder so that you can have a gateway to jump into before you go into your cluster file system name you can change it here is something that we have predefined a couple profile to help you know new user or user have different level of needs so quixtar a pretty small standard Standa plasm premium for today demo I am going to just select a premium and it also can be a capacity base where its profile have a minimum requirement for capacity so if you want to increase more than that you just enter here and step four if you don't select any profile on step three you can go into step four and make change to your server any way you want some people they want to you know tailor to the way that they want to be a certain server certain memory you can do all that customization here the last step is select number of client include into this deploy so for our purpose I pick a client and I the last step it just clicked the point and the deployment approximately on this deploy is about four minutes so while we waiting for the deployment I will hand that to back to seven and we'll come back to here and when's it complete in about four or five minutes and we'll continue yeah thanks a lot min so a couple of things to mention is probably there is a default 8 by 5 support associated with the service so you can actually pick up the phone and call somebody if there is a problem it is not fully managed yet we looking for feedback from customers if they also want a fully managed service of that and then we're gonna gonna look into this and so if you deploy the system as means that this is probably going to take somewhere around 4 minutes we also deployed large systems of you know hundreds of terabytes in size and everything is basically a matter of a couple of minutes to get deployed so it's actually is pretty easy and pretty scalable if you need a very large amount of capacity in cloud so let's talk a little bit more about luster and and what ddn is actually bringing to the table here so ddn's luster is is the number world peril of fly system deployed we have a large number of AI customers so we're not just playing an HPC we have very specific offerings in the H in the AI space as well and the whole purpose of Laster is really to provide very very high performance where traditional files having solutions typically lacking so while it can do a lot of things with NFS and SMB with normal workloads as soon as you get to a massive amount of parallelism they're typically not really your choice of workloads so we have some customers in today deployed in the world if you look at like the top 500 supercomputers list massive amount of them are actually running the lustre file system inside so there is a very interesting performance benchmark called IO 500 I'm talking a little bit more about that and what you see here is essentially it demonstrates the capabilities of that offering so we have customers that have millions of billions of files we can do hundreds of gigabytes even terabytes per second throughput with this technology it's something that is really proven ly deployed in many many sites so we basically try to bring this on-prem technology that we deploy in many many sites today in the cloud to give customers the choice to not just run this on Prem but also get a very very nice cloud experience with a very proven technology that is deployed in many of these areas today so the things that primarily immuno differentiates lustre from some of these other options that you see on the marketplace is really superiority and scalability as I just said deploying hundreds of millions even billions of files petabytes I systems we have some systems on Prem today there are three digits petabytes in size so we know that the technology scaled so you can deploy this in the cloud with the same technology the other nice thing is that you can very nicely scale it for various types of workloads it doesn't matter if you have massive random read performance workloads or you have massively sequential workloads for reads or writes we have small files a large files there is lots of options that you can have we try to keep it as simple as possible that's why we have all these poor fights we can basically pre-select from but if you need something that is highly tailored for your environment you can do this with the offering as well and what a lot of people don't know is many AI workloads particularly and we really put lots of effort in in this area rely on under the cover on MF so many many libraries LM DB and multiple others use MF under the cover many other files store options actually have significant performance scaling issues with that and so we put lots of investment into this area to make this very very fast so we believe that's why this option is is very interesting particular if you try to run AI workloads in the cloud because with the distributed tries it and it has optimizations for that a lot of the frameworks that basically get shipped out of the box on these systems gonna run significantly faster if they leverage a map under the cover we also have you know given the size of the deployments we already had there is lots of expertise in the organization to manage massively scalable systems we actually run for lots of customers very massively scaled systems on side so we provide these extra services to customers on Prem and basically with the service offering and the knowledge that we have in this area we try to bring all of that into the cloud as well and we're more than happy to also have separate engagements in specialized services for that if there are very specific needs you have so the technology is is heavily being worked on we try to tailor as I said this offering specifically for cloud there's lots of powerful tools that we are in the process of launching in subsequent releases that gonna come over the next couple of quarters so you have today already capabilities to set a quota allocation in the system you can either do this by subdirectory there is options you can do this by user role by group and we're gonna add significant management capabilities in terms of pooling so you have much more fine-grained control as an end-user or based on data that the system automatically collects in terms of file heat and being able to basically make manage data placement across multiple pools of different types of storage inside the system at basically either with the full automatisms and the tools required for the end users to manage that and we also have a ability for doing very fast transfers from on-prem to cloud because it's great if you have an offering on the cloud or if you have an offering on the prem but if you can't figure out how to make the movement for these very very large scale systems working you need more than that and we have a special product would you have one extra slide for that and there's auditing capabilities as well that is required obviously for lots of customers for governments governance purposes and we have lots of security features into this solution as well so you can do things like secure mounds where people only can mount certain subdirectories of the system without other people essentially seeing the data of of each other so you can basically separate each other's out in in the file system namespace so IO 500 as I said there is a benchmark that is used trying to provide basically a very fair comparison between different storage solutions in the market out there we've done a couple of tests in in the IO 500 benchmark with loud with with Laster this is just one example of what we deployed we basically deployed 120 client system 120 service system 180 terabyte capacity in total we achieved hundred and seven gigabytes per second throughput on the system and we basically short absolute linear scalability going from the smallest configuration we've tested all the way up and an S min already mentioned and what we're going to show in the demo in a few seconds is that all of this was launched essentially and executed in a couple of minutes this are the results of the benchmarks so we did different tests we scaled servers independently of client so this is essentially the luster server account scaling so you see the IO 500 score how it basically goes up when we add servers after mention one point the benchmark itself is very complicated so as largely you make the benchmark data size as more complex its execution is and so it's higher you go basically in the scores it tougher it's going to be if you increase the scale of the system and we also as I said in increased the gigabytes per second essentially but just adding individual notes into the system so here you see the scale basically going from ten servers to 60 servers to 120 servers and you see a basically a very nice scale of that performance in the system and we believe if you just add more nodes into the system it's just basically gonna run faster and faster as more resources you add into this so the nice thing is as I said you can scale capacity completely independent from metadata and independent of the IEEE ops or throughput in the system so basically whatever metrics you try to achieve it's just basically a matter of picking the right input for the deployment and we can basically meet all of all of those needs so two more slides that are very specific I said this is our cloud offering lots of our customers already have on Prem offering or customers that have a cloud offering also need an on-prem offering so I want to show you two slides of what we deployed today on Prem so we just launched a new product it's called SFA 18k it's essentially the fastest storage controller for file systems that you can purchase right now it's essentially a for you device that delivers ad gigabytes per second throughput over the network and this is essentially embedded running the same master file system software that we deploy on GC P so if you're looking for an offering that needs to be deployed on Prem this is a very very good offering for that and the second slide that I wanted to show you is a product called dataflow so this is the one that essentially ties this on-prem technology with the cloud technology or also cloud to cloud or also cloud or on-premise one Prem so the products called dataflow and what dataflow essentially does it tightly integrates with multiple storage sources and targets so it is not actually limited to just lustre on Prem and lustre on the cloud it has targets to NFS it has targets to SMB it has targets to IBM spectrum scale file system it has targets to multiple different sources and targets including s3 and even has support for tape and what differentiates this technology from other simple file movement data management solutions is that it's particularly built for massive scale so you can basically take a hundred petabyte file system as a source and you know a multi hundreds of terabytes or petabytes right system in in the cloud and you can basically fan out the data movement across large number of nodes in large number of networks so instead of running on a single node it's essentially gonna send a request to the source system you specify with rules what data you want to be moved it finds the state are very fast with built-in technologies in their technology if it's supported so for example if you run this against a spectrum scale cluster from IBM it's going to use the policy engine that is built in there it's going to find the files very quickly that you select then splits the workload across large number of workloads and transfers them in parallel into the cloud because that's one of the big challenges with just massive data movements how do you move that data between cloud and on-premise can't really do this from a single node you need to massively you know distributed system in order to do that and basically a data flow is is is providing you exactly that capability to move between all these these different technologies ok so there was the screen last we left when it finished deploy approximately for the 5 minute on this particular profile so as you can see the on the right hand side you see some information what we provide is a web console that you can see file you can see your statistic or you can see status of your list of our system so so this will display the network the lustre network file system name also the NID the network owner and ID that you use if you have new client bringing in to mouth that is the mount point that you want to use so all information is there and we are running the latest version of lustre 2.12 0 and the center of 7.6 and below here is the information that you can automatically we have automate the way that we you can introduce new cluster a new client into you know mounting the same file system so I read all that but one interesting I want to show you is if you visit the console you will see some a file system lustre file system and you see metadata and you see the the target the object storage target that we just allocate and the total file system just for this demo I just launched 25 Parab terabyte with using you know a few minute and the the cluster here are showing the OSS which is its server and we are using approximately 10 OSS and the right hand side are this read and write right now I think is doing the read what I just did what I launched an i/o our program on client to demonstrate you know the this activity so you can see as the DS is reading at four hundred megabyte per second right now and if we go to the graph we including the ganglia graph you can monitor this to stick a statistic for last hour or day a week so you have made a data server management server and OSS so I'm going to show you the so we have 10 OSS the first default screen is going to be the low of OSS you can see the low to spike up because we'll just launched a IO our test one of the interesting thing I mean you can see a lot of statistic create write read for individual OST if you want to look at it but will a total by read and write total on all of them and they do a stack graph with ten different color and you can see if you click on individual one you can actually see you know everything CPU low luster let me see we can collapse and we can look at the OST matrix or the statistic is there for you to to look and while you run your application you can actually see the low everything so I think that so as you can see there's lots of detailed information that we expose so it's very easy to also find potential bottlenecks where you might actually be limited by like if you know exactly what the network limit is of the individual instance you know that when you deploy a larger system or expanded what you basically have to pay attention for min thanks a lot for the demo I'm gonna hand it thanks man thanks min so I think you know this is this is where we want to get to right is is you have your way you know there's the last 30 years has been optimizing how people are doing an HPC on premise you know and how they're deploying file systems and you know ddn's the experts there and compute servers and all these things it in the cloud so we're trying to get to well what's the best way to do it there right and so we really want to change some of the the general dynamics around you know one massive file system that you're just been packing all of your workloads into and really sort of thinking about how do we optimize the entire workflow where you're expanding the clusters when you need to you're shrinking them down when they're not needed you're targeting file systems where you know you can set a set of jobs for bandwidth better bandwidth optimized and you can set a set of jobs that are you know require high ops and really targeting the specific needs of the applications with by you know doing the right configuration within Google

One Comment

  1. Matt Middleton/artless intent said:

    nice

    June 27, 2019
    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *