DATA & ANALYTICS – IoT – from small data to big data: Building solutions with connected devices



my name is Preston Holmes I'm a solution architect on Google cloud platform and I work on projects with connected devices otherwise known as the Internet of Things or IOT and I want to start with answering definitively a question that often comes up and that is do we have a consensus as to what is IOT and definitively I'll say no we do not have any consensus about what exactly IOT means and so I'm going to give you my working definition that I find useful for this and that is that IOT is really a period of transition and transformation and that we have everyday objects things in our lives and they serve some primary function and that at some point we begin to connect a few of these in this sort of awkward phase where some of them are connected to the network and many aren't and this is really IOT that sort of awkward transition and then eventually as all of these devices become connected by default they're no longer really IOT they just become the things that we have in our everyday lives and so it's this transition and more and more of these things in our lives becoming connected by default that defines what is IOT devices now if that's one working definition of what is IOT then why are we wanting to connect all these devices what is that what's the point of it all and it's really about getting access to information that wasn't previously available so there's information in the environment all the time all around us but it's not really available to us as data so this information might be something like the speed of a delivery vehicle the utility you today ssin of parking in a city setting might be more information about the yield and defect rate of manufactured products no matter which vertical or industry you're in there's probably information that's valuable to you that you're not getting any access to today and so this is one goal for the Internet of Things to go from information to insight now most of the data that we deal with today in data context is born as digital data clicks on a website application downloads online commerce transactions all are born as digital information but the kind of information we're interested in in the Internet of Things is analog and we need a way to capture that analog information and then locate it to the places where we can work on it with the big data tools so really the Internet of Things becomes the set of technologies that are the means to this end and this is really what I mean here when I talk about an end end solution around IOT now you've likely been hearing about the Internet of Things for some time and the question comes up why hasn't it been more fully realized and there's a number of reasons why it's people have been held back and but first and foremost I think is really the complexity of the domain that's involved you've got to determine and choose some sort of hardware device platform I've got to pick whether or not to run an operating system and which operating system to run the networking in the Internet of Things is a lot more varied and diverse some of these networks are much more spotty much more expensive there's whole new classes of networks being built for specifically for IOT applications you have mobile and cloud application development each of which are their own sort of complex domains and all of these devices have the potential to generate a lot of data at scale that might challenge to traditional data infrastructure and then finally we all recognize that this has to be done securely all eyes are on the security of IOT and the good news is by paying attention to it early we have the chance to design security in up front rather than act reactively which is a lot of what happened with the web so the technology and to solve this sort of end-to-end solution like to break down sort of an architecture for IOT and fundamentally we can recognize three primary components the device which is the physical object out located in the real world there's often a gateway which is responsible for collecting data from devices and relaying it and then cloud platform which is increasingly sort of the nervous system of your business or application where this IOT data can be combined with other non IOT data to sort of then render useful insight into your applications now if we look at sort of other pieces of this technology stack that data touches throughout come from sort of bottom to top you'll recognize a bunch of sort of familiar cloud platform technology pieces in terms of pipeline storage analytics etc and across these things that touch data we recognize a couple cross-cutting sort of concerns in the areas of device management and security and so it's often unclear how our cloud pieces necessarily fit into this end-to-end stack so a lot of what we'll be talking about today is some of the devices and Gateway pieces so with this map or framework we can break down some of the technical considerations and we'll kind of use this throughout the talk starting with devices so as I said devices are sort of the thing of the Internet of Things it's the thing out in the world but a thing might be as small and simple as a tiny sensor located inside a door hinge or thing might be as gigantic as an entire oil rig and these devices might be compound and mapped in sort of a hierarchy where you might have a ship and on that ship is a pump and both of those are related to each other both are things and how those are graphed is going to really depend on your specific domain model your domain specialties but at the end of the day a device is a physical thing out in the physical world and now a device is really just a tiny computer so it's both familiar and foreign you can recognize all the familiar components of a computer inside any IOT device fundamentally it's composed of hardware and software and it is different than the computers of our everyday lives fundamentally the inputs and outputs are going to be very different and a lot more sort of exotic than the familiar mouse and keyboard and monitor that you have on your computer for sensor inputs there's a wide variety of sensors available for gases for red etc but an input might be as humble and simple as a button located in just the right place at just the right time for somebody to signal some event has occurred now not all devices are going to have outputs but outputs can range as well so you may have something as simple as a buzzer simply let somebody know that something's happening an output can be a gigantic hydraulic lift that can lift several tons so again these things are what really define IOT devices is different now communications or comms is really technically just another form of input and output i do call it out separately because the diversity of radios and radio types and communications available for IOT devices makes communication somewhat of a special case for in and out of things now obviously on top of this you run boot loaders which run and load an operating system potentially and then all of your libraries and application software then runs on this device now there like I said there's this real diversity and explosion of hardware platforms it seems like almost every week there's new types of boards coming out new technology but really at the end of the day as a software developer you probably think of the device in terms of the information which it provides how do i how do i model and manage information this device is working with and we can recognize sort of the four primary kinds of information the first is metadata so this is really information about the device this is things like a device identifier it might include a firmware version location date installation location and data etc second is state this is sort of information about the condition of the device not necessarily the condition of the environment but it's a current condition of the device might be read read only or read write it is going to be something that is along the lines of like the set point for a thermostat or some sample rate for frequency and so this is not something you're going to be changing and resetting at very high frequencies commands are a type of information which indicates some action to be taken by a device so this is usually done as some sort of handler or remote procedure call that's running on the device and these actions are often not idempotent which means that if you send the same command multiple times you might not be getting simply the same outcome every time and you can imagine if you have a command that says you know run self-cleaning cycle and you do that once and now you're clean if you keep running it multiple times it's not necessarily going to change the result the same way it changed it the first time unlike State and then finally telemetry now telemetry is really the eyes and ears of the Internet of Things it's sort of the the source a lot of what we have is the environmental data and this telemetry can take advantage of that diversity of sensor types I mentioned earlier now if you're going to bother to sort of define a device pick hardware platforms and model the information they they produce or consume the odds are you going to have more than one device out there and as soon as you have more than one device you need to consider how you're going to do device management and device management broadly addresses the tasks required to make these devices functional and useful in your project and this involves pieces of functionality that run both on the device as well as in the cloud now a cloud platform we're pretty versatile in terms of how we work with you to do device management we have integrations with brillo and weave which we'll be hearing more about later that handles device management we work with a number of device management platform partners who take care of all the edge communications and setup and provisioning and there are a number of open source and sort of do-it-yourself solutions but no matter which solution you choose for device management we can recognize kind of the key functions that are cheeves first is you need to provision a device onto the network it needs to be able to connect to the local networking environment where that device sits you need to register this device in some sort of registry so you can keep track of which devices you have under management you need to authorize a device you need to make sure that you're only speaking to devices that are part of your fleet and you need to make sure that devices can only reach the back-end systems they're supposed to you need to monitor and perform operational management across a fleet or groups of devices so that you know that collectively and aggregate your entire fleet is performing as you expect it to and finally you need to be able to ensure that devices are being up-to-date now this is not only important for making sure the devices gaining new functions over time and new features but it's critically important for security that it's always being maintained with the latest security updates so let's talk about a couple of customer examples I want to tell you about how Ness has been able to simplify the deployment of connected cameras as part of their ongoing migration from another cloud platform so today in the completely regionalised deployment cameras first must contact a dispatcher in one region before connecting to a home server in another region and then that data is stored per region and when it's stored in a single region it gets copied to other regions for durability and availability now in contrast the cameras that are now connecting to Google cloud platform use our global load balancing service to immediately reach the regional server that they ultimately stay connected to and the data from that server is stored in services with global storage scope so they avoid the complexity of needing to synchronize data across regions and get that sort of global availability making objects easy and pleasant to use and easy to orchestrate as something philips Lighting is done successfully with the hue color changing light bulb it's one of the more familiar and sort of widely deployed IOT devices out in the world many of you may have one now Philips with the help of one of our partners q42 was able to build and scale the infrastructure to support the hue application and API on cloud platform this low latency and reliable API is part of the hue experience and Philips and Hugh Phillips and q42 we're able to take advantage of some features in container engine to allow the backend to this system to be steadily improved and steadily released by taking advantage of features such as rolling updates on container engine they're also able to launch new and interesting small experiments with very very quickly on their existing container engine cluster and it almost no cost no additional cost to run those experiments side by side and all of that container engine infrastructure is then able to connect to our data and analytics and monitoring platforms elsewhere on cloud platform so the light bulb is a great example of a device which by itself cannot reach cloud platform and I'd mentioned gateways in in the beginning and all hue light bulbs ship with sort of a gateway or hub and I said this is often a key component in IOT architectures and I want to explain a little bit more about why that's so and it comes down to there's a whole class of devices out there that are not able to reach the cloud for one reason or another this might be that they do not have any routable connectivity to the internet for example bluetooth is not directly routable to tcp/ip it might be that they don't have the processing capability to perform the secure HTTP transport protocol that all of our cloud platform api is required it also might be that they don't have the power budget in order to support the networking they may be very low power and finally there are perspectives in a setting the required device to look at multiple devices in that area and do some pre computing so any one device isn't going to give you that local aggregate and pre-processing setting and that's part of what a gateway can do so we have a demo outside running with Intel one of our partners in the Gateway space is until they make an industrial IOT gateway platform and we've worked with them on an integration with cloud pub/sub and the scenario we've worked on is this idea of tracking the environmental conditions and locations of pallet goods so this starts with an Eddystone Bluetooth beacon now Eddystone is an open-source Bluetooth spec from Google defines a couple data types that are broadcast over bluetooth including an ID as well as telemetry frames for things like temperature or humidity and the Gateway collects these from the Bluetooth beacon and then sends the information up to pub/sub on the Gateway is also developer experience a sort of a developer UI that you can easily connect various sensors directly to the Gateway and then these automatically gain from this integration to cloud pub/sub the Gateway also runs a set of libraries for doing runtime edge processing and this is useful when you have networking where you want to be somewhat miserly about what you're actually transmitting on the network to the cloud so if you can do some degree of edge processing you can really send up the right quantity and kind of condensed version of information and so this idea of connectivity being a specific challenge in IOT solutions comes up because the location where this information you're interested in in is going to be potentially who knows what kind of environment and the world is not blanketed with Wi-Fi Wi-Fi is limited to sort of the places where people are today for the most part and so one of the ways we're looking at helping extend connectivity to cloud platform through partners is with a project we're doing with particle i/o now particle is one of the more widely used Internet of Things platforms they've got over 60,000 developers on their platform already and most recently they released the electron-electron is a small powerful IOT device with built-in cellular connectivity and so this gives you the ability to have this device connect to the cloud in over a hundred countries we've partnered with particle to make it easier than ever to pipe data from any connected particle device directly into Google cloud pub/sub now anyone who's visited Google may have seen these colorful bikes that we have all around campus called G bikes and and as Google's campus has grown the G bike has been come this sort of key way that people get from building the building throughout the day but the team that operates G bikes for Google has very little visibility into where these bikes tend to go during the day and how they tend to clump and where they are so working with particle we've sort of attached some of these particle electrons to bikes and we're able to collect the locations of how they move throughout campus during the day we stream this through pub/sub into bigquery and then we can run queries that allow us to plot their locations with heat map API is in Google Maps this is really invaluable for that G bike team to understand what the usage is and where these bikes are throughout the day so connectivity involves things like networks but it also involves things like protocols yesterday we announced support alpha support 4G RPC for pub/sub and G RPC is an open-source binary protocol that will be supporting across many of our products but in IOT there are other protocols that are very commonly found and one of those is MQTT so a partner of ours augusto has been working on building an MQTT solution on compute engine MQTT as a protocol is well suited for IOT it's it has a very low footprint both in terms of network usage but also is available as libraries for many small microcontroller devices today and so what Agosto has done is build a custom MQTT Broker that is based on RabbitMQ and it allows us to focus the protocol translation from MQTT to pub/sub and let's pub/sub as a managed service handle all of the durability and manage scale of storage of events so the compute engine we can focus on using the networking performance and CPU performance but not necessarily worry about storing all of the events and this solution allows you to scale this out from pilot to production very cost-effectively now what I've been talking about today is how things get solved today this this domain of complexity is all solvable but it can get definitely better and at Google we're working on making it better across the experience and it's been part of our DNA to work with devices from Android phones to chromecast and Chromebooks so invite Jeff Chen product manager from brillo and weave to tell you more about the brillo and weave project at brillo and how we're looking to improve devices like you Preston good afternoon so my name is Jeff Chen I'm one of the product managers on the Google brillo and weave team so I'm a device guy the hardware guy and so I wanted to first introduce to you what brillo and weave are this is Google's offering specifically in the IOT space well when we sought out to build out brillo and weave we wanted to address three major objectives the first one is that we wanted to help device makers a build for IOT so at Google we have been building connected devices Android Chromebooks chromecast and so on and we realize that these solutions could be benefited from third party device manufacturers because they struggle with trying to figure out how to make a device's cost effectively or be able to build with security so we wanted to help device makers prototype their device applications quickly and then more importantly port that over to the production hardware without requiring major surgery and it doesn't end there with connected devices it's really important to be a position where you can actually monitor the devices that are in the field and be able to push out updates to them to resolve any security concerns or to update new features to continue to improve the user experience secondly we wanted to create an open ecosystem as we observe the growth of these IOT ecosystems we saw that there were subnets of siloed ecosystems that were causing more and more fragmentation and while there were some very good reasons for vendors to lock down the connectivity of these devices with proprietary protocols it keeps us from getting to this truly open Internet of Things and so the only way that we can get into this Internet of Things is by embracing open and standardized communication protocols which leads us to the third and final objective that we had and that is that we can create an ecosystem of devices with open protocols so that we can create a rich ecosystem of apps and services that could be transformative so think about the web think about mobile Android now the intelligence and the sports of these devices will not come from a single vendor nor from Google it comes from you developers developers that want to build applications and services that can communicate across devices from different brands and I can build rich analytics around these devices to help make these devices better and more useful to the user so that's why we built brillo and weave because ultimately we see that the magic comes from a distributed world of devices and we think that the power of this new internet would require the ecosystem to build devices cost-effectively and F scale these devices have to communicate with each other through open protocols where the smarts are in the device on clients or in the cloud so let me tell you about what brillo is well brillo is an operating system that's maintained by Google it is a version of Android and it is supported by a variety of silicon vendors and it has proven scale to reach over 1 billion Android devices right so we recognize that deploying a full-blown stack of Android is a little bit overkill for connected devices so we built a stripped-down version that was targeted specifically for a connected devices so device manufacturers can leverage all of the benefits of the Android hardware ecosystem so if a chip vendor supports Android it can essentially support brillo now the second aspect of brillo is about security and brillo enables devices to be secure because we fundamentally we believe security has to be built into the device for IOT technology when we believe that not only do you need to have it built in but you also need a way to be able to fix issues that are in the field as they come up so we've built a number of security features into brillo that will address these problems including verified boot which means that the device will run code that is trusted from the manufacturer we also make sure that access to the device and all the data that's produced through these devices is controlled by the user and that any data that is actually stored on the device or transmitted across the communication pathways are all getting cryptid and secured and then lastly with brillo it comes with services like analytics and updates so for example device manufacturers can get a readout of all the features and the usage of their device population in aggregate and figure out how the users are using their devices but then also we provide an infrastructure to push updates so that as security threats come up or you see new features that you want to deploy that you can have a pathway to be able to push those updates to those devices now for weave what is we've what we've is our open communications platform it consists of three components a device library client API and a web server cloud server if you build weave into your device it's easy for third-party application developers to access your weave devices from their phone or a cloud service through weave mobile and cloud api's we believe in interoperability so much that we built out device schemas for a specific set of device categories so that third-party app developers could use these schemas to build apps to monitor the device state to control them and to build analytics and richer experiences across all of these devices and in addition to offering third party API s we've also offers seamless integration into Google services such as the Google Voice assistance so with brillo and weave devices this platform enables an ease of use and ease of building devices and creating a standardized device data and we expect these devices to generate data and a lot of it so Google cloud platform for IOT data offers a proven end-to-end solution to make sense of large amounts of data and produce meaningful insights these insights can help manufacturers determine what features to add or remove on the product help app makers develop novel new experiences and services and help enterprises manage their fleet of devices for optimal performance and security so that gives you a little bit of an overview of what Google is doing in the device ecosystem to enable with operating system brillo and a communications platform called weave so our team looks forward to all them amazing and awesome things that you guys are going to be doing on the cloud side to take advantage of these smart and connected devices thank you all heading back to Preston so I said earlier much of what we've talked about today has been about the technology of the Internet of Things that runs outside of the context of cloud platform now this is important because unless you clearly understand how the on-ramps for IOT data work and how to solve for them you can't take advantage of all the great big data tools we have on cloud platform so it's important to understand how to get the data onto the cloud but without a doubt data and analytics is our strong suit on cloud platform and the key here is that we provide a comprehensive platform for working with IOT data and what's interesting is as we follow this data through the journey from the edge to the application it begins perhaps is highly specific data that might be coming from a machine specific industrial protocol or healthcare device but as it moves and is ingested into the platform and processed and normalized and standardized into the structured data that's well suited for our big data tool chain it begins to lose some of its IOT specificity our temperatures just become integers and cycle times on machines become just timestamps and so as our general-purpose big data tool chain processes and sort of munches this IOT data it then becomes a bit more generic and as it moves out into the application is where it really regains this domain specificity because at the end of the day the insights and and sort of queries that you run against the data are what turned this volume of data into the useful sort of actionable level insights that you need to then present back to processes or users through applications and this is really why application sort of called out here separately and I want to tell you one such application that we've incubated at Google for work and launched in a multi store pilot with a retail customer is a solution which provides digital inventory visibility and accuracy now this works by putting RFID tags on every item in the store readers then on the ceiling can scan the store and basically generate a live inventory feed of what is both in the front of the store and the back of the store this application then generates the sort of steady stream of the real-time view of the inventory up to cloud platform where it's processed and analyzed and then made available as an application to store associates now store associates can take this information and it boosts their visibility into the accuracy of their inventory in store upwards of 95 percent compared to the 60 to 70 percent typically gained through a standard point-of-sale system and this allows them to track inventory in the back room to the front of the store to make sure that product is always present where customers need it the corporate office this also gives all sorts of visibility about changing and adapting marketing campaigns adopting store HR hours etc and once infrastructure such as this is in place it begins subsequent ideas for other applications that can bring sort of what we're once considered far off future ideas of sort of virtual changing rooms and Tryon stores touchless payment and checkout systems etc so this is going to bring us back to security now and security like device management is something that needs to touch the entire sort of data handling stack and end and you all were in the keynote earlier hopefully you heard how much attention we pay to security and cloud platform it's something that Google takes very seriously in cloud platform builds into the fabric of all of our products and services now for IOT it's important that we offer both secure api's and methods for bringing the data onto the platform securely as well as sort of the security controls and foundations to make sure that you're able to keep that data secure once it's on the cloud on the device side you've heard about about brillo and weave device security needs to be built into the fundamental systems and technologies you're choosing so sandboxing allows the protection of one process from another over-the-air updates and verified boots make sure that you're running secure code and always able to push out the latest secure versions keeping user data secure on the device and in transit to the cloud is very important for consumer applications as well as quote has sort of commercial applications and finally using secure Linux technologies such as mandatory access controls limits the power of any exploit from doing more damage in taking control of the device but it's not enough for us to just do what we can at Google to really secure the Internet of Things requires collaboration across a broad ecosystem of partners so one of the challenges in IOT applications is maintaining this end-to-end integrity and secure ownership of a device so I want to tell you about this collaboration we'll work on with Intel and the technology they're calling Marshall point the idea is that by taking a Hardware proof of authentication in this technology Intel calls epi' deal isince to add Mel and microchip that a new owner of a device a new service of advice can meet at a virtual rendezvous point the manufacturer the previous owner of that device and digitally sort of through some cryptographic signatures take new ownership of that device and securely provision it with application or cloud credentials and so this is an example of what I think it takes to really secure the IOT end end because if you have any gap in that chain that becomes a sort of an unacceptable point of surface to breach your system overall so in conclusion why do you want to get started building IOT now well it is a fantastic time with the hardware that's the the proliferation and the diversity of sensors and platforms and hardware it's never been easier to get started with hardware today second Google is bringing its expertise and experience working with both devices and data across the entire spectrum of the IOT problem space and third we do so with an utmost focus on security throughout as you've been hearing so I encourage you all to think about the analog information that's relevant to you and out there in the world today and to go start building solutions to capture and process that information lone cloud thank you very much we'll let you get cut free lunch if anyone has questions Jeff and I will be datacom thank you you

4 Comments

  1. Dawit Uta said:

    thank you!! It is Very interesting!!! how can we get sensor data set specially (air quality sensor data set like CO, Temp, PM2.5, VOCs)
    thank you for responding me!!

    June 26, 2019
    Reply
  2. Yaghiyah Brenner said:

    so much fighting, IOT is suffering severely from the same problems browsers did in the 90's, same with mobile. Why has none of these companies formed an ISO alliance for IOT protocol standardization like TCP, no! cause everyone is greedy and want to lock vendors on their platform. The IOT road is very long and bumpy.

    June 26, 2019
    Reply
  3. Spawny Chancer said:

    Big Data is a loosely defined term for the effective management of a huge volume of information characterised by data that is too complex and/or unstructured. Big data solutions should contemplate approaches to structured, semi-structured and unstructured data.

    June 26, 2019
    Reply
  4. Boris A Buchancow said:

    Interesting.

    June 26, 2019
    Reply

Leave a Reply

Your email address will not be published. Required fields are marked *