Urban Informatics and the High Frequency City



well thanks for joining us my name is and will resume in the fall early September for this series that is supported by the office of the provost co-sponsored by a number of campus units now with great honor and the tremendous pleasure so delighted to introduce today's speaker dr. Mike from University College London where he's professor emeritus and the chairman of the Center for Advanced spatial analysis Mike needs no introduction at all he's a renowned scholar across the globe and the most we know for he's very influential work the modeling of urban systems and complex systems and most recently advocated new science of cities and we're successful in so many different ways but I could at least mention one feeder recently completely for the past several years still publishing travel activities for the community he continues to receive award although he's receiving so many awards and accolades just to mention a few he's fellow of the Royal Society and several years ago he received the Nobel Prize or geography something like that I can't say it's out of its French so if I go through all these accomplishments that would be a lecture for you itself so I think I should stop but just a personal story I would like to share I know Mike when I was a graduate student I was presenting at a conference he wasn't either audience I was presenting a piece of visualization work I knew he's transporting the utilization of urban systems I wasn't nervous but um I survived and after the conference I went to my wife's hometown for marriage that was a few hundred miles away and my wife now we're visiting the beautiful part and I thought that was so amazing and we recognized each other and from that point I had somehow believed there must be some magic force connecting me to my since then I regarded him as my role model and increasingly over the years I found there are so many people regarding because as a role model he's obviously so knowledgeable that's not a surprise but he's personable approachable and the humble and whenever we need White's asked him for the red across so he's just so right thank you very much Sharon thank you okay so what you didn't say I thought you're going to say this that I have been here before 33 years ago 33 years ago I spent three months in the university here and there's only one person in the room I think who was who was there then and that's Lou Hopkins over there who was then head of the planning school and we had a small grant I think from the NCSA to basically look at models so they employed me during the summer in 86 for three months to do some work on that the the IBM PC extended model had just come out and Larry Smarr who's the director of the NCSA had convinced IBM to give them like 20 or 30 machines basically and they wanted to visualization was very big in those days a scientific visualization and that's effectively what we did okay so the title of this talk urban informatics on the high frequency city let me tell you what I'm going to talk about to give you an idea of what we're going to cover I'm going to say a bit about urban informatics in the smart city that we don't need to get hung up on the terminology the smart city is a terrible word urban informatics is not much better but generally speaking it's sort of focused on the notion that computers have scaled down and have been embedded into everything around us and it's generating data in real time basically and of course it has a major impact on the city so I'll talk a little bit about urban informatics in the smart city very briefly and then I'll introduce this idea about the high and low frequency cities and by high frequency I mean really the 24-hour city in that sense what's been happening is that large amounts of data are now available that in in various ways are are streamed in real-time to us in different ways and this has put the 24-hour city really very high on the agenda previously you go back 10 or 15 years this was not the case at all that most people thought about cities in a low frequency sense we thought about how they changed over a matter of years or even decades for that matter course has always been people who've managed the city in a 24-hour context but in some senses we're beginning to think there are theories about the city in this kind of high frequency sense anyway that's the kind of context I don't want to talk about a short history of big data big data is one of the great buzz words of contemporary times how big is big basically so I'll say something about that and also indicate that big data is more than it seems and sort of lessen it seems etc and then I want to move on to really what is my main example and that is looking at some of these ideas with respect to the real-time streaming of transit data in London so it's related to mobility etc we have a lot of big data from Transport for London the Oyster card is the smart card that you use to actually travel on public transport in London to give you an indication of how big this problem is about 45 50 percent of people perhaps a bit more rarely fifty-five percent of people are in Greater London that's a city of the inner part of London which is about eight million people there travelling by public transport about 38 percent of people in Greater London travel by car the rest walk basically or bike so in other words the very much the dominant movement pattern in central London is by public transport and on public transport you can use this smart card which you top-up with money and so on you tap in and tap out on the tube and about eighty-five percent of people traveling are using an oyster card of some sort yet they're using the smart God the other people are traveling on the subway are probably using a network rail card which is the old the old paper card basically etc which you can't top-up it really relates to simply buying a ticket Station ok I'll then talk about learning about mobility from this data set it's a very interesting data set and I'll talk a little bit about variabilities one of the kind of key things in the data set is heterogeneity one of the key issues I think about the modern age is that we're learning more and more that things are not homogeneous they're highly heterogeneous in that sense and we can certainly see this from this data and then I'll talk about what we can do with this data we can identify disruptions on the system signal failures stalls trains and so on and we can also look at the variable locational dynamics of demand basically so I'll spend most of the time in in in talking about this kind of problem and then very briefly I'll just point you in the direction of other real-time data many of you here know about this there's a lot of data now on the public bike schemes I noticed here that you don't have the the Chinese public bike schemes that have appeared in appeared in Britain recently mow bikes and so on plenty of those around and that's playing havoc with regulation of course because our public bike schemes in London on our docked bikes basically which are organized by local government and so I'll talk a bit about bikes then social media being be one or two things like that but this will simply just sort of set the context finally and then a few notes really about what we can learn limits the Big Data the need for big theory another one of the themes I think is that ok with big data etcetera real times data but we need theory to actually mine it in some sense the kind of mindless effort of of data mining machine intelligence cetera has to be paralleled by ideas about theory you know if we go in there and we don't know what we're looking for we're never going to find anything basically so we find things by knowing about theory ok so let me introduce these ideas and first list a few things about the the idea of the smart city now all of these sort of points are things that we already know etc I'm going to make them simply two again set the context compute continue to scale down in size and cost and scale up in speed and memory that's really Moore's law basically so miniaturization shows no sign of stopping that's the first point as computers have got smaller we can now embed them everywhere basically so in 1954 there were 200 computers in the world basically there are probably 200 computers in this room basically at this particular point and of course because we've got lit real-time streaming we've got we've got the high frequency city in this room basically because the video is real time streaming we're reaching out to an audience I said I assume that certainly has more than 200 computers in their pockets and so on at this particular point so in other words this is really a kind of sea change it was always the case back in the day actually in 1949 50 shirring wrote a paper on computers and machine intelligence something of that cell computers intelligence I think it was called and he basically said and many people like von Neumann and cheering us on had said this a number of times in the forties that the computer was the universal machine of course it's Titan along I'm not even sure that the anointment ensuring really believed that they in in the philosophy of it that was the case it was clearly Universal what they were doing but of course it's it continues to take us by surprise really that we've invented something something's emerged basically which is truly Universal etc in that sense ok now the latest phase of this embedding of computers into the environment is all around its into the built form basically again this room is a good example of the embedding of computers I'm using a computer basically there's real-time streaming over there nobody's looking at their phones basically I would have expected people to be looking at their phones so don't feel embarrassed if you're looking at your phones that in some sense again cut this kind of computation and then bending has given us the ability to parallel process to some extent so the next stage of course is the embedding of these things probably into us basically so once the smart city which is the embedding of computers into the city and the environment around us sort of runs its face basically and we accept it then it's probably embedding into us that's already happening chips inside us and so on and of course in a way we're already doing this because we're carrying around devices all the time our mobile phones etc which are enable us to actually interact in different ways so colloquial all of this is called at the smart city it's a very much overworked term in some senses but it does give an index to the kind of thing we're really talking about now twenty years ago it's been around for about 20 years the term smart city there's a book called the Technopolis phenomenon in 2002 I think which actually has a subtitle for me like smart city but there are lots of other words for it the digital city the virtual city the information city Castells call it's the information city a wired city billed on calls it the wired city and so on all of these terms really are synonymous with this idea of the smart city and in some senses this is kind of important because it really is an indication of the world of cities basically which are changing in some sense and also as for example these information technologies evolve cities are getting more complex and our theories always seem to be lagging behind so one consequence of the smart city is this increasing complexity clearly human evolution is not such that it's just related to information technologies there's lots of other things too but nevertheless complexity is increasing quite dramatically and again it's a digression from my theme here but to some extent our theories of the immediate past and certainly theories of the last half-century and look increasingly dated in terms of what we see out there in terms of making sense of the city and that's a very very big challenge I think to all of us in the in the urban domain but it's a challenge to a science and social science in that sense so the 24-hour searchers come onto the agenda quite dramatically and so far we don't really have a clear idea about what all of this means despite the development of basically you shall when mentioned had done some work on developing what we call the new science of cities and so on which is really the application of physical ideas to the city not so much to do with smart cities etc but despite the development of these sorts of ideas in some senses we really don't have a clear view about how to understand the city etc ok so a better way of saying all this is to show you a diagram so this is my kind of potted diagram of the smart city that in the top left-hand corner we've got the city which is the thing we study and in the bottom right-hand corner in the orange kind of color we've got our theories about the city and to some extent this kind of circular diagram in a way really relates to the way we actually develop our ideas about the city the city is something out there in that sense arguably we could talk about the notion that the theories that the the the orange box is inside the yellow box and vice versa basically I'm not going to get get into that I mean there is an issue about that but to some extent this represents the scientific method we we actually draw data from the city which is the the the arrow coming out of the yellow box basically and we then produce ideas about the city which potentially we send back to the city in terms of things like planning action decision and things of that sort so that's the model that's really dominated for well it's dominated really prior to the last 20 years or so and what has really changed in this particular time and we should say that computers have been used for the last 50 years to actually study this process of developing theories about the city etc but between them really probably about 10 or 15 years ago what happened was the computers and sensors have been embedded into the city itself and the data that that those computers as sensors are actually producing is that is the real-time stream data so the two yellow boxes I've just introduced represent the change now to some extent information in control is the reason why computers and senses have been introduced into the city and out of this comes data based any real-time stream data it's big because it's continuous in some sense in this particular context and the other important point to note is that it's like an information exhaust unlike much of the data we used to study cities like census data and so on this is highly unstructured data one of the reasons why there's a strong emphasis on data mining basically to looking for patterns in this data is that it tends to be unstructured it tends to be produced for a purpose which is not particularly necessarily close to what we're interested in it so therefore we need to mine it for patterns etc okay so let me just org meant the Box a little bit here for example we've got models in the middle this is loosely called these days urban analytics we've got strategic models which to some extent or more about the low frequency City longer term but we have routine models in some sense and in fact in terms of the 24-hour city the high-frequency City these routine eyes models have been around for quite a long time but they've had a recent massive increase in in significance largely because of the kind of data that's now available so this is urban analytics in that sense and also the other feature is that we've got big data so all of this so this gives you a kind of pottage summary of really what I'm talking about just general context okay so high and low frequency cities I've already made this point and I'll make it again the real issue about analytics is real-time streaming data and this has given this a strong momentum to move to what we call the high frequency city it's the 24-hour city that in the past has been managed in a rather low-key way and increasingly it's being managed in a slightly higher key way because of much of the advent of this big data – providing us with a much richer sense of what cities are all about and much of our urban theory data in the past and I've made this point a moment ago has been about the low frequency City so cities changing over years and decades and now of course we're in the business of thinking about how cities change over seconds minutes hours and days and so on and that is revealing lots of interesting things about the structure of the city and indeed about how we can actually plan the city to improve the quality of life Bristol a couple more basics before I move fairly swiftly to various examples the way we access the smart city is through technologies that let us generate news data information in that context we access it through mobile and fixed devices increasingly I think we're accessing the smart city through mobile devices although my example will mainly be about fixed devices which are sensors embedded into the transit system in fact much of the increase in data that's available on on a continuing basis is actually from mobile devices from our smart phones etc in this sense now this of course really complements rather than substitutes for of the data that we've collected and used in the past and it's introduced time into our thinking in a big way it's all part and parcel of this increasing complexity of the city which has generated more timescales more opportunities more networks and much greater diversity so that's really the kind of context okay now let me move to another biggest part and I'll spend about five to ten minutes on this before I move into into my examples ok now data that's streamed in real time is referred to as big as long as the sensors are on we never know how big the data is because to some extent the data is only only as big as the time when the sensors have been switched on once they're switched off we know how big it is in this particular context it's big respect to its volume in this center context the conventional definition in business is the so called 5vs volume velocity variety veracity and value kind of business cliche really at that point but to me big data is probably data that is in terms of volume a very good casual definition of big data is is it it will it fit into an Excel spreadsheet now if your data doesn't fit into an Excel spreadsheet it's probably big basically Excel spreadsheets have gone up to a million rows however if you packed an Excel spreadsheet with that amount of data you'd crash your computer and probably spend half the day sort of on scrambling it but nevertheless you get the idea that to some extent in this context this kind of data is data that requires special techniques or techniques which are not really available on the desktop per se in this particular context now the point I'm going to make in the next five minutes that Big Data data can really be big even if it isn't streamed traditional data can be big if it's blown up into categories in some sense and for example in cities is a very obvious way of thinking about small data which becomes big very rapidly and that is that if we have a number of locations and we start looking at interactions which have flows between locations networks in that sense then the data becomes big and very quickly we can actually begin to scale up and demonstrate to ourselves you always don't look at big data in that sense so we have to be careful about what Big Data is it's big under different circumstances and also at the same time we've got to consider that small data is significant so data is big in the traditional sense but not in the high frequency sense and if we have n locations and we then begin to look at interactions arguably the way we should look at cities is not in terms of locations it's in terms of interactions because we should do both in that sense now this idea of thinking about cities as interactions which is a small data turning into big etcetera has been around a long time here's a couple of maps basically which a big data based on flows big date historically now the map on the left hand side is a map of traffic flow in the hinterland of Dublin the capital of Ireland in 1837 now in 1837 the British Army was commissioned by the Treasury in London to actually do a survey of the amount of flow into the city of Dublin or the town of Dublin as it were from the hinterland so the scale of this is probably about 20 miles across east-west basically and that left-hand map and you may ask the question why was the British Army commissioned to actually produce a map now these flows into Dublin are things like horses and carts people walking etc you've got to remember that this was the very early days of the Industrial Revolution the question I always ask the audience is why was the map actually produced in 1837 by lieutenant harness of the of the Royal Engineers basically why did they produce the map it's a good test of your economic history this it would be a good test of economic history for high school students in brief schools of course in 1829 the first passenger railway was open between Liverpool and Manchester that line and the railway building era began in earnest basically we're going through thinking about in Britain another railway building era now because basically nothing has changed since about 1870 in terms of the railways that in other words the Dublin or the british gunboat wanted to know whether it was worthwhile building a railway so they did a survey this is an origin and destination survey basically and lieutenant harness who ultimately became a general in the British Army was well known in the Indian Indian Mutiny in 1857 but harness basically produced his maps in this particular context showing yes indeed there was some justification for a railway basically that point on a railway indeed was built the other map on the right is Raven Stein who took migration data from the 1881 census and began to plot it you can see I've taken island again he did this for the whole of the whole of Britain basically the island of Ireland and Great Britain and you can actually see a kind of vector map that's taken that's well though Tobler's not well though basically inked in I think what Raven Stein had done at one point in one of his papers and this is a kind of a kind of almost a kind of intuitive flow map basically summarizing the main flows between the IRA the Irish the Irish continent in that sense okay now I'm going to tell you a little bit about this next example that pertains to big data problems basically but if you really want to look at the thing copy down the YouTube movie it's an excellent movie it's about big data in 1955 and it's quite relevant to this question of flows it's called the great railway caper big data in 1955 very quickly at the end of the Second World War various scientists did work to Bletchley Park of course during was the person who mainly jet Betsy Park cracking the enigma codes that control the u-boats the German u-boats in the North Atlantic basically a big code breaking operation various of the engineers went off to work for private companies and so on and basically in Britain there was a fast-food chain called the Lyons tea shop everybody in Britain in those days would have afternoon tea and there were 250 tea shops built like a McDonald's and they had a big supply chain problem and they built a computer system in West London in the early 1950s and many of the Bletchley engineers and they had a link to Maurice Wilkes over at the computer lab in Cambridge and so on they basically were building a computer system now it's just really primitive stuff you know sort of you know no operating system all this sort of thing punch cards punch tape you know intermediate outputs and so on but at the end of the day they found they had idle cycles and what do you do if you have a computer system in idle cycles you try and sell it so the they had a number of contracts with people and they got a contract in 1955 with British Railways British Railways wanted to price their freight and they had 5000 stations in the railway there's only about 2,000 now but there were 5,000 in 1955 and therefore if you multiply if you square 5,000 you get a matrix of 25 million basically so there are 25 million where it's symmetric basically because you assume the price of freight is the same got to take away the five thousand main dangled on the matrix but nevertheless it's a big problem so small data suddenly becomes big data and they wanted to price the freight so they needed to work out the distances between every station basically and then produce a kind of a pricing system for Freight so they commissioned at the Leo Bay the Leo system the Lyons electronic office to do this job and of course they began the the engineer's basically began by looking at at the problem basically and of course they realised that you had to look at a shortest route problem at this point so the first thing they'd to do is to solve the shortest route problem however of course if you solve a shortest route problem and you actually use an algorithm it was invented by a guy called Edgar Dijkstra right the Dijkstra algorithm but of course this was 1955 and the Dyke's Dijkstra didn't invent his algorithm until 1959 so these guys invented the Dijkstra algorithm of course later on long after this they'd never heard that Dijkstra these guys they went off to work for British Airways or something like that and so on but anyway they invented the Dijkstra algorithm they partition the matrix there are three railway lines running from Scotland into England so you could partition the problem and solve the shortest route problem in that particular way and of course they did all of this stuff and loads of printout etc they delivered all the all the output – in 1955 – British Railways and they never heard from British Railways ever again they got paid but they never heard that's very typically British basically you do something and then no everybody ignores it basically so but anyway the interesting feature is here's a flow problem that's big data the big data was so big that it forced them to invent a shortest route algorithm in this particular context I don't know whether Dijkstra I think dikes was probably still alive but Dijkstra whether or not he was energized by big data in some sense to solve this algorithms algorithm anyway this gives you an idea of big data let me move on very quickly I've got time here on this context so I've been going for about about 30 minutes let me go very quickly and I'll show you some ideas about big data that I've been involved in back in the 60s basically you can actually see from my book urban modeling in 76 and we looked at flow patterns so the data is well there's 50 zones there in the part of North Lancashire north of Manchester basically but then the bottom in a set of line shows you the 33 London boroughs that's the Greater London Authority divided into its 33 boroughs and it shows you how to plot a matrix of flows at 33 squared 33 boroughs 108 9 not so big but very hard to visualize who still have enormous problems in visualisation scale the thing up the top left is one of these circus diagrams for the 33 boroughs those are very popular and again invented in the last 10 years or so but you can actually see the issue about when you scale up we've got now sit three hundred and thirty six hundred thirty three which is four hundred thousand flows basically or potential flows the matrix is sparse to some extent but even at this scale is not that sparse and you can actually see that I plotted a couple of flows one from us one of the zones in Westminster in the center in the middle and the one on the edge basically is the flows out of Heathrow which is one of the zones on the edge there basically okay so this is looking like big data in this particular context once we actually begin to display it in some way then it really does begin to confound everything because this is I'm not going to tell you what this scatter diagram is simply to say that once we get our orders that once we get orders of magnitude of this sort half a million points are set in some sense then we've got real structure in the data that much of our conventional statistics never picks up you don't see scatter diagrams like this in books on statistics when I learn statistics you saw it's got two diagrams with a hundred points basically not a hundred thousand or not certainly you know you know a hundred a hundred million a million or 100 million etc which it which is now so in other words big small data in that sense scaled up has always generated big data in this countries you don't need real-time streaming for any of this now let me give you another view of big data in this context we've taken our model and we scaled it up this is our standard land use transportation model land use transportation model is modeling these flows between origins and destinations we've scaled it up for Britain basically or Great Britain I should say what you see is the scale as is the the zones basically which are equivalent really to blocks I think or probably block groups basically in your context for England and Wales at that point I'll add Scotland in a moment that's been out added recently so our model is quite big we're talking in terms of the number of interactions or flows in this particular model and I'll tell you why we've scaled it up in a moment but we're talking about something in the order about seven 7,200 squared which is summing in the order of nearly 50 million and to run models very quickly with 50 million potential trips basically it's quite a challenge in terms of programming we've moved this model to the web for a variety of reasons because we want to make it open to anybody in that sense and the key you here is that we have big data in the model but we have ever bigger data when our client basically runs the model so if you were to run the model now and because it's online you could do it there'll be a caveat in the moment about it but you would be a client there's a server with the model it's a big model it's of England Scotland and Wales and we had 7,200 zones in in England and Wales and when we put it to Scotland we have eight eight and a half thousand zones in the order of about 70 million flows etc it's on the web because we want users to do it quickly and we have any limit we have a an open number of clients an opal number of users who can use it every time you run the model it generated it takes the data and it generates your outputs and your inputs etc in some senses so if you think about this kind of big data being scaled by the number of clients etc you get a sense of what it is now it's we've extended it to Scotland you can see that in Scotland here and that's employment and employment density and population counts etc but let me give you a quick issue in terms of running the model in this particular context there's the block diagram about how it works we have clients on the one side which is on the right-hand side in that sense and we have all the server-side stuff over there so all the GIS and modeling stuff is done on the server side the some of the GIS drifts over to the client side anyway that's the kind of that's the kind of structure of things that actually takes place there's lots of things in this model that elaborate the big data they we're talking about it's a three mode model we have detailed networks of the the basic equivalent of your taiga files I guess and the Ordnance Survey data relay is eight million links etc and three million nodes and so on so we have detailed networks underneath all of this and again which have to be solved very quickly because the thing must run instantly ok so this is what you actually get if you log on and there's the actual there's the actual the actual web address basically now we'll see how intelligent things are at the University of Illinois let me see if I can click on this No okay oh yeah okay whoops no yeah not so good okay so now so we're not going to I'm not gonna click on this I do have some slides of it if you clicked on this then I was hoping that the the web link would come up in this particular context but I don't see we had a PowerPoint to actually do it in terms of this interface if you click on that the things comes up and you would be able to run the model right you can do it on your phone I don't recommend you do it on your phone unless you just want to see the prototype right it's an alpha version in that sense it's not really designed for a phone in this particular context the graphics doesn't quite scale but you can actually run it on your phone and it will actually work basically at that point now imagine all of you in this room logging on well what would happen is our server would just collapse over in in London basically just couldn't handle that many it probably wouldn't collapse you'd probably just freeze basically so but in other words think about the kind of big data that's being generated that way it's a bit like saying you know Google you know you're a Google you're doing a Google search and you know 100 million other people are doing a Google search how do you handle that well it's the same kind of thing that generates in and of itself big data you have small data being generated that way okay now this is what you would actually get basically sort of them you can see a kind of sequence of things when you actually click on in this context you can see a populate what did it observe population basically if you if you click on this then you can see map box working to produce this thing there's all sorts of layers in this that's greenbelts around London that I accessibility job accessibility of some sort these are flows basically this is an example of a scenario because the idea is that anybody using this can actually test the scenario by changing the networks and adding employment here for example this is West Lancashire Merseyside and that little red spot that I just put up is not very far from where I was born and brought up on the edge of Liverpool and basically at that point I'm putting in 40,000 extra jobs into that particular context and then we're looking at the impact the model runs basically and you can look at the impact in various ways in this context let me show you where it really becomes serious this particular model as I said a moment ago we're rebuilding the railway network in Britain for the reason that the network was really built until about 1870 it finished being built and then it's progressively been scaled back as the Industrial Revolution can't turn into the post-industrial revolution and at the moment they're thinking of building a couple a number of high-speed lies this is the biggest infrastructure project in Europe this is called cross rail which is the Elizabeth line it's a new tube line but it's a tube line with the difference it runs really from the east in Essex here all the way to Redding that bit in the middle is Heathrow the red bit is under the center basically from Paddington through to Abbey wood and out to the Olympic Games site under the center this is a massive project that's taken about well 15 years or so basically but the interesting thing about it is that when cross rail is actually opened supposed to open last December but two years late basically when it opens many of the stations have been built and the tones have been built and so a state-of-the-art trains and so on from well from Hitachi or somewhere basically if you look at the impact of cross rail it's actually throughout the network this is the impact across the south Britain and you can see that these different clusters basically this is improved travel times basically at that point and then if we look at the population change associated with Crossrail this is putting Crossrail in and watching everything else – just – it basically it's no total population increase that's a better one showing you how once we introduce the different modes because Crossrail is competing with the road mode and the the bus mode the 3 modal 3 modes in that the rail mode it's pulling people off those modes and it's also at the same time because it's a location model it's actually pulling people towards it so this is a measure if you like of impact when we correlate this with house prices for example it's quite a high correlation because to some extent what's happening is that the house price the Crossrail has been factored in already basically enter into a house price change in London we're doing the same with that so that's the reason why we're doing it ok now let me move on I want to move on fairly quickly lots of data and I'm going to move now to our you know high frequency City stuff lots of data are available from through api's which really relate to real-time streaming so this is our London dashboard I'm pretty sure I can't I can't click on that either to bring it up that doesn't matter so in other words if you click on this you find a variety of stuff which is the state of the tube line and so on you probably can't read it from there but there's a variety of things like web cameras put see without a doubt Industrial Average and whole range of data which can be pulled off these api's in a real-time context and that updates constantly we have another well there's a sort of dashboard wall basically but we have a particularly interesting site it's a pity I can't bring this up if you do have a phone copy that web link down and try and bring it up because basically what you will get is real-time streaming of these cameras these are cameras in central London you could pull the menu down in the middle and you'll get the real-time streaming of the buses now I tried this about I tried this earlier on this afternoon and of course it's dark in London at the moment so you'll probably see police cars running around and so on at these particular wise I'm going to move on very quickly but tell me if anybody doesn't tell me if you've actually got pictures that work basically because I'm quite conscious that I can tell you all this stuff and then you can say well it never works anyway so okay and here is some other kind of real-time sort of things this is these are pictures like Rio de Janiero the control room there Tokyo traffic control rooms and so on and here's a couple of other dashboards which have been developed they're very popular in this sense this is the Rob kitchen Dublin dashboard that he's built which has got more GIS functionality in it most of them are fairly done these dashboards but they do give an indication of the kind of thing that is now being packaged in in a sense in terms of the kind of real time real time data anybody brought it up no okay at the end are at the end I will try and build it up because I bring it up because I can get out of this get out this interface okay now let me tell you and spend about ten minutes really talking about the the real-time Oyster card streaming and then we'll begin to sort of close it up in some sense okay now I've mentioned to you the fact that about 85% of passengers using the system using the Oyster card there's a picture of the Oyster card with them with somebody tapping in that's the card you buy over the counter you can also top it up on your phone you can use Apple pay and there's somebody very kind of clumsily using an Apple watch basically on the actual scanner and this is this is particularly symptomatic because this is this is a picture of somebody using the Oyster card or their credit card really using Royal Bank of Scotland now Royal Bank of Scotland is still owned by the British government it's not really improved over the last 10 years it was one of the ones that really went bust back in the recession some are symptomatic that we got a picture from London Transport that okay that's demand for trap travelers okay you-you-you actually pay by tapping in and tapping out we're gonna look at the tube now the other data set that we've actually got is the supply of trains think about demand for travel you tapping in and out supply of trains there's a lot of data that comes out of the train systems in and the one we're interested in is the top one which is the tube where we have tracker net that tells you the position and scheduling of its every train on the network in this particular context so that's a picture of the network those are the trains also Oyster card works on the buses so we're talking about a pretty complex network the the tube network that you often see the abstracted map is embedded into that basically so if you looked at Tokyo at the the system it would look more like this so once we put all the networks together in terms of complexity think about the bus network on top of this think about the road network on top of it think about email networks social media networks web networks and so on people logging on and so on all of that kind of activity that's what we mean by increasing complexity in terms of the city okay here are some pictures of the YouTube movie that you can have a look at done by our group in New Zealand of the auspices of engineering and then there's a couple of pictures of stuff from the movie they're basically essentially I should point out the the pictures there what we've actually been able to do in this if we have tap in and tap out data the tap in and tap out data were able to figure out using Dijkstra's algorithm against shortest routes the most likely route where people travel to and from one of the basic problems we've got is that a multiple routes you can travel through the tube system where I go from good Street to Euston for example good Street to Kings Cross I can go in a couple of different ways and you can actually walk there in the time you can do it a couple of different ways there are many paths through the network and the key issue is to do that one of the other features of course is that the network is quite complicated if you go into the network and you've never been there before you go to bank or Kings Cross then you will find that if I if you if I accompany you and I want to get somewhere I'm much more likely to be able to get there long before you because you'll probably spend time learning about the network basically so there's a real cognitive issue issue in this context what we're really interested in is how do we take the tap in tap out day to the demand and join it to the supply data if I get into the network I can get down to the Train I want to faster than you because I you know you know if the seven in seven stations interacting like at Kings Cross seven lines interacting I can get down to the Train quicker than you can because you're kind of reading the signs and discovering it getting on what particular train is important because I want to look at disruption I need to know what train that passenger is on basically and if there's a miss matter there's no Wi-Fi in the tube there trialing it Transport for London but it's illegal to take photographs in the tube and it's illegal to use why you use Wi-Fi for purposes of tracking in some context so there's a whole range of issues of not being able to connect to demand and supply in that sense okay here are some pictures that you actually get out you can actually see these are fairly standard the red represent the the horizontal represents the 24 hours the red the top-left represents the number of tube stations something about 290 tube stations and the actual morning and evening peak the morning peak is narrower and more elongated than the evening peak in this particular context you can see the lag between tap in and tap out in terms of entries and exits in terms of the morning and then you've got three pictures of different tube stations and what the profiles are they're particularly interesting now none of this is new this data was collected years ago by hand very slowly and painfully and not very well it's now automatically in that sense and we have three three different Tube stations where it says nightlife that is Camden you've got a mixture of journey to work the morning and evening Peaks which are shown differentially in terms of below or below the one below called work is Bank station Bank of England and as you'd expect dominantly the flows are working that says now the top right is interesting I always ask the audience where they think this is you've got these three big spikes now I won't you don't need to hold your breath on this one that is the Emirates right now the Emirates is the home of the Arsenal right and the two blue spikes are the two Saturday games this is a data set done over a month I think it's a synthetic data set we've added an average things so these are the the two Saturday games and the the spike on the immediate right is effect the Wednesday game basically so in some senses what we're actually showing in these profiles is a lot of interest interesting things with respect to the the actual activities which were surrounding this to my postdoc who is now electorate Kings chanseong who did a PhD in the future cities lab in Singapore wrote some very interesting papers on Singapore regard Schmitt and myself and one or two people looking at tube data there and she inferred one of our papers she inferred what the land use was around the tube stations from matching the tap in tap out data in Singapore from the easy link cards with some extra data that said a Singapore Transit Authority had actually collected in terms of surveys so you can begin to add value to these sorts of datasets by having the right link data okay now I can't show you this but um because my my my interface won't work here but if I clicked on the top left you would see over over over a week basically the morning and evening Peaks it's a movie basically if you clicked on the one on the right that's the position of the trains it's so good well I'm not gonna lie I'm not going to crash out of this PowerPoint because I probably wouldn't get back in or there's this point with this interface but at the end I will just show you very briefly if we can do this because of some quite interesting things about about the quality of the data now very quickly then we've done a number of projects on learning about mobility from this data we've done a comparison between Beijing Singapore and London and what we find is there's much more churn if you like much more churn in the in the London system much more variability in the London system largely because we reckon the London system is very old basically so if you go into the London system invariably if you're at Victoria which is a big hub and the Victoria may be closed for two or three times a week it'll be closed for half an hour where people will be queuing to get into the tube in this particular context due to simply due to sort of overcrowding but also due to the fact that capacity of the subway system is limited the brief stood where the London subway system was really built between about 1870 and about 1920 basically although one or two new lines have been produced learning about mobility from the data again disruptions signal failures so here's some pictures of of crowding physical crowding in this particular context that's the network so and here's a case of disruption this was when the circle and district lines basically that's the abstracted tube the back map when these lines were actually closed and these two lines were closed for four hours now closing a tube line for four hours in a city like London is really disaster basically this is the this is the situation where a variety of things happened kind of perfect storm really sort of signal failures stalled trains and so on it was closed for something in the order of four hours on the 21st of June on the 19th of July 2012 and one point we could figure out from the data one point to three million Oyster card holders were actually disadvantaged by this that they spent longer time now the Google issue is that when we compare the picture that we get in terms of travel from this particular point with the synthetic data set the differences between or enable us to actually begin to look at what's actually happening long the travel times in the network people diverting onto buses leaving the subway system and so on we can figure out a lot of this from the from the data in other words we can look at the Cascades that come from the data one of the problems of course is that on the bus system in London you don't tap out you only tap in for a whole variety of reasons the drivers are the drivers of these buses these one-man buses basically they're abled they reckon they're able to actually see the passenger tapping in but they're it's much harder to see the passenger tapping out for active be cognitive reasons so consequently you don't tap out on the on the whereas on the Singapore system you have to tap in the wealth being Singapore you have to tap in you have to tap in all over the place in Singapore but in London they they don't tap out so that's a problem because we can't really look at any integrative modal kind of be integrated modal behavior you know bimodal trips and so on okay here are some pictures of the dead Manley is produced for us on different travel times basically so these little circles represent origin chat well we had an Origin changes these are increased travel times these are origin changes these are destination changes basically here's a picture of relative regularity you don't need to look at this any detail we're just saying that the darkest spots are will get much greater regularity of travel in this particular context that's measuring regularity in terms of stations and that's really measuring regularity in the middle of the day previously this is the pika this is the middle of the day we've done lots of bits of work on looking at the dynamics of demand in this particular context let's see if this one works a famine I see both seem to have lost my mouse but okay don't worry about it okay so we can't see that we can't can't see the trains moving one of the interesting features this is a map of the trains moving basically if you actually examine that map if I was able to show you the movie some of these trains would not be on the line that's basically sealy the Purple Line here which is the Metropolitan line going out to the the northwest one of the little dots is a train which is actually off the line and the issue is I said to our programmer Richard can you get the trains back on the line right just for the visualization because people sort of say they see these thing going the trains are going something off though he said Mike he said this is a big data set he said I could get the trains back on the line but it'd take me a thousand years because I have to pick up every individual train and place it back on the line these were the glitches this was the noise in tracker net the noise is such that these things spin off all over the place if I showed you the the movie then what happens is the clock goes round and round about 9 o'clock or 10 o'clock there's a kind of glitch where the system is reset so that's happening all the time tracker netting is notoriously kind of volatile in this sense ok streams of data going to miss that we've put together and delays from tube national rail and buses fused in a sense so there's some interesting things of that sort one of the things that we're very interested in is looking at Polish centricity in terms of the hubs in London and what you see here is a number of hotspots basically you actually see morning and evening peak which is the red and the the thing and we can actually examine how how the station's change in terms of the morning evening peak I can't run the movie unfortunately because it's actually embedded in this but from that we can actually begin to classify the the stations in London into different kind of hubs okay now let me just make a couple more slides and then I'll finish off and we can open it up for questions lots of real-time data out there we've looked at bikes for example these are the fixed docking bikes dot bikes and what you see where I able to run the movie and you could actually see the movement of these docking stations with the bikes in that is actually in the morning sometime I cannot whether you can actually read the Israelis got a time on in some sense and seem to but basically that's the configuration of the bikes in the center of London there's about 10,000 bikes I think in the docking stations and where to move show the movie what would happen is that during the peak these bikes would be picked up and then moved into the sort of central bit and then in the evening peak they come out because the mainline stations the big rail stations are all around at the edge of the CBD like Kings Cross Paddington Euston Victoria Waterloo and saw a little Bull Street so they're all around so you have this you have this movement in and out in terms of the in terms of the tube system and so on you have this this sort of doughnut like kind of thing in the middle of the centre you have a lot of these cross patterns in some sense which are interesting because to some extent until we have this data it's not super clear about this we've also done work on Twitter Edie Manley and James Cheshire have done this work on Twitter languages we plotted this sort of stuff and so on and more recently we've done some work on air B&B that's the growth of their being be in London and Zara Sabrina at one of our PhD students who is looking at this she's more interested in how the location of air B&B relates to the sort of attractions that you see in the top left-hand corner but in the process of doing that we found that air B&B is very disruptive to the housing market to tenure it's taking a lot of rental accommodation out of the the rent and the private rented sector being turned over to her be and be quite dramatic growth think there's 80,000 Airbnb s has grown to in London and we're talking here about a very fragile and vulnerable housing market with very little housing being built over the last 20 30 years for all sorts of reasons and also massive immigration short term immigration into London from a variety of sources so you're taking stuff out and wittingly from the 10-year from the tenured stock is problematic and we've done a variety of things in this particular context okay now the last thing I'm going to leave this open what can we learn the limits to big date of the need for big theory I've made all these points in a sense and if we want we can discuss some of them but to actually finish let me actually give you some some references we wrote a special issue of the journal built environment which was edited by myself and you can sort of see down as an article after the editorial so by Karla ratty in his lot and and then there's one on finding pearls in London oysters which is our oyster current project here's an archive paper we posted a couple of weeks ago with the three of us who've looked at the Airbnb stuff and there's on the variability in temporal mining with gem song and then here's the paper that she's written with one of our PhD students she sang in I SP RS I think and then in terms of the modeling stuff we've not really written anything on our big model yet but the nearest you get to it will B and n right and planning paper by myself back in 2013 so at that point I will stop and hand it over to any questions I don't know how far out of time but question there's discussion here the u.s. in terms of transportation systems being impacted by the rideshare industry is there anything in your research in London that confirms the kind of impact second question I just heard I think was in the radio last week that there been building a new line in London's mass transit by right show you mean things like uber okay I'm not aware of any work that's been done in London there's a lot of concern about first of all there's a lot of concern in London for regulation that in fact at one point the mayor basically basically decided that uber were to be banned basically from London and that was about a year and a half ago and then the decision was appealed against and it's not been banned it was not it was not actually banned this was the proposal one of the critical issues is the impact on the impact on public transit in fact public transit is quite expensive in London in that sense and the impact is probably a bit less than you might expect but nevertheless it is concerned that it will take transit away from people who can only afford transit and leave them sort of inaccessible in some sense there's a very good bus service it's very good public transport system in London generally in that sense you'd expect that with only I think I said 38 percent of in early slides you were probably not there I said something like only 38 percent of people traveling in Greater London or traveling by Road basically and the other last mainly are travelling by public transport so it's a well served service but I take the point about uber now your second point I actually did cover the example of the Elizabeth line which is cross rail and that that will actually be a very high speed on the ground line with well with trains every two minutes basically you know the headways is such that they're able to do this but trains every two minutes and many many trains actually going from almost Essex all the way through to reading a new link down to Heathrow and so on so that will that should carry a vast number of new passengers basically and there is a proposal for a north-south line called cross rail too many of the debates really relate to not whether we need cross rail one or two crosser ones built basically but not open cross rail to the biggest debates in Britain are about these big infrastructure projects relate to the issue about where the London gets more infrastructure or whether the disenfranchised north where people vote brexit get it right and actually I've just explained begs it by saying that right basically okay okay Mike if he could the big theory can you give a clue about what piece of theory is missing or what this is I haven't figured out how the empirical goes to the theory okay well I mean we have no theory at all about how you know an increasing number of transactions which relate to you know the digital world are impacting on our locational decisions basically so in other words one of the obvious issues is email used for a very wide variety of purposes searching the web etcetera the impact on that on where we live where we locate working from home and we've got some sense of it but it's not clear whether it's complementing substituting whatever so it's that kind of thing in other words in the past when we looked at the city in terms of networks we mainly looked at the transportation network basically but there are lots of other networks that people now involved in how do we actually factor those sorts of things in that's one thing the online economy is the other the housing market bears no resemblance whatsoever to what people like Bill Alonzo rest his name I mean people like that said in the 1960s I mean there's great work basically in my view basically but everything has changed in terms of the way those markets work in that sense and you see that and you see that particularly in big cities where there's massive you know stress in the housing market context so I'm thinking I'm thinking in terms of that kind of theory it would be really great for somebody to actually produce some very insightful kind of way in which we could put a few few things together I wouldn't be ambitious enough for us to be able to put everything together but to put a few of these things together in a different way and to some extent the the the idea that in the past was that form follows function I kind of ended in architecture probably fifty years ago but in in the context of you could now say that form no longer follows function basically the functions are have been disconnected to some extent from the form that we see and that's part of globalization the virtual world the digital world and so on so that's the sort of thing I'm getting at I wouldn't like to go any further because I don't really have a grand plan for it anyway I have a very superficial question relating to the big data you mentioned because there are some pare bias in big data so that helps me how to go back to the low frequency data for some social economic feature of characters so I wonder how can we build a bridge between the high frequency data and low frequency data in the model and prediction part thank you okay well interestingly enough the of course high frequency data begins to turn into low frequency data if we get enough Transport for London who donated our data test I should have said the big data set that we've got to the three months in 2012 the reason why they gave it to us or donated it to us it was largely because they were interested in what would happen to the transport system when the Olympic Games came which was in August 2012 and they feared that the system would gridlock for a variety of reasons it didn't and which they were able to we were able to study that but for London are not very interested in fact they're quite against giving us giving us a tranche of data over the whole time the Oyster card system was introduced in 2003 so there's basically 16 years worth of data now 16 years worth of data is beginning to look like the low frequency City and no longer looks like that it's high-frequency data but it's it's in the Lo and Transport for London will not donate or let us look at that data because it actually shows things that they don't want the public at large to know and there are public agency basically but in other words the media is such these days that they get hold of that data you can see some subway stations declining in patronage others growing massively and it can lead to all kinds of you know from the point of view of Transport for London undesirable demands really for change in that sense so but nevertheless your point is that how can fry frequency be linked to low frequency that's one way getting more and more data in that sense and that really is quite an exciting prospect if we can get it the other problem of course is that the systems may well change that we've got 16 years worth of transit data basically you know there is 16 years worth of transit dates are available but the question is will the system change and there's no guarantee probably in the case of Transport for London if the system changed they make it compatible etc because there are enough expertise in Transport for London in terms of traffic modeling and things like that to be able to maintain it well there's lots of other data services real-time data where they don't care at all about you know getting long-term stuff I mean Twitter are always continuing to change their API and what you can actually access on this kind of thing so so that's the nature of that's actually destroying the bridge between you know high frequency in the long term etcetera what you're also saying is that normally low-frequency data is things like census data I don't really have any answers personally about how we would link that sort of data to two high frequency data except that we have looked at we've looked at the transit data we've got the the Oyster card data and on the Oyster card data we can classify the card by child disabled season ticket holder a couple of different types of season ticket-holder and also Freedom Pass now I'm a Freedom Pass Holder over 60 we get a free bus pass everywhere in Britain and if you're over 60 no it's yeah it's no no basis I'm over 60 so I get a freedom pass actually worth their weight in gold I tell you I mean you know so you have you been on a London subway you know how expensive it is anyway the Freedom pack so we've got this this geo demographics now what we've worked out from the data set is that it looks like relative to the census data in London where we've actually got you know age cohorts and so we know where everybody is by age in that sense it looks as though the representation for older people is much lower than what we would expect and that's probably because older people live in the suburbs rather than the centre and they probably drive cars more ironically that's very characteristic London's situation so in other words we are beginning to think there are bits and pieces of traditional data that help us the basic problem in all this it's all inferential right you can't you can only demonstrate the you can only get an intuitive probability that that's that's indeed is the case now that's the problem with big data that that when you get the pattern you you it's just an inference basically in a sense anyway right right well I mean to some extent you'd probably define it's our duty is its some eyes there is yeah although I think the proper question of the theoretical aspect I think for high-frequency CD traditional theory like a central location theory yeah I think it's broken down nice opportunity for you well I think that I think the idea of hotspots in cities and hubs and so on where activities happen could well be reinvigorated by by this high frequency data and in other words it's a bit like saying that the you know central place theory will not go away it's at many different levels etc it may be changing in some sense so there is a sense in which all is not lost in terms of old traditional theories basically those central spot or exactly and there's much more volatile much more change involved but rapidity and that's part of the increasing complexity [Applause]

Be First to Comment

Leave a Reply

Your email address will not be published. Required fields are marked *