Commutes, Greenspace, Parks and how to grow your own decision tree

We are all well aware if you ask any decent estate agent they will pride themselves on knowing every local detail of their stomping ground, from amenities to zoos. They will also know about transportation, schools, hospitals, council tax bands, shopping and parks, where they are and be able to rate how good they are. However if you were to ask them exactly how much each of these characteristics are actually worth in pounds per square foot, then they would probably quite rightly reply that it was all in the eye of the beholder, since buyers all have very different motivations.

So while we all know that public resources such as transportation, hospitals, parks, and schools are very important factors in housing prices, we do not have an attribution to value of what these factors actually represent. However we do know that, in aggregate, if enough buyers rate, say, proximity to a school, as sufficiently important a factor, or proximity to transportation, this will obviously lead to such properties being more or less desirable and hence more or less expensive.

Re-engineering The Decision Tree

Of course if all buyers had exactly the same motivations, then describing how these variables influenced price would be a trivial exercise. Furthermore in this respect, property portals do not help at all, since the decision tree to selecting a property is so restrictive, effectively forcing all buyers into the same space, with the same decision tree. For example, the Zoopla/Rightmove search order goes a) define location b) set the price band c) choose between flat, terrace, semi or detached house d) define number of bedrooms. To see how really unhelpful this is, imagine if these property portals were selling paintings, the listings would appears as follows: ‘Rectangle landscape, size 120 sqft, blue and green with lilies and water’, or ‘sought after portrait, size 1 sqft, mostly dark with funny smile’. Well that's a Claude Monet and a Leonardo da Vinci marketed, job done!

What if buying decisions could be based on a person’s actual unique motivations? This is where Big Data and Artificial Intelligence (AI) come in. To see why, think when people say it’s all about location, location, location, what do they really mean? Actually this translates to all those things that make up a place, which will be everything you can measure, from shopping footfall, to reported crime, schools, health, sports clubs, aspect, environment, air quality, noise pollution, transportation and so on. Indeed everything that makes a place. So far we have identified around fifty different objective measures and while it very unlikely that a buyer will list all 50 of these, they may well have a very different say, top four or five?

For an example, consider a buyer who was interested in flats with a maximum commute time to central London of 45 mins, green space, transportation and wanted to be close to a park, where would be the places they could look to buy in London?

To date, several studies on property value have largely concentrated on transportation effects and only a few studies have focused on the effect that green space has on property values. In these researchers have mainly focused on specific parks, for example Hyde or Green Park, within different communities rather than parks in general, to study the average impact of green space on housing prices. Using both parks and actual visible green space we have been able to quantify the effect of public resources on property value, especially green space and/or parks, using AI.

Machine Learning

To do this we took transaction price data and the structural attributes of 84,747 properties in and around London that have sold over the course of the last 18 months. These were then cross referenced with additional supporting data, which included structural attributes, location variables, and environmental variables. In this study, Inner London is defined as the 150 inner London postcodes. Outer London is classed as within 16 and 35 km of central London. While these studies are quite basic, they demonstrate the way the Machine Learning evolves to arrive at a precise value.


Graph 1

Graph 2

Graphs 1 and 2 show that the average percentage of green space has far more impact on price per square meter (PSQM) the closer you are to the center of London, this again is as one would expect, with green space being more highly prized the more built up a city becomes.
Graph 3

Graph 4

As the graphs 3 and 4 show, London property price per square meter is inversely related to commute distance, which is not a surprise, but it is also more widely dispersed the further you radiate out, representing cheaper PSQM for similar commute times.
Graph 5


Graph 6

Graphs 5 and 6 show the effect of being near a train or tube station. What is interesting is that if a property is too close, this reduces the price per square meter of a property, where noise disturbance becomes a factor to consider. In Outer London, up to 2 km away from a station or tube line and you can see very large differences in PSQM, which again shows the opportunities that exist in these areas.
Graph 7

Graph 8

Charts 7 and 8, show the considerable effect of living close to a park, in fact in Inner London, properties fall by and average of £4,264 PSQM every 1000 meters you move way from a park! This effect is lessened in Outer London, though it is still clearly a factor as graph 8 shows.

In fact if we look a just one property type, we get an even more impressive demonstration of how certain factors drive values. Here we are just looking at flats, but we have all property types listed under the Land Registry and based on commute distances up to 90 km to central London, we have also expressed house prices as the natural log of the price per square meter so now we can see the very clear correlation at work in graph 9 with an R squared of 0.72.


Graph 9

We can see the effect of proximity to a park on flats as well in graph 10.


Graph 10

Living in a Bubble

Graph 11

Graph 12

The last two graphs, 11 and 12 show that as we pull together all the unique drivers our buyer has chosen, we start to narrow down to the specific areas and property types that meet the criteria. In fact we have now the means to select individual properties, that meet the precise requirements the buyer seeks. We arbitrarily chose central London and 5 particular factors out of a much longer list, but with Houseprice.AI you could do this for any city in the UK and for 40+ other drivers.

By using AI and Machine Learning, we can arrive at assessing any property purely on the basis of objective values thanks to the breadth, detail and quality of our data. In fact this attention to detail actually leads to a more natural and holistic approach to finding property. So, property searches can be made far more intuitive, personalised and quantifiable, thereby allowing buyers to identify and locate properties far more efficiently and objectively and to be able to verify value for money on both an absolute and comparative basis.

Factoids and Snippets about Property Valuation Methods

Herman Hollerith (1860-1929)

In the 21st century processing Big Data and analytics are now part of our lives, but how would you compute even one set of data if your calculator could only add up one string of values at a time? Herman Hollerith, ingenious inventor, started the tabulator industry. He invented an electrical system that could process thousands of transactions in a single run. His system of blanks and holes was basically a binary system. The concept of automated data processing had been born.
In 1880 Hollerith worked for the US Census Office, where they recorded the results by hand. He realised there had to be a quicker way and had a lightbulb moment to store information with holes punched in paper as bus conductors did. His then girlfriend’s dad, suggested a card device to automate the count, like those used for Jacquard looms.

Hollerith got cracking designing a machine that used the completion of an electrical circuit though holes on cards to advance a counter on a dial and tally overall numbers, individual characteristics and even cross-tabulations. He tested his machine in 1887 when the hand-counted 1880 census was finally completed, winning a contract from the Census Office for the 1890 census, which is reputed to have saved the American Government $5 million.

The Hollerith Electric Tabulator - never needed a reboot

That Tabulator was hardwired to operate only on 1890 Census cards. The advantages of the technology were immediately apparent for all sorts of industry, and Hollerith founded the Tabulating Machine Company in 1896. In 1906 Hollerith made the first step toward the foundation of our modern information processing industry by adding a control panel which allowed it to do different jobs without being rebuilt.

His firm merged with others to form the Computing Tabulating Recording Corporation, renamed International Business Machines in 1924, IBM. It spawned a larger class of devices known as unit record equipment and the whole data processing industry. The term "Super Computing" was first used by the New York World newspaper in 1929 to refer to huge custom-built tabulators IBM made for Columbia University.

After the mechanical computing era waned in the 1960s, punched cards were used for input, but were replaced by magnetic tape, then disks for data storage and manipulation. Punched cards are now almost obsolete but were used in the American elections as recently as 2014.

Herman Hollerith’s designs influenced the computing field for nearly an entire century. He is remembered as one of the founding fathers of modern programming, the father of information processing, and the world’s first statistical engineer.

'IBM's Early Computers', MIT Press (1985) Bashe, Charles J., Lyle R. Johnson, John H. Palmer, Emerson W. Pugh
'A Computer Perspective: Background to the Computer Age' Eames, Charles and Ray,Harvard University Press. 2nd Edition 1990
'The American Patriot’s Almanac Daily Readings on America' William J. Bennett and John Cribb, (2008) Kindle Edition)
united states census bureau

The Remains of the Day

Kazuo Ishiguro’s Booker-winning story of unspoken love for anyone who’s ever held their true feelings back, seems to sum up the experience of many Remainers, where these voters were forced into a stubborn silence. Feeling bullied by the Government’s increasingly priapic rush to a Hard Brexit and vilified as Saboteurs and Remoaners; the 48% had their revenge through the ballot box by overturning Theresa May’s majority and returning a well hung parliament. This, it would seem, is the most frequently voiced synopsis of last week’s election; the Remains had their day, but is this really the explanation?

Whilst the Labour Party did pick up a huge number of seats, which had voted to stay in the EU, from the Conservatives, this would not explain the reason for the Phoenix like rise of the Conservatives in Scotland; a part of the UK that had voted to remain in the EU by a very significant margin. Labour too has been crowing about its results in England, but they are also particularly taciturn about their performance in Scotland. It wasn’t just about Brexit.

What then might also have been a significant factor in the election?
Could Big Data help us we asked ourselves? So, we decided we would put our resident boffins to work, to see if HOUSEPRICE.AI could come up with something that might explain these unexpected electoral results. Across England & Wales there were 28 swings to Labour from the Conservatives, the average swing for those parliamentary constituencies was 12.14%, which is historically very large.

Brighton, Kemptown58.319.2
Bristol North West50.616.3
Bury North53.64
Cardiff North50.111.9
Colne Valley47.712.7
Crewe and Nantwich47.19.4
Croydon Central 52.39.7
Derby North48.512
Enfield, Southgate 51.712.7
High Peak49.714.3
Peterborough 48.1 12.5
Plymouth, Sutton and Devonport 53.4 16.7
Portsmouth South4121.5
Reading East4916
Stockton South48.511.5
Vale of Clwyd50.211.9
Warrington South48.49.3
Warwick and Leamington46.711.8
Weaver Vale51.510.1
But this simple table does not tell us much about the 2017 election, what kind of correlations are there between the housing market and the electoral results? £20.77 billion of residential home sales were transacted in these 28 new MP’s constituencies over the last year, with incredibly just two, Battersea and Kensington accounting for more than 30% of that total. There were a total of 62,602 transactions, with an average value of £325,629 and with an average price per square meter of £3,412.60 The chart below shows the percentage swing to Labour from Conservative, vs the price paid per square meter and also showing the aggregate size of transactions in all residential property in that constituency over the last year.
So what can we conclude from this quick analysis?

Battersea also stands out in having many more property transactions than the other constituencies, transacting over 74% more than the average of 2236.

Firstly, overall no surprises that there is no correlation between house prices and political outlook in this sample. The value of a home does not determine its owners political outlook. This will be a boon to pollsters and Labour political canvassers, who can legitimately door knock in Kensington and North Bury with equal hope.
Secondly, Estate Agents in Battersea must be worried about the rise of Internet Agents, 1.5% fees on an average Battersea property of £936,187, equate to fees of over £14,000 and with such a high number of transactions, that looks a sitting duckhouse.