• Data Science for Social Good programme helps five not-for-profits mine data
  • Time to get services to rough sleepers could be cut thanks to data science
  • Ofsted will be able to improve inspections of foster agencies
  • Data scientists highlight where public transport lets down most vulnerable

Services for homeless people could be improved greatly through the use of data science, thanks to the UK’s inaugural 12-week Data Science for Social Good (DSSG) programme.

Organised and sponsored by The Alan Turing Institute, Warwick Business School and the Maths and Computer Science departments at the University of Warwick, data scientists from industry along with students from all over the world worked with charities and organisations like Ofsted, Homeless Link and the West Midlands Combined Authority, bringing cutting edge data science solutions to help the under-resourced not-for-profits take advantage of their mass of data.

Homeless Link created an app called StreetLink in 2012 so anybody can send the organisation alerts with a geo-tag when they spot a rough sleeper. The charity can then send services to help them.

Since it started in 2012 it has had 280,000 alerts, which are reviewed by volunteers to see if there is sufficient information to send local services to them. But with up to 1,000 alerts a day during winter, it can take valuable time for volunteers to review them all, by which time the rough sleeper may have moved on, with just 14 per cent being found by local services.

But the team of DSSG data scientists who worked with StreetLink’s volunteers built a machine learning pipeline that will aim to boost the number of rough sleepers found in time to 18 per cent.

Zoe Kimpel, a data scientist from Oklahoma, USA, told the DSSG presentation at Warwick Business School’s London base at The Shard: “Using data science we will be able to help StreetLink minimise the amount of time vulnerable people are left sleeping on the street and help implement positive policy change.

“By using StreetLink’s data we were able to build a model that prioritises the alerts with the highest chance of the rough sleeper being found, so the reviewers do not waste time on alerts that don’t have enough information.

“This means StreetLink can dispatch high quality referrals more quickly, so services will get to them quicker, while, by quantifying where rough sleepers are Homeless Link will have better quality data to argue their case with policymakers and make more informed decisions.”

UK Government data estimates there are 4,700 rough sleepers in England on any one night, a total that has more than doubled since 2010.

Using data science to help the homeless

Matt Harrison, director of business and social enterprise at Homeless Link, said: “Implementing the model into our software will take some time and we will need to trial it first, but it is something we definitely want to add to our systems – it will really help.

“But even before we do that there are three things we can take from this project and use straight away. At the moment our local services partners take two weeks to follow up a referral, but we now have the data to suggest that this should be cut.

“Secondly, the data scientists have highlighted some biases that our system has, such as locations we tend to ignore that are actually good indicators of quality referrals.

“And thirdly, it has picked out the important features of an alert that will allow our reviewers to deal with them more efficiently. For example, at the moment our reviewers tend to downgrade alerts that don’t specify the gender of a rough sleeper, but the data scientists have found that this being left blank is actually a very reliable indicator of a good quality alert.”

Set-up by the University of Chicago, the DSSG programme had additional sponsorship from Accenture and Microsoft. Students, PhDs, graduates and data scientists already in industry, applied from all over the world to spend the summer using their skills to help society.

The other projects saw a team produce an early warning system that will allow over-stretched staff at Ofsted to spot high risk foster agencies sooner and so cut the three-year gap between visits significantly.

While a group working with Cochrane, which summarises medical research to keep doctors and GPs up-to-date, produced a model to sort academic papers into review groups using keywords and abstracts so its volunteers can concentrate their efforts on just 22 per cent of the mass of research produced. This will significantly reduce the time currently taken by volunteers to produce a summary of current research.

Data science tackling corruption in Paraguay

In Paraguay the Government’s procurement office, the National Directorate of Public Procurement (DNCP) tackles corruption, but with 10,000 applications a year and just 30 staff only 30 per cent of dodgy looking papers are spotted. A DSSG team built a machine learning model that will raise that to 80 per cent and potentially save the Paraguayan Government $90 million a year that it can spend to build more schools and hospitals.

With 1.45m of over-65s in the UK struggling to get to hospital because of a lack of public transport, a team of data scientists also worked with Transport for West Midlands TfWM), part of the West Midlands Combined Authority, to map who and where is underserved.

They wanted to map which low income areas will struggle to get to a job centre because of a lack of a bus service, or where the elderly have poor access to hospitals.

The DSSG team now enables local government officers to identify the clusters, through heatmaps, where different groups in society struggle to get to the services they need as a result of high cost or lengthy journeys.

Stuart Lester, Data Insights Manager at TfWM, said: “This data will be absolutely vital for us in putting together polices that have a real impact in alleviating social isolation and improving access to services. It is this kind of analysis that ensures the people who need affordable public transport are actually getting it.

“We will be able to build on this with our other data to refine it and learn more about the needs of our community. This is an open source project so we will be making the data and code available for developers and data scientists to add to.”

To watch all the presentations from the DSSG teams click here.