Creative industries in south-east Scotland – mapped

Recently, I created online maps of creative companies in Scottish Borders, the Lothians, Edinburgh and Fife (collectively ‘south-east Scotland’). This was commissioned by the Creative Informatics programme, which aims ‘to explore how data can be used to drive ground-breaking new products, businesses and experiences’, among other good things.

Without further ado, here are the maps:

Below I rant about the problems encountered in this project – they are almost all about the data.

By the way, Creative Informatics is also providing most of the funding for the Platform to Platform project which I will lead next year, so I had better say nice things about it and its contributors. And in truth, I can say nice things: it has been fun to work with Inge Panneels and Ingi Helgason. They answered my many questions during the project clearly and quickly, and were very responsive to my suggestions. They have written nice things about me, and they were very quick to sign off my invoice – thank you!

However, I can only say nasty things about they data they inherited and supplied to me. (That is, it’s not Ingi’s, Inge’s or Creative Informatics’ fault that I spend longer making the data usable than writing the mapping code.)

How’s it done?

It’s all based on the wonderful Leaflet library, and uses the MarkerCluster plugin. Leaflet provides a great tutorial: following that, and given the data, anyone could have produced these maps. I had a head-start: I’d already used this software to map community councils, SFC-funded GCRF projects and RIVAL network members. So I just had to adapt my codebase for one of these previous maps. Then I collated the data into a large spreadsheet, and used concatenation functions in Excel to produce lines of javascript, one for each row of data. Then there is a javascript ‘programme’ for each map. The programmes are called by the web-pages: they ingest the data and invoke Leaflet to draw maps and marks on the web-pages.

Data source

I’m told that the data came from FAME from Bureau van Dijk, who in turn get their data from Companies House and supplement it with 118 market data. Companies House thus only collects data on companies and mostly excludes sole traders, who make up a significant part of the creative workforce. There are other data sources but most are tightly gatekept. 

Issues I could fix

For a start, the data was supplied in 48 separate spreadsheets, some of which had columns in different orders. So copying it all into a single spreadsheet for processing was less fun than it should have been.

Then there was the SIC (standard industrial classification of economic activities) code data. This was supplied in this format. (Not that SIC codes are currently used in the maps, but I don’t like throwing away data.)

CompanyAll SIC codesPrimary UK SIC code
<company name 1>7111171111
<company name 2>6201262012
62020 
62030 
62090 

It took a lot of manual work to get to

Companyprimary SIC code2nd SIC code3rd SIC code4th SIC code
<company name 1>71111   
<company name 2>62012620206203062090

A few extra rows of data had been sourced using a Google form. This presented the data in another format that needed manual cutting and pasting to get it into the right format.

Company data also said whether companies were in ‘Edinburgh and the Lothians’, ‘Fife’ or ‘Scottish Borders’. Fortunately, converting postcodes to latitudes and longitudes using Doogal’s batch-geocoder also stated which local authority companies are in. It also found that some companies in the data are outwith south-east Scotland. (Some were in other parts of Scotland, but some were in south-east England!) So these ‘irrelevant’ companies were deleted from the data.

Some company data omitted postcodes, but this was fairly quickly fixed by web-searching for the companies on the Companies House website and other online resources. But in at least one case, the Companies House website stated a street address that does not exist.

There are issues when converting SCCI codes (which were supplied for each company) to DCMS codes (which were not):

SCCI codesDCMS codes 
advertisingadvertising and marketing
architecturearchitecture
visual artmuseums, galleries and libraries
craft and antiquescrafts
fashion and textilesdesign (product, graphic, fashion etc)
designdesign (product, graphic, fashion etc)
performing artsmusic, performing & visual arts
musicmusic, performing & visual arts
photographyfilm, TV, video, radio & photography
film and videofilm, TV, video, radio & photography
computer gamesno DCMS code
radio and tvfilm, TV, video, radio & photography
writing and publishingpublishing
libraries and archivesno DCMS code
software and electronic publishingtech – IT, software, hardware and computer services
cultural educationno DCMS code

Remaining issues

Duplication

The data still contains a large number of duplicates, leading to many cases where a company has two or more markers on the SCCI map, one for each SCCI code in that company’s data. There is no obvious ‘programmatic’ way to resolve this duplication – I cannot decide which SCCI code(s) should be removed. For example:

CompanySCCI codeDCMS code
<company name 3>computer gamesno DCMS code
<company name 3>film and videofilm, TV, video, radio & photography
<company name 3>software and electronic publishingtech etc
<company name 3>visual artmuseums, galleries and libraries
<company name 3>writing and publishingpublishing

This problem also occurs on the DCMS map, because each SCCI code has an equivalent DCMS code. So there are two markers on each map for many companies. On the DCMS map, a company may have two identical markers, even if its markers are different on the SCCI map. For example:

CompanySCCI codeDCMS code
<company name 4>designdesign (product, graphic, fashion etc)
<company name 4>fashion and textilesdesign (product, graphic, fashion etc)

WordPress

There are plugins to make WordPress use Leaflet. If I’d got these to work, the maps could have been on Creative Informatics’ WordPress-based website. However, the WordPress plugins don’t support (as far as I can see) clustering or the selector-panels in the top-right of my maps. Also the plugins will display 10, or 100, or even 1000 markers, but when I tried to display all 9000 markers, my browsers crashed. Hence the maps are hosted on the Edinburgh Napier University School of Computing ‘projects’ server as HTML, CSS and javascript files.

Missing data?

I am convinced the data isn’t complete. For example, I cannot believe that there are no creative companies in St Andrews. (I lived there for a long time – I think I’d have noticed if there weren’t any.) What about the south-west of Scottish Borders?

Conclusion

I am not convinced the maps are that helpful, mostly because they have so many duplicate markers and probably have a lot of missing data. Instead, I believe they give a rough flavour of what’s happening. They are also useful for showing the paucity of data available to government. This is important to me: without decent data, how can any government create and implement the right polices, or do the right amount of whatever it chooses to do. In brief:

Does the government know what it’s doing?

1 thought on “Creative industries in south-east Scotland – mapped

  1. Pingback: Mapping creative industries in south-east Scotland - Social Informatics research

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.