We have been investigating IBM Watson’s new social media analytics features when trying to understand the job market. We wanted to compare the news sources and social media chatter against the CV data and job posting data which we have to see how they compare with the realities of the job market and future 15 technical pathways as shown Government skills report – future of technical education in the white paper published in April 2016
To get started we added the feature to our existing subscription – this gives us a social media analysis project in addition to our other data and dashboard types.
Once we had the project we added the sources such as News, Twitter, Languages (we just picked English) as part of the configuration setup.
We then moved onto creating our Topics which were the following
- Jobs – We want to understand how people and the news talk about job data
- Employment – Picking up another theme around employment to see if we consider general employment different to jobs
- Skills – It seems anecdotally that skills are discussed everywhere – but we want to put this to the test
- Apprenticeships – Is there a rise in apprenticeship chatter compared to jobs and skills
- Education – The foundation topic Education, Education, Education – we want to understand how this is linked to the other topics
- Training – Do people want / not want training
- College / Universities – How do we consider these institutions
Topics from IBM Watson
For each of the Topics we need to be able to pivot them by Themes which will help provide some context but also be able to compare the Topics by Themes
Themes from IBM Watson
We then set the Date period to analyse – we picked 2 periods – one from 1 year ago and current period from 1st Jan 2017 to 31st Jan 2017
Date Period Chooser

We selected all sources but you can switch these on and off
Sources options

We then executed the analysis – which told us that from over 5million documents our trial version could analyse 25,000 documents – from the image below you can see the total documents and mentions and the split of the mentions of “Share of Voice” across the topics which shows an even share between College, Jobs, Training, Education and Skills. Employment, Universities and Apprenticeships were very small when compared overall and the College Vs. University difference can be attributed to the mostly US focus of the data sources used.
The results which are the most useful are the Topics / Themes analysis
For each of the Themes you can actually view the detail behind them, called Mentions, and validate the content categorisation which is useful.
- Legal, Manufacturing and Employment themes – This data is from January 2017 and the rise of Legal and Manufacturing is directly correlated to President Trump and discussions around Legal /Illegal immigrants and manufacturing jobs in the US.
- Interestingly semantic analysis of the data also reveals a mostly positive outlook when comparing Legal and Manufacturing – although it should be noted the semantic analysis is not 100% perfect as you can see from the posts below
Correct semantically analysed post
Incorrect semantics – although it depends if you are Democrat or Republican
- Interestingly semantic analysis of the data also reveals a mostly positive outlook when comparing Legal and Manufacturing – although it should be noted the semantic analysis is not 100% perfect as you can see from the posts below
From our initial views it looks promising, skewed to US sources but could be configured in the future and maybe using the Alchemy APIs may prove a good way to programmatically analyse the data.
We will be back with more analyse as we uncover the features of the data and products within IBM Watson
Geek Talent Team…
Recent Comments