We have been investigating IBM Watson’s new social media analytics features when trying to understand the job market.  We wanted to compare the news sources and social media chatter against the CV data and job posting data which we have to see how they compare with the realities of the job market and future 15 technical pathways as shown Government skills report – future of technical education in the white paper published in April 2016

To get started we added the feature to our existing subscription – this gives us a social media analysis project in addition to our other data and dashboard types.

Once we had the project we added the sources such as News, Twitter, Languages (we just picked English) as part of the configuration setup.

We then moved onto creating our Topics which were the following

  • Jobs – We want to understand how people and the news talk about job data
  • Employment – Picking up another theme around employment to see if we consider general employment different to jobs
  • Skills – It seems anecdotally that skills are discussed everywhere – but we want to put this to the test
  • Apprenticeships – Is there a rise in apprenticeship chatter compared to jobs and skills
  • Education – The foundation topic Education, Education, Education – we want to understand how this is linked to the other topics
  • Training – Do people want / not want training
  • College / Universities – How do we consider these institutions

Topics from IBM Watson

For each of the Topics we need to be able to pivot them by Themes which will help provide some context but also be able to compare the Topics by Themes

Themes from IBM Watson

We then set the Date period to analyse – we picked 2 periods – one from 1 year ago and current period from 1st Jan 2017 to 31st Jan 2017

Date Period Chooser

We selected all sources but you can switch these on and off

Sources options

We then executed the analysis – which told us that from over 5million documents our trial version could analyse 25,000 documents – from the image below you can see the total documents and mentions and the split of the mentions of “Share of Voice” across the topics which shows an even share between College, Jobs, Training, Education and Skills.  Employment, Universities and Apprenticeships were very small when compared overall and the College Vs. University difference can be attributed to the mostly US focus of the data sources used.

The results which are the most useful are the Topics / Themes analysis

For each of the Themes you can actually view the detail behind them, called Mentions, and validate the content categorisation which is useful.

  • Legal, Manufacturing and Employment themes  – This data is from January 2017 and the rise of Legal and Manufacturing is directly correlated to President Trump and discussions around Legal /Illegal immigrants and manufacturing jobs in the US.
    • Interestingly semantic analysis of the data also reveals a mostly positive outlook when comparing Legal and Manufacturing – although it should be noted the semantic analysis is not 100% perfect as you can see from the posts below

      Correct semantically analysed post

      Incorrect semantics – although it depends if you are Democrat or Republican

From our initial views it looks promising, skewed to US sources but could be configured in the future and maybe using the Alchemy APIs may prove a good way to programmatically analyse the data.

We will be back with more analyse as we uncover the features of the data and products within IBM Watson

Geek Talent Team…

Share This