+44 (0) 15 64 793552

ETC Conference Papers 2023

An integrated framework of social media opinion mining and category system to analyse public opinion towards transportation technologies and services

Seminar
Day 1 (6 Sep 2023), Session 3, Data Collection 1, 16:30 - 18:00

Status
Accepted, documents submitted

Submitted by / Abstract owner
Xuefen Cai

Authors
Katherine Cai Xuefen, Yuxiong Ji and Yu Shen

Short abstract
Social media opinion mining offers an alternative methodology to the conventional way of collecting and analysing public opinions towards new transportation technologies or public services to facilitate transportation policy planning and management.

Abstract
Robust content analysis pertaining to new transportation technologies or public transport services through social media opinion mining offers an alternative methodology to the conventional way of collecting and analysing public opinions to facilitate transportation policy planning and management.

The objective of this research is to demonstrate: a) the capability of sentiment analysis and topic modelling as an efficient alternative in capturing the temporal trend of sentiments and opinions; b) the potential of this technique to reveal latent factors or relationships that could possibly influence the user perceptions towards the research object; and c) the valuable insights obtained from the Twitter data to facilitate the discussion of the policy implications and the recommendation of measures.

A novel framework which integrates the opinion mining results from the unsupervised machine learning models (Valence Aware Dictionary for sEntiment Reasoning and Latent Dirichlet Allocation) into the category system to extract public opinions and sentiments that could aid in the understanding of the users’ key concerns regarding the research object, autonomous vehicles (AV).

A two-year (2018-2019) sample of 2,376,839 tweets was extracted from the Twitter data pool via generic hashtag queries. After removing irrelevant tweets, a total of 359,118 tweets in 2018 and 205,040 tweets in 2019 were used for the data processing. Performance of the proposed models was validated against 167 manually annotated held-out dataset where 50% accuracy was achieved for sentiment classification and 34% accuracy was achieved for topic inference at the document-level. The accuracy of the human classified sentiment achieved between 44%-51% whereas the human classified topics achieved between 51%-66% amongst the five human annotators.

Factors that could have influenced the users’ sentiments were analyzed for its impact on posterior polarity and topic inference using the one-way ANOVA hypothesis testing. Content analysis was achieved by integrating computational linguistics into the category system.

The research demonstrated that it is promising to use the proposed novel framework of integrating computational linguistics and the category system via the social media data to monitor users’ preferences towards the research object in a cost-effective, time-saving and resource-efficient manner, which could aid interested stakeholders in planning and decision-making.

The research was concluded with discussing the policy implication and recommendation, the limitation of the methodology and areas of where further research is required.

Programme committee
Data

Topic
Cities and transport – integrated planning, liveable cities, active transport