• Student Retention Pattern Prediction Employing Linguistic Features Extracted from Admission Application Essays
Presented at International Conference on Machine Learning and Applications IEEE ICMLA 2017 Paper ID #379 | Cancun, Mexico, December 18-21, 2017
Abstract This paper investigates the use of linguistic features extracted from the application essay so students enrolled in a university academic program for their retention pattern prediction. Three sets of linguistic features are generated from text analysis: (1) latent Dirichlet allocation (LDA) based topic modeling with a variety of topic number, (2) Linguistic Inquiry and Word Count (LIWC), and (3) part-of-speech (POS) distribution. Various classification experiments are implemented to evaluate the prediction performance of student retention patterns from these three feature sets and their combinations. The results show that the POS distribution features yield the best prediction performance among these three, while neither the LDA features nor ensemble methods improves predictive performance, which is contrary to admission experts’ manual analysis methods in the conventional admission process.
Conference Link . . .
• Multimodal Content Analysis for Effective Advertisements on YouTube
Presented at International Conference on Data Mining IEEE ICDM 2017 ID #DM714 | New Orleans, LA, November 18-21, 2017
Abstract The recent advancement of web-scale digital advertising saw a paradigm shift from the conventional focus of digital advertisement distribution towards integrating digital processes and methodologies and forming a seamless workflow of advertisement design, production, distribution, and effectiveness monitoring. In this work, we implemented a computational framework for the predictive analysis of the content-based features extracted from advertisement video files and various effectiveness metrics to aid the design and production processes of commercial advertisements. Our proposed predictive analysis framework extracts multi-dimensional temporal patterns from the content of advertisement videos using multimedia signal processing and natural language processing tools. The pattern analysis part employs an architecture of cross modality feature learning where data streams from different feature dimensions are employed to train separate neural network models and then these models are fused together to learn a shared representation. Subsequently, a neural network model trained on this joint representation is utilized as a classifier for predicting advertisement effectiveness. Based on the predictive patterns identified between the content features and the effectiveness metrics of advertisements, we have elicited a useful set of auditory, visual and textual patterns that is strongly correlated with the proposed effectiveness metrics while can be readily implemented in the design and production processes of commercial advertisements. We validate our approach using subjective ratings from a dedicated user study, the text sentiment strength of online viewer comments, and a viewer opinion metric of the likes/views ratio of each advertisement from YouTube video-sharing website.
Read the full article . . .