Technical Description

Home » Technical Description

Images have always been a convenient way to propagate information in all types of online social interaction. Therefore, the task of automatically analyzing this visual content in order to discover patterns, similarities and trends is rather important and challenging. In addition, image documents can provoke a wide range of emotional reactions. Sentiment-based image retrieval refers to the task of associating low-level visual features with emotions/sentiment in order to enhance the categorization, indexing and retrieval process.

One of the first applications of image sentiment analysis was conducted in the context of semantics extraction of art paintings and video commercials (Colombo, Del Bimbo, & Pala, 1999). A technique was proposed towards the extraction of high-level representation, allowing the derivation of emotional semantics, e.g., relaxation, joy, etc. Towards this end, mostly colour-based features have been adopted: this stemmed from the observation that particular colour combinations are (un)consciously used by the artists to produce optical sensations. Several other visual features have also been used in image sentiment analysis: for example, in (Siersdorfer, Minack, Deng, & Hare, 2010) colour histograms and SIFT-based representations are adopted for binary sentiment classification of web images, while in (Yuan, Mcdonough, You, & Luo, 2010) HoGs are used. Also, in (Yuan, Mcdonough, You, & Luo, 2010) special focus is given in images that contain faces during the sentiment classification process. Apart from the typical binary sentiment categorization, some methods (e.g. (Schmidt & Stock, 2009)) face the problem as a multi-class classification task (e.g. more emotions are used: anger, disgust, etc.).

In the context of the SentIMAGi project, we aim to:

  1. study and implement methods towards the automatic recognition of sentiment stemming from images
  2. adopt state-of-the-art low level visual features for the given task and study new features that either stem from raw image information or based on semiotic observations that correlate to high-level semantics related to emotions.
  3. use supervised and semi-supervised techniques, utilizing information from the – more semantically rich – accompanying text information.

LEAVE A COMMENT