Abstract: The growing use of Internet and web-enabled devices has made it easier for us to access any kind of information anytime and from anywhere. People are always curious to know the feedbacks of products and services they are interested on and are always happy to get feedbacks from others. So opinion matters in our decision making. Sometimes reviews made by different people can help to take many difficult decisions. Nowadays there are a lot of resources for getting feedbacks from. The use of online social media shows us how differently the information is produced, represented and consumed. As we know that the information in social media is transmitted via social interactions, we have to understand how the information influences the personal behavior. Web is a huge database from where user can access information and can interact with the different groups of people, business organization and can create the content. We can use the ratings and feedbacks of thousands of the users to extract their attitudes and sentiment towards any product or services and utilize those data and information for future market and business domain analysis. The use of the online social media for getting peoples opinion, appraisal, attitudes and approval regarding the products and services makes it easier for decision making but adds the complexity regarding the information processing to extract the correct opinions. (Abstract)
Index Terms’ Social media, Social media analysis, Sentiment analysis, Opinion, Web. (Key words)
I. INTRODUCTION
Social Media allows people to create, share and exchange information, ideas, pictures and videos in the network [1]. Nowadays social media is a part of everyday life. People use it at least once every day. Sentiment analysis extracts the subjective information from the text or sentences being studied. More specifically the sentiment analysis determines the attitudes of an individual regarding some topics. The attitudes can be directed towards positivity, negativity or can be neutral. The advanced form of sentiment analysis determines the emotional states like happy, sad, and angry. The earlier development in the sentiment analysis is mainly focused on towards the detection of polarity of products review and the review of the movies. It is a very important tool in the field of computation mining for understanding the people’s idea, attitudes towards the issues, topics, events and response towards product or services being observed. The different business organization is always looking to improve within itself and its services for the customers and wants to get some feedbacks from them so that it can improve in the future. Hence Social media analytics tools are used by many big companies like Bank of Canada, Whirlpool [1] etc. to interact with customers and getting feedback from them. The Sentiment Analysis procedure helps to detect the source, target and type of attitude and helps to distinguish the attitude of texts. Potential customer always wants to have some idea about the services or the products before buying them. The individuals or organizations are increasingly using opinions from the media sources like personal blogs, review sites and social networks for their decision-making. However, current web consists of millions of social media and each with the diverse opinions. It is always challenging to accurately summarize the opinions and information from those media. We human beings have different nature and opinion towards same products or services. Opinions differ from individual-to-individual and from time to time on the same content. As a human nature, we take our opinions quite seriously since it is our own and close to our thoughts. So it is always a big challenge to have consistent opinions about a product or a service. There should be some mechanism that automates and summarizes all those human opinion to get unbiased and correct information.
In this chapter, we will be dealing with basics of social media analytics to the opinion mining problem, as well as we will see various research and challenges in the field of sentiment analysis, the key technical issues that need to be addressed. We will then describe various semantic analysis techniques that have been studied in the research literature and their representation techniques. After that, we will discuss the methods to evaluate the sentiment and the current and future issues of semantic analysis.
II. OPINION DEFINATION
Opinions can be given on anything and the given opinions can be summarized as having positive, negative or neutral attitudes towards the text or sentence under consideration. Opinion extraction and gather information from millions of web content and to summarize them to get the correct results is one of the big challenges of sentiment analysis technique. There is still a need to have clear and better mechanism to mine these huge amounts of data to get the correct results. Opinion-mining systems analyze the parts of the text or sentence in the aspects of who are the author, what the opinion is and which part is mainly expressed [2].
Sentiment analysis determines the subjectivity, polarity and the polarity strength of the content in the text or in the sentence. The polarity defines positivity or negativity whereas the polarity strength defines how the opinion is motivated like weakly positive, mildly positive or strongly positive. Sentiment analysis approach can be categorized into keyword spotting, statistical analysis, lexical affinity and concept level methods [3]. Earlier sentiment analysis was mainly focused on product reviews and movies reviews but now it is focused on plethora of application ranging from forum, social networks, blogs, product reviews and so on. The main objective of the sentiment analysis is to detect the subjective information and determine the mindset of the author towards the subject to be studied.
Figure 1: Conceptual Model of Sentiment Analysis
Sentiment analysis determines the attitudes on the basis of holder of the attitude, target of the attitude, and type of attitude from set of type like love, hate or simply from polarity like positive, negative or neutral. This can be done with the help of some text or sometimes we might consider whole documents.
III. WHY SENTIMENT ANALYSIS
Sentiment analysis is increasingly important because of emergence of social media. Sentiment analysis can be used in all sorts of task. Some of them are in the field of public sentiment to know the customer confidence, to know what people think about the new products, in the field of politics to know how people think about the candidate and their issues. Some companies use sentiment analysis for market analysis prediction and movies industries use it to get the reviews of the movies to get the feedback to know whether the audience have positive, negative or neutral view.
A. Object and Features
Object is used to denote the target entity. An object can have a set of components and some set of attributes. The object with components and attributes is known as feature of that object. For example ‘I like Samsung galaxy III mini. It has a great touch screen’, the first sentence expresses a positive opinion on Samsung phone and the second sentence expresses a positive opinion on its touch screen that describes the feature of galaxy III mini [4].
B. Opinion Holder
Opinion holder is the one who express the opinion about the entity being observed. Opinion holder can be author of the post or organization that holds the particular opinion about the product or services [4].
C. Opinion and Orientation
An opinion on a feature can be positive, negative or neutral view on that feature. Positive, negative and neutral views are called opinion orientations. The model, model of an opinionated text and mining objectives are collectively called as the feature-based sentiment analysis model. Opinion can be of two types namely regular opinion and comparative opinion [4]. Regular opinion expresses opinion on target entity.
‘ Direct Opinion :
Lui mathematically represented an opinion as a quintuple (o, f, so, h, t), where ‘o’ is an object; ‘f’ is a feature of the object ‘o’; ‘so’ is the orientation or polarity of the opinion on feature ‘f’ of object ‘o’; ‘h’ is an opinion holder; ‘t’ is the time when the opinion is expressed [5]. The opinion orientation can be positive, negative or neutral. E.g. the keypad is really cool.
The comparative opinion compares more than one entity to determine the sentiment.
‘ Indirect Opinion :
It is expressed using comparative opinions between two or more objects. It is usually conveyed using the comparative or superlative form of an adjective or an adverb, e.g., ‘Coffee is better than Tea’. E.g. After movies I am feeling energetic.
IV. TASK PERFORMED IN SENTIMENT ANALYSIS
A. Subjectivity Classification
In this method the given text or sentences is divided into two classes namely objective and subjective. The subjectivity of the words or phrase depends on the context. Su [6] results are largely dependent on the definition of subjectivity used when annotating texts. However, Pang [7] showed that removing objective sentences from a document before classifying its polarity helped improve performance.
B. Sentiment Classification
It classifies the text as having positive opinion or having negative opinion. Sentiment classification can be a binary classification (positive or negative) [8], multi-class classification (extremely negative, negative, neutral, positive or extremely positive), regression or ranking [9]. Depending upon the application of the sentiment analysis, sub -tasks of opinion holder extraction and object feature extraction are optional.
C. Opinion Header Classification
The sentiment Analysis approach involves task like opinion holder extraction, i.e. the discovery of opinion holders or sources [10]. Detection of opinion holder is to recognize direct or indirect sources of opinion. The opinion holder detection is very important in the cases where same opinion holder can express multiple opinions. In those cases opinion holders are identified by name and login credentials.
D. Object/Features Extraction
The main things while analyzing sentiment is to determine target entity. The opinion in blogs, social media and in review sites have specified intention towards topic hence to find out the target entity is necessary in such scenarios to extract the features. A reviewer can have different opinions about the features and components of the target entity so feature based analysis are important issues in sentiment analysis [11].
V. LEVELS OF SENTIMENT ANALYSIS
A. Document Level
The document whose sentiment has to be determined is considered as a basic unit for sentiment analysis purpose. This approach assumes that single opinion holder holds the opinion. The positive and negative reviews can be classified by using various available machines learning approach. They experimented with three classifiers (Naive Bayes, maximum entropy, and support vector machines) and features like unigrams, bigrams, term frequency, term presence and position, and parts-of-speech. They have concluded that SVM classifier works best and that unigram presence information was most effective [9]. Pang and Lee formulated Document level sentiment analysis as a regression problem [9]. Supervised learning was used to predict rating scores.
B. Sentence Level
This method is based on identifying the subjective sentence from the mixture of sentences. The main problem with document level analysis is that it can extract information from objective sentence hence sentence level sentiment analysis is needed for subjective analysis. The supervised learning method is used to identify the subjective sentence.
C. Word Level
It uses mostly adjectives as features. Researcher also uses some verbs, nouns and adverbs as features [7, 12]. The two methods of automatically annotating sentiment at the word level are:
‘ Dictionary-based approaches and
‘ Corpus-based approaches.
1) Dictionary based
This approach is based on the list of words with prior polarity. In this method, a list of word is created and is extended with synonyms and antonyms using online dictionary. The sentiment of the word is determined by how the unseen words interacted with the previously defined words in the list. The positive and negative sentiments of the words and the orientation of words are calculated with the help lexical relation. ‘The semantic orientation of a word was calculated by its relative distance from the two seed terms, good and bad. The values ranged from [-1, 1] with the absolute value indicating the strength of the orientation” [16]. The dictionary method is not domain specific so it faces polarity classification problem. It is always difficult to distinguish the polarity as a negative or positive for certain word in certain situation [13].
2) Corpus based
Corpus based methods is based on known polarity and relies on syntactic and statistical techniques. The association relationship between an unknown word and a set of manually selected seeds (like excellent and poor) was used to classify it as positive or negative. The degree of association between the unknown word and the seed words was determined by counting the number of results returned by web searches in the AltaVista Search Engine joining the words with the NEAR operator and calculating the point-wise mutual information between them [13].
D. Feature based
The document based and sentence level approaches don’t differentiate but only talks about positive and negative review but in some cases the reviewer likes some features and dislikes some other. Hence feature based analysis is required. It extracts the feature of the products. Yi et al. [12] restricted the candidate words further by extracting only base noun phrases, noun phrases preceded by a definite article, and definite base noun phrase at the beginning of a sentence followed by a verb phrase. For each sentiment phrase detected, its target and final polarity is determined based on a sentiment pattern database. Hu and Lui [7] use heuristic method to extract the most frequent noun or noun phrase using association mining.
They assigned the nearest opinion word to a feature to determine the sentiment orientation. Popescu and Etzioni [6] greatly improved the task of extracting features by differentiating a part of an object and a property of the object by using WorldNet’s hierarchy and morphological clues. Their algorithm tries to eliminate those noun phrases that probably are not product features.
VI. RECENT TRENDS
The recent trend shows that the automated method is the most widely used techniques for analyzing the opinions. It is based on the phenomena like Natural Language Processing, Text Mining, Machine Learning and Artificial Intelligence, Automated Content Analysis, and Voting Advise Applications. The increase use of social media leads to increase in the quantity of unstructured data. It is due to the adoption of social media that are available for machine learning algorithm to be trained on. Because of the combination of increase in the volume of data available and more complex concepts to analyze, in recent years there has been a decrease in interest on semantic-based application and increase in use of statistics and visualization. Just as any other scientific discipline, automated content analysis is also becoming a data-intensive science.
A. Challenges
‘ Detection of Reliable content: Determination of fake spam and reviews and methods to eliminate for effective sentiment analysis.
‘ Name Entity Recognition: What the opinion is all about.
‘ Limitation of Filtering: For filter bubble gives irrelevant opinion sets and it results false summarization of sentiment.
‘ Domain-independence: The Domain dependent nature of sentiment words results good performance in one domain case and poor in other domain.
‘ The opinion orientation: The comparison of words and whether they are giving positive or negative feedback totally depends on their context. So it is not so easy to determine in which context the comparison is made.
‘ Language: The use of both positive and negative or mixed view about the products or services in the same statement is somehow difficult to understand.
VII. CURRENT EXAMPLES OF SENTIMENT ANALYSIS
Sentiment analysis can be used from online retail to blogging and also in different application in politics. Nowadays customer related business use sentiment analysis not just for product review but also for customer services and brand reputation management. Similarly sentiment analysis is useful in getting the feedback from citizen and the hot issues as well as spread campaign messages and policy announcements by political parties.
VIII. METHODS OF DETERMINING SENTIMENT
The growing use of the online media makes it difficult for the sentiment evaluation.
A. Scaling System
Figure 2 Five star ratings.
This method uses the rating system to determine the personal appreciation. As we can see in most of movies reviews sites as well as in online shopping sites, it uses rating system from one to four/five.
B. Subjectivity/Objectivity Identification
This method is based on some type of documentation. The text or the sentence can have subjective or objective opinion. Differentiating subjective and objective opinion from the text or from the sentence is difficult task. Different text or sentences can give different meaning in different situation, context and as well as in different scenario. Hence this is one of the difficult methods to implement [15].
C. Bales Interaction Process Analysis
This method identifies and records the nature of the each interaction. It is not used to measure the content of the interaction. Bales IPA is based on scoring interactions based on ‘units’ of interaction or communication. These scores are applied to a predetermined set of categories and an analysis made based on the scores of each category. These units are typically made up of one simple sentence expressing one idea. Complex sentences expressing more than one idea are scored based on the number of independent clauses they contain. Fragments of sentences can be scored as communication, one ‘point’ each, but the problem with these is the interpretation of those sentences. These fragment sentences must be taken in context in order to be categorized. In case of oral or physical studies, simple sounds like grunts or sighs can typically be categorized and even facial expressions if the observer feels they convey enough meaning can be categorized [14].
IX. PROBLEM WITH SENTIMENT ANALYSIS
Sentiment analysis is a topic with personal and technical challenges. Opinions are expressed by multiple numbers of people and whenever there are a lot of people that is always a chance to have multiple opinions in the same subject. So interpreting the moods will be difficult not only for humans but also for computers. Similarly we know that opinions differ from person to person and to analyze the particular text or sentence comes up with some technical challenges. Consider a scenario, which states that coffee machine is on the side of the reception. This statement may be positive negative or neutral depending on the situation and the people. To analyze those differences from person to person and time to time is always a great challenge. The different experiments shows that automated sentiment analysis is a good tool for sentiment analysis but it cannot always be trusted and we cannot say that it always gives accurate analysis on the data. Automated Sentiment analysis is not every time a perfect solution because it finds difficulty when it has to differentiate between positive, negative and neutral data [17].
X. CURRENT SITUATION AND FUTURE RESEARCH
The most of the researches in the field of sentiment analysis are mainly focused on products reviews and movies reviews but we are still behind in developing a good model that understand human language and interpret it well. Further work can be done in expanding the techniques and algorithm to handle more general writing, analysis on short sentences like abbreviations to perform cross domain analysis. The sentiment analysis algorithm however can be utilized in spam detection, detection of the context, evaluation of the expression and to detect human language. Researches in improving the word identification, bipolar sentiment and developing full automatic tools can be done further. The sentiment analysis algorithms use simple terms and expression but due to large number of opinion, different opinion orientation and the different context as a whole will be a big task for computers to get it done and extract the correct sentiment.
XI. CONCLUSION
Opinions are important to everyone because whenever we need to make any decision we want to hear other people’s opinions. This is not only true for an individual but also true for any organization. In the past, when anyone needed to make a decision, he/she typically used to ask for opinions from friends and families. When an organization wanted to find opinions of the general public about its products and services, it used to conduct surveys and focus groups. But nowadays with the explosive growth of the social media content on the Web, in the past few years, the world has taken a different shape. People can now post reviews of different products on different merchant sites and express their views on almost anything in discussion forums and blogs, and also in social network sites. Now if someone wants to buy a product, he/she is no longer limited to his/her friends and families’ reviews because there are several users’ reviews on the Web. For a company, it may no longer need to conduct any surveys or focus groups in order to gather consumers’ opinions about its products and those of its competitors’ because there are plenty of resources publicly available where it can collect that information. However, finding opinion sites and monitoring them on the Web can still be a formidable task because there are a large number of diverse sites, and each site may have a huge volume of opinionated text. It is difficult for a human reader to find relevant sites, extract related sentences with opinions, read them, summarize them, and organize them into usable forms. Sentiment analysis is a field with large area of application and provides researcher and academic organization lots of research challenges. With the rapid growth of internet and internet enabled applications sentiment analysis have become so popular among different communities so more innovative, automated and effective summarization techniques are required which should overcome the current challenges faced Sentiment Analysis.
REFERENCES
[1] Social media analytics [online]: http://en.wikipedia.org/wiki/Social_analytics (last accessed on 15 November 2014).
[2] [Online]: http://www.w3.org/2012/06/pmod/opinionmining.pdf (last accessed on 15 November 2014).
[3] Cambria Erik. Schuller Bj??rn. Xia Yunqing. Havasi Catherine. New Avenues in Opinion Mining and Sentiment Analysis”
[4] [Online]: http://www.cs.uic.edu/~liub/FBS/IEEE-Intell-Sentiment-Analysis.pdf (last accessed on 15 November 2014).
[5] Liu B. Sentiment Analysis and Subjectivity. Handbook of Natural Language Processing, Second edition, 2010
[6] Popescu A-M. Etzioni O. Extracting product features and opinions from reviews, Proceedings of the Human Language Technology Conference and the Conference on Empirical Methods in Natural Language Processing (HLT/EMNLP), 2005
[7] Aue A. Gamon M. Customizing sentiment classifiers to new domains: A case study. Proceedings of Recent Advances in Natural Language Processing (RANLP), 2005.
[8] Pang B. Lee L. Vaithyanathan S. Sentiment classification using machine learning techniques. In Proceedings of the Conference on Empirical Methods in Natural Language Processing, 2002, (EMNLP):79’86.
[9] Pang B. Lee L. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales, Proceedings of the Association for Computational Linguistics (ACL),2005:115’124
[10] Bethard S. Yu H. Thornton A. Hatzivassiloglou V. and Jurafsky D. Automatic extraction of opinion propositions and their holders. Proceedings of the AAAI Spring Symposium on Exploring Attitude and Affect in Text, 2004.
[11] Hu M. Liu B. Mining opinion features in customer reviews. In Proceedings of AAAI, 2004:755’760.
[12] Esuli A. Sebastiani F. Determining the semantic orientation of terms through gloss classification. In Proceedings of CIKM-05, the ACM SIGIR conference on information and knowledge management, Bremen, DE, 2005.
[13] Turney P. Thumbs up or thumbs down? Semantic orientation applied to unsupervised classification of reviews. In Proceedings of the Association for Computational Linguistics (ACL), 2005: 417’424.
[14] Bales,’Robert.”Interaction’Process’Analysis’Article.”Interaction Process Analysis.’Web.’25’ Mar.’2012.
[15] [Online]: http://web.njit.edu/~da225/NetHelp/default.htm?turl=Documents%2Fsubjectivityobjectivityidentification.htm (last accessed on December 15 2014)
[16] Kamps, J., Marx, M., Mokken, R.J., de Rijke, M., Using WordNet to measure semantic orientation of adjectives In Language Resources and Evaluation (LREC),2004
[17] Sentiment analysis and opinion mining [Online]: http://stp.lingfil.uu.se/~santinim/sais/2014/bingliu.pdf (last accessed on 18 December 2014)
[18]
Essay: Social media analytics – sentiment analysis
Essay details and download:
- Subject area(s): Computer science essays
- Reading time: 14 minutes
- Price: Free download
- Published: 30 September 2015*
- Last Modified: 23 July 2024
- File format: Text
- Words: 3,920 (approx)
- Number of pages: 16 (approx)
- Tags: Social media essays
Text preview of this essay:
This page of the essay has 3,920 words.
About this essay:
If you use part of this page in your own work, you need to provide a citation, as follows:
Essay Sauce, Social media analytics – sentiment analysis. Available from:<https://www.essaysauce.com/computer-science-essays/essay-social-media-analytics-sentiment-analysis/> [Accessed 18-12-24].
These Computer science essays have been submitted to us by students in order to help you with your studies.
* This essay may have been previously published on EssaySauce.com and/or Essay.uk.com at an earlier date than indicated.