Emerging cutting edge advances in data science are continuing to have immense impacts on the way business and consumer data are handled and analyzed by corporations. This is adding additional value to the data that companies harvest and is contributing to the increased proﬁtability of business.
A major emerging use of big data in commercial settings is in sentiment analysis. Sentiment analysis refers to the process of evaluating consumer sentiment with respect to some product or service. Businesses have a vested interest in ensuring their consumers but evaluating that can be very hard to establish as the only way businesses can get feedback on their products is via product reviews.
There has been a growing interest in providing sentiment analysis services to companies, with companies such as CloudFactory, LionBridge and SocialBakers entering the space to beneﬁt from this rapidly growing area. These companies provide sentiment analysis services to companies, most often the analysis is done using data acquired by scraping social media websites. Sentiment analysis relies on the use of statistical techniques to extract user sentiment about a particular product (i.e. how users feel about a particular product or service). There are three main stages involved which tokenization, feature extraction and then ﬁnally classiﬁcation.
Tokenisation refers to the transformation of user text in into a series of tokens which can be easily processed by computer algorithms. There are many diﬀerent types of tokenization mechanisms but they all transform each word into a symbol which can easily be processed by diﬀerent algorithms. After tokenization,diﬀerent features are extracted from the tokens. A feature refers to a particular attribute of the data and count be anything from the word count or length to the distribution of occurrence of particular tokens.
The ﬁnal step involves classiﬁcation, this refers to the ﬁnal classiﬁcation of processed textual data into diﬀerent classes of sentiment (i.e happy, dissatisﬁed, etc.) and the ﬁnal evaluation of those classiﬁcations. The classiﬁcation mechanisms used are similar to those used conventionally such as naive Bayes classiﬁcation or support vector machine-based classiﬁcation. Sentiment analysis is now starting to be used extensively by many companies to assess how the general public feel about the products and services they provide or the general public feeling on a company's brand name.
Figure 1: Data Analytics Framework
The most common techniques used to analyze large amounts of data are machine learning-based approaches which are based on artiﬁcial neural network architectures. Artiﬁcial neural network algorithms are algorithms that have structures that mimic the structure of biological neural networks in the brain. These artiﬁcial neural networks use statistical techniques to ﬁnd underlying patterns in large swathes of data. Machine learning-based approaches have proved very eﬀective for big data and are now the defacto standard for big data processing.
The deployment of these cutting edge analytics application can also be done in a wide variety of new cloud deployment methods. In addition to the current conventional method of deploying analytics engines on service-as-as-service (SaaS) platforms such as SAP cloud services, servers running cloud services can be run on disposable servers on cloud platforms such as Amazon AWS and Google compute, with the computing power of the underlying server being adjusted on-the-ﬂy as per user data processing requirement.
Figure 2: Trends in Data Analytics
In other words, one does not need to buy computing resources that they do not uses all the time, one can just purchase resource on-theﬂy and rent processing power when they need it for a very short, limited amount of time, this process can be automated.
The cloud platforms that are being deployed for business intelligence can also be anonymized and handed to external specialist data analytics consultancy companies for advanced analysis. This allows data analytics to be done by external parties with no issues of user privacy being brought up. Data analytics companies thus now have the option of aggregating data collected by many diﬀerent companies and using it to identify and analyze underlying trends in the data. The analysis of these massive big data collections can provide very important insights into consumer behaviors and trends.
(Ramakrishnan Ramanathaiah is the Director- SAP Practices at Miraclesoft, Michigan, USA, based IT solutions firm. He is also an Author, Blogger, Speaker, Advisor, Mentor, Entrepreneur & Investor)
(Disclaimer: The opinions expressed are the personal views of the author. The facts and opinions appearing in the article do not reflect the views of Devdiscourse and Devdiscourse does not claim any responsibility for the same.)