1) Data mining is essential technique to extract and analyse figures from the homogeneous data. It is mainly focused on account dependent activities such as accounting, purchasing, supply chain, CRM etc… The solution can be quickly found once the algorithm is defined.
Text mining is the technique to extract data from the heterogeneous document formats such as text documents, emails, social media posts etc… The process includes several linguistic stages of analysis such as language guessing, tokenization, segmentation, morpho-syntactic analysis, disambiguation, cross references etc…These steps helps to structure the unstructured data to nurture domain specific applications.
Sentiment analysis is popular method of text mining to analyse the data for understanding the opinion expressed by it and other key factors comprising modality and mood. The two main types of sentiment analysis are subjectivity/objectivity identification and feature/aspect-based sentiment analysis (Saxena, N.).
2) Text mining is data mining technique to derive high quality information from the text.
It is the process of analysing unstructured/heterogeneous text, extract relevant information an transforming it into some useful business intelligence. The important techniques that involve with text mining are information extraction, information retrieval, categorization, clustering and summarization.
Popular applications of text mining are:
1) Customer care service
Companies are getting much benefited with the help of text mining techniques, particularly NLP, are finding increasing importance in field of customer care service.
Companies are investing in these text analytics software to access text data from varied sources such as surveys, customer feedback, and customer calls, etc. They aim to reduce the response time to address the grievances of the customer through their feedback in much more speed and efficient manner.
2) Personalized advertising
Every individual can see a greater number of ads in Facebook or YouTube that are very relevant or close to their needs. It is just because their browsing data is being run in text mining software so as to understand their needs and finally display advertisements related to them in social media platforms or in ad-sense managers (Williams, J., 2018) .
3) Text mining is an inductive approach to find patterns and trends across the data. Data given for text mining will be in unstructured format. It must be converted into structured format for predictive modelling and other types of analysis. The process includes several linguistic stages of analysis such as language guessing, tokenization, segmentation, morpho-syntactic analysis, disambiguation, cross references etc…The process helps to extract the knowledge from the text-based data.
Other possible ways for inducing structure in text for extracting knowledge are:
1) Classification – Grouping terms into predefined categories
2) Clustering – Come up with natural groupings
3) Association rule learning – Finding frequent combination of terms in the data
4) Trend analysis – Recognizing concept distributions based on specific collection of documents
4) Text mining is an artificial intelligence technology that use natural language processing to transform the to transform the free (unstructured) text in documents and databases into normalized, structured data suitable for analysis or to drive machine learning (ML) algorithms.
Natural Language processing studies the problem of “understanding” natural human language into form of numeric or symbolic data that are easier for computer programs to manipulate.
The main objective of NLP is to drive the text to true understanding and processing of natural language that considers grammatical and semantic constraints with maintaining context.
Some of the limitations of text mining includes copyrights where most of the repositories, authors, publishers and other interested parties carefully manage their data and bit highly expensive to get them.
Another limitation of NLP is representing a sentence or group of concepts with absolute precision. The realities of computer software and hardware limitation make this challenge nearly insurmountable. The realistic amount of data necessary to perform NLP at the human level requires a memory space and processing capacity that is beyond even the most powerful computer processors (Thomas-Ogbuji, C., 2001).
Internet Exercise:
I have gone through kdnuggets.com and explored software section that deals with information on packages for data mining and text mining.
Packages for text mining:
1) Clara bridge – Text mining software that provide end-to-end solution for customer experience professionals those who to transform customer feedback for marketing, service and product improvements.
2) Expert System – Software which use proprietary COGITO platform for semantic comprehension of the language to do knowledge management of unstructured information (Text Analysis, Text Mining, and Information Retrieval Software, 2020)
Packages for data mining:
3) Bayesian Lab – This software tool is based in Bayesian networks, including data preparation, missing values imputation, data and variables clustering, unsupervised and supervised learning.
4) Advanced Miner – It provides wide range of tools for data transformations, data mining models, data analysing and reporting. (Software Suites/Platforms for Analytics, Data Mining, Data Science, and Machine Learning.)
2020-6-1-1590980320