There is new Moto g6 in the market. Let us make a word cloud from the customer reviews to know if Moto g6 is worth buying.
Word cloud which is also known as tag cloud or text cloud is an image which displays certain words from a particular text. These words are of different sizes; there size depends upon their frequency of occurrence in that particular text.
Greater is the frequency larger is the font.
We will be creating word cloud for customer reviews on Amazon for new Moto g6.
Step1:- We will create a text file for all the customer reviews. Just visit amazon.in and copy all the reviews for moto g6 in a text document.
Step2:- Remove all the unnecessary things as bullets, smiley and others manually and if some are left we will handle them later. This is how your text document will look like.
Step3:- Loading the packages
we will now load all the required packages.
If anyone of these are not installed just install it using a simple command install.packages (“Package Name”).
Step4:- Reading the file
we will now import our text document in Rstudio to further precede on it.
Copy the path of the text document into “reviews” variable. Then using readLines () function import this text document.
Step5:- Creating corpus
we will now convert our text document to corpus. It is important to convert it into corpus format so that it can be easily processed by tm package.
Corpus is collection of all the text and documents regarding any content. We will use Corpus () function to convert our text document to corpus.
Step6:- Cleaning corpus
we will now clean our corpus and remove other unnecessary things which are not required for analysis.
We removed whitespaces, numbers, punctuation’s and stop words as they have no role in the analysis process.
Stop words are those words which are extremely common as “the, are, be, been…….” and many others and so we will remove them as they are not required.
Step7:- Term Document Matrix
we will now create a term document matrix.
Term Document matrix is a simple mathematical matrix which tells us about the frequency of words in our corpus.
To convert a corpus into TDM use TermDocumentMatrix () function. Now pass it to as.matrix () to convert it to matrix form.
This is not important you can even skip this step.
Step8:- Creating Word Cloud
We passed our corpus to wordcloud () function to create a word cloud. Scale variable defines the range of size of largest word to the smallest word.
Random.order = FALSE it is set to false as we do not want our words to be in a random order.
This is how our word cloud will look like:-