Day 13: gtrendsR

Welcome back for the 13th day of the #packagecalendar, today we will be taking a look at the gtrendsR by Philippe Massicotte. The gtrendsR package provides an interface for retrieving and displaying Google Trends information.

The package is available from CRAN, but due to recent changes to Google Trends API, the CRAN version of the package is no longer working. If you want to continue to query Google Trends, you have to install the development version of the package

remotes::install_github("PMassicotte/gtrendsR")

We have already seen the use of gtrendsR in previous posts. The main function is gtrends() which contains a handful of parameters. The main arguments are keyword and time for our purposes. keyword is a character vector of the search terms you want to get the trend for. time is a string specifying the span of the query, examples are

Let’s use the carrol names we found yesterday, note how we are not able to pass more than 5 keywords to gtrends().

library(gtrendsR)
carrol_trends <- gtrends(carrols)

After pulling the information we get quite a lot of information back

names(carrol_trends)
## [1] "interest_over_time"  "interest_by_country" "interest_by_region" 
## [4] "interest_by_dma"     "interest_by_city"    "related_topics"     
## [7] "related_queries"

Lets first look at the interest over time

head(carrol_trends$interest_over_time)
##         date hits   geo      time      keyword gprop category
## 1 2014-12-14   29 world today+5-y O Holy Night   web        0
## 2 2014-12-21   37 world today+5-y O Holy Night   web        0
## 3 2014-12-28    5 world today+5-y O Holy Night   web        0
## 4 2015-01-04    2 world today+5-y O Holy Night   web        0
## 5 2015-01-11    2 world today+5-y O Holy Night   web        0
## 6 2015-01-18    1 world today+5-y O Holy Night   web        0

here we get hits as a character since some of the entries are <1. Let’s correct that and plot the values over time by keyword

library(tidyverse)

carrol_trends$interest_over_time %>%
  mutate(hits = as.numeric(if_else(hits == "<1", "0", hits))) %>%
  ggplot(aes(date, hits, color = keyword)) +
  geom_line()

Note how the maximal value is 100 and every other value is proportional to that value. Which is why we see some of the values being 0 since they appeared less than 1% of the times as the maximum.

We can also take a look at the related queries. Here I have shown them using the reactable package.

.ReactTable {
  background: #2f3238;
}
library(reactable)
reactable(carrol_trends$related_queries)

There are many more things we can look at, but this is all I have for you today!