How to Create Alerts for Adobe Analytics Using R and Slack

Intelligent alerting has been one of the most popular features of Adobe Analytics since its release years ago. It’s impossible to keep an eye on your data 24/7, and alerts are an excellent way to prevent missing out on important trends in your metrics over time. And while there are lots of amazing things you can do with the built-in alerts, there’s no way to find trending dimension items (e.g., trending videos, pages, or products) in your data. Additionally, Adobe Customer Journey Analytics and Adobe Experience Platform Query Service don’t have built-in alerting as of this writing.

In this post, I’m going to show you how to create more sophisticated alerting using three awesome R libraries: adobeanalyticsr (or, in my case cjar), slackr, and cronR. But first, if you’re wondering whether you need R just to send intelligent alerts from Adobe Analytics to Slack, you don’t. It’s quite simple to create an email address tied to a Slack channel or conversation. With that email address, you can simply add it as a recipient in your alert setup:

But, if you’re ready to go a step beyond the basics, then read on!

Setting Up R for Alerting

The basic steps to follow are:

  1. Define the query you want to run whether using adobeanalyticsr, cjar, or using the techniques I showed in a previous blog post to utilize Query Service.
  2. If we decide that an alert should be triggered based on our query results, we’ll use the slackr library to send a message to Slack
  3. Using the cronR library, we’ll set our script to run on a regular cadence so that our R script doesn’t have to be manually triggered every time we want it to run.

Querying the Data

There are tons of cool and interesting things that can be done when you can alert on any report, such as alerting when a new value appears or disappears from a dimension, alerting when a dimension item moves up or down dramatically in rank, or alerting when a trended metric makes a significant jump or drop over time.

Let’s start with detecting new values in a dimension. I frequently use this to detect new values that need to be classified in a report or alert me when we have a new business account that’s started using our product. For a retailer, this could be useful to see when a new product was purchased for the first time. For a media company, this is useful to be alerted when a new video has started to be watched by users.

The basic query looks something like this:

library(cjar) # you can also use adobeanalyticsr
library(dplyr)

cja_auth()
cja_dv = "my_dataview_id"

# Grab all values from the last 30 days
previous_values = cja_freeform_table(
  date_range =  c(Sys.Date()-30, Sys.Date()-1),
  dataviewId = cja_dv,
  dimensions = c("my_dimension"),
  metrics = "visits",
  top=50000
)

# Grab all the values from today
current_values = cja_freeform_table(
  date_range =  c(Sys.Date(), Sys.Date()),
  dataviewId = cja_dv,
  dimensions = c("my_dimension"),
  metrics = "visits",
  top=50000
)

# Isolate values that exist today that didn't exist before
prev_and_current = previous_values %>%
  full_join(current_values, by="my_dimension") %>%
  filter(is.na(visits.x) & !is.na(visits.y))

Another trick I’ll often use is applying a filter or segment to the report to look at just a subset of the values (rather than look at all possible values). This is useful if you have URLs and you only want to get alerted on those from a particular domain (say a new campaign landing page), or if you have different product categories and you want to be alerted about new “shoes” but not new “pants” as an example.

Conversely, you can change the final step to isolate values that disappeared from the data if that is more interesting for your use case: filter(!is.na(visits.x) & is.na(visits.y)) – for example, to detect if tracking has died on a particular page or if there’s something preventing users from making a conversion on your site.

In the next example, let’s find changes in how dimension items are ranked. You can take a similar approach, but instead, we’ll leverage the row_number() function to find large changes in a rank:

# Grab yesterday's rankings:
yesterday_values = cja_freeform_table(
  date_range =  c(Sys.Date()-1, Sys.Date()-1),
  dataviewId = cja_dv,
  dimensions = c("my_dimension"),

  # Pick any metric to rank by like "orders" or "views":
  metrics = "my_metric", 
  top=50000
) %>%
  transmute(
    my_dimension,
    rank = row_number()
  )

# Grab today's rankings:
today_values = cja_freeform_table(
  date_range =  c(Sys.Date(), Sys.Date()),
  dataviewId = cja_dv,
  dimensions = c("my_dimension"),
  metrics = "my_metric",
  top=50000
) %>%
  transmute(
    my_dimension,
    rank = row_number()
  )

# Calculate change in ranking:
prev_and_current = yesterday_values %>%
  full_join(today_values, by="my_dimension") %>%
  transmute(
    my_dimension,
    rank_change = rank.x - rank.y
  )

# Grab the top winners:
top_winners = prev_and_current %>%
  filter(rank_change > 0) %>%
  arrange(-rank_change)

# Grab the biggest losers:
top_losers = prev_and_current %>%
  filter(rank_change < 0) %>%
  arrange(rank_change)

This is incredibly useful for identifying the top trends in your product purchases or videos/pages that are trending on your site. Using the cross-channel capabilities of Customer Journey Analytics, you can even get alerted when specific pages are causing more calls to your call center than others or when specific products have a sudden jump in in-store returns.

Lastly, you can look for anomalies in trended metrics. This is somewhat duplicative of the alerting functionality already present in Adobe Analytics but still useful if you want to be able to achieve the same types of alerting in Customer Journey Analytics or from other reporting systems like Query Service (or anything else you might use).

Delving deeply into the science of anomaly detection is outside the scope of this post; however, if you’re interested, Arka Ghosh has a really nice article about using the anomalize package that you can read here. Here’s a simple example of how to apply the package to the data from Adobe Analytics or Customer Journey Analytics:

library(anomalize)
library(tidyverse)

trended_metric = cja_freeform_table(
  date_range =  c(Sys.Date()-90, Sys.Date()-1),
  dataviewId = cja_dv,
  dimensions = c("daterangeday"),
  metrics = "visits"
) %>%
  arrange(daterangeday)

anomalized_visits <- as.tibble(trended_metric) %>%
  time_decompose(visits, merge = TRUE) %>%
  anomalize(remainder) %>%
  time_recompose()

plot = anomalized_visits %>% 
  plot_anomalies(ncol = 3, alpha_dots = 0.75, time_recomposed = TRUE)

This creates an R data frame where anomalies have been flagged in the “anomaly” column. It also creates a very nice-looking plot where the anomalies have been highlighted for you:

Now that we have some example queries, we can move on to the next step: sending alerts to Slack.

Sending Alerts to Slack

To send messages to Slack, we’re going to leverage the slackr library. The setup is not terribly difficult, and you can read the setup instructions at this link. For this article, we’ll just be using the simplest option: a single-channel bot. The setup steps can also be found by running vignette('scoped-bot-setup', package = 'slackr') from the command line in R.

Once you’ve got your channel set up and you’ve added the bot to your Slack channel, posting messages to your slack channel is as easy as:

token = "your_token_here"
channel = "#alert-channel"
message = "I'm alerting to Slack!"

slackr_msg(message, channel = channel, token = token)

For the example above where we’re detecting new values in a dimension, one needs only add a bit of logic to test whether the queries we’ve run contain any rows. For example, to alert when new values have appeared in our dimension:

# If new values have appeared, then send a Slack alert
if(dim(prev_and_current)[1] > 0){
  message = paste0("There are new dimension items!", "\n", 
      paste(prev_and_current$my_dimension, collapse = "\n"))

  slackr_msg(message, channel = channel, token = token)
}

slackr also has a very convenient function for converting a data frame to a CSV and sending it to R as well. For the rank change example above, you can do something like this:

# Send a csv of the top "winners" to Slack 
slackr_csv(
  top_winners[1:10,],
  token = "your_token_here",
  channels = "#cja-bot",
  quote = FALSE,
  row.names = FALSE,
  title = "Top Winners Today"
)

Lastly, you can even send plots over to Slack using ggslackr and the same plot object we created above:

ggslackr(
  plot = plot,
  channels = "#cja-bot",
  token = "your_token_here",
  title = "Anomaly Plot"
)

In Slack, it should look something like this – awesome!

Scheduling Alerts with cronR

Now that we’ve got the R queries and the Slack messages we need, it’s just a matter of running these queries and sending alerts to Slack automatically. The cronR package is awesome for this (on Windows you can try taskscheduleR).

After installing the cronR package, you can run the following from the R console to set up a schedule:

> library(cronR)
> cron_rstudioaddin()

This should bring up a nice GUI to allow you to select the R script we created above and decide how frequently you want it to run.

And that’s all there is to it. Make sure your machine is connected to the internet whenever you’ve scheduled your script to run, and you should start seeing your Slack channel populate with interesting data.

Happy alerting!

Trevor Paulsen

Trevor is a group product manager for Adobe's Customer Journey Analytics (CJA). With a background in aerospace engineering and robotics, he has a strong foundation in estimation theory and data mining. Before leading Adobe's data science consulting team, Trevor used these skills to drive innovation in the fields of aerospace and robotics. When he's not working, Trevor enjoys engaging in big data projects and statistical analyses as a hobby. He is also a father of five and enjoys bike rides and music. All views expressed are his own.