Reclassification of gabapentinoids as schedule 3 controlled substances in the UK: an interrupted time series analysis

Peter Kamerman

3 July 2020

Gabapentinoids (pregabalin and gabapentin) are used for the management of neuropathic pain, some types of epileptic seizures, and anxiety.

Because of the rising number of drug-related fatalities linked to the use of gabapentinoids¹, and evidence of their misuse, the drugs were reclassified in the UK in April 2019 as Class C substances, and scheduled as Schedule 3 items². This reclassification meant that more controls were placed on the prescription of gabapentinoids.

By knowing exactly when the change in prescribing requirements began (1 April 2019), it is possible to use interrupted time series analysis to look for a change in prescribing from the time of the change in regulation. This analysis requires data on drug prescriptions over time, data that are freely available for the National Health System (NHS) England through the OpenPrescribing platform.

In this blog post I work through my first attempt at conducting an interrupted time series analysis using piecewise regression (also called segmented regression). I use two methods of piecewise regression:

Manually re-specifying the data to incorporate a breakpoint (knot).
Using linear splines to specify the knot.

Get some data

The first thing to do was to get some data. Using the OpenPrescribing API, I download data on the total items prescribed of gabapentin and pregabalin for the period 2020-05-01 to 2020-04-01 (please see this GitHub Gist for the R script I used).

# Import data
data <- read.csv('_data/2020-07-03-piecewise-regression/gabapentinoids.csv')

Look at the data

# Describe data
head(data)

##         date time_months prescriptions_gabapentin prescriptions_pregabalin
## 1 2015-05-01           1                   450908                   375350
## 2 2015-06-01           2                   479752                   399022
## 3 2015-07-01           3                   502738                   419985
## 4 2015-08-01           4                   460096                   385838
## 5 2015-09-01           5                   497057                   414772
## 6 2015-10-01           6                   505418                   423735
##   prescriptions_total
## 1              826258
## 2              878774
## 3              922723
## 4              845934
## 5              911829
## 6              929153

str(data)

## 'data.frame':    60 obs. of  5 variables:
##  $ date                    : chr  "2015-05-01" "2015-06-01" "2015-07-01" "2015-08-01" ...
##  $ time_months             : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ prescriptions_gabapentin: int  450908 479752 502738 460096 497057 505418 488061 543067 489551 490390 ...
##  $ prescriptions_pregabalin: int  375350 399022 419985 385838 414772 423735 410291 457160 414782 416315 ...
##  $ prescriptions_total     : int  826258 878774 922723 845934 911829 929153 898352 1000227 904333 906705 ...

The data had five columns: a date column (date), a time column expressing the time in months since the first record (time_months), and then three columns expressing the number of prescribed items per month of gabapentin (prescriptions_gabapentin), pregabalin (prescriptions_pregabalin), and the total number of gabapentinoids (prescriptions_total).

# Load package
library(dplyr)
library(tidyr)
library(stringr)
library(ggplot2)

# Process data into long format
data_long <- data %>% 
  pivot_longer(cols = starts_with('prescription'),
               names_to = 'names',
               values_to = 'values') %>% 
  mutate(names = str_remove(names, 'prescriptions_'))
  
# Plot data (make it pretty)
ggplot(data = data_long) +
  aes(x = time_months,
      y = values,
      colour = names,
      fill = names) +
  geom_point(shape = 21,
             size = 3) +
  geom_vline(xintercept = 48,
             linetype = 2) +
  labs(caption = 'Vertical dashed line: date of reclassification (2019-04-01)',
       y = 'Number of prescribed items',
       x = 'Date') +
  scale_x_continuous(labels = c('May 2015', 'Dec 2016', 'Aug 2018', 'Apr 2020')) +
  scale_fill_manual(values = c('#67A0C9', '#FDA568', '#70BB70')) +
  scale_colour_manual(values = c('#2678B2', '#FD7F28', '#339F34')) +
  theme_minimal(base_size = 18) +
  theme(legend.title = element_blank(),
        legend.position = 'top',
        panel.grid = element_blank(),
        axis.line = element_line(size = 0.5),
        axis.ticks = element_line(size = 0.5))

Looking at the plot, the number of prescribed items for the individual drugs, and therefore the overall total of gabapentinoids too, have been increasing steeply over the past 5 years. There is, however, a hint (especially for pregabalin) of prescriptions leveling off a bit since the April 2019 change in prescribing regulations.

Using piecewise regression, I wanted to determine whether there was indeed a change in prescribed items from April 2019 . I did this assessment by fitting the data using simple linear regression and piecewise regression (using the two methods mentioned above), and then compared the the models. I did this modelling for each drug and the total.

Simple linear regression

# load packages
library(purrr)

# Generate a palette for some pretty colours
colours_fill <- data.frame(fill = c('#67A0C9', '#FDA568', '#70BB70'),
                           colour = c('#2678B2', '#FD7F28', '#339F34'))

# Purrr over the data (see: https://www.painblogr.org/2020-06-19-purring-through-exploratory-analyses)
data_long <- data_long %>% 
  # Nest the data by drug 'name' (pregabalin, gabapentin, and total)
  group_by(names) %>% 
  nest() %>% 
  # Add the colour and fill columns
  bind_cols(colours_fill) %>% 
  # Make the names sentence case.
  mutate(names = str_to_sentence(names))

data_long <- data_long %>% 
  # Add model
  mutate(mod_1  = map(.x = data,
                      ~ lm(values ~ time_months, data = .x))) %>% 
  # Plot model
  mutate(plot_1 = pmap(.l = list(data, names, fill, colour),
                      ~ ggplot(data = ..1) +
                        aes(x = time_months,
                            y = values) +
                        geom_point(shape = 21,
                                   size = 3,
                                   fill = ..3,
                                   colour = ..4) +
                        geom_smooth(method = 'lm',
                                    se =  FALSE,
                                    colour = ..4) +
                        geom_vline(xintercept = 48,
                                   linetype = 2) +
                        labs(title = str_glue('{..2}'),
                             subtitle = 'Simple linear regression (no knots)',
                             caption = 'Vertical dashed line: date of reclassification (2019-04-01)',
                             y = 'Number of prescribed items',
                             x = 'Date') +
                        scale_x_continuous(labels = c('May 2015', 'Dec 2016', 'Aug 2018', 'Apr 2020')) +
                        theme_minimal(base_size = 18) +
                        theme(legend.title = element_blank(),
                              legend.position = 'none',
                              panel.grid = element_blank(),
                              axis.line = element_line(size = 0.5),
                              axis.ticks = element_line(size = 0.5))))

# Print plots and lm outputs
## GABAPENTIN
### Plot
data_long$plot_1[[1]]

### LM summary
summary(data_long$mod_1[[1]])

## 
## Call:
## lm(formula = values ~ time_months, data = .x)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -70794 -21651   3032  24697  51188 
## 
## Coefficients:
##             Estimate Std. Error t value             Pr(>|t|)    
## (Intercept) 503528.8     7728.0   65.16 < 0.0000000000000002 ***
## time_months   2371.7      220.3   10.76  0.00000000000000189 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 29560 on 58 degrees of freedom
## Multiple R-squared:  0.6664, Adjusted R-squared:  0.6606 
## F-statistic: 115.9 on 1 and 58 DF,  p-value: 0.00000000000000189

## PREGABALIN
### Plot
data_long$plot_1[[2]]

### LM summary
summary(data_long$mod_1[[2]])

## 
## Call:
## lm(formula = values ~ time_months, data = .x)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -57154 -12165   4972  16396  42945 
## 
## Coefficients:
##             Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) 395574.9     5587.7   70.79 <0.0000000000000002 ***
## time_months   4465.3      159.3   28.03 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 21370 on 58 degrees of freedom
## Multiple R-squared:  0.9312, Adjusted R-squared:  0.9301 
## F-statistic: 785.6 on 1 and 58 DF,  p-value: < 0.00000000000000022

## TOTAL
### Plot
data_long$plot_1[[3]]

### LM summary
summary(data_long$mod_1[[3]])

## 
## Call:
## lm(formula = values ~ time_months, data = .x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -127948  -34376    5254   42910   83841 
## 
## Coefficients:
##             Estimate Std. Error t value            Pr(>|t|)    
## (Intercept) 899103.7    12992.2   69.20 <0.0000000000000002 ***
## time_months   6837.0      370.4   18.46 <0.0000000000000002 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 49690 on 58 degrees of freedom
## Multiple R-squared:  0.8545, Adjusted R-squared:  0.852 
## F-statistic: 340.7 on 1 and 58 DF,  p-value: < 0.00000000000000022

Piecewise regression

Manually re-specify the data to create two segments around a knot

To create a knot I needed to include a second term in the linear regression model, one term for each segment.

Original model:

New model:

Where:

is a dummy knot, which takes a value of 1 when and 0 when
is the value of minus the knot value ()

Thus, for values that fall below the knot value (), the second term will be 0, and for values greater than the knot value the second term will be .

For my data, the knot point was 48 months, the time_months value that coincided with the implementation of the new regulations (2019-04-01). Therefore I re-specified the data into two terms: time_months (for time values 48 months) and time_months2 (for time values 48 months)

The code block below shows how I added the second term for each of the three datatsets (Gabapentin, Pregabalin, and Total). First I created a dummy knot (; dummy_knot) at 48 months, then I subtracted the knot value () from time_months () to produce x_minus_knot, and finally I multiplied dummy_knot by x_minus_knot to yield the second term of the model, namely, time_months2.

data_long <- data_long %>% 
  mutate(data_2 = map(.x = data,
                      ~ .x %>% 
                        # Create a dummy knot
                        mutate(dummy_knot = ifelse(time_months < 48,
                                                   yes = 0, 
                                                   no = 1)) %>% 
                        # Subtract knot value from x
                        mutate(x_minus_knot = time_months - 48) %>% 
                        # Calculate time_months2
                        mutate(time_months2 = dummy_knot * x_minus_knot)))

# Top-and-tail GABAPENTIN to check what the data looks like
head(data_long$data_2[[1]])

## # A tibble: 6 x 6
##   date       time_months values dummy_knot x_minus_knot time_months2
##   <chr>            <int>  <int>      <dbl>        <dbl>        <dbl>
## 1 2015-05-01           1 450908          0          -47            0
## 2 2015-06-01           2 479752          0          -46            0
## 3 2015-07-01           3 502738          0          -45            0
## 4 2015-08-01           4 460096          0          -44            0
## 5 2015-09-01           5 497057          0          -43            0
## 6 2015-10-01           6 505418          0          -42            0

tail(data_long$data_2[[1]])

## # A tibble: 6 x 6
##   date       time_months values dummy_knot x_minus_knot time_months2
##   <chr>            <int>  <int>      <dbl>        <dbl>        <dbl>
## 1 2019-11-01          55 602927          1            7            7
## 2 2019-12-01          56 618615          1            8            8
## 3 2020-01-01          57 629109          1            9            9
## 4 2020-02-01          58 570291          1           10           10
## 5 2020-03-01          59 642909          1           11           11
## 6 2020-04-01          60 616384          1           12           12

Then I ran the model using the updated data.

data_long <- data_long %>% 
  mutate(mod_2 = map(.x = data_2,
                     ~ lm(values ~ time_months + time_months2, data = .x)))

I then generated predicted values for the model and plotted the data.

data_long <- data_long %>% 
  # Generate predicted data
  mutate(data_3 = map2(.x = data_2,
                       .y = mod_2,
                       ~ .x %>% 
                         mutate(predicted = predict(.y)))) %>% 
  # Generate plots
  mutate(plot_2 = map2(.x = data_3,
                       .y = names,
                      ~ ggplot(data = .x) +
                        aes(x = time_months,
                            y = values,
                            colour = dummy_knot,
                            fill = dummy_knot) +
                        geom_point(shape = 21,
                                   size = 3) +
                        geom_line(aes(y = predicted)) +
                        geom_vline(xintercept = 48,
                                   linetype = 2) +
                        labs(title = .y,
                             subtitle = 'Segmented regression (1 knot)',
                             caption = 'Vertical dashed line: date of reclassification (2019-04-01)',
                             y = 'Number of prescribed items',
                             x = 'Date') +
                        scale_x_continuous(labels = c('May 2015', 'Dec 2016', 'Aug 2018', 'Apr 2020')) +
                        theme_minimal(base_size = 18) +
                        theme(legend.title = element_blank(),
                              legend.position = 'none',
                              panel.grid = element_blank(),
                              axis.line = element_line(size = 0.5),
                              axis.ticks = element_line(size = 0.5))))

# Print plots and lm outputs, and slopes
## GABAPENTIN
### Plot
data_long$plot_2[[1]]

### LM summary
summary(data_long$mod_2[[1]])

## 
## Call:
## lm(formula = values ~ time_months + time_months2, data = .x)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -76545 -19315   3516  19892  48666 
## 
## Coefficients:
##              Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)  491934.6     7687.0  63.996 < 0.0000000000000002 ***
## time_months    2985.4      260.5  11.462 < 0.0000000000000002 ***
## time_months2  -5481.2     1491.8  -3.674             0.000528 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 26810 on 57 degrees of freedom
## Multiple R-squared:  0.7303, Adjusted R-squared:  0.7208 
## F-statistic: 77.17 on 2 and 57 DF,  p-value: < 0.00000000000000022

### Slopes of individual segments
#### Segment 1
coef(data_long$mod_2[[1]])[[2]]

## [1] 2985.421

#### Segment 2 (the sum of the two regression coefficients)
coef(data_long$mod_2[[1]])[[2]] + coef(data_long$mod_2[[1]])[[3]]

## [1] -2495.8

## PREGABALIN
### Plot
data_long$plot_2[[2]]

### LM summary
summary(data_long$mod_2[[2]])

## 
## Call:
## lm(formula = values ~ time_months + time_months2, data = .x)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -59624 -12687   6007  13611  35680 
## 
## Coefficients:
##              Estimate Std. Error t value            Pr(>|t|)    
## (Intercept)  389636.5     5876.8  66.300 <0.0000000000000002 ***
## time_months    4779.7      199.1  24.003 <0.0000000000000002 ***
## time_months2  -2807.4     1140.5  -2.462              0.0169 *  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20500 on 57 degrees of freedom
## Multiple R-squared:  0.9379, Adjusted R-squared:  0.9357 
## F-statistic: 430.1 on 2 and 57 DF,  p-value: < 0.00000000000000022

### Slopes of individual segments
#### Segment 1
coef(data_long$mod_2[[2]])[[2]]

## [1] 4779.71

#### Segment 2 (the sum of the two regression coefficients)
coef(data_long$mod_2[[2]])[[2]] + coef(data_long$mod_2[[2]])[[3]]

## [1] 1972.273

## TOTAL
### Plot
data_long$plot_2[[3]]

### LM summary
summary(data_long$mod_2[[3]])

## 
## Call:
## lm(formula = values ~ time_months + time_months2, data = .x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -136169  -37501   12546   32975   75616 
## 
## Coefficients:
##              Estimate Std. Error t value             Pr(>|t|)    
## (Intercept)  881571.1    13212.1  66.724 < 0.0000000000000002 ***
## time_months    7765.1      447.7  17.345 < 0.0000000000000002 ***
## time_months2  -8288.7     2564.0  -3.233              0.00204 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 46080 on 57 degrees of freedom
## Multiple R-squared:  0.8771, Adjusted R-squared:  0.8727 
## F-statistic: 203.3 on 2 and 57 DF,  p-value: < 0.00000000000000022

### Slopes of individual segments
#### Segment 1
coef(data_long$mod_2[[3]])[[2]]

## [1] 7765.131

#### Segment 2 (the sum of the two regression coefficients)
coef(data_long$mod_2[[3]])[[2]] + coef(data_long$mod_2[[3]])[[3]]

## [1] -523.5277

Use linear splines to specfiy two segments around a knot

You can do the same thing as manually re-specifying the data into two segments by using the lspline function from the lspline package.

The lspline function is designed to, “compute the basis of piecewise-linear spline”.

The arguments for the function are:

lspline(x, knots = NULL, marginal = FALSE)

Where:

x is a numeric vector (x values)
knots is a numeric vector of knot positions
marginal is a logical that determines whether coefficients can be interpreted as slopes of consecutive spline segments (FALSE), or slope change at consecutive knots (TRUE).

# Load package
library(lspline)

data_long <- data_long %>% 
  # Generate the model using lspline
  mutate(mod_3 = map(.x = data,
                     ~ lm(values ~ lspline(x = time_months,
                                           knots = 48,
                                           marginal = FALSE), data = .x))) %>% 
  # Get predicted values 
  mutate(data_4 = map2(.x = data_3,
                       .y = mod_3,
                       ~ .x %>% 
                         mutate(predicted_spl = predict(.y)))) %>% 
  # Generate plots
  mutate(plot_3 = map2(.x = data_4,
                       .y = names,
                      ~ ggplot(data = .x) +
                        aes(x = time_months,
                            y = values,
                            colour = dummy_knot,
                            fill = dummy_knot) +
                        geom_point(shape = 21,
                                   size = 3) +
                        geom_line(aes(y = predicted_spl)) +
                        geom_vline(xintercept = 48,
                                   linetype = 2) +
                        labs(title = .y,
                             subtitle = 'Piecewise linear spline (1 knot)',
                             caption = 'Vertical dashed line: date of reclassification (2019-04-01)',
                             y = 'Number of prescribed items',
                             x = 'Date') +
                        scale_x_continuous(labels = c('May 2015', 'Dec 2016', 'Aug 2018', 'Apr 2020')) +
                        theme_minimal(base_size = 18) +
                        theme(legend.title = element_blank(),
                              legend.position = 'none',
                              panel.grid = element_blank(),
                              axis.line = element_line(size = 0.5),
                              axis.ticks = element_line(size = 0.5))))

# Print plots and lm outputs, and slopes
## GABAPENTIN
### Plot
data_long$plot_3[[1]]

### LM summary
summary(data_long$mod_3[[1]])

## 
## Call:
## lm(formula = values ~ lspline(x = time_months, knots = 48, marginal = FALSE), 
##     data = .x)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -76545 -19315   3516  19892  48666 
## 
## Coefficients:
##                                                         Estimate Std. Error
## (Intercept)                                             491934.6     7687.0
## lspline(x = time_months, knots = 48, marginal = FALSE)1   2985.4      260.5
## lspline(x = time_months, knots = 48, marginal = FALSE)2  -2495.8     1339.7
##                                                         t value
## (Intercept)                                              63.996
## lspline(x = time_months, knots = 48, marginal = FALSE)1  11.462
## lspline(x = time_months, knots = 48, marginal = FALSE)2  -1.863
##                                                                    Pr(>|t|)    
## (Intercept)                                             <0.0000000000000002 ***
## lspline(x = time_months, knots = 48, marginal = FALSE)1 <0.0000000000000002 ***
## lspline(x = time_months, knots = 48, marginal = FALSE)2              0.0676 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 26810 on 57 degrees of freedom
## Multiple R-squared:  0.7303, Adjusted R-squared:  0.7208 
## F-statistic: 77.17 on 2 and 57 DF,  p-value: < 0.00000000000000022

### Slopes of individual segments
#### Segment 1
coef(data_long$mod_3[[1]])[[2]]

## [1] 2985.421

#### Segment 2 
#### (marginal = FALSE, therefore coefficients can be interpreted as slopes of consecutive spline segments)
coef(data_long$mod_3[[1]])[[3]]

## [1] -2495.8

## PREGABALIN
### Plot
data_long$plot_3[[2]]

### LM summary
summary(data_long$mod_3[[2]])

## 
## Call:
## lm(formula = values ~ lspline(x = time_months, knots = 48, marginal = FALSE), 
##     data = .x)
## 
## Residuals:
##    Min     1Q Median     3Q    Max 
## -59624 -12687   6007  13611  35680 
## 
## Coefficients:
##                                                         Estimate Std. Error
## (Intercept)                                             389636.5     5876.8
## lspline(x = time_months, knots = 48, marginal = FALSE)1   4779.7      199.1
## lspline(x = time_months, knots = 48, marginal = FALSE)2   1972.3     1024.2
##                                                         t value
## (Intercept)                                              66.300
## lspline(x = time_months, knots = 48, marginal = FALSE)1  24.003
## lspline(x = time_months, knots = 48, marginal = FALSE)2   1.926
##                                                                    Pr(>|t|)    
## (Intercept)                                             <0.0000000000000002 ***
## lspline(x = time_months, knots = 48, marginal = FALSE)1 <0.0000000000000002 ***
## lspline(x = time_months, knots = 48, marginal = FALSE)2              0.0591 .  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 20500 on 57 degrees of freedom
## Multiple R-squared:  0.9379, Adjusted R-squared:  0.9357 
## F-statistic: 430.1 on 2 and 57 DF,  p-value: < 0.00000000000000022

### Slopes of individual segments
#### Segment 1
coef(data_long$mod_3[[2]])[[2]]

## [1] 4779.71

#### Segment 2
#### (marginal = FALSE, therefore coefficients can be interpreted as slopes of consecutive spline segments)
coef(data_long$mod_3[[2]])[[3]]

## [1] 1972.273

## TOTAL
### Plot
data_long$plot_3[[3]]

### LM summary
summary(data_long$mod_3[[3]])

## 
## Call:
## lm(formula = values ~ lspline(x = time_months, knots = 48, marginal = FALSE), 
##     data = .x)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -136169  -37501   12546   32975   75616 
## 
## Coefficients:
##                                                         Estimate Std. Error
## (Intercept)                                             881571.1    13212.1
## lspline(x = time_months, knots = 48, marginal = FALSE)1   7765.1      447.7
## lspline(x = time_months, knots = 48, marginal = FALSE)2   -523.5     2302.6
##                                                         t value
## (Intercept)                                              66.724
## lspline(x = time_months, knots = 48, marginal = FALSE)1  17.345
## lspline(x = time_months, knots = 48, marginal = FALSE)2  -0.227
##                                                                    Pr(>|t|)    
## (Intercept)                                             <0.0000000000000002 ***
## lspline(x = time_months, knots = 48, marginal = FALSE)1 <0.0000000000000002 ***
## lspline(x = time_months, knots = 48, marginal = FALSE)2               0.821    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 46080 on 57 degrees of freedom
## Multiple R-squared:  0.8771, Adjusted R-squared:  0.8727 
## F-statistic: 203.3 on 2 and 57 DF,  p-value: < 0.00000000000000022

### Slopes of individual segments
#### Segment 1
coef(data_long$mod_3[[3]])[[2]]

## [1] 7765.131

#### Segment 2 
#### (marginal = FALSE, therefore coefficients can be interpreted as slopes of consecutive spline segments)
coef(data_long$mod_3[[3]])[[3]]

## [1] -523.5277

Summary

If you go back and compare the plots and slopes generated by the manually specified segments to those generated using lsplines, you find that they are identical, which is what you expect, but using lsplines saved quite a bit of coding to process the data compared to the manual method.

More important for the question I asked (did the change in prescribing regulations lead to a change in prescribed items?), was whether the fit and quality of the simple linear regression was worse than the piecewise regression models? For assessing goodness-of-fit, I used the , and for model selection I used .

For number of gabapentin items prescribed:

= 0.66 and = 0.72
= 1410 and = 1399

So neither model is a great fit, and its obvious why when you look at the plots; the data follow a curvilinear path that isn’t fully captured by the simple linear regression or linear spline. Based on the , the quality of the segmented model was relatively better than that of the simple linear regression model.

For number of pregabalin items prescribed:

= 0.93 and = 0.94
= 1371 and = 1367

So by looking at the plots and assessing the s values, you can see that both models are fairly good fits. Based on the the quality of the segmented model is relatively better than that of the simple linear regression model.

For total number of items prescribed:

= 0.85 and = 0.87
= 1472 and = 1464

So neither model is a great fit, with the fit lying somewhere between the values obtained for pregabalin and those obtained for gabapentin. If you look at the plots, the data follow a curvilinear path, reflecting the influence of the gabapentin data on the total number of prescribed items. Based on the , the quality of the segmented model is relatively better than that of the simple linear regression model.

Based on these results, I feel that the piecewise regression models were marginally better than the models produced by simple linear regression, and that there has been a change in prescribing of gabapentinoids since the introduction of the new regulations in April 2019. However, the changes are not marked.

Thanks to Blair H Smith for giving me the idea to perform these analyses.

For more cool R content, visit: R Weekly

Session information

## R version 4.0.4 (2021-02-15)
## Platform: x86_64-apple-darwin17.0 (64-bit)
## Running under: macOS Catalina 10.15.7
## 
## Matrix products: default
## BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
## 
## Random number generation:
##  RNG:     Mersenne-Twister 
##  Normal:  Inversion 
##  Sample:  Rounding 
##  
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
## 
## attached base packages:
## [1] grid      stats     graphics  grDevices utils     datasets  methods  
## [8] base     
## 
## other attached packages:
##  [1] lspline_1.0-0       carrots_0.1.0       geofacet_0.2.0     
##  [4] survey_4.0          survival_3.2-9      Matrix_1.3-2       
##  [7] sf_0.9-7            fiftystater_1.0.1   skimr_2.1.3        
## [10] magrittr_2.0.1      geojsonio_0.9.4     lubridate_1.7.10   
## [13] xml2_1.3.2          sp_1.4-5            leaflet_2.0.4.1    
## [16] highcharter_0.8.2   flexdashboard_0.5.2 gganimate_1.0.7    
## [19] gdtools_0.2.3       ggiraph_0.7.8       svglite_2.0.0      
## [22] pander_0.6.3        knitr_1.31          forcats_0.5.1      
## [25] stringr_1.4.0       dplyr_1.0.5         purrr_0.3.4        
## [28] readr_1.4.0         tidyr_1.1.3         tibble_3.1.0       
## [31] ggplot2_3.3.3       tidyverse_1.3.0    
## 
## loaded via a namespace (and not attached):
##   [1] colorspace_2.0-0    ellipsis_0.3.1      class_7.3-18       
##   [4] rgdal_1.5-23        base64enc_0.1-3     fs_1.5.0           
##   [7] rstudioapi_0.13     httpcode_0.3.0      farver_2.1.0       
##  [10] ggrepel_0.9.1       fansi_0.4.2         splines_4.0.4      
##  [13] rlist_0.4.6.1       jsonlite_1.7.2      broom_0.7.5        
##  [16] dbplyr_2.1.0        png_0.1-7           rgeos_0.5-5        
##  [19] compiler_4.0.4      httr_1.4.2          backports_1.2.1    
##  [22] assertthat_0.2.1    lazyeval_0.2.2      cli_2.3.1          
##  [25] tweenr_1.0.1        htmltools_0.5.1.1   prettyunits_1.1.1  
##  [28] tools_4.0.4         igraph_1.2.6        gtable_0.3.0       
##  [31] glue_1.4.2          geojson_0.3.4       V8_3.4.0           
##  [34] Rcpp_1.0.6          imguR_1.0.3         cellranger_1.1.0   
##  [37] jquerylib_0.1.3     vctrs_0.3.6         crul_1.1.0         
##  [40] nlme_3.1-152        crosstalk_1.1.1     xfun_0.22          
##  [43] rvest_1.0.0         lifecycle_1.0.0     jqr_1.2.0          
##  [46] zoo_1.8-9           scales_1.1.1        hms_1.0.0          
##  [49] yaml_2.2.1          quantmod_0.4.18     curl_4.3           
##  [52] gridExtra_2.3       sass_0.3.1          stringi_1.5.3      
##  [55] highr_0.8           maptools_1.0-2      e1071_1.7-4        
##  [58] TTR_0.24.2          boot_1.3-27         repr_1.1.3         
##  [61] rlang_0.4.10        pkgconfig_2.0.3     systemfonts_1.0.1  
##  [64] geogrid_0.1.1       evaluate_0.14       lattice_0.20-41    
##  [67] htmlwidgets_1.5.3   labeling_0.4.2      tidyselect_1.1.0   
##  [70] geojsonsf_2.0.1     R6_2.5.0            magick_2.7.0       
##  [73] generics_0.1.0      DBI_1.1.1           mgcv_1.8-34        
##  [76] pillar_1.5.1        haven_2.3.1         foreign_0.8-81     
##  [79] withr_2.4.1         units_0.7-0         xts_0.12.1         
##  [82] modelr_0.1.8        crayon_1.4.1        uuid_0.1-4         
##  [85] KernSmooth_2.23-18  utf8_1.2.1          rmarkdown_2.7      
##  [88] jpeg_0.1-8.1        progress_1.2.2      rnaturalearth_0.1.0
##  [91] readxl_1.3.1        data.table_1.14.0   reprex_1.0.0       
##  [94] digest_0.6.27       classInt_0.4-3      webshot_0.5.2      
##  [97] munsell_0.5.0       viridisLite_0.3.0   kableExtra_1.3.4   
## [100] bslib_0.2.4         mitools_2.4

Torrance N, Veluchamy A, Zhou Y, Fletcher EH, Moir E, Hebert HL, Donnan PT, Watson J, Colvin LA, Smith BH. Trends in gabapentinoid prescribing, co-prescribing of opioids and benzodiazepines, and associated deaths in Scotland. Br J Anaesth [in press]. doi: 10.1016/j.bja.2020.05.017.↩︎
Rescheduling of Gabapentin and Pregabalin as Schedule 3 Controlled Drugs. NHS England. Accessed: 2020-06-29.↩︎