library(segclust2d)Both segmentation() and segclust() return objects of segmentation-class for which several functions are available (see below).
There are two types of function: (1) some are general and show likelihood for all the different segmentations; (2) other are specific to a given segmentation and requires selecting a number of segments and of clusters (if applicable).
For the functions specific to a given segmentation, if you do not provide as argument the number of segments and of clusters, the functions will automatically select the best arguments based on a penalized log-likelihood as following:
for outputs of segmentation() the optimal number of segments is selected with Lavielle’s criterium. Other numbers of segments may be provided with arguments nseg.
for outputs of segclust() the optimal numbers of clusters and segments are selected with a BIC-based penalized criterium. Other parameters may be provided with arguments nseg and ncluster. It is recommended to manually choose the number of clusters based on biological knowledge or careful exploration of the BIC-based penalized likelihood. Once the number of clusters was chosen (either manually or automatically) it is recommended to select the number of segments using the automatic BIC-based penalized likelihood criterium.
All plot methods use ggplot2 package and return ggplot objects that can be further modified and customized using classical ggplot2 (see ggplot2 function reference).
orderIf you provide argument order = TRUE to a function specific to a segmentation, then the different segments or clusters will be numbered ordered by the variable provided as order.var in the segmentation() or segclust() call.
For a specific segmentation:
plot.segmentation to show the segmented time-series, and clusters if applicable.segmap to show the results of the segmentation as a labelled path (if applicable).stateplot plot summary statistics for all segments or clusters.Summary for all segmentations:
plot_likelihood for segmentation() show the log-likelihood of the segmentation for all numbers of segments.plot_BIC for segclust() show the BIC-based penalized log-likelihood of the segmentation.clustering for all numbers of segments and clusters.For a specific segmentation:
augment returns a data.frame with the original data as well as the segment or cluster associated for each data pointsegment returns a data.frame with the beginning and end of each segmentstates for segclust provides a data.frame with summary statistics for all clustersSummary for all segmentations:
logLik for segmentation() returns a data.frame with the log-likelihood for all numbers of segments.BIC for segclust() returns a data.frame with the BIC-based penalized log-likelihood for all numbers of clusters and segments.As functions for segmentation and segmentation/clustering are very similar, we will show examples mostly for the segmentation/clustering outputs, but the use is very similar, argument ncluster just need to be omitted for obtaining outputs for segmentation.
data(simulmode)
simulmode$abs_spatial_angle <- abs(simulmode$spatial_angle)
simulmode <- simulmode[!is.na(simulmode$abs_spatial_angle), ]
mode_segclust <- segclust(simulmode,
Kmax = 20, lmin=10, ncluster = c(2,3),
seg.var = c("speed","abs_spatial_angle"),
scale.variable = TRUE)plot.segmentation for segmented time-seriesplot(mode_segclust, ncluster = 3)segmap() plots the results of the segmentation as a labelled path. This can be done only if data have a geographic meaning. Coordinate names are by default “x” and “y” but they can be provided through argument coord.names.
segmap(mode_segclust, ncluster = 3)stateplot() shows statistics for each state or segment.
stateplot(mode_segclust, ncluster = 3)augment.segmentation() is a method for broom::augment. It returns an augmented data.frame with outputs of the model - here, the attribution to segment or cluster.
augment(mode_segclust, ncluster = 3)segment() makes it possible to retrieve information on the different segments for a given segmentation. Each segment is associated with the mean and standard deviation for each variable, the state (equivalent to the segment number for segmentation) and the state ordered given a variable - by default the first variable given by seg.var. One can specify the variable for ordering states through the order.var of segmentation() and segclust().
segment(mode_segclust, ncluster = 3)states() returns information on the different states of the segmentation. For segmentation() it is quite similar to segment(). For segclust, however it gives the different clusters found and the statistics associated.
states(mode_segclust, ncluster = 3)logLik.segmentation() return information on the log-likelihood of the different segmentations possible. It returns a data.frame with the number of segments and the log-likelihood.
data("simulshift")
shift_seg <- segmentation(simulshift,
seg.var = c("x","y"),
lmin = 240, Kmax = 25,
subsample_by = 60)logLik(shift_seg)plot_likelihood() plots the log-likelihood of the segmentation for all the tested numbers of segments and clusters.
plot_likelihood(shift_seg)BIC.segmentation() returns information on the BIC-based penalized log-likelihood of the different segmentations possible. It returns a data.frame with the number of segments, the BIC-based penalized log-likelihood and the number of cluster. For segclust() only. Note that this does not truly return a BIC. Here highest values are favored (in opposition to BIC)
BIC(mode_segclust)plot_BIC() plots the BIC-based penalized log-likelihood of the segmentation for all the tested numbers of segments and clusters.
plot_BIC(mode_segclust)