Handling Trajectory Compression - Trajectory Database Systems

Trajectory Database Systems

6.6 Handling Trajectory Compression

As addressed by [36], it is expected that all the ubiquitous positioning devices will eventually start to generate an unprecedented data stream of time-stamped positions.

Sooner or later, such enormous volumes of data will lead to storage, transmission, computation, and display challenges. Hence, the need for compression techniques arises. However, existing work in this domain is relatively limited [10,36,50,51], and mainly guided by advances in the ﬁeld of line simpliﬁcation, cartographic general-ization, and data series compression. According to [36], the objectives for trajectory data compression are:

– To obtain a lasting reduction in data size

– To obtain a data series that still allows various computations at acceptable (low) complexity

– To obtain a data series with known, small margins of error, which are preferably parametrically adjustable

As a consequence, our interest is with lossy compression techniques, which elimi-nate some redundant or unnecessary information under well-deﬁned error bounds.

Generally, the whole of the proposed compression algorithms that will be examined in this section deal with the compression of trajectory data in unrestricted spaces.

To the best of our knowledge, the case of compression under network constraints has not been already examined in the research literature, and it will be consequently discussed in Sect. 6.7.3. Meratnia and By [36] exploit existing algorithms used in

Fig. 6.13 Top–down Douglas–Peucker algorithm used for trajectory compression. Original trajec-tory is presented withdotted linesand compressed trajectory withsolid line[36]

Pe(xe,ye,te)

Ps(xs,ys,ts)

P_i’(x_i’,y_i’,t_i) Pi(xi,yi,ti)

Fig. 6.14 The synchronous Euclidean distance (SED): the distance is calculated between the point under examination (Pi) and the pointP_i, which is determined as the point on the line (Ps,Pe) at the time instanceti[36]

the line generalization ﬁeld, presenting one top–down and one opening window algorithm, which can be directly applied to spatiotemporal trajectories. The top–

down algorithm, named TD-TR, is based on the well-known Douglas–Peucker [18]

algorithm (Fig. 6.13) introduced by geographers in cartography. This algorithm cal-culates the perpendicular distance of each internal point from the line connecting the ﬁrst and the last point of the polyline (line AB in Fig. 6.13) and ﬁnds the point with the greatest perpendicular distance (point C). Then it creates lines AC and CB and, recursively, checks these new lines against the remaining points with the same method, and so on. When the distance of all remaining points from the currently examined line is less than a given threshold (e.g., all the points following C against line BC in Fig. 6.13), the algorithm stops and returns this line segment as part of the new, compressed, polyline. Being aware of the fact that trajectories are poly-lines evolving in time, the algorithm presented in [36] replaces the perpendicular distance used in the DP algorithm with the so-called synchronous Euclidean dis-tance(SED), also discussed in [10, 51], which is the distance between the currently examined point (Piin Fig. 6.14) and the point of the line (Ps,P_e) where the moving object would lie, supposed it was moving on this line, at time instancet_idetermined by the point under examination (P_iin Fig. 6.14)). The time complexity of such an algorithm isO(NlogN).

Although the experimental study presented in [36] shows that the TD-TR algo-rithm is signiﬁcantly better than the opening window in terms of both quality and compression (since it globally optimizes the compression process), it has the main disadvantage of not being an online algorithm and, therefore, it cannot be applied directly to trajectory segments at the time they are feeding a spatiotemporal data-base. Quite the opposite, it needs the a priori knowledge of the entire moving object trajectory.

172 E. Frentzos et al.

Fig. 6.15 Opening window algorithm used for trajectory compression. Original data points are represented byclosed circles[36]

On the contrary, under the previously described conditions of online operation, the opening window(OW) class of algorithms can be easily applied. These algo-rithms start by anchoring the ﬁrst trajectory point, and attempt to approximate the subsequent data points with one gradually longer segment (Fig. 6.15). As long as all distances of the subsequent data points from the segment are below the distance threshold, an attempt is made to move the segment’s end point one position up in the data series. When the threshold is going to exceed, two strategies can be applied:

either the point causing the violation (normal opening window, NOPW) or the point just before it (before opening window, BOPW) becomes the end point of the current segment, and also the anchor of the next segment. If the threshold is not exceeded, the ﬂoat is moved one position up in the data series (i.e., the window opens fur-ther) and the algorithm caries on until the trajectory’s last point; then the whole trajectory is transformed into a linear approximation. In the original OW class of algorithms, each distance is calculated from the point perpendicular to the segment under examination, while in the OPW-TR algorithm presented in [36], the SED distance is evaluated.

Although OW algorithms are computationally expensive – since their time com-plexity isO(N²)– they are very popular. This is because, they are online algorithms, and they can work reasonably well in presence of noise (but only for relatively short data series). Moreover, the time complexity isO(N²)regarding only the compres-sion of the full data series; when dealing with each point update – that is in the online case – the complexity of determining whether each incoming point will be ﬂoat or the next anchor isO(N).

Recently, Potamias et al. [51] proposed several techniques based on uniform and spatiotemporal sampling to compress trajectory streams, under different memory availability settings: fixed memory, logarithmically or linearly increasing memory, or memory not known in advance. Their major contributions are two compression algorithms, namely, the ST TraceandT hresholds. According to this work, there are two basic requirements when dealing with trajectory streams: the need for pro-cessing incoming points in high rates and the need for locally or globally constant allocated memory. To deal with the first requirement, they propose the Thresholds method withO(1)time complexity. This method uses the current object’s position, speed and direction in order to predict asafe area, where the next trajectory point will be located; when this area actually contains the next reported point, it can be approximated by the current moving point settings. The authors propose the calcu-lation of the safe area using two methods: the first one, namedsample-based safe

SA_T SA_S

Joint Safe Area

Fig. 6.16 Safe area used by the Thresholds algorithm

area, is calculated using each object’s current position speed and direction in any case, despite of whether the object’s current position was or was not eliminated by the heuristic. On the contrary, the second approach, named trajectory based, calculates the safe area using each object’s last recorded position speed and direc-tion. Because of several limitations that both approaches demonstrate, the safe area employed by the algorithm is calculated as the planar intersection of the sample-based and the trajectory-sample-based one (SASandSA_T areas, respectively, in Fig. 6.16).

The main advantage of the proposed algorithm compared with the opening window presented in [36] is its low-cost time complexity; however, although their results would possibly be comparable, they do not provide any experimental comparison between the two algorithms in terms of actual execution time, compression rate, and quality.

The second algorithm proposed in [51] is designed to fulﬁll the requirement of the preset amount of memory. The proposed algorithm, named ST Trace, utilizes a constant for each trajectory amount of memoryM. It starts by inserting in the allocated memory the ﬁrst M recorded positions, along with each position’s SED with respect to its predecessor and successor in the sample. As soon as the allocated memory gets exhausted and a new point is examined for possible insertion, the sam-ple is searched for the item with the lowest SED, which represents the least possible loss of information in case it gets discarded. In the sequel, the algorithm checks whether the inserted point has SED larger than the minimum one found already in the sample and, if yes, the currently processed point is inserted into the sample at the expense of the point with the lowest SED. Finally, the SED attributes of the neigh-boring points of the removed one are recalculated, whereas a search is triggered in the sample for the new minimum SED. The proposed algorithm may be easily applied in the multiple trajectory case, by simply calculating a global minimum SED of all the trajectories stored inside the allocated memory.

6.7 Open Issues: Roadmap

Dans le document Mobility, Data Mining and Privacy (Page 178-181)