• Aucun résultat trouvé

Time Series Data

N/A
N/A
Protected

Academic year: 2022

Partager "Time Series Data"

Copied!
26
0
0

Texte intégral

(1)

Time Series Data

CS 7450 - Information Visualization February 10, 2004

John Stasko

Spring 2004 CS 7450 2

Time Series Data

• Fundamental chronological component to the data set

• Random sample of 4000 graphics from 15 of world’s newspapers and magazines from ’74-’80 found that 75% of graphics published were time series

Tufte, Vol. 1

(2)

Spring 2004 CS 7450 3

Data Sets

• Each data case is likely an event of some kind

• One of the variables can be the date and time of the event

• Ex: sunspot activity, baseball games, medicines taken, cities visited, stock prices, etc.

Meta Level

• Consider multiple stocks being examined

• Is each stock a data case, or is a price on a particular day a case, with the stock name as one of the other variables?

• Confusion between data entity and data cases

(3)

Spring 2004 CS 7450 5

Data Mining

• Data mining domain has techniques for algorithmically examining time series data, looking for patterns, etc.

• Good when objective is known a priori

• But what if not?

Which questions should I be asking?

InfoVis better for that

Spring 2004 CS 7450 6

Tasks

• What kinds of questions do people ask about time series data?

(4)

Spring 2004 CS 7450 7

Time Series User Tasks

• Examples

When was something greatest/least?

Is there a pattern?

Are two series similar?

Do any of the series match a pattern?

Provide simpler, faster access to the series

Standard Presentation

• Present time data as a 2D line graph with time on x-axis and some other variable on y-axis

(5)

Spring 2004 CS 7450 9

Classic Views

Spring 2004 CS 7450 10

Today’s Focus

• Examination of a number of case studies

(6)

Spring 2004 CS 7450 11

Case Study 1

• Calendar visualization

• Present series of events in context of calendar

Tasks

• See commonly available times for group of people

• Show both details and broader context

(7)

Spring 2004 CS 7450 13

Design Ideas

Spring 2004 CS 7450 14

One Solution

Mackinlay, Robertson & DeLine UIST ‘94

Spiral Calendar

(8)

Spring 2004 CS 7450 15

Another View

Time Lattice

x - days y - hours z - people Uses projected shadows on walls

Case Study 2

• Personal histories

Consider a chronological series of events in someone’s life

Present an overview of the events

Examples Medical history

Educational background Criminal history

(9)

Spring 2004 CS 7450 17

Tasks

• Put together complete story

• Garner information for decision-making

• Notice trends

• Gain an overview of the events to grasp the big picture

Spring 2004 CS 7450 18

Design Ideas

(10)

Spring 2004 CS 7450 19

Lifelines Project

Visualize personal history in some domain

Video Demo

Plaisant et al CHI ‘96

Medical Display

(11)

Spring 2004 CS 7450 21

Features

• Different colors for different event types

• Line thickness can correspond to another variable

• Interaction: Clicking on an event produces more details

• Certainly could also incorporate some Spotfire-like dynamic query capabilities

Spring 2004 CS 7450 22

Benefits

• Reduce chances of missing information

• Facilitate spotting trends or anomalies

• Streamline access to details

• Remain simple and tailorable to various applications

(12)

Spring 2004 CS 7450 23

Challenges

• Scalability

• Can multiple records be visualized in parallel (well)?

Case Study 3

• People’s movements in some space

• Situation:

Workers punch in and punch out of a factory

Want to understand the presence patterns over a calendar year

(13)

Spring 2004 CS 7450 25

Design Ideas

Spring 2004 CS 7450 26

One Idea

Good

Typical daily pattern Seasonal trends Bad

Weekly pattern Details

van Wijk & van Selow InfoVis ‘99

(14)

Spring 2004 CS 7450 27

Another Approach

• Cluster analysis

Find two most similar days, make into one new composite

Keep repeating until some preset number left or some condition met

• How can this be visualized?

Display

(15)

Spring 2004 CS 7450 29

Insights

Traditional office hours followed

Most employees present in late morning

Fewer people are present on summer Fridays

Just a few people work holidays

When were the holidays

School vacations occurred May 3-11, Oct 11-19, Dec 21- 31

Many people take off day after holiday

Many people leave at 4pm on December 5

Spring 2004 CS 7450 30

Interaction

• Click on day, see its graph

• Select a day, see similar ones

• Add/remove clusters

(16)

Spring 2004 CS 7450 31

Case Study 4

• Serial, periodic data

• Data with chronological aspect, but repeats and follows a pattern over time

Hinted at in last case study

Using Spirals

• Standard x-y timeline or tabular display is problematic for periodic data

It has endpoints

• Use spiral to help display data

One loop corresponds to one period

• Discuss…

(17)

Spring 2004 CS 7450 33

Basic Spiral Display

One year per loop

Same month on radial bars Quantity represented by size

of blob

Is it as easy to see serial data as periodic data?

Spring 2004 CS 7450 34

Advanced Spiral

Same mapping as previous one

Different foods represented by different colors and drawn at different heights Can you still see serial and periodic attributes?

As with all 3-D, requires navigation

(18)

Spring 2004 CS 7450 35

Compare with Spotfire

Another standard spiral display Color mapped to movie type +/- compared to Spotfire?

Unknown Periods

What if a data set doesn’t have a regular temporal period?

(19)

Spring 2004 CS 7450 37

Case Study 5

• Computer system logs

• Potentially huge amount of data

Tedious to examine the text

• Looking for unusual circumstances, patterns, etc.

Spring 2004 CS 7450 38

MieLog

• System to help computer systems administrators examine log files

• Interesting characteristics

Discuss

Takada & Koike LISA ‘02

(20)

Spring 2004 CS 7450 39

System View

Tag area block for each unique tag, with color representing frequency (blue-high,

red-low)

Time area days, hours, &

frequency histogram (grayscale, white-high)

Message area actual log messages (red – predefined

keywords blue – low

frequency words) Outline area

pixel per character

Another View

Alternate color mappings?

(21)

Spring 2004 CS 7450 41

Interactions

Tag area

Click on tag shows only those messages

Time area

Click on tiles to show those times

Can put line on histogram to filter on values above/below

Outline area

Can filter based on message length

Just highlight messages to show them in text

Message area

Can filter on specific words

Spring 2004 CS 7450 42

Thoughts

• Strengths/weaknesses?

• Other domains in which a similar system could be used?

(22)

Spring 2004 CS 7450 43

Case Study 6

• Most systems focus on visualization and navigation of time series data

• How about querying?

TimeFinder

Can create rectangles that function as matching regions

Grayed region is data envelope that shows extreme values of queries matching criteria

(23)

Spring 2004 CS 7450 45

Limitations

• Can you think of a fundamental limitation of such an approach?

Spring 2004 CS 7450 46

Problematic Example

0 5 10 15 20 25

97.5

98

98.5

99

99.5

100

100.5

101

101.5

0 5 10 15 20 25

0 5 10 15 20 25

A)

C) B)

Hodgkins patients exhibit double spike in temperature…

But that can be with differing amounts of time in-between

(24)

Spring 2004 CS 7450 47

Solution

TimeSearcher

Augmented TimeBox Mechanism

(x1, y1) (x2, y2)

0 5

10

15

20

25

0 5

10 15 20 25

0 5 10 15 20 25

C)

B)

(x1, y1)

(x2, y2)

R TimeBox

A)

Variable Time Timeboxes

Allow time boxes with deltas on each side

TimeSearcher Interface

(25)

Spring 2004 CS 7450 49

Drawing Queries

Query Sketch

You specify a timeline query by drawing a rough pattern for it, the system brings back near

matches User-drawn query

Responses

M. Wattenberg, CHI ‘01

Spring 2004 CS 7450 50

Final Project

• Description has been updated with correct information

• Be forming your teams and thinking about your topics

• Initial deliverable due next Thursday

(26)

Spring 2004 CS 7450 51

Upcoming

• Overview and Detail

Reading Chapter 7

Plaisant, Carr & Shneiderman

• Focus & Context

References

• Spence and CMS books

• All referred to articles

Références

Documents relatifs

tslearn is a general-purpose Python machine learning library for time series that offers tools for pre-processing and feature extraction as well as dedicated models for

Data Augmentation for Time Series Classification using Convolutional Neural Networks.. Arthur Le Guennec, Simon Malinowski,

It is demonstrated how road safety developments of the traffic volume, the number of accidents and the number of fatalities can be linked to the developments

Here with the Paris area as a case study, three products were selected for their comparable spatial resolution of one kilometer, their daily frequency, and a minimum 15 year

Chapter 6 and 7 addressed the characterization and prediction of brain states in Zebrafish from LFP signals. The study involved the development of 1) an automatic seizure

Philippe Esling, Carlos Agon. Time-series data mining.. In almost every scientific field, measurements are performed over time. These observations lead to a collection of organized

Because of this, we pro- pose to use known estimators of extreme value theory, such as the ex- tremogram, which we generalize to time series with the presence of missing

The next theorem proves that the basic scheme meets the AO security notion in the random oracle model, based on the DDH assumption.. T