# Mining Heterogeneous Multidimensional Sequential Patterns

(1)

(2)

## Mining Heterogeneous Multidimensional Sequential Patterns

3

1

(3)

1

p

1

111

221

p

1

222

l

1

221

311

2

n

1

111

n

2

111

211

l

1

222

312

3

n

3

112

211

l

2

222

311

4

p

2

112

222

p

2

221

312

p

2

221

312

### Table 1 : An example of a database of patient trajectories.

h

uh gh

uhp uhn ghp ghl

d

r ca

r1 r2 ca1 ca2 ca3

### Healthcare Institution ( Hosp, ) Diagnosis ( Diag, )

mp

mp1 mp2 mp3

mp11 mp12 mp21 mp22 mp31

mp111 mp112 mp121 mp211 mp221 mp222 mp311 mp312

1

p

1

111

221

1

p

1

111

221

1

2

n

i

1

k

t

1

t

2

t

3

t

t

n

1

k

1

k

i

i

1

2

n1

1

2

n2

1

2

n2

1

j

ij

2

2

1

p

1

111

121

11

12

p

1

111

121

11

12

p

1

111

121

l

2

121

131

1

p

1

111

121

2

l

2

121

131

1

2

1

11

12

l

13

1

1

2

2

1

1

2

2

DB

1

2

m

1

2

k

DB

DB

DB

i

DB

i

ij

DB

DB

DB

DB

i

DB

i

DB

DB

DB

DB

DB

11

2

h

d

222

DB

DB

DB

(4)

DB

DB

DB

DB

p

1

111

112

DB

1

d

1

h

d

2

11

2

22

31

DB

DB

DB

DB

1

k

i

h

d

111

112

121

211

221

222

311

312

1

k

1

k

th

1

1

k

k

1

n

1

m

i

j

1

1

k

k

i

1

n

1

m

1

k

DB

1

k

i

i

1

k

h

mp

d

2

3

mp

h

mp

h

mp

mp

h

1

2

3

1

2

m

m

1

m

i

i

m

m

m+1

m+1

m

1

m

m+1

(5)

i

i

m+1

m

1

21

21

1

211

21

22

3

1

21

22

1

21

3

h

mp

1

2

3

h

mp

h

1

mp

h

2

mp

h

3

mp

mp

1

2

3

h

h

1

h

2

h

3

h

d

mp

d

mp

d

mp

h

mp

h

mp

h

d

1

h

d

2

h

d

3

d

mp

mp

d

1

d

2

mp

1

2

11

1

2

11

2

1

11

11

2

DB

### which are frequent and most speciﬁc. Table 2 shows the set of most speciﬁc frequent elementary vectors which are extracted from ( F E,≤) .

id Elementary Vector 1 (uh, ca,{mp11, mp2}) 2 (gh, r,{mp22, mp31}) 3 (h,d,{mp222})

DB

1

n

DB

i

i

4

DB

11

2

h

d

222

22

31

22

31

41

p

2

112

222

11

2

h

d

222

41

11

2

41

h

d

222

42

43

p

2

221

312

22

31

DB

DB

DB

### .

Patients Trajectories

s1 {(uh, ca,{mp11, mp2})},{(h,d,{mp222})},{(gh, r,{mp22, mp31})}

s2 {(uh, ca,{mp11, mp2})},{(h,d,{mp222}),(gh, r,{mp22, mp31})}

s3 {(uh, ca,{mp11, mp2})},{(h,d,{mp222}),(gh, r,{mp22, mp31})}

s4 {(uh, ca,{mp11, mp2}),(h,d,{mp222})},{(gh, r,{mp22, mp31})}, {(gh, r,{mp22, mp31})}

DB

DB

i

DB

ij

ij

i

4

11

2

h

d

222

22

31

22

31

DB

11

2

22

31

h

d

222

4

DB

### in Table 3 by using the identiﬁers of all most speciﬁc frequent elementary vectors MSFEV in Table 2.

Patients Trajectories

s1 {1},{3},{2}

s2 {1},{2,3}

s3 {1},{2,3}

s4 {1,3},{2},{2}

### We use CloSpan [11] as a sequential pattern mining algorithm to extract sequential patterns form Table 4. Table 5 displays all sequen- tial patterns in their transformed format and the frequent patient tra- jectories in which identiﬁers are replaced with their actual values, with σ = 3 .

sequential patterns mds-patterns support

{3} (h,d,{mp222}) 4 {1},{2} (uh, ca,{mp11, mp2}),(gh, r,{mp22, mp31}) 4 {1},{3} (uh, ca,{mp11, mp2}),(h,d,{mp222}) 3

DB

### 282

(6)

(h,d,mp)

(uh,d,mp) (h, ca,mp) (h,{r},mp) (gh,d,mp) (h,d,{mp1}) (h,d,{mp2}) (h,d,{mp3})

(uhp,d,mp) NonFrequent

(uhn,d,mp) NonFrequent

(uh, ca,mp) (uh, r,mp) NonFrequent

(uh,d,{mp1}) (uh,d,{mp2}) (uh,d,{mp3}) NonFrequent

(uh, ca1,mp) NonFrequent

(uh, ca2,mp) NonFrequent

(uh, ca3,mp) NonFrequent

(uh, ca,{mp1}) (uh, ca,{mp2})

(uh, ca,{mp11}) (uh, ca,{mp12}) NonFrequent

(uh, ca,{mp1, mp2})

(uh, ca,{mp111}) NonFrequent

(uh, ca,{mp112}) NonFrequent

(uh, ca,{mp11, mp2})

(uh, ca,{mp11, mp21}) NonFrequent

(uh, ca,{mp11, mp22}) NonFrequent

3

2

3

2

3

th

4

1

### C 341 ) where he underwent a chest X-Ray (code: ZBQK ). Then, he was hospitalized in a private clinic in Metz (code: 570023630 ), for a chemotherapy session (code: Z 511 ) where he had a chest X-Ray and pneumonectomy (code: GF F A).

Patients Trajectories

P atient1 (540002078, C341,{ZBQK}),(570023630, Z511,{ZBQK, GF F A}) P atient2 (100000017, C770,{ZBQK}),(210780581, C770,{ZZQK, Y Y Y Y}) P atient3 (210780110, H259,{Y Y Y Y}),(210780110, H259,{ZZQK})

hospital

4

### medical and surgical procedures

(7)

0 25000 50000 75000 100000

[100,90[ [90,80[ [80,70[ [70,60[ [60,50[ [50,40[ [40,30[ [30,20]

Support (%)

Number of pattern

Length 1 2 3 4 5 6 7

### neoplasm of bronchus and lung”. In the branch of“factors inﬂuenc-ing health status and contact with health services”, items of depth 4 (“chemotherapy session for neoplasm”) have been extracted. Chil- dren of “Malignant neoplasm of bronchus and lung” are not frequent enough to be extracted, but “chemotherapy session” appears in a suf- ﬁcient proportion of trajectories to be seen. Such results cannot be obtained by representing items at an arbitrary pre-determined level.

ICD10 level – Diagnosis Taxonomy 0– Root

1– Neoplasms

2– Malignant neoplasms of respiratory and intrathoracic organs (C30–C39) 3– Malignant neoplasm of bronchus and lung (C34)

1– Factors inﬂuencing health status and contact with health services

2– Persons encountering health services for speciﬁc procedures and health care (Z40-Z54) 3– Other medical care (Z51)

4– Chemotherapy session for neoplasm (Z511)

Diag

Diag

Updating...

## References

Related subjects :