• Aucun résultat trouvé

Mathematical Tools for Data Mining

N/A
N/A
Protected

Academic year: 2022

Partager "Mathematical Tools for Data Mining"

Copied!
834
0
0

Texte intégral

(1)

Advanced Information and Knowledge Processing

Dan A. Simovici Chabane Djeraba

Mathematical Tools for Data Mining

Set Theory, Partial Orders, Combinatorics

Second Edition

(2)

Mathematical Tools for Data Mining

(3)

For further volumes:

http://www.springer.com/series/4738

Advanced Information and Knowledge Processing

Series editors Professor Lakhmi Jain lakhmi.jain@unisa.edu.au Professor Xindong Wu xwu@cs.uvm.edu

(4)

Dan A. Simovici

Chabane Djeraba

Mathematical Tools for Data Mining

Set Theory, Partial Orders, Combinatorics

123

Second Edition

(5)

Dan A. Simovici MS, MS, Ph.D.

University of Massachusetts Boston

USA

Chabane Djeraba BSc, MSc, Ph.D.

University of Sciences and Technologies of Lille

Villeneuve d’Ascq France

ISSN 1610-3947

ISBN 978-1-4471-6406-7 ISBN 978-1-4471-6407-4 (eBook) DOI 10.1007/978-1-4471-6407-4

Springer London Heidelberg New York Dordrecht

Library of Congress Control Number: 2014933940 Springer-Verlag London 2008, 2014

This work is subject to copyright. All rights are reserved by the Publisher, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and transmission or information storage and retrieval, electronic adaptation, computer software, or by similar or dissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Duplication of this publication or parts thereof is permitted only under the provisions of the Copyright Law of the Publisher’s location, in its current version, and permission for use must always be obtained from Springer. Permissions for use may be obtained through RightsLink at the Copyright Clearance Center. Violations are liable to prosecution under the respective Copyright Law.

The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant protective laws and regulations and therefore free for general use.

While the advice and information in this book are believed to be true and accurate at the date of publication, neither the authors nor the editors nor the publisher can accept any legal responsibility for any errors or omissions that may be made. The publisher makes no warranty, express or implied, with respect to the material contained herein.

Printed on acid-free paper

Springer is part of Springer Science+Business Media (www.springer.com)

(6)

Preface

The data mining literature contains many excellent titles that address the needs of users with a variety of interests ranging from decision making to pattern investi- gation in biological data. However, these books do not deal with the mathematical tools that are currently needed by data mining researchers and doctoral students and we felt that it is timely to produce a new version of our book that integrates the mathematics of data mining with its applications. We emphasize that this book is about mathematical tools for data mining andnotabout data mining itself; despite this, many substantial applications of mathematical concepts in data mining are included. The book is intended as a reference for the working data miner.

We present several areas of mathematics that, in our opinion are vital for data mining: set theory, including partially ordered sets and combinatorics; linear algebra, with its many applications in linear algorithms; topologythat is used in understanding and structuring data, andgraph theorythat provides a powerful tool for constructing data models.

Our set theory chapter begins with a study of functions and relations. Appli- cations of these fundamental concepts to such issues as equivalences and partitions are discussed. We have also included a précis of universal algebra that covers the needs of subsequent chapters.

Partially ordered sets are important on their own and serve in the study of certain algebraic structures, namely lattices, and Boolean algebras. This is con- tinued with a combinatorics chapter that includes such topics as the inclusion–

exclusion principle, combinatorics of partitions, counting problems related to collections of sets, and the Vapnik–Chervonenkis dimension of collections of sets.

An introduction to topology and measure theory is followed by a study of the topology of metric spaces, and of various types of generalizations and special- izations of the notion of metric. The dimension theory of metric spaces is essential for recent preoccupations of data mining researchers with the applications of fractal theory to data mining.

A variety of applications in data mining are discussed, such as the notion of entropy, presented in a new algebraic framework related to partitions rather than random distributions, level-wise algorithms that generalize the Apriori technique, and generalized measures and their use in the study of frequent item sets.

Linear algebra is present in this new edition with three chapters that treat linear spaces, norms and inner products, and spectral theory. The inclusion of these

v

(7)

chapters allowed us to expand our treatment of graph theory and include many new applications.

A final chapter is dedicated to clustering that includes basic types of clustering algorithms, techniques for evaluating cluster quality, and spectral clustering.

The text of this second edition, which appears 7 years after the publication of the first edition, was reorganized, corrected, and substantially amplified.

Each chapter ends with suggestions for further reading. Over 700 exercises and supplements are included; they form an integral part of the material. Some of the exercises are in reality supplemental material. For these, we include solutions.

The mathematics required for making the best use of our book is a typical three-semester sequence in calculus.

Boston, January 2014 Dan A. Simovici

Villeneuve d’Ascq Chabane Djeraba

vi Preface

(8)

Contents

1 Sets, Relations, and Functions . . . 1

1.1 Introduction . . . 1

1.2 Sets and Collections. . . 1

1.3 Relations and Functions . . . 6

1.3.1 Cartesian Products of Sets . . . 6

1.3.2 Relations . . . 7

1.3.3 Functions . . . 12

1.3.4 Finite and Infinite Sets . . . 19

1.3.5 Generalized Set Products and Sequences . . . 21

1.3.6 Equivalence Relations . . . 26

1.3.7 Partitions and Covers. . . 28

1.4 Countable Sets . . . 31

1.5 Multisets . . . 33

1.6 Operations and Algebras . . . 35

1.7 Morphisms, Congruences, and Subalgebras. . . 39

1.8 Closure and Interior Systems . . . 42

1.9 Dissimilarities and Metrics . . . 47

1.10 Rough Sets . . . 50

1.11 Closure Operators and Rough Sets. . . 54

References . . . 66

2 Partially Ordered Sets. . . 67

2.1 Introduction . . . 67

2.2 Partial Orders . . . 67

2.3 The Poset of Real Numbers . . . 74

2.4 Chains and Antichains . . . 76

2.5 Poset Product . . . 82

2.6 Functions and Posets . . . 85

2.7 The Poset of Equivalences and the Poset of Partitions . . . 87

2.8 Posets and Zorn’s Lemma . . . 89

References . . . 95

vii

(9)

3 Combinatorics. . . 97

3.1 Introduction . . . 97

3.2 Permutations . . . 97

3.3 The Power Set of a Finite Set . . . 101

3.4 The Inclusion–Exclusion Principle . . . 104

3.5 Locally Finite Posets and Möbius Functions . . . 106

3.6 Ramsey’s Theorem . . . 114

3.7 Combinatorics of Partitions. . . 117

3.8 Combinatorics of Collections of Sets . . . 119

3.9 The Vapnik-Chervonenkis Dimension . . . 125

3.10 The Sauer–Shelah Theorem . . . 128

References . . . 147

4 Topologies and Measures. . . 149

4.1 Introduction . . . 149

4.2 Topologies . . . 149

4.3 Closure and Interior Operators in Topological Spaces . . . 151

4.4 Bases . . . 159

4.5 Compactness . . . 162

4.6 Continuous Functions. . . 164

4.7 Connected Topological Spaces . . . 167

4.8 Separation Hierarchy of Topological Spaces . . . 170

4.9 Products of Topological Spaces. . . 172

4.10 Fields of Sets . . . 174

4.11 Measures . . . 179

References . . . 195

5 Linear Spaces . . . 197

5.1 Introduction . . . 197

5.2 Linear Mappings . . . 202

5.3 Matrices . . . 206

5.4 Rank . . . 224

5.5 Multilinear Forms . . . 236

5.6 Linear Systems . . . 240

5.7 Determinants. . . 242

5.8 Partitioned Matrices and Determinants . . . 257

5.9 The Kronecker and Hadamard products . . . 260

5.10 Topological Linear Spaces . . . 263

References . . . 279

6 Norms and Inner Products. . . 281

6.1 Introduction . . . 281

6.2 Inequalities on Linear Spaces . . . 281

6.3 Norms on Linear Spaces . . . 284

viii Contents

(10)

6.4 Inner Products . . . 290

6.5 Orthogonality . . . 295

6.6 Unitary and Orthogonal Matrices. . . 301

6.7 The Topology of Normed Linear Spaces . . . 305

6.8 Norms for Matrices . . . 311

6.9 Projection on Subspaces . . . 318

6.10 Positive Definite and Positive Semidefinite Matrices . . . 324

6.11 The Gram-Schmidt Orthogonalization Algorithm. . . 331

References . . . 345

7 Spectral Properties of Matrices . . . 347

7.1 Introduction . . . 347

7.2 Eigenvalues and Eigenvectors . . . 347

7.3 Geometric and Algebraic Multiplicities of Eigenvalues . . . 355

7.4 Spectra of Special Matrices . . . 357

7.5 Variational Characterizations of Spectra . . . 363

7.6 Matrix Norms and Spectral Radii . . . 370

7.7 Singular Values of Matrices . . . 372

References . . . 397

8 Metric Spaces Topologies and Measures. . . 399

8.1 Introduction . . . 399

8.2 Metric Space Topologies . . . 399

8.3 Continuous Functions in Metric Spaces . . . 402

8.4 Separation Properties of Metric Spaces . . . 404

8.5 Sequences in Metric Spaces . . . 411

8.5.1 Sequences of Real Numbers . . . 412

8.6 Completeness of Metric Spaces . . . 415

8.7 Contractions and Fixed Points . . . 420

8.7.1 The Hausdorff Metric Hyperspace of Compact Subsets. . . 422

8.8 Measures in Metric Spaces . . . 425

8.9 Embeddings of Metric Spaces . . . 428

References . . . 433

9 Convex Sets and Convex Functions. . . 435

9.1 Introduction . . . 435

9.2 Convex Sets . . . 435

9.3 Convex Functions . . . 441

9.3.1 Convexity of One-Argument Functions . . . 443

9.3.2 Jensen’s Inequality . . . 446

References . . . 455

(11)

10 Graphs and Matrices. . . 457

10.1 Introduction . . . 457

10.2 Graphs and Directed Graphs . . . 457

10.2.1 Directed Graphs . . . 466

10.2.2 Graph Connectivity . . . 470

10.2.3 Variable Adjacency Matrices . . . 474

10.3 Trees . . . 478

10.4 Bipartite Graphs . . . 493

10.5 Digraphs of Matrices . . . 501

10.6 Spectra of Non-negative Matrices . . . 504

10.7 Fiedler’s Classes of Matrices . . . 508

10.8 Flows in Digraphs . . . 517

10.9 The Ordinary Spectrum of a Graph . . . 524

References . . . 538

11 Lattices and Boolean Algebras. . . 539

11.1 Introduction . . . 539

11.2 Lattices as Partially Ordered Sets and Algebras. . . 539

11.3 Special Classes of Lattices . . . 546

11.4 Complete Lattices . . . 553

11.5 Boolean Algebras and Boolean Functions. . . 556

References . . . 581

12 Applications to Databases and Data Mining . . . 583

12.1 Introduction . . . 583

12.2 Relational Databases . . . 583

12.3 Partitions and Functional Dependencies . . . 590

12.4 Partition Entropy . . . 598

12.5 Generalized Measures and Data Mining . . . 614

12.6 Differential Constraints . . . 618

12.7 Decision Systems and Decision Trees . . . 624

12.8 Logical Data Analysis . . . 631

12.9 Perceptrons . . . 639

References . . . 645

13 Frequent Item Sets and Association Rules. . . 647

13.1 Introduction . . . 647

13.2 Frequent Item Sets. . . 647

13.3 Borders of Collections of Sets. . . 653

13.4 Association Rules . . . 655

13.5 Levelwise Algorithms and Posets . . . 657

13.6 Lattices and Frequent Item Sets . . . 662

References . . . 668

x Contents

(12)

14 Special Metrics . . . 669

14.1 Introduction . . . 669

14.2 Ultrametrics and Ultrametric Spaces . . . 669

14.2.1 Hierarchies and Ultrametrics . . . 672

14.2.2 The Poset of Ultrametrics . . . 677

14.3 Tree Metrics . . . 680

14.4 Metrics on Collections of Sets . . . 689

14.5 Metrics on Partitions . . . 695

14.6 Metrics on Sequences . . . 699

14.7 Searches in Metric Spaces . . . 703

References . . . 724

15 Dimensions of Metric Spaces . . . 727

15.1 Introduction . . . 727

15.2 The Euler Functions and the Volume of a Sphere . . . 727

15.3 The Dimensionality Curse . . . 732

15.4 Inductive Dimensions of Topological Metric Spaces . . . 735

15.5 The Covering Dimension . . . 745

15.6 The Cantor Set . . . 747

15.7 The Box-Counting Dimension . . . 751

15.8 The Hausdorff-Besicovitch Dimension . . . 755

15.9 Similarity Dimension . . . 758

References . . . 766

16 Clustering. . . 767

16.1 Introduction . . . 767

16.2 Hierarchical Clustering. . . 768

16.3 Thek-Means Algorithm . . . 778

16.4 The PAM Algorithm . . . 780

16.5 The Laplacian Spectrum of a Graph . . . 782

16.5.1 Laplacian Spectra of Special Graphs . . . 784

16.5.2 Graph Connectivity . . . 788

16.6 Spectral Clustering Algorithms . . . 798

16.6.1 Spectral Clustering by Cut Ratio. . . 799

16.6.2 Spectral Clustering by Normalized Cuts. . . 801

References . . . 817

Index . . . 819

(13)

Chapter 1

Sets, Relations, and Functions

1.1 Introduction

In this chapter, dedicated to set-theoretical bases of data mining, we assume that the reader is familiar with the notion of a set, membership of an element in a set, and elementary set theory. After a brief review of set-theoretical operations we discuss collections of sets, ordered pairs, and set products.

Countable and uncountable sets are presented in Sect.1.4. An introductory section on elementary combinatorics is expanded in Chap.3.

We present succinctly several algebraic structures to the extent that they are nec- essary for the material presented in the subsequent chapters. We emphasize notions like operations, morphisms, and congruences that are of interest for the study of any algebraic structure. Finally, we discuss closure and interior systems, topics that have multiple applications in topology, algebra, and data mining.

1.2 Sets and Collections

The membership of x in a set S is denoted by xS; if x is not a member of the set S, we write xS.

Throughout this book, we use standardized notations for certain important sets of numbers:

C the set of complex numbers R the set of real numbers R0 the set of nonnegative real numbersR>0 the set of positive real numbers Rˆ0 the setR0∪ {+∞} Rˆ the setR∪ {−∞,+∞}

Q the set of rational numbers I the set of irrational numbers Z the set of integers N the set of natural numbers

D. A. Simovici and C. Djeraba, Mathematical Tools for Data Mining, 1 Advanced Information and Knowledge Processing, DOI: 10.1007/978-1-4471-6407-4_1,

© Springer-Verlag London 2014

(14)

The usual order of real numbers is extended to the setRˆ by−∞<x<+∞for every x∈R. In addition, we assume that

x+ ∞ = ∞ +x= +∞, and x− ∞ = −∞ +x= −∞, for every x∈R. Also,

x· ∞ = ∞ ·x=

+∞ if x>0

−∞ if x<0, and

x·(−∞)=(−∞)·x=

−∞ if x>0

if x<0.

Note that the product of 0 with either+∞or−∞is not defined. Division is extended by x/+ ∞ =x/− ∞ =0 for every x∈R.

If S is a finite set, we denote by|S|the number of elements of S.

Sets may contain other sets as elements. For example, the set C= {∅,{0},{0,1},{0,2},{1,2,3}}

contains the empty set∅and{0},{0,1},{0,2},{1,2,3}as its elements. We refer to such sets as collections of sets or simply collections. In general, we use calligraphic lettersC,D, . . .to denote collections of sets.

IfC andD are two collections, we say thatCis included in D, or that Cis a subcollection of D, if every member of Cis a member ofD. This is denoted by C⊆D.

Two collectionsCandDare equal if we have bothC ⊆DandD ⊆C. This is denoted byC=D.

Definition 1.1 LetCbe a collection of sets. The union ofC, denoted by⎜ C, is the

set defined by

C= {x | xS for some S∈C}.

IfCis a nonempty collection, its intersection is the set

Cgiven by

⎟C= {x | xS for every S∈C}.

IfC= {S,T}, we have x∈⎜

Cif and only if xS or xT and x∈⎜ Cif and only if xS and yT . The union and the intersection of this two-set collection are denoted by ST and ST and are referred to as the union and the intersection of S and T , respectively.

We give, without proof, several properties of union and intersection of sets:

(15)

1.2 Sets and Collections 3 1. S(TU)=(ST)U (associativity of union),

2. ST =TS (commutativity of union), 3. SS=S (idempotency of union), 4. S∪ ∅ =S,

5. S(TU)=(ST)U (associativity of intersection), 6. ST =TS (commutativity of intersection),

7. SS=S (idempotency of intersection), 8. S∩ ∅ = ∅,

for all sets S,T,U.

The associativity of union and intersection allows us to denote unambiguously the union of three sets S,T,U by STU and the intersection of three sets S,T,U by STU.

Definition 1.2 The sets S and T are disjoint if ST = ∅.

A collection of setsCis said to be a collection of pairwise disjoint sets if for every distinct sets S and T inC, S and T are disjoint.

Definition 1.3 Let S and T be two sets. The difference of S and T is the set ST defined by ST = {xS | xT}.

When the set S is understood from the context, we write T for ST , and we refer to the set T as the complement of T with respect to S or simply the complement of T . The relationship between set difference and set union and intersection is given in the following theorem.

Theorem 1.4 For every set S and nonempty collectionCof sets, we have

S

C=⎟

{SC | C∈C}and S−⎟ C=

{SC | C∈C}.

Proof We leave the proof of these equalities to the reader.

Corollary 1.5 For any sets S,T,U, we have

S(TU)=(ST)(SU)and S(TU)=(ST)(SU).

Proof Apply Theorem 1.4 toC= {T,U}.

With the notation previously introduced for the complement of a set, the equalities of Corollary 1.5 become

TU=TU and TU=TU.

The link between union and intersection is given by the distributivity properties contained in the following theorem.

(16)

Theorem 1.6 For any collection of setsCand set T , we have C

T =

{C∩T | C∈C}. IfCis nonempty, we also have

⎟C

T =⎟

{CT | C∈C}.

Proof We prove only the first equality; the proof of the second one is left as an exercise for the reader.

Let x(

C)∩T . This means that x ∈⎜

Cand xT . There is a set C ∈ C such that xC; hence, xCT , which implies x∈⎜

{CT | C∈C}. Conversely, if x ∈ ⎜

{CT | C ∈ C}, there exists a member CT of this collection such that xCT , so xC and xT . It follows that x ∈ ⎜

C, and this, in turn, gives x(

C)∩T .

Corollary 1.7 For any sets T , U, V , we have

(UV)T =(UT)(VT)and(UV)T =(UT)(VT).

Proof The corollary follows immediately by choosingC= {U,V}in Theorem 1.6.

Note that ifCandDare two collections such thatC⊆D, then C⊆

Dand ⎟

D⊆⎟ C.

We initially excluded the empty collection from the definition of the intersection of a collection. However, within the framework of collections of subsets of a given set S, we will extend the previous definition by taking

∅ =S for the empty collection of subsets of S. This is consistent with the fact that∅ ⊆Cimplies

C⊆S.

The symmetric difference of sets denoted byis defined by UV =(UV)(VU)for all sets U,V .

Theorem 1.8 For all sets U,V,T , we have (i) UU= ∅;

(ii) UV =VT ;

(iii) (UV)T =U(VT).

Proof The first two parts of the theorem are direct applications of the definition of

⊕. We leave to the reader the proof of the third part (the associativity of⊕).

The next theorem allows us to introduce a type of set collection of fundamental importance.

Theorem 1.9 Let{{x,y},{x}}and{{u, v},{u}}be two collections such that{{x,y}, {x}} = {{u, v},{u}}. Then, we have x=u and y=v.

(17)

1.2 Sets and Collections 5 Proof Suppose that{{x,y},{x}} = {{u, v},{u}}.

If x =y, the collection{{x,y},{x}}consists of a single set,{x}, so the collection {{u, v},{u}}will also consist of a single set. This means that{u, v} = {u}, which implies u =v. Therefore, x = u, which gives the desired conclusion because we also have y=v.

If x=y, then neither(x,y)nor(u, v)are singletons. However, they both contain exactly one singleton, namely{x}and{u}, respectively, so x=u. They also contain the equal sets{x,y}and{u, v}, which must be equal. Sincev∈ {x,y}andv=u=x, we conclude thatv=y.

Definition 1.10 An ordered pair is a collection of sets{{x,y},{x}}.

Theorem 1.9 implies that for an ordered pair{{x,y},{x}}, x and y are uniquely determined. This justifies the following definition.

Definition 1.11 Let{{x,y},{x}}be an ordered pair. Then x is the first component of p and y is the second component of p.

From now on, an ordered pair {{x,y},{x}} will be denoted by (x,y). If both x,yS, we refer to(x,y)as an ordered pair on the set S.

Definition 1.12 LetCandDbe two collections of sets such that⎜ C=⎜

D.Dis a refinement ofCif, for every D∈D, there exists C∈Csuch that DC.

This is denoted byC⊥D.

Example 1.13 Consider the collectionC = {(a,∞) | a ∈ R}andD = {(a,b) | a,b∈R,a<b}. It is clear that⎜

C=⎜ D=R.

Since we have(a,b)(a,∞)for every a,b∈Rsuch that a<b, it follows that Dis a refinement ofC.

Definition 1.14 A collection of setsCis hereditary if U ∈ Cand WU implies W ∈C.

Example 1.15 Let S be a set. The collection of subsets of S, denoted byP(S), is a hereditary collection of sets since a subset of a subset T of S is itself a subset of S.

The set of subsets of S that contain k elements is denoted byPk(S). Clearly, for every set S, we haveP0(S)= {∅}because there is only one subset of S that contains 0 elements, namely the empty set. The set of all finite subsets of a set S is denoted byPfin(S). It is clear thatPfin(S)=⎜

k∈NPk(S).

Example 1.16 If S= {a,b,c}, thenP(S)consists of the following eight sets:

∅,{a},{b},{c},{a,b},{a,c},{b,c},{a,b,c}.

For the empty set, we haveP(∅)= {∅}.

Definition 1.17 LetCbe a collection of sets and let U be a set. The trace of the collectionCon the set U is the collectionCU = {UC | C∈C}.

(18)

We conclude this presentation of collections of sets with two more operations on collections of sets.

Definition 1.18 LetCandDbe two collections of sets. The collectionsC∨D,C∧D, andC−Dare given by

C∨D= {CD | C∈Cand D∈D}, C∧D= {CD | C∈Cand D∈D}, C−D= {CD | C∈Cand D∈D}.

Example 1.19 LetCandDbe the collections of sets defined by C= {{x},{y,z},{x,y},{x,y,z}}, D= {{y},{x,y},{u,y,z}}.

We have

C∨D= {{x,y},{y,z},{x,y,z},{u,y,z},{u,x,y,z}}, C∧D= {∅,{x},{y},{x,y},{y,z}},

C−D= {∅,{x},{z},{x,z}},

D−C= {∅,{u},{x},{y},{u,z},{u,y,z}}.

Unlike “∪” and “∩”, the operations “∨” and “∧” between collections of sets are not idempotent. Indeed, we have, for example,

D∨D= {{y},{x,y},{u,y,z},{u,x,y,z}} =D.

The traceCKof a collectionCon K can be written asCK =C∧ {K}.

1.3 Relations and Functions

This section covers a number of topics that are derived from the notion of relation.

1.3.1 Cartesian Products of Sets

Definition 1.20 Let X and Y be two sets. The Cartesian product of X and Y is the set X×Y , which consists of all pairs(x,y)such that xX and yY .

If either X= ∅or Y = ∅, then X×Y = ∅.

(19)

1.3 Relations and Functions 7

Fig. 1.1 Cartesian

representation of the pair(x,y)

Example 1.21 Consider the sets X = {a,b,c}and Y = {0,1}. Their Cartesian product is the set X×Y = {(x,0), (y,0), (z,0), (x,1), (y,1), (z,1)}.

Example 1.22 The Cartesian product R×Rconsists of all ordered pairs of real numbers (x,y). Geometrically, each such ordered pair corresponds to a point in a plane equipped with a system of coordinates. Namely, the pair (u, v)∈ R×Ris represented by the point P whose x-coordinate is u and y-coordinate isv(see Fig.1.1) The Cartesian product is distributive over union, intersection, and difference of sets.

Theorem 1.23 Ifνis one of∪,∩, or, then for any sets R, S, and T , we have (RνS)×T =(R×T) ν (S×T)and T×(RνS)=(T×R) ν (T×S).

Proof We prove only that(RS)×T =(R×T)−(S×T). Let(x,y)(RS)×T . We have xRS and yT . Therefore,(x,y)R×T and(x,y)S×T , which show that(x,y)(R×T)(S×T).

Conversely, (x,y)(R×T)(S ×T)implies xR and yT and also (x,y)S×T . Thus, we have xS, so(x,y)(RS)×T .

It is not difficult to see that if RR and SS, then R×SR×S. We refer to this property as the monotonicity of the Cartesian product with respect to set inclusion.

1.3.2 Relations

Definition 1.24 A relation is a set of ordered pairs.

If S and T are sets andρis a relation such thatρS×T , then we refer toρas a relation from S to T .

A relation from S to S is called a relation on S.

(20)

P(S×T)is the set of all relations from S to T .

Among the relations from S to T , we distinguish the empty relationand the full relation S×T .

The identity relation of a set S is the relationιSS×S defined byιS= {(x,x) | xS}. The full relation on S isθS=S×S.

If(x,y)ρ, we sometimes denote this fact by xρy, and we write xρy instead of(x,y)ρ.

Example 1.25 Let S⊆R. The relation “less than” on S is given by {(x,y) | x,yS and y=x+z for some z∈R0}.

Example 1.26 Consider the relationν⊆Z×Qgiven by

ν = {(n,q) | n∈Z, q∈Q, and nq<n+1}.

We have(−3,−2.3)νand(2,2.3)ν. Clearly,(n,q)νif and only if n is the integral part of the rational number q.

Example 1.27 The relationδis defined by

δ= {(m,n)∈N×N | n=km for some k∈N}.

We have(m,n)δif m divides n evenly.

Note that if ST , thenιSιT andθSθT.

Definition 1.28 The domain of a relationρfrom S to T is the set Dom(ρ)= {xS | (x,y)ρfor some yT}.

The range ofρfrom S to T is the set

Ran(ρ)= {yT | (x,y)ρfor some xS}.

Ifρis a relation and S and T are sets, thenρis a relation from S to T if and only if Dom(ρ)S andRan(ρ)T . Clearly,ρis always a relation from Dom(ρ)to Ran(ρ).

Ifρandσare relations andρσ, then Dom(ρ) ⊆ Dom(σ)andRan(ρ) ⊆ Ran(σ).

Ifρandσare relations, then so areρσ,ρσ, andρσ, and in fact ifρandσ are both relations from S to T , then these relations are also relations from S to T . Definition 1.29 Letρbe a relation. The inverse ofρis the relationρ1given by

ρ1= {(y,x) | (x,y)ρ}.

(21)

1.3 Relations and Functions 9

The proofs of the following simple properties are left to the reader:

(i) Dom1)=Ran(ρ), (ii) Ran1)=Dom(ρ),

(iii) ifρis a relation from A to B, thenρ1is a relation from B to A, and (iv) 1)1=ρ

for every relationρ. Furthermore, ifρandσare two relations such thatρσ, then ρ1σ1(monotonicity of the inverse).

Definition 1.30 Letρandσbe relations. The product ofρandσis the relationρσ, whereρσ= {(x,z) | for some y, (x,y)ρ, and(y,z)σ}.

It is easy to see that Dom(ρσ)⊆Dom(ρ)andRan(ρσ)⊆Ran(σ). Further, ifρ is a relation from A to B andσis a relation from B to C, thenρσis a relation from A to C.

Several properties of the relation product are given in the following theorem.

Theorem 1.31 Letρ12, andρ3be relations. We have (i) ρ12ρ3)=1ρ23(associativity of relation product).

(ii) ρ12ρ3)=1ρ2)∪(ρ1ρ3)and(ρ1ρ23=1ρ3)∪(ρ2ρ3)(distributivity of relation product over union).

(iii) 1ρ2)1=ρ21ρ11.

(iv) Ifρ2ρ3, thenρ1ρ2ρ1ρ3 and ρ2ρ1ρ3ρ1(monotonicity of relation product).

(v) If S and T are any sets, thenιSρ1ρ1andρ1ιTρ1. Further,ιSρ1 = ρ1 if and only if Dom(ρ1)S, andρ1ιT =ρ1if and only ifRan(ρ1)T . (Thus, ρ1is a relation from S to T if and only ifιSρ1=ρ1=ρ1ιT.)

Proof We prove (i), (ii), and (iv) and leave the other parts as exercises.

To prove Part (i), let(a,d)ρ12ρ3). There is a b such that(a,b)ρ1and (b,d)ρ2ρ3. This means that there exists c such that(b,c)ρ2and(c,d)ρ3. Therefore, we have(a,c)ρ1ρ2, which implies(a,d)1ρ23. This shows that ρ12ρ3)1ρ23.

Conversely, let (a,d)1ρ23. There is a c such that (a,c)ρ1ρ2 and (c,d)ρ3. This implies the existence of a b for which(a,b)ρ1and(b,c)ρ3. For this b, we have(b,d)ρ2ρ3, which gives(a,d)ρ12ρ3).We have proven the reverse inclusion,1ρ23ρ12ρ3), which gives the associativity of relation product.

For Part (ii), let(a,c)ρ12ρ3). Then, there is a b such that(a,b)ρ1 and (b,c)ρ2 or (b,c)ρ3. In the first case, we have (a,c)ρ1ρ2; in the second,(a,c)ρ1ρ3. Therefore, we have(a,c)1ρ2)1ρ3)in either case, soρ12ρ3)1ρ2)1ρ3).

Let(a,c)1ρ2)1ρ3). We have either(a,c)ρ1ρ2or (a,c)ρ1ρ3. In the first case, there is a b such that (a,b)ρ1 and (b,c)ρ2ρ2ρ3. Therefore,(a,c)ρ12ρ3). The second case is handled similarly. This establishes 1ρ2)1ρ3)ρ12ρ3).

(22)

The other distributivity property has a similar argument.

Finally, for Part (iv), letρ2andρ3be such thatρ2ρ3. Sinceρ2ρ3=ρ3, we obtain from (ii) that

ρ1ρ3=1ρ2)1ρ3),

which shows thatρ1ρ2ρ1ρ3. The second inclusion is proven similarly.

Definition 1.32 The n-power of a relationρS×S is defined inductively byρ0=ιS andρn+1=ρnρfor n∈N.

Note thatρ1=ρ0ρ=ιSρ=ρfor any relationρ.

Example 1.33 Letρ⊆R×Rbe the relation defined by ρ= {(x,x+1) | x∈R}.

The zero-th power ofρis the relationιR. The second power ofρis

ρ2=ρ·ρ= {(x,y)∈R×R | (x,z)ρand(z,y)ρfor some z∈R}

= {(x,x+2) | x∈R}.

In general,ρn= {(x,x+n) | x∈R}.

Definition 1.34 A relationρis a function if for all x,y,z,(x,y)ρand(x,z)ρ imply y=z;ρis a one-to-one relation if, for all x,x, and y,(x,y)ρand(x,y)ρ imply x=x.

Observe that ∅is a function (referred to in this context as the empty function) because∅satisfies vacuously the defining condition for being a function.

Example 1.35 Let S be a set. The relationρon S×P(S)given byρ= {(x,{x}) | xS}is a function.

Example 1.36 For every set S, the relationιS is both a function and a one-to-one relation. The relationν from Example 1.26 is a one-to-one relation, but it is not a function.

Theorem 1.37 For any relationρ,ρis a function if and only ifρ−1is a one-to-one relation.

Proof Letρbe a function, and let(y1,x), (y2,x)ρ1. Definition 1.29 implies that (x,y1), (x,y2)ρso y1=y2, soρ1is one-to-one.

Conversely, assume thatρ1is one-to-one and let(x,y1), (x,y2)ρ. Applying Definition 1.29, we obtain(y1,x), (y2,x)ρ1and, sinceρ1is one-to-one, we have y1=y2. This shows thatρis a function.

Example 1.38 We observed that the relationνintroduced in Example 1.26 is one- to-one. Therefore, its inverseν1⊆Q×Zis a function. In fact,ν1associates to each rational number q its integer partq⊃.

(23)

1.3 Relations and Functions 11 Definition 1.39 A relation ρ from S to T is total if Dom(ρ) = S and is onto if Ran(ρ)=T .

Any relationρis a total and onto relation from Dom(ρ)toRan(ρ). If both S and T are nonempty, then S×T is a total and onto relation from S to T .

It is easy to prove that a relationρfrom S to T is a total relation from S to T if and only ifρ1is an onto relation from T to S.

If ρis a relation, then one can determine whether or notρis a function or is one-to-one just by looking at the ordered pairs ofρ. Whetherρis a total or onto relation from A to B depends on what A and B are.

Theorem 1.40 Letρandσbe relations.

(i) ifρandσare functions, thenρσis also a function;

(ii) ifρandσare one-to-one relations, thenρσis also a one-to-one relation;

(iii) ifρis a total relation from R to S andσis a total relation from S to T , thenρσ is a total relation from R to T ;

(iv) ifρis an onto relation from R to S andσis an onto relation from S to T , then ρσis an onto relation from R to T ;

Proof To show Part (i), suppose that ρandσ are both functions and that (x,z1) and (x,z2)both belong to ρσ. Then, there exists a y1 such that (x,y1)ρand (y1,z1)σ, and there exists a y2such that(x,y2)ρand(y2,z2)σ. Sinceρis a function, y1=y2, and hence, sinceσis a function, z1=z2, as desired.

Part (ii) follows easily from Part (i). Suppose that relationsρandσare one-to-one (and hence thatρ1andσ1are both functions). To show thatρσis one-to-one, it suffices to show that(ρσ)1=σ1ρ1is a function. This follows immediately from Part (i).

We leave the proofs for the last two parts of the theorem to the reader.

The properties of relations defined next allow us to define important classes of relations.

Definition 1.41 Let S be a set and letρS×S be a relation. The relationρis:

(i) reflexive if(s,s)ρfor every sS;

(ii) irreflexive if(s,s)ρfor every sS;

(iii) symmetric if(s,s)ρimplies(s,s)ρfor s,sS;

(iv) antisymmetric if(s,s), (s,s)ρimplies s=sfor s,sS;

(v) asymmetric if(s,s)ρimplies(s,s)ρ; and (vi) transitive if(s,s), (s,s⇒⇒)ρimplies(s,s⇒⇒)ρ.

Example 1.42 The relationιSis reflexive, symmetric, antisymmetric, and transitive for any set S.

Example 1.43 The relationδintroduced in Example 1.27 is reflexive since n·1=n for any n∈N.

(24)

Suppose that(m,n), (n,m)δ. There are p,q∈Nsuch that mp=n and nq=m.

If n=0, then this also implies m=0; hence, m=n. Let us assume that n=0. The previous equalities imply nqp=n, and since n=0, we have qp=1. In view of the fact that both p and q belong toN, we have p=q=1; hence, m=n, which proves the antisymmetry ofρ.

Let(m,n), (n,r)δ. We can write n=mp and r=nq for some p,q∈N, which gives r =mpq. This means that(m,r)δ, which shows thatδis also transitive.

Definition 1.44 Let S and T be two sets and letρS×T be a relation.

The image of an element sS under the relationρis the setρ(s)= {tT | (s,t)ρ}.

The preimage of an element tT underρis the set{sS | (s,t)ρ}, which equalsρ1(t), using the previous notation.

The collection of images of S underρis

IMρ= {ρ(s) | sS}, while the collection of preimages of T is

PIMρ=IMρ−1 = {ρ1(t) | tT}.

IfCandCare two collections of subsets of S and T , respectively, andC=IMρand C=PIMρfor some relationρS×T , we refer toCas the dual class relative toρ ofC.

Example 1.45 Any collectionD of subsets of S can be regarded as the collection of images under a suitable relation. Indeed, letCbe such a collection. Define the relationρS×Casρ= {(s,C) | sS,C∈Cand cC}. Then,IMρconsists of all subsets ofP(C)of the formρ(s)= {C∈C | sC}for sS. It is easy to see thatPIMρ(C)=C.

The collection IMρ defined in this example is referred to as the bi-dual collection ofC.

1.3.3 Functions

We saw that a function is a relationρsuch that, for every x in Dom(ρ), there is only one y such that(x,y)ρ. In other words, a function assigns a unique value to each member of its domain.

From now on, we will use the letters f,g,h, and k to denote functions, and we will denote the identity relationιS, which we have already remarked is a function, by 1S.

If f is a function, then, for each x in Dom(f), we let f(x)denote the unique y with (x,y)f , and we refer to f(x)as the image of x under f.

(25)

1.3 Relations and Functions 13

Definition 1.46 Let S and T be sets. A partial function from S to T is a relation from S to T that is a function.

A total function from S to T (also called a function from S to T or a mapping from S to T ) is a partial function from S to T that is a total relation from S to T .

The set of all partial functions from S to T is denoted by ST and the set of all total functions from S to T by S −→T . We have S−→TS T for all sets S and T .

The fact that f is a partial function from S to T is indicated by writing f :ST rather than fS T . Similarly, instead of writing fS −→ T , we use the notation f :S−→T .

For any sets S and T , we have∅ ∈ S T . If either S or T is empty, then∅is the only partial function from S to T . If S = ∅, then the empty function is a total function from S to any T . Thus, for any sets S and T , we have

S∅ = {∅},∅T = {∅}, and∅ −→T = {∅}.

Furthermore, if S is nonempty, then there can be no (total) function from S to the empty set, so we have S−→ ∅ = ∅if S= ∅.

Definition 1.47 A one-to-one function is called an injection.

A function f :ST is called a surjection (from S to T ) if f is an onto relation from S to T , and it is called a bijection (from S to T ) or a one-to-one correspondence between S and T if it is total, an injection, and a surjection.

Using our notation for functions, we can restate the definition of injection as follows: f is an injection if for all s,s ∈ Dom(f), f(s) = f(s)implies s = s. Likewise, f :ST is a surjection if for every tT there is an sS with f(s)=t.

Example 1.48 Let S and T be two sets and assume that ST . The containment mapping c:S −→T defined by c(s)=s for sS is an injection. We denote such a containment by c:T .

Example 1.49 Let m ∈ Nbe a natural number, m 2. Consider the function rm :N−→ {0, . . . ,m−1}, where rm(n)is the remainder when n is divided by m.

Obviously, rmis well-defined since the remainder p when a natural number is divided by m satisfies 0pm1. The function rmis onto because of the fact that, for any p∈ {0, . . . ,m−1}, we have rm(km+p)=p for any k∈N.

For instance, if m = 4, we have r4(0) = r4(4) = r4(8) = · · · = 0, r4(1) = r4(5)=r4(9)= · · · =1, r4(2)=r4(6)=r4(10)= · · · =2 and r4(3)=r4(7)= r4(11)= · · · =3.

Example 1.50 LetPfin(N)be the set of finite subsets ofN. Define the functionφ : Pfin(N)−→Nas

(26)

φ(K)=

0 if K= ∅,

p

i=12ni if K= {n1, . . . ,np}.

It is easy to see thatφis a bijection.

Since a function is a relation, the ideas introduced in the previous section for relations in general can equally well be applied to functions. In particular, we can consider the inverse of a function and the product of two functions.

If f is a function, then, by Theorem 1.37, f1is a one-to-one relation; however, f1is not necessarily a function. In fact, by the same theorem, if f is a function, then f1is a function if and only if f is an injection.

Suppose now that f : S T is an injection. Then, f1 : T S is also an injection. Further, f1:T S is total if and only if f :S T is a surjection, and f1:T S is a surjection if and only if f :ST is total. It follows that f :ST is a bijection if and only if f1:T S is a bijection.

If f andgare functions, then we will always use the alternative notationgf instead of the notation fgused for the relation product. We will refer togf as the composition of f andgrather than the product.

By Theorem 1.40, the composition of two functions is a function. In fact, it follows from the definition of composition that

Dom(gf)= {s∈Dom(f) | f(s)∈Dom(g)}

and, for all s∈Dom(gf),

gf(s)=g(f(s)).

This explains why we usegf rather than fg. If we used the other notation, the previous equation would become fg(s)=g(f(s)), which is rather confusing.

Definition 1.51 Let f : S −→ T . A left inverse (relative to S and T ) for f is a functiong:T −→S such thatgf =1S. A right inverse (relative to S and T ) for f is a functiong:T −→S such that fg=1T.

Theorem 1.52 A function f : S −→T is a surjection if and only if f has a right inverse (relative to S and T ).

If S is nonempty, then f is an injection if and only if f has a left inverse (relative to S and T ).

Proof Suppose that f :S −→T is a surjection. Define a functiong :T −→S as follows: For each yT , letg(y)be some arbitrarily chosen element xS such that f(x)=y. (Such an x exists because f is surjective.) Then, by definition, f(g(y))=y for all yT , sog is a right inverse for f . Conversely, suppose that f has a right inverseg. Let yT and let x=g(y). Then, we have f(x)=f(g(y))=1T(y)=y.

Thus, f is surjective.

To prove the second part, suppose that f : S −→T is an injection and that S is nonempty. Let x0be some fixed element of S. Define a functiong :T −→S as

(27)

1.3 Relations and Functions 15 follows: If y∈Ran(f), then, since f is an injection, there is a unique element xS such that f(x)=y. Defineg(y)to be this x. If yT −Ran(f), defineg(y)=x0. Then, it is immediate from the definition ofgthat, for all xS,g(f(x))=x, sogis a left inverse for f . Conversely, suppose that f has a left inverseg. For all x1,x2S, if f(x1)=f(x2), we have x1=1S(x1)=g(f(x1))=g(f(x2))=1S(x2)=x2. Hence, f is an injection.

Theorem 1.53 Let f :S−→T . Then, the following statements are equivalent:

(i) f is a bijection;

(ii) there is a functiong:T −→S that is both a left and a right inverse for f ; (iii) f has both a left inverse and a right inverse.

Furthermore, if f is a bijection, then f1is the only left inverse that f has, and it is the only right inverse that f has.

Proof (i) implies (ii):If f :S−→B is a bijection, then f1:T −→S is both a left and a right inverse for f .

(ii) implies (iii):This implication is obvious.

(iii) implies (i):If f has both a left inverse and a right inverse and S= ∅, then it follows immediately from Theorem 1.52 that f is both injective and surjective, so f is a bijection. If S= ∅, then the existence of a left inverse function from T to S implies that T is also empty; this means that f is the empty function, which is a bijection from the empty set to itself.

Finally, suppose that f : S −→ T is a bijection and thatg : T −→ S is a left inverse for f . Then, we have

f1=1Sf1=(gf)f1=g(ff1)=g1T =g.

Thus, f1is the unique left inverse for f . A similar proof shows that f1is the unique right inverse for f .

To prove that f : S −→T is a bijection one could prove directly that f is both one-to-one and onto. Theorem 1.53 provides an alternative way. If we can define a functiong :T −→S and show thatgis both a left and a right inverse for f , then f is a bijection andg=f1.

The next definition provides another way of viewing a subset of a set S.

Definition 1.54 Let S be a set. An indicator function over S is a function I :S−→ {0,1}.

If P is a subset of S, then the indicator function of P (as a subset of S) is the function IP:S −→ {0,1}given by

IP(x)=

1 if xP 0 otherwise, for every xS.

(28)

It is easy to see that

IPQ(x)=IP(x)·IQ(x),

IPQ(x)=IP(x)+IQ(x)IP(x)·IQ(x), IP¯(x)=1−IP(x),

for every P,QS and xS.

The relationship between the subsets of a set and indicator functions defined on that set is discussed next.

Theorem 1.55 There is a bijectionΨ :P(S)−→(S−→ {0,1})between the set of subsets of S and the set of indicator functions defined on S.

Proof For P ∈ P(S), defineΨ (P) = IP. The mappingΨ is one-to-one. Indeed, assume that IP =IQ, where P,Q∈P(S). We have xP if and only if IP(x)=1, which is equivalent to IQ(x)=1. This happens if and only if xQ; hence, P=Q soΨ is one-to-one.

Let f :S −→ {0,1}be an arbitrary function. Define the set Tf = {xS | f(x)= 1}. It is easy to see that f is the indicator function of the set Tf. Hence,Ψ (Tf)=f , which shows that the mappingΨ is also onto and hence it is a bijection.

Definition 1.56 A simple function on a set S is a function f : S −→Rthat has a finite range.

Simple functions are linear combinations of indicator functions, as we show next.

Theorem 1.57 Let f : S −→ R be a simple function such that Ran(f) = {y1, . . . ,yn} ⊆R. Then,

f = n

i=1

yiIf1(yi).

Proof Let x∈R. If f(x)=yj, then

If1(yβ)(x)=

1 ifβ=j, 0 otherwise.

Thus, n

i=1

yiIf1(yi)

(x)=yj,

which shows that f(x)= ni=1yiIf1(yi) (x).

Theorem 1.58 Let f1, . . . ,fkbe k simple functions defined on a set S. Ifg:Rk−→R is an arbitrary function, theng(f1, . . . ,fk)is a simple function on S and we have

Références

Documents relatifs

An Opinion Mining tool will be set up to satisfy the detected needs of the final users, allowing the analysis of several data sources provided by communications channels

SAS® Analytics U is a new initiative for making SAS data analysis and mining tools available for free to educational researchers and instructors.. T hese tools

What this means is that our system is initially subjected to a training phase, where it is provided with a corpus of sequences that have been correctly tagged by a human. Based on

The talk presents the history and present state of the GUHA method, its theoretical foundations and its relation and meaning for data mining. (Joint work

The aim of the book is to illustrate how effective data mining algorithms can be generated with the incorporation of fuzzy logic into classical cluster analysis models, and how

In these two environments, we faced the rapid development of a field connected with data analysis according to at least two features: the size of available data sets, as both number

1) Import both data sets into your RapidMiner repository. You do not need to worry about attribute data types because the Decision Tree operator can handle all types of data. Be

It provides the user with a graphical flow environment for model building, and it has a set of popular data mining algorithms, including decision trees, neural net- work,