Uncertain Data Management
Relational Probabilistic Database Models
Antoine Amarilli1, Silviu Maniu2 December 12th, 2016
1Télécom ParisTech
2LRI
1/48
Table of contents
Probabilistic instances TID
BID pc-tables
Uncertain instances Remember fromlast class:
• Fix a finite set ofpossible tuplesof same arity
• Apossible world: a subset of thepossible tuples
• A (finite)uncertain relation: set ofpossible worlds
U1
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
U2
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
Uncertain instances Remember fromlast class:
• Fix a finite set ofpossible tuplesof same arity
• Apossible world: a subset of thepossible tuples
• A (finite)uncertain relation: set ofpossible worlds U1
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017
U2
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017
Probabilistic instances
• SupportU: uncertain relation
• Probability distributionπonU:
• FunctionfromU to reals in[0,1]
• It mustsum upto1:P
I∈Uπ(I) =1
U1
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
π(U1) =0.8
U2
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
π(U2) =0.2
Probabilistic instances
• SupportU: uncertain relation
• Probability distributionπonU:
• FunctionfromU to reals in[0,1]
• It mustsum upto1:P
I∈Uπ(I) =1
U1
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
π(U1) =0.8
U2
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
π(U2) =0.2
Probabilistic instances
• SupportU: uncertain relation
• Probability distributionπonU:
• FunctionfromU to reals in[0,1]
• It mustsum upto1:P
I∈Uπ(I) =1
U1
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
π(U1) =0.8
U2
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
π(U2) =0.2
Probabilistic instances
• SupportU: uncertain relation
• Probability distributionπonU:
• FunctionfromU to reals in[0,1]
• It mustsum upto1:P
I∈Uπ(I) =1
U1
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017
π(U1) =0.8
U2
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017
π(U2) =0.2
Probabilistic instances
• SupportU: uncertain relation
• Probability distributionπonU:
• FunctionfromU to reals in[0,1]
• It mustsum upto1:P
I∈Uπ(I) =1
U1
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
U2
date teacher room 04 Silviu C017 04 Antoine C017 04 Antoine C47 11 Silviu C017 11 Silviu C47 11 Antoine C017
What aboutNULLs?
Remember that last time we saw:
• Codd-tablesandv-tablesandc-tables, withNULLs
• Boolean c-tables, withNULLsonly in conditions
→ Boolean variables
→ We focus for probabilities on models likeBoolean c-tables
→ Easier to define probabilities on afinitespace!
What aboutNULLs?
Remember that last time we saw:
• Codd-tablesandv-tablesandc-tables, withNULLs
• Boolean c-tables, withNULLsonly in conditions
→ Boolean variables
→ We focus for probabilities on models likeBoolean c-tables
→ Easier to define probabilities on afinitespace!
Relational algebra on uncertain instances Remember fromlast class:
• Extend relational algebra operators touncertain instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsin the supports of the inputs
• apply the operation and get thepossible outputs
U1
04 S. C017 11 S. C47
U2
11 A. C017
∪
V1
V2
11 A. C017
=
04 S. C017 11 S. C47 04 S. C017 11 S. C47 11 A. C017 11 A. C017
Relational algebra on uncertain instances Remember fromlast class:
• Extend relational algebra operators touncertain instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsin the supports of the inputs
• apply the operation and get thepossible outputs
U1
04 S. C017 11 S. C47
U2
11 A. C017
∪
V1
V2
11 A. C017
=
04 S. C017 11 S. C47 04 S. C017 11 S. C47 11 A. C017 11 A. C017
Relational algebra on uncertain instances Remember fromlast class:
• Extend relational algebra operators touncertain instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsin the supports of the inputs
• apply the operation and get thepossible outputs
U1
04 S. C017
11 S. C47
∪
V1
V2
11 A. C017
=
04 S. C017 11 S. C47 04 S. C017 11 S. C47 11 A. C017 11 A. C017
Relational algebra on uncertain instances Remember fromlast class:
• Extend relational algebra operators touncertain instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsin the supports of the inputs
• apply the operation and get thepossible outputs
U1
04 S. C017 11 S. C47
U2
11 A. C017
∪
V1
V2
11 A. C017
=
04 S. C017 11 S. C47 04 S. C017 11 S. C47 11 A. C017 11 A. C017
Relational algebra on uncertain instances Remember fromlast class:
• Extend relational algebra operators touncertain instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsin the supports of the inputs
• apply the operation and get thepossible outputs
U1
04 S. C017
11 S. C47
∪
V1
V
=
04 S. C017 11 S. C47 04 S. C017 11 S. C47 11 A. C017 11 A. C017
Relational algebra on uncertain instances Remember fromlast class:
• Extend relational algebra operators touncertain instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsin the supports of the inputs
• apply the operation and get thepossible outputs
U1
04 S. C017 11 S. C47
U2
11 A. C017
∪
V1
V2
11 A. C017
=
04 S. C017 11 S. C47 04 S. C017 11 S. C47 11 A. C017 11 A. C017
Relational algebra on probabilistic instances
• Let’s adapt relational algebra toprobabilistic instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsof the inputs
• apply the operation and get apossible output
• Theprobabilityof each possible world should be...
• considerall input possible worldsthat give it
• sum up theirprobabilities
Relational algebra on probabilistic instances
• Let’s adapt relational algebra toprobabilistic instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsof the inputs
• apply the operation and get apossible output
• Theprobabilityof each possible world should be...
• considerall input possible worldsthat give it
• sum up theirprobabilities
Relational algebra on probabilistic instances
• Let’s adapt relational algebra toprobabilistic instances
• Thepossible worldsof theresultshould be...
• take allpossible worldsof the inputs
• apply the operation and get apossible output
• Theprobabilityof each possible world should be...
• considerall input possible worldsthat give it
• sum up theirprobabilities
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017
π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017
π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017
π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017
π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017
π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017
π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017
π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017
π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U ) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9 W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U ) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1 W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47
π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U ) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47 π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017
π(W1) =0.8·0.1
W3
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47 π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017 π(W1) =0.8·0.1
W3
11 A. C017
π(W1) =0.2·0.9 +0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U ) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47 π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017 π(W1) =0.8·0.1
W3
+0.2·0.1
Example of relational algebra on probabilistic instances
U1
04 S. C017 11 S. C47
π(U1) =0.8
U2
11 A. C017 π(U2) =0.2
∪
V1
π(V1) =0.9
V2
11 A. C017 π(V2) =0.1
=
W1
04 S. C017 11 S. C47 π(W1) =0.8·0.9
W2
04 S. C017 11 S. C47 11 A. C017 π(W1) =0.8·0.1
W3
11 A. C017 π(W1) =0.2·0.9
Representation system
• Remember that if we haveNpossible tuples
→ there are2Npossible instances
→ there are22N possible uncertain instances
→ writing outan uncertain instance isexponential
→ Last time we sawBoolean c-tablesas aconcise way torepresentuncertain instances
• Forprobabilistic instances:
→ there areinfinitely manypossible instances
→ writing outa probabilistic instance is stillexponential
→ How torepresentprobabilistic instances?
Representation system
• Remember that if we haveNpossible tuples
→ there are2Npossible instances
→ there are22N possible uncertain instances
→ writing outan uncertain instance isexponential
→ Last time we sawBoolean c-tablesas aconcise way torepresentuncertain instances
• Forprobabilistic instances:
→ there areinfinitely manypossible instances
→ writing outa probabilistic instance is stillexponential
→ How torepresentprobabilistic instances?
Representation system
• Remember that if we haveNpossible tuples
→ there are2Npossible instances
→ there are22N possible uncertain instances
→ writing outan uncertain instance isexponential
→ Last time we sawBoolean c-tablesas aconcise way torepresentuncertain instances
• Forprobabilistic instances:
→ there areinfinitely manypossible instances
→ writing outa probabilistic instance is stillexponential
→ How torepresentprobabilistic instances?
Representation system
• Remember that if we haveNpossible tuples
→ there are2Npossible instances
→ there are22N possible uncertain instances
→ writing outan uncertain instance isexponential
→ Last time we sawBoolean c-tablesas aconcise way torepresentuncertain instances
• Forprobabilistic instances:
→ there areinfinitely manypossible instances
→ writing outa probabilistic instance is stillexponential
→ How torepresentprobabilistic instances?
Representation system
• Remember that if we haveNpossible tuples
→ there are2Npossible instances
→ there are22N possible uncertain instances
→ writing outan uncertain instance isexponential
→ Last time we sawBoolean c-tablesas aconcise way torepresentuncertain instances
• Forprobabilistic instances:
→ there areinfinitely manypossible instances
→ writing outa probabilistic instance is stillexponential
→ How torepresentprobabilistic instances?
Representation system
• Remember that if we haveNpossible tuples
→ there are2Npossible instances
→ there are22N possible uncertain instances
→ writing outan uncertain instance isexponential
→ Last time we sawBoolean c-tablesas aconcise way torepresentuncertain instances
• Forprobabilistic instances:
→ there areinfinitely manypossible instances
→ writing outa probabilistic instance is stillexponential
→ How torepresentprobabilistic instances?
Representation system
• Remember that if we haveNpossible tuples
→ there are2Npossible instances
→ there are22N possible uncertain instances
→ writing outan uncertain instance isexponential
→ Last time we sawBoolean c-tablesas aconcise way torepresentuncertain instances
• Forprobabilistic instances:
→ there areinfinitely manypossible instances a probabilistic instance is still
Table of contents
Probabilistic instances
TID BID pc-tables Conclusion
Tuple-independent databases
• Thesimplestmodel: tuple-independent databases
• Annotate eachinstance factwith aprobability
U
date teacher room 04 Silviu C017
0.8
04 Antoine C017
0.2
11 Silviu C017
1
→ Assumeindependencebetween tuples
(Silviu and Antoine may teach at the same time)
Tuple-independent databases
• Thesimplestmodel: tuple-independent databases
• Annotate eachinstance factwith aprobability U
date teacher room 04 Silviu C017
0.8
04 Antoine C017
0.2
11 Silviu C017
1
→ Assumeindependencebetween tuples
(Silviu and Antoine may teach at the same time)
Tuple-independent databases
• Thesimplestmodel: tuple-independent databases
• Annotate eachinstance factwith aprobability U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
→ Assumeindependencebetween tuples
(Silviu and Antoine may teach at the same time)
Tuple-independent databases
• Thesimplestmodel: tuple-independent databases
• Annotate eachinstance factwith aprobability U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
→ Assumeindependencebetween tuples
(Silviu and Antoine may teach at the same time)
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples
U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
0.8×(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
0.8×(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room
04 Silviu C017 04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
0.8×(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017
04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
0.8×(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017
11 Silviu C017 What’s theprobabilityof this outcome?
0.8×(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017 11 Silviu C017
What’s theprobabilityof this outcome?
0.8×(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
0.8×(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
0.8
×(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
(1−0.2)×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
0.8×(1−0.2)
×1
Semantics of TID
• Each tuple iskeptordiscardedwith the probability
• Probabilistic choices areindependentacross tuples U
date teacher room 04 Silviu C017 0.8 04 Antoine C017 0.2
11 Silviu C017 1
U
date teacher room 04 Silviu C017 04 Antoine C017 11 Silviu C017 What’s theprobabilityof this outcome?
Getting a probability distribution
Thesemanticsof a TID instance is aprobabilistic instance...
→ thepossible worldsare the subsets
→ always keeping tuples withprobability 1 Formally, for a TID instanceI, theprobabilityofJ:
• we must haveJ⊆I
• product ofptfor each tupletkeptinJ
• product of1−pt for each tupletnot keptinJ
Getting a probability distribution
Thesemanticsof a TID instance is aprobabilistic instance...
→ thepossible worldsare the subsets
→ always keeping tuples withprobability 1
Formally, for a TID instanceI, theprobabilityofJ:
• we must haveJ⊆I
• product ofptfor each tupletkeptinJ
• product of1−pt for each tupletnot keptinJ
Getting a probability distribution
Thesemanticsof a TID instance is aprobabilistic instance...
→ thepossible worldsare the subsets
→ always keeping tuples withprobability 1 Formally, for a TID instanceI, theprobabilityofJ:
• we must haveJ⊆I
• product ofptfor each tupletkeptinJ
• product of1−pt for each tupletnot keptinJ
Getting a probability distribution
Thesemanticsof a TID instance is aprobabilistic instance...
→ thepossible worldsare the subsets
→ always keeping tuples withprobability 1 Formally, for a TID instanceI, theprobabilityofJ:
• we must haveJ⊆I
• product ofptfor each tupletkeptinJ
• product of1−p for each tupletnot keptinJ
Is it a probability distribution?
Do the probabilities alwayssum to 1?
• 0 tuples:vacuous
• 1 tuple:p+ (1−p) =1
• 2 tuples:p1p2+p1(1−p2) + (1−p1)p2+ (1−p1)(1−p2)
→ p1(p2+ (1−p2)) + (1−p1)(p2+ (1−p2))
→ p1×1+ (1−p1)×1
→ 1
• More tuples:p1(· · ·) + (1−p1)(· · ·)andinduction hypothesis
Is it a probability distribution?
Do the probabilities alwayssum to 1?
• 0 tuples:vacuous
• 1 tuple:p+ (1−p) =1
• 2 tuples:p1p2+p1(1−p2) + (1−p1)p2+ (1−p1)(1−p2)
→ p1(p2+ (1−p2)) + (1−p1)(p2+ (1−p2))
→ p1×1+ (1−p1)×1
→ 1
• More tuples:p1(· · ·) + (1−p1)(· · ·)andinduction hypothesis
Is it a probability distribution?
Do the probabilities alwayssum to 1?
• 0 tuples:vacuous
• 1 tuple:p+ (1−p) =1
• 2 tuples:p1p2+p1(1−p2) + (1−p1)p2+ (1−p1)(1−p2)
→ p1(p2+ (1−p2)) + (1−p1)(p2+ (1−p2))
→ p1×1+ (1−p1)×1
→ 1
• More tuples:p1(· · ·) + (1−p1)(· · ·)andinduction hypothesis
Is it a probability distribution?
Do the probabilities alwayssum to 1?
• 0 tuples:vacuous
• 1 tuple:p+ (1−p) =1
• 2 tuples:p1p2+p1(1−p2) + (1−p1)p2+ (1−p1)(1−p2)
→ p1(p2+ (1−p2)) + (1−p1)(p2+ (1−p2))
→ p1×1+ (1−p1)×1
→ 1
• More tuples:p1(· · ·) + (1−p1)(· · ·)andinduction hypothesis
Is it a probability distribution?
Do the probabilities alwayssum to 1?
• 0 tuples:vacuous
• 1 tuple:p+ (1−p) =1
• 2 tuples:p1p2+p1(1−p2) + (1−p1)p2+ (1−p1)(1−p2)
→ p1(p2+ (1−p2)) + (1−p1)(p2+ (1−p2))
→ p1×1+ (1−p1)×1
→ 1
• More tuples:p1(· · ·) + (1−p1)(· · ·)andinduction hypothesis
Is it a probability distribution?
Do the probabilities alwayssum to 1?
• 0 tuples:vacuous
• 1 tuple:p+ (1−p) =1
• 2 tuples:p1p2+p1(1−p2) + (1−p1)p2+ (1−p1)(1−p2)
→ p1(p2+ (1−p2)) + (1−p1)(p2+ (1−p2))
→ p1×1+ (1−p1)×1
→ 1
• More tuples:p1(· · ·) + (1−p1)(· · ·)andinduction hypothesis
Is it a probability distribution?
Do the probabilities alwayssum to 1?
• 0 tuples:vacuous
• 1 tuple:p+ (1−p) =1
• 2 tuples:p1p2+p1(1−p2) + (1−p1)p2+ (1−p1)(1−p2)
→ p1(p2+ (1−p2)) + (1−p1)(p2+ (1−p2))
→ p1×1+ (1−p1)×1
→ 1
• More tuples:p1(· · ·) + (1−p1)(· · ·)andinduction hypothesis
Is it a probability distribution?
Do the probabilities alwayssum to 1?
• 0 tuples:vacuous
• 1 tuple:p+ (1−p) =1
• 2 tuples:p1p2+p1(1−p2) + (1−p1)p2+ (1−p1)(1−p2)
→ p1(p2+ (1−p2)) + (1−p1)(p2+ (1−p2))
→ p1×1+ (1−p1)×1
→ 1
Strong representation system
Remember fromlast class:
Uncertain instance: set of possible worlds
Uncertainty framework: concise way to represent uncertain instances Query language: here, relational algebra Definition (Strong representation system) For any query in the language,
on uncertain instances represented in the framework, the uncertain instance obtained by evaluating the query can also be represented in the framework.
→ Are TID instances astrong representation system?
Strong representation system
Remember fromlast class:
Uncertain instance: set of possible worlds Uncertainty framework: concise way to represent
uncertain instances
Query language: here, relational algebra Definition (Strong representation system) For any query in the language,
on uncertain instances represented in the framework, the uncertain instance obtained by evaluating the query can also be represented in the framework.
→ Are TID instances astrong representation system?
Strong representation system
Remember fromlast class:
Uncertain instance: set of possible worlds Uncertainty framework: concise way to represent
uncertain instances Query language: here, relational algebra
Definition (Strong representation system) For any query in the language,
on uncertain instances represented in the framework, the uncertain instance obtained by evaluating the query can also be represented in the framework.
→ Are TID instances astrong representation system?
Strong representation system
Remember fromlast class:
Uncertain instance: set of possible worlds Uncertainty framework: concise way to represent
uncertain instances Query language: here, relational algebra Definition (Strong representation system) For any query in the language,
on uncertain instances represented in the framework, the uncertain instance obtained by evaluating the query can also be represented in the framework.
→ Are TID instances astrong representation system?
Strong representation system
Remember fromlast class:
Uncertain instance: set of possible worlds Uncertainty framework: concise way to represent
uncertain instances Query language: here, relational algebra Definition (Strong representation system) For any query in the language,
on uncertain instances represented in the framework,
the uncertain instance obtained by evaluating the query can also be represented in the framework.
→ Are TID instances astrong representation system?
Strong representation system
Remember fromlast class:
Uncertain instance: set of possible worlds Uncertainty framework: concise way to represent
uncertain instances Query language: here, relational algebra Definition (Strong representation system) For any query in the language,
on uncertain instances represented in the framework, the uncertain instance obtained by evaluating the query
can also be represented in the framework.
→ Are TID instances astrong representation system?
Strong representation system
Remember fromlast class:
Uncertain instance: set of possible worlds Uncertainty framework: concise way to represent
uncertain instances Query language: here, relational algebra Definition (Strong representation system) For any query in the language,
on uncertain instances represented in the framework, the uncertain instance obtained by evaluating the query can also be represented in the framework.
→ Are TID instances astrong representation system?
Strong representation system
Remember fromlast class:
Uncertain instance: set of possible worlds Uncertainty framework: concise way to represent
uncertain instances Query language: here, relational algebra Definition (Strong representation system) For any query in the language,
on uncertain instances represented in the framework, the uncertain instance obtained by evaluating the query
Implementing select
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
σteacher=“Silviu”(U) date teacher room
04 Silviu C47 0.8
11 Silviu C47 1
→ Is thiscorrect?... So far,so good.
Implementing select
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
σteacher=“Silviu”(U) date teacher room
04 Silviu C47 0.8
11 Silviu C47 1
→ Is thiscorrect?... So far,so good.
Implementing select
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
σteacher=“Silviu”(U) date teacher room
04 Silviu C47 0.8
11 Silviu C47 1
→ Is thiscorrect?... So far,so good.
Implementing select
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
σteacher=“Silviu”(U) date teacher room
04 Silviu C47 0.8
11 Silviu C47 1
So far,so good.
Implementing select
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
σteacher=“Silviu”(U) date teacher room
04 Silviu C47 0.8
11 Silviu C47 1
→ Is thiscorrect?... So far,so good.
Implementing project
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
11 Antoine C47 0.1
18 Silviu C47 0.9
πdate(U) date
04
1−(1−0.2)·(1−0.8)
11
1
18
0.9
→ Is thiscorrect?... So far,so good.
Implementing project
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
11 Antoine C47 0.1
18 Silviu C47 0.9
πdate(U) date
04
1−(1−0.2)·(1−0.8)
11
1
18
0.9
→ Is thiscorrect?... So far,so good.
Implementing project
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
11 Antoine C47 0.1
18 Silviu C47 0.9
πdate(U) date
04
1−(1−0.2)·(1−0.8) 1
0.9
→ Is thiscorrect?... So far,so good.
Implementing project
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
11 Antoine C47 0.1
18 Silviu C47 0.9
πdate(U) date
04
1−(1−0.2)·(1−0.8)
11
1
18 0.9
→ Is thiscorrect?... So far,so good.
Implementing project
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
11 Antoine C47 0.1
18 Silviu C47 0.9
πdate(U) date
04
1−(1−0.2)·(1−0.8)
→ Is thiscorrect?... So far,so good.
Implementing project
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
11 Antoine C47 0.1
18 Silviu C47 0.9
πdate(U) date
04 1−(1−0.2)·(1−0.8)
11 1
18 0.9
→ Is thiscorrect?... So far,so good.
Implementing project
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
11 Antoine C47 0.1
18 Silviu C47 0.9
πdate(U) date
04 1−(1−0.2)·(1−0.8)
So far,so good.
Implementing project
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
11 Silviu C47 1
11 Antoine C47 0.1
18 Silviu C47 0.9
πdate(U) date
04 1−(1−0.2)·(1−0.8)
11 1
18 0.9
Implementing product
...OR NOT!
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
U×Repair date teacher room cause 04 Silviu C47 leopard
0.8×0.1
04 Antoine C47 leopard
0.2×0.1
→ Is thiscorrect?
→ It’sWRONG!
Implementing product
...OR NOT!
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
U×Repair date teacher room cause 04 Silviu C47 leopard
0.8×0.1
04 Antoine C47 leopard
0.2×0.1
→ Is thiscorrect?
→ It’sWRONG!
Implementing product
...OR NOT!
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
U×Repair date teacher room cause
04 Silviu C47 leopard
0.8×0.1
04 Antoine C47 leopard
0.2×0.1
→ Is thiscorrect?
→ It’sWRONG!
Implementing product
...OR NOT!
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
U×Repair date teacher room cause 04 Silviu C47 leopard
0.8×0.1
04 Antoine C47 leopard
0.2×0.1
→ Is thiscorrect?
→ It’sWRONG!
Implementing product
...OR NOT!
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
U×Repair date teacher room cause
04 Silviu C47 leopard 0.8×0.1 04 Antoine C47 leopard 0.2×0.1
→ Is thiscorrect?
→ It’sWRONG!
Implementing product
...OR NOT!
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
U×Repair date teacher room cause
04 Silviu C47 leopard 0.8×0.1 04 Antoine C47 leopard 0.2×0.1
→ Is thiscorrect?
→ It’sWRONG!
Implementing product ...OR NOT!
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
U×Repair date teacher room cause
04 Silviu C47 leopard 0.8×0.1 04 Antoine C47 leopard 0.2×0.1
Why is it wrong?
U
date teacher room
04 Silviu C47 1
04 Antoine C47 1
Repair room cause
C47 leopard 1/2
U×Repair date teacher room cause 04 Silviu C47 leopard
1/2
04 Antoine C47 leopard
1/2
→ The two tuples arenot independent!
→ The first is thereiffthe second is there.
Why is it wrong?
U
date teacher room
04 Silviu C47 1
04 Antoine C47 1
Repair room cause
C47 leopard 1/2
U×Repair date teacher room cause 04 Silviu C47 leopard
1/2
04 Antoine C47 leopard
1/2
→ The two tuples arenot independent!
→ The first is thereiffthe second is there.
Why is it wrong?
U
date teacher room
04 Silviu C47 1
04 Antoine C47 1
Repair room cause
C47 leopard 1/2
U×Repair date teacher room cause 04 Silviu C47 leopard
1/2
04 Antoine C47 leopard
1/2
→ The two tuples arenot independent!
→ The first is thereiffthe second is there.
Why is it wrong?
U
date teacher room
04 Silviu C47 1
04 Antoine C47 1
Repair room cause
C47 leopard 1/2
U×Repair date teacher room cause
04 Silviu C47 leopard 1/2 04 Antoine C47 leopard 1/2
→ The two tuples arenot independent!
→ The first is thereiffthe second is there.
Why is it wrong?
U
date teacher room
04 Silviu C47 1
04 Antoine C47 1
Repair room cause
C47 leopard 1/2
U×Repair date teacher room cause
04 Silviu C47 leopard 1/2 04 Antoine C47 leopard 1/2
→ The two tuples arenot independent!
→ The first is thereiffthe second is there.
Why does it matter?
U×Repair date teacher room cause
04 Silviu C47 leopard 1/2 04 Antoine C47 leopard 1/2
πroom(U×Repair) room
C47 1−(1−1/2)×(1−1/2)
→ Probability of3/4...
→ But the leopard had probability1/2!
Why does it matter?
U×Repair date teacher room cause
04 Silviu C47 leopard 1/2 04 Antoine C47 leopard 1/2
πroom(U×Repair) room
C47
1−(1−1/2)×(1−1/2)
→ Probability of3/4...
→ But the leopard had probability1/2!
Why does it matter?
U×Repair date teacher room cause
04 Silviu C47 leopard 1/2 04 Antoine C47 leopard 1/2
πroom(U×Repair) room
C47 1−(1−1/2)×(1−1/2)
→ Probability of3/4...
→ But the leopard had probability1/2!
Why does it matter?
U×Repair date teacher room cause
04 Silviu C47 leopard 1/2 04 Antoine C47 leopard 1/2
πroom(U×Repair) room
C47 1−(1−1/2)×(1−1/2)
→ Probability of3/4...
→ But the leopard had probability1/2!
Why does it matter?
U×Repair date teacher room cause
04 Silviu C47 leopard 1/2 04 Antoine C47 leopard 1/2
πroom(U×Repair) room
C47 1−(1−1/2)×(1−1/2)
TID are not a strong representation system
• Remember howCodd tablesrequirednamed nulls?
• The result of a query on TID maynotbe a TID
→ We will see that the correlations can becomplex
• How toevaluatequeries on a TID then?
→ List allpossible worldsand count the probabilities
TID are not a strong representation system
• Remember howCodd tablesrequirednamed nulls?
• The result of a query on TID maynotbe a TID
→ We will see that the correlations can becomplex
• How toevaluatequeries on a TID then?
→ List allpossible worldsand count the probabilities
Query evalation done right
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
πroom(U×Repair) room
C47
• Eitherthere is no leopard and then no result...
• Orthere is a leopard and then...
• Non-empty result:1−(1−0.8)(1−0.2) =0.84
• Thequery probabilityis:0.1×0.84
Query evalation done right
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
πroom(U×Repair) room
C47
• Eitherthere is no leopard and then no result...
• Orthere is a leopard and then...
• Non-empty result:1−(1−0.8)(1−0.2) =0.84
• Thequery probabilityis:0.1×0.84
Query evalation done right
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
πroom(U×Repair) room
C47
• Eitherthere is no leopard and then no result...
• Orthere is a leopard and then...
• Non-empty result:1−(1−0.8)(1−0.2) =0.84
• Thequery probabilityis:0.1×0.84
Query evalation done right
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
πroom(U×Repair) room
C47 ???
• Eitherthere is no leopard and then no result...
• Orthere is a leopard and then...
• Non-empty result:1−(1−0.8)(1−0.2) =0.84
• Thequery probabilityis:0.1×0.84
Query evalation done right
U
date teacher room
04 Silviu C47 0.8
04 Antoine C47 0.2
Repair room cause C47 leopard0.1
πroom(U×Repair) room
C47 ???
• Eitherthere is no leopard and then no result...
• Orthere is a leopard and then...
• Non-empty result:1−(1−0.8)(1−0.2) =0.84
• Thequery probabilityis:0.1×0.84