• Aucun résultat trouvé

Balancing Model Usability and Verifiability with SBVR and Answer Set Programming

N/A
N/A
Protected

Academic year: 2022

Partager "Balancing Model Usability and Verifiability with SBVR and Answer Set Programming"

Copied!
4
0
0

Texte intégral

(1)

Balancing Model Usability and Verifiability with SBVR and Answer Set Programming

Deepali Kholkar deepali.kholkar@tcs.com

Dushyanthi Mulpuru dushyanthi.mulpuru@tcs.com

Vinay Kulkarni vinay.vkulkarni@tcs.com

TCS Research Pune, India

Abstract

Model driven engineering (MDE) approaches necessitate verification and validation (V&V) of the models used. Balancing usability of modeling languages with verifiability of the specification presents several challenges. We present an approach and early results using the Semantics of Business Vocabulary and Business Rules (SBVR) standard as modeling notation and answer set programming (ASP) logic paradigm for verification of domain mod- els as an attempt to address these challenges, illustrated in the problem context of regula- tory compliance for business enterprises with an example from a real-life regulation.

1 Introduction

MDE seeks to enable development of large, complex systems using models as basis, necessitating V&V of the models. The vocabulary and business rules of the problem domain need to be captured as the first step in model building. It is desirable that models are pop- ulated directly by subject matter experts (SMEs) to minimize errors and provide them with greater con- trol over the specification. Usability and verifiability are therefore two key requirements from modeling lan- guages. In practice, we find that modeling languages that are easy to use by SMEs are less formal, there- fore harder to verify e.g. Unified Modeling Language (UML), while formal modeling languages such as Al- loy, Kodkod, CSP, and SAT specification languages that have automated solvers available are hard for SMEs to use. Models facilitate V&V since they can be used to generate the formal specification for vali- dating themselves that can be solved using off-the-shelf solvers as in [GD17, ABGR07, CCR08, FSB04, FF08].

In our earlier work we built a generic MDE frame- work for rule checking that we applied to the complex real-world problem of regulatory compliance by enter- prises [RSKK17]. Our framework illustrated in Figure 1 comprises three parts: a) automated extraction from natural language (NL) text of regulations to aid SMEs in authoring a model of regulation rules [RSKK17], b) model-based generation of formal logic specifica- tion of rules [KSK17] and extraction of relevant en- terprise data, and c) automated checking of extracted data for rule compliance using an off-the-shelf rule en- gine. V&V of the models used was not dealt with in this work, and is the focus of the current paper.

Direct derivation of formal logic rules from NL reg- ulation text is not feasible [LN13], therefore controlled natural language (CNL) and a rule model are selected as logical intermediate steps. Attempto Controlled English is a CNL that translates directly to formal specification while we need the intermediate model for step b). We choose the SBVR modeling standard by Object Management Group (OMG) to build the rule model since SBVR a) is an expressive notation for modeling business vocabulary and rules of any domain and b) provides a CNL interface called SBVR Struc- tured English (SBVR SE) usable by SMEs to populate the model that is grounded in first-order logic (FOL).

Automated validation of models created using a flexible CNL notation such as SBVR SE however poses several problems. SAT solvers used in existing model validation approaches [GD17, ABGR07, FSB04, FF08]

use programming paradigms that are much less expres- sive than SBVR and do not support aggregates, rela- tions, or functions. Alloy, Kodkod, and Object Con- trol Language (OCL) support encoding of aggregates, relations and constraints but not of rules with modal- ities such as obligations and permissions, and default behavior with exceptions, all of which frequently oc- cur in real-world problem contexts such as regulation,

(2)

MDE framework

2 Model-based Generation

SBVR model of regulation

Solution: Tradeid / '1010000023TATA' Success Rules:

[defeasible(r_1, obligation, tradeIs_included_inMiFID_rep orting(Tradeid,MiFID_reportin gid), [rule1(Tradeid,Acquisitionid,D isposalid,MiFID_reportingid)]).

] fact(tradeTradetype(Tradeid,T radetype)):- fact(trade(Tradeid,_,_,Tradety pe,_,_,_,_,_)).

Success Facts:

fact(trade('1010000023TATA', _,_,'buy',_,_,_,_,_)).

fact(tradeHasBuyer('1010000 023TATA','YFXS5XCH7N0Y05 NIXW11')).

1 Model Population

3 Compliance Checking Generated artefacts

Data Program Code Rules

Legal text regulationof

Compliance report

Feedback {Inconsistencies, Solutions}

Extension of MDE for validation NLP+ML

extraction

Validation spec in ASP Constraints Rules

ASP solver

Figure 1: Extension of our MDE Framework for Model Validation policy or contract compliance. SBVR supports these

features, however validating an SBVR model using any of these languages requires implementation in terms of the language primitives, a non-trivial task. To over- come these limitations, we propose a V&V approach for domain models built in SBVR using the ASP logic programming paradigm detailed below.

2 Our V&V Approach for Domain Models

We extend our earlier developed MDE framework to generate the formal logic specification for V&V of the SBVR model in the ASP paradigm, as shown in Figure 1. Model verification seeks to check that the model does what it is expected to do, usually through an implementing program1. We propose inconsistency checking, scenario generation, and testing using a stan- dard set of test cases and test data as verification tech- niques in our approach.

The objective of model validation is to ascertain that the model is a faithful representation of the prob- lem space it represents1. We implement validation by having SMEs certify scenarios generated from the model for validity and coverage. We briefly touch upon the SBVR and ASP technologies before explain- ing the approach using an illustration from the KYC regulation for Indian banks. KYC is an anti-money laundering regulation that lays down guidelines for ac- ceptance of new customers including risk categoriza- tion and identification procedures.

2.1 SBVR

SBVR is a fact-oriented modeling notation [Nij07] that encodes rules as logical formulations over fact types that are relations betweenconcepts2. Rules are stated in SBVR SE to populate the model. Listing 1 shows a few SBVR SE rules from the KYC example. Rules r6 and r7 define obligations on the bank to open an account for a customer. Rule r6 is populated in the

1http://www.inf.ed.ac.uk/teaching/courses/pm/Note16.pdf

2https://www.omg.org/spec/SBVR/1.3

SBVR model as arule,meant byanobligationformula- tionembedding animplication logical formulationover fact types bank opensAccountFor customer and bank verifiesIdentityOf customer that relate concepts bank andcustomer.

Listing 1: Section of SBVR SE Rules from the KYC Example

r u l e r 1 I t i s n e c e s s a r y t h a t c u s t o m e r @ i s h i g h R i s k C u s t | | c u s t o m e r @ i s l o w R i s k C u s t r u l e r 2 I t i s n e c e s s a r y t h a t c u s t o m e r @has

s o u r c e O f W e a l t h E a s i l y I d e n t i f i e d i f c u s t o m e r

@ i s l o w R i s k C u s t

r u l e r 4 I t i s n e c e s s a r y t h a t c u s t o m e r @ i s s a l a r i e d | | c u s t o m e r @ i s g o v t D e p t i f c u s t o m e r @has

s o u r c e O f W e a l t h E a s i l y I d e n t i f i e d r u l e r 6 I t i s o b l i g a t o r y t h a t bank

@ o p e n s A c c o u n t F o r c u s t o m e r i f bank

@ v e r i f i e s I d e n t i t y O f c u s t o m e r r u l e r 7 I t i s o b l i g a t o r y t h a t bank n o t

@ o p e n s A c c o u n t F o r c u s t o m e r i f c u s t o m e r

@ a p p r o a c h e s bank && c u s t o m e r @ i s b a n n e d C u s t o m e r

r u l e r 8 I t i s n e c e s s a r y t h a t bank

@ v e r i f i e s I d e n t i t y O f c u s t o m e r i f c u s t o m e r

@ a p p r o a c h e s bank && c u s t o m e r @ s u b m i t s document

2.2 ASP

We select ASP for automated verification since it a) is a powerful, highly expressive logic programming nota- tion supporting aggregates, functions, and optimiza- tions and b) maps directly onto SBVR’s fact-oriented model and underlying first-order logic formalism.

Rules in SBVR are translated as rules in ASP of the form Head :- Body, whereBody is the antecedent that implies the consequent Head. Statement-type rules without antecedents are translated as constraints i.e. rules with empty head, as :- Body. Relations in SBVR become predicates in ASP, with related con- cepts as arguments referred by their ids e.g. bank

@opensAccountFor customer becomes opensAccount- For(BankId,CustomerId). Our model-based generator uses Eclipse Modeling Framework (EMF) generated model traversal functions to traverse the SBVR model and translate it to ASP rules and constraints. Listing

(3)

2 shows a fragment from the ASP program generated for the KYC example.

Listing 2: ASP Fragment for KYC Example

h i g h R i s k C u s t ( C u s t o m e r I d ) v l o w R i s k C u s t ( C u s t o m e r I d ) :− c u s t o m e r ( C u s t o m e r I d ) . s o u r c e O f W e a l t h E a s i l y I d e n t i f i e d ( C u s t o m e r I d ) :−

l o w R i s k C u s t ( C u s t o m e r I d ) .

s a l a r i e d ( C u s t o m e r I d ) v g o v t D e p t ( C u s t o m e r I d ) :−

s o u r c e O f W e a l t h E a s i l y I d e n t i f i e d ( C u s t o m e r I d ) .

o p e n s A c c o u n t F o r ( BankId , C u s t o m e r I d ) :−

v e r i f i e s I d e n t i t y O f ( BankId , C u s t o m e r I d ) .

−o p e n s A c c o u n t F o r ( BankId , C u s t o m e r I d ) :−

a p p r o a c h e s ( C u s t o m e r I d , B a n k I d ) , b a n n e d C u s t o m e r ( C u s t o m e r I d ) .

v e r i f i e s I d e n t i t y O f ( BankId , C u s t o m e r I d ) :−

a p p r o a c h e s ( C u s t o m e r I d , B a n k I d ) , s u b m i t s ( C u s t o m e r I d , DocumentId ) .

s u b m i t s ( C u s t o m e r I d , DocumentId ) :−s a l a r i e d ( C u s t o m e r I d ) , l e t t e r O f I d e n t i t y F r o m C o r p o r a t e ( DocumentId ) .

ASP supports encoding of extended logic programs with choice, cardinality, classical negation, and dis- junction in the head of rules, not supported by normal logic programs, enabling verification of these features that are supported in SBVR SE. Disjunction in rule heads is a powerful feature used to model mutually exclusive paths, such as high and low risk categoriza- tion in rule r1 and and two types of low risk customers in rule r4 in Listing1. Classical negation allows ex- plicit statement or derivation of negative knowledge, e.g. rule r7 in Listing1.

ASP solvers perform automated analysis byground- ing the program i.e. generating variable-free rules given a set of ground facts and using techniques simi- lar to SAT solving to generate solutions or answer sets of the program.

2.3 V&V using ASP

We use the DLV system [LPF+06] as the solver for our generated ASP programs. The solver performs con- sistency checking of the rule base, indicating conflict- ing rules or constraints and generating answer sets for paths where no conflicts exist. Answer sets by defini- tion are minimal, i.e. no answer set can be contained in another [Lif08], and represent the minimal set of unique scenarios for the input model. These are pre- sented to SMEs to check whether the generated sce- narios are valid and whether all important scenarios have been covered.

The KYC program of Listing2 generates four an- swer sets corresponding to two risk categories and four customer types. Listing3 shows ground facts provided for a single customer with id 301 and a sample answer set for salaried customer.

Listing 3: Input Ground Facts and Sample Answer Set

% Ground F a c t s :

c u s t o m e r ( 3 0 1 ) . a p p r o a c h e s ( 3 0 1 , 2 0 1 ) .

l e t t e r O f I d e n t i t y F r o m C o r p o r a t e ( 4 0 1 ) .

%b a n n e d C u s t o m e r ( 3 0 1 ) .

% S o l u t i o n s :

{c u s t o m e r ( 3 0 1 ) , a p p r o a c h e s ( 3 0 1 , 2 0 1 ) , l e t t e r O f I d e n t i t y F r o m C o r p o r a t e ( 4 0 1 ) , l o w R i s k C u s t ( 3 0 1 ) ,

s o u r c e O f W e a l t h E a s i l y I d e n t i f i e d ( 3 0 1 ) , s a l a r i e d ( 3 0 1 ) , s u b m i t s ( 3 0 1 , 4 0 1 ) , v e r i f i e s I d e n t i t y O f ( 2 0 1 , 3 0 1 ) , o p e n s A c c o u n t F o r ( 2 0 1 , 3 0 1 )}

Although minimal, number of answer sets can grow combinatorially with conditions and ground instances.

To constrain number of scenarios, only critically im- portant conditions are modeled as disjunctions that create independent scenarios for each condition, while non-critical conditions such as id proof submitted by a customer e.g. letterOfIdentityFromCorporate are modeled as normal clauses and minimal number of ground facts are provided to prevent combinatorial ex- plosion.

Another verification step is testing the ASP pro- gram using a set of manually created positive and neg- ative test cases with test data provided as ground facts and checking generated scenarios against expected re- sults. When test data in the form of ground fact bannedCustomer(301) is added, rule r7 becomes true, conflicting with rule r6 and displays the inconsistency as shown in Listing 4.

Listing 4: Inconsistencies Shown by ASP Solver

% F a c t s d e r i v e d a s t r u e i n e v e r y a n s w e r s e t :

−o p e n s A c c o u n t F o r ( 2 0 1 , 3 0 1 ) .

% I n c o n s i s t e n c i e s :

:− o p e n s A c c o u n t F o r ( 2 0 1 , 3 0 1 ) .

Inconsistencies need to be corrected in the model and solutions re-generated iteratively until no anomalies remain.

2.4 Discussion

In our experience of using our framework over two years on sections from three real-life regulations, we trained several groups of SMEs who found SBVR SE to be an easily usable notation to understand and spec- ify rules without requiring knowledge of underlying technologies. Using a general-purpose CNL and model saves us the effort of implementing a DSL, also keeps the approach generic and usable for any problem do- main. CNL being more flexible, verification is harder than in case of a DSL. Our SBVR SE language imple- mentation therefore places some restrictions on sen- tence structure and keywords for ease of verification, taking care not to compromise on feature support.

It is desirable that V&V ensures that specified rules form a correct inferencing hierarchy. Work to supple- ment the approach described here with static analysis

(4)

of the model to flag rules that are not explicated fur- ther as possible cases of missing rules, check for cycles and reachability is ongoing. However the same clause may get specified in multiple ways. The extra work to be performed for verification is compensated by other advantages of the SBVR-ASP combination mentioned earlier, also minimal translation effort from SBVR to ASP. Not having to implement an input DSL that re- duces the development and testing effort on language engineering and model transformations. SBVR helps identify and map to actual data if available, else gen- erate mock data in our framework [RSKK17].

Having so far worked with partial regulations, we plan to encode a few complete regulations that specify rules at different levels of abstraction in order to test adequacy of the SBVR and ASP notations.

3 Conclusion

Usability and verifiability of modeling languages are both critical in any MDE approach, however often conflict in practice. We proposed using CNL as a stepping stone to populate models and reviewed the challenges faced in verification of such a model us- ing commonly used model checking languages. We described our V&V approach using SBVR and ASP to address these challenges and illustrated it with an example from the KYC regulation for Indian banks.

Ongoing investigations include managing conflicts be- tween rules using priorities and automated generation of positive and negative ground data for verification.

4 References References

[ABGR07] Kyriakos Anastasakis, Behzad Bordbar, Geri Georg, and Indrakshi Ray. Uml2alloy:

A challenging model transformation. In Model Driven Engineering Languages and Systems, 10th International Conference, MoDELS 2007, Nashville, USA, Septem- ber 30 - October 5, 2007, Proceedings, 2007.

[CCR08] Jordi Cabot, Robert Claris´o, and Daniel Riera. Verification of UML/OCL class diagrams using constraint programming.

InFirst International Conference on Soft- ware Testing Verification and Validation, ICST 2008, Lillehammer, Norway, April 9-11, 2008, Workshops Proceedings, 2008.

[FF08] S. Feja and D. Ftsch. Model checking with graphical validation rules. In 15th An- nual IEEE International Conference and

Workshop on the Engineering of Computer Based Systems (ecbs 2008), 2008.

[FSB04] F. Fleurey, J. Steel, and B. Baudry. Vali- dation in model-driven engineering: test- ing model transformations. In Proceed- ings. 2004 First International Workshop on Model, Design and Validation., 2004.

[GD17] Martin Gogolla and Khanh-Hoang Doan.

Quality improvement of conceptual UML and OCL schemata through model valida- tion and verification. In Conceptual Mod- eling Perspectives., pages 155–168, 2017.

[KSK17] Deepali Kholkar, Sagar Sunkle, and Vinay Kulkarni. Towards automated generation of regulation rule bases using MDA. In Proceedings of the 5th International Con- ference on Model-Driven Engineering and Software Development, MODELSWARD 2017, Porto, Portugal, February 19-21, 2017., 2017.

[Lif08] Vladimir Lifschitz. What is answer set pro- gramming? InProceedings of the Twenty- Third AAAI Conference on Artificial In- telligence,AAAI 2008, Chicago, Illinois, USA, July 13-17, 2008.

[LN13] Fran¸cois L´evy and Adeline Nazarenko.

Formalization of natural language regula- tions through SBVR structured english - (tutorial). InTheory, Practice, and Appli- cations of Rules on the Web - 7th Interna- tional Symposium, RuleML 2013, Seattle, WA, USA, July 11-13, 2013. Proceedings, pages 19–33, 2013.

[LPF+06] Nicola Leone, Gerald Pfeifer, Wolfgang Faber, Thomas Eiter, Georg Gottlob, Si- mona Perri, and Francesco Scarcello. The DLV system for knowledge representation and reasoning.ACM Trans. Comput. Log., 7(3):499–562, 2006.

[Nij07] Sjir Nijssen. SBVR: Semantics for busi- ness. 2007.

[RSKK17] Suman Roychoudhury, Sagar Sunkle, Deepali Kholkar, and Vinay Kulkarni.

From natural language to SBVR model au- thoring using structured english for com- pliance checking. In 21st IEEE Interna- tional Enterprise Distributed Object Com- puting Conference, EDOC 2017, Quebec City, QC, Canada, October 10-13, 2017.

Références

Documents relatifs

The logical foundations of Answer Set Programming (ASP; [21]) rest upon the logic of Here-and-There (HT; [18]), or more precisely its equilibrium models [22] that correspond to

The main con- tributions of s(CASP) are: (i) a generic interface to connect the disequality constraint solver CLP(6=) with different constraint solvers (e.g., CLP( Q ), a

Recently, a model of Interdependent Scheduling Games (ISGs) in which each player controls a set of services that they schedule independently has been proposed [1].. Indeed, to

Gebser, M., Leone, N., Maratea, M., Perri, S., Ricca, F., Schaub, T.: Evaluation techniques and systems for answer set programming: a survey.. Gebser, M., Maratea, M., Ricca, F.:

It has been shown that Logic Programs with Annotated Disjunctions (LPADs) (Ven- nekens et al. 2009) can be turned into ProbLog, and thus can be embedded in LP MLN as well. It is

The verification of business process compliance to business rules and regulations has gained a lot of interest in recent years and it has led to the development to a process

Since the same graphical feature model can be represented both with the textual feature modeling language (Kumbang) as well as with the product configuration language (PCML), it

Furthermore, we need a notion for a number to be in a square, and conse- quently rule out answer sets where a number does not occur in every square.. in square(S, N ) :- cell (X, Y,