• Aucun résultat trouvé

Form Understanding Program-analysis-based ProFoUnd:

N/A
N/A
Protected

Academic year: 2022

Partager "Form Understanding Program-analysis-based ProFoUnd:"

Copied!
7
0
0

Texte intégral

(1)

ProFoUnd:

Program-analysis-based Form Understanding

Andreas Savvides

(work done with M. Benedikt, T.Furche and P. Senellart)

(2)

Looking deep under the surface

• Surface Web:

– Bread & butter of traditional search engines – All about hyperlinks

• Deep Web:

– Search engines struggle at surfacing hidden content

– Plethora of valuable information behind Web services and HTML forms

– Valuable data for information extraction systems, e.g. DIADEM

2

(3)

Dealing with a deep Web search interface

1. Find relevant websites

2. Identify a search interface

3. Deduce input fields of the search interface and

corresponding meta-data

4. Query a hidden database 3

(4)

JavaScript and the Deep Web

• Client-side integrity constraints enforced using JavaScript

4

(5)

ProFoUnd

The first system to provide integrity

constraint identification for deep Web search interfaces based on JavaScript analysis.

ProFoUnd’s Interface

ProFoUnd’s Architecture 5

(6)

Evaluation & Moving Forward

• 70 randomly selected real-estate websites

– 100% precision and 63% recall

• Where did we fall short?

– eval, obfuscation, system limitations

• What about…

– Runtime execution?

– Ajax?

– Server-side constraints?

6

(7)

Demonstration?

Come find us at our demo booth all day long on Friday for a showcase of the

system or simply to learn more!

7

Références

Documents relatifs

At the crossroad of CP and MCTS, this paper presents the Bandit Search for Constraint Program- ming ( BaSCoP ) algorithm, adapting MCTS to the specifics of the CP search.

For this purpose, given an environment for which electronic forms are a privileged way to exchange information and stakeholders familiar with form-based (com- puter) interaction,

To evaluate our method on the dataset, we used the network shown in Fig.1 and described in section 2, which consists of a band-pass filter, CSP spatial projection,

In our work, we proposed to assess data completeness, which in UI labeling equates the number of identified UI elements, via predicting this expected number with metrics

A common strategy in deep convolutional neural network for semantic segmentation tasks requires the down-sampling of an image between convolutional and ReLU layers, and then

Sentilo 2 is a semantic SA system introduced in [10] that analyses the senti- ment of a sentence: it identifies the holder of an opinion, the topics and sub-topics of that opinion,

The current highlights comprise modules for RDF, SPARQL, data ac- cess, conversion of result sets to objects and faceted browsing, together with corresponding user interface

For this case, we give an efficient algorithm to compute all the values attained by a given quantity (e.g. execution time) over all possible paths — i.e., the support of