• Aucun résultat trouvé

Web Service 
 and Open Data

N/A
N/A
Protected

Academic year: 2022

Partager "Web Service 
 and Open Data"

Copied!
46
0
0

Texte intégral

(1)

Web Service 
 and Open Data

By Lélia Blins - ProgRes 2018 [email protected]

Thanks to Quentin Bramas

(2)

What is a Web Service ?

A Web Service is a method of communication between two electronic devices over the Web.

HTTP is the typical protocol used by WebService to communicate.

(3)

What is a Web Service ?

Request

Response

Device HTTP Server

(4)

Application

Programming

Interface

(5)

An Interface

used by Programs to interact

with an Application

(6)

APIs exposes a service which consumes the service Developers

write a program

(7)

Examples

(8)

Example twitter API

(9)

Example geocoding APIs

(10)

Geocoding APIs

Open Street Map API

Google Map API

Adress Data Gouv

….

(11)

What is the format of the response?

Request

Response

Device HTTP Server

(12)

https://maps.googleapis.com/maps/api/geocode/xml?

address=25%20rue%20lang%20france

https://maps.googleapis.com/maps/api/geocode/json?

address=25%20rue%20lang%20france

Web Api - example

(13)

REpresentational State

Transfert

(14)

REST Web Api

Is a web service using simpler REpresentational State Transfer (REST) based communication.

Request is just a HTTP Method over an URI.


Response is typically JSON or XML.

(15)

Request HTTP

GET POST PUT

DELETE

(16)

Request Python

>>> import requests

>>> r = requests.get("http://linuxfr.org/")

>>> print(r.text)

<!DOCTYPE html>

<html lang="fr">

<head>

<meta charset="utf-8">

<title>Accueil - LinuxFr.org</title>

<style type="text/css">header#branding h1 { background-image: url('/images/logos/linuxfr2_mountain.png') }</style>

r = requests.put("http://linuxfr.org/") r = requests.delete("http://linuxfr.org/") r = requests.patch("http://linuxfr.org/") r = requests.post("http://linuxfr.org/") r = requests.head("http://linuxfr.org/") r = requests.options("http://linuxfr.org/")

(17)

Request Python

Send data

data = {"first_name":"Richard", "second_name":"Stallman"}

r = requests.post("http://linuxfr.org", data = data)

Picture

file = {'file': open("photo.png", "rb")}

r = requests.post("http://linuxfr.org", files = file)

r.text #Return the content (unicode) r.content #Return the content (bytes) r.json #Return the content (json) r.headers #Return the content (Dict)

(18)

Resources

ex: Facebook Graph Api:


GET: /{photo-id} to retrieve the info of a photo

GET: /{photo-id}/likes to retrieve the people who like it
 POST: /{photo-id} to update the photo


DELETE : /{photo-id} to delete the photo


URI/Resource based:

ex: Google Calendar Api:


GET: /calendars/{calendarId} to retrieve the info of a calendar

PUT: /calendars/{calendarId} to update a calendar
 DELETE : /calendars/{calendarId} to delete a calendar
 POST: /calendars to create a calendar
 GET: /calendars/{calendarId}/events/{eventId}


(19)

Response

HTTP Response:

200: OK

3 _ _: Redirection

404: not found (4 _ _ : something went wrong with what you try to access)

5 _ _ : Server Error

API Response:

Flickr:


{ "stat": "fail", "code": 1, "message": "User not found" }

{ "galleries": { ... }, "stat": "ok" }

Google Calendar:

{ "error": {"code": 403, "message": "User Rate Limit Exceeded" } }
 { "kind": "calendar#events","summary": ..., "description": ...

(20)

text/plain

text/html

text/xml or application/xml

application/json

image/png

...

Response

Content-Type:

(21)

Python

JSON and XML 


Parsing

(22)

use the json package:

>> obj = json.loads('{"attr1": "v1", "attr2": 42}')


>> obj['attr1']


'v1'


>> obj['attr2'] 


42


>> obj = {'id':1, 'data':[1,2,3,4]}


>> json.dumps(obj) # returns a string
 '{'id':1, 'data':[1,2,3,4]}'

JSON Parsing

(23)

Convert JSON to Python Object (Dict)

use the json package:

import json

json_data = '{"name": "Marie", "city": "Paris"}' python_obj = json.loads(json_data)

print python_obj[« name"]

print python_obj[« city"]

Result

>python3 01_Json.py Maria

Paris

(24)

Convert JSON to Python Object (List)

import json

json_data = '{"persons": [{"name": "Marie", "city": "Paris"}, {"name": "Pierre", "city": "Lyon"} ] }' python_obj = json.loads(json_data)

print json.dumps(python_obj, sort_keys=True, indent=4)

Result

>python3 02_Json.py

{

"persons": [ {

"city": "Paris", "name": "Marie"

}, {

"city": "Lyon", "name": "Pierre"

} ]

}

(25)

Convert JSON to Python Object

import json

json_input = '{"persons": [{"name": "Marie", "city": "Paris"}, {"name": "Pierre", "city": "Lyon"} ] }'

try:

decoded = json.loads(json_input)

# Access data

for x in decoded['persons']:

print x['name']

except (ValueError, KeyError, TypeError):

print "JSON format error"

Result

>python3 03_Json.py Marie

Pierre

(26)

Use JSON file

import json

data = json.load(open('lang.json')) try:

# Access data

for x in data['results']:

print x['formatted_address']

except (ValueError, KeyError, TypeError):

print "JSON format error"

Result

>python3 04_Json.py

25 Rue Cité Lang, 68560 Hirsingue, France

25 Rue Raphaël Lang, 54500 Vandœuvre-lès-Nancy, France

(27)

XML Parsing

With xml.etree.ElementTree, xml.sax, or html.parser

import xml.etree.ElementTree as ET tree = ET.parse(‘countryXML.xml')

xml.etree.ElementTree load the whole file, you can then naviguate in the tree structure.

(28)

$ python

Python 2.7.10 (default, Feb 7 2017, 00:08:15)

[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin

Type "help", "copyright", "credits" or "license" for more information.

>>> import xml.etree.ElementTree as ET

>>> tree=ET.parse('contryXML.xml')

>>> root=tree.getroot()

>>> root.tag 'data'

>>> root.attrib {}

>>> for child in root:

... print child.tag, child.attrib ...

country {'name': 'Liechtenstein'}

country {'name': 'Singapore'}

country {'name': 'Panama'}

>>> root[0][1].text '2008'

>>> for n in root.iter('neighbor'):

... print n.attrib ...

{'direction': 'E', 'name': 'Austria'}

{'direction': 'W', 'name': 'Switzerland'}

{'direction': 'N', 'name': 'Malaysia'}

{'direction': 'W', 'name': 'Costa Rica'}

{'direction': 'E', 'name': 'Colombia'}

(29)

Simple API to XML: SAX

import xml.sax

class MyHandler ( xml.sax.ContentHandler):

def __init__( self):

xml.sax.ContentHandler.__init__( self) self.element_name2count = {}

def startElement( self, name, attrs):

self.element_name2count[ name] =

self.element_name2count.get( name, 0) + 1 filename = "lang.xml"

handler = MyHandler()

xml.sax.parse( filename, handler)

# sort elements according to their count to_sort = [(count,name) for name,count in handler.element_name2count.iteritems()]

to_sort.sort( reverse=True) for count,name in to_sort:

print "%s: %d" % (name,count)

(30)

Simple API to XML: SAX

Result

type: 24

short_name: 14 long_name: 14

address_component: 14 lng: 6

lat: 6

viewport: 2 southwest: 2 result: 2 place_id: 2

partial_match: 2 northeast: 2

location_type: 2 location: 2

geometry: 2

formatted_address: 2 status: 1

GeocodeResponse: 1

(31)

Beautifulsoup (HTML parser)

import requests

from bs4 import BeautifulSoup

r = requests.get("https://fr.wikipedia.org/wiki/

Beautiful_Soup")

soup = BeautifulSoup(r.content, "html.parser")

#print(soup)

print (soup.title)

> python3 06_BS.py

<title>Python — Wikipédia</title>

(32)

Regular Expressions

(33)

Regular Expressions

Regular expressions are a powerful language for matching text patterns.

The Python "re" module provides regular expression support.

>>> import re

>>> re.findall("([0-9]+)", "Bonjour 111 Aurevoir 222") ['111', '222']

(34)

Regular Expressions

• a, X, 9, < -- ordinary characters just match themselves exactly. The meta-characters

which do not match themselves because they have special meanings are: . ^ $ * + ? { [ ] \

| ( ) (details below)

• . (a period) -- matches any single character except newline '\n'

• \w -- (lowercase w) matches a "word" character: a letter or digit or underbar [a-zA-Z0-9_].

Note that although "word" is the mnemonic for this, it only matches a single word char, not a whole word. \W (upper case W) matches any non-word character.

• \b -- boundary between word and non-word

• \s -- (lowercase s) matches a single whitespace character -- space, newline, return, tab, form [ \n\r\t\f]. \S (upper case S) matches any non-whitespace character.

• \t, \n, \r -- tab, newline, return

• \d -- decimal digit [0-9] (some older regex utilities do not support but \d, but they all support \w and \s)

• ^ = start, $ = end -- match the start or end of the string

• \ -- inhibit the "specialness" of a character. So, for example, use \. to match a period or \\

to match a slash. If you are unsure if a character has special meaning, such as '@', you can put a slash in front of it, \@, to make sure it is treated just as a character.

(35)

[email protected]

Regular Expressions

Extract Email Information:

([^@]+)@([^@]+)

[ ]

^

a character that is not

@ the at symbol

+ at least one of this character

>>> m= re.match('([^@]+)@([^@]+)','[email protected]')

>>> m.group(1) 'lelia.blin'

>>> m.group(2) 'lip6.fr'

>>>

(36)

Create an API

with Python

(37)

Create an API

Django: Powerful web framework with a lot of modules. Great to build a complete website.

Flask: Small Framework to build simple website.

Bottle: Similar to Flask, but even simpler. Perfect to build an API

Available library/framework in python:

(38)

Create an API

The Bottle Framework (single file module, no dependencies)

Routing: Requests to function-call mapping with support for clean and dynamic URLs.

Templates: Fast and pythonic built-in template engine

Utilities: Convenient access to form data, file uploads, cookies, headers and other HTTP-related metadata.

Server: Built-in HTTP development server and

support for other WSGI capable HTTP server. (WSGI is the Web Server Gateway Interface, which is a specification for web server in python)

(39)

Create an API

from bottle import route, run

@route('/hello') def hello():

return 'Hello world'

run(host='localhost', port=8080)

Hello world example:

(40)

>python3 07_Hello.py

Bottle v0.12.13 server starting up (using WSGIRefServer())...

Listening on http://localhost:8080/

Hit Ctrl-C to quit.

http://localhost:8080/hello

(41)

Id in URL

from bottle import route, run, template

@route('/hello/<name>') def hello(name):

return 'Hello ' + name

run(host='localhost', port=8080)

File: 08_HelloName.py

URL: http://localhost:8080/hello/Marie

(42)

Id in URL

from bottle import route, run, template

@route('/hello/<name>') def hello(name):

return 'Hello ' + name

#http://localhost:8080/hello/Marie

@route('/bonjour/<name>') def bonjour(name):

return 'Bonjour ' + name

#http://localhost:8080/bonjour/Marie

@route('/buenas/<name>') def buena(name):

return 'Buenas dias ' + name

#http://localhost:8080/buenas/Marie

run(host='localhost', port=8080)

File: 09_HelloPL.py

(43)

Id in URL

from bottle import Bottle, run, view, request app = Bottle()

@app.route('/jemesure') def jemesure():

return "Je mesure " + request.params.taille + " cm"

run(app, host='localhost', port=8080)#, reloader=True)

File: 10_Taille.py

URL: http://localhost:8080/jemesure?taille=133

(44)

Static content

#!/usr/bin/env python

# -*- coding: utf-8 -*-

from bottle import Bottle, run, static_file app = Bottle()

@app.route('/static/<filename:path>') def server_static(filename):

return static_file(filename, root='.')

run(app, host='localhost', port=8080, reloader=True)

File: 11_Img.py

URL: http://localhost:8080/static/cube.png

(45)

Open Data

(46)

Open Data

Publicly available API / Dataset about:

Education

Public Transport

Economie

Sport Results

Références

Documents relatifs

In this paper, we demonstrate how ODRL can be used not only to represent access policies but also to specify access requests, offers and agreements, and propose an approach to

• Watch out when a child’s mouth is closed and he/she hypersalivates and emits fetid breath with a putrid smell, regardless of whether he/she has fever or not. Signs and symptoms

We consider locally C-flat embeddings in codimension one in three categories C : TOP (continuous maps), LQC (locally quasiconformal maps), and LIP (locally Lipschitz

Essentially, the obligation of public sector bodies to make PSI available to the citizens stems from national and international regulations on access to

This study explored how do children explain the reasons for their gamer rage, and how does show that children consider their in-game failures,

in which the disks are copied is you sequence of the following instructions, undesirable Log onto a disk drive which contains the TurboDOS occur.. COPY.CMD or .COM program in

of file name display on page - default default is ON, so turns off name display Htext - Define Heading text to appear at the top of each page deliin = for example user Files

Type ‘‘ my first program ’’ (get into the habit of giving descriptive names to projects and other files) in the Target name box. Use the browse button to find the