• Aucun résultat trouvé

We discussed that the challenges related to linked open sensor data time series are not limited to the volume and velocity of the time series but also to their variety

interop-erability (IOP) challenges, which are crucial when combining air quality data from differ-ent sources as well as linking them to other datasets such as traffic or weather data. We addressed the different IOP levels, namely the legal, organisational, technical and seman-tic level. Linked Data facilitates IOP on both a technical and a semanseman-tic level. Context information, such as temperature, humidity and spatial information, enriches air quality data by re-using existing machine-readable RDF vocabularies. The principles of Linked Data make the data self-describing and machine-readable, which allows autonomous agents to reason on the sensor data. Linked Data builds upon the architecture of the World Wide Web and uses typed links between data entities from disparate sources, described using the Resource Description Framework.

Figure 25.Overview of the benchmark with the query response time (latency) of Linked Time Series Server, publishing the average sensor values in a time interval (hour) that has ended (b4).

The benchmark showed that the Linked Time Series approach lowers the cost for publishing air quality data and raises the availability because of a better caching strategy.

The results of the benchmark ascertain that—with an increasing number of clients—the LTS API has a lower CPU load than the FIWARE QL API. Additionally, as the load of the LTS is constant for historic data, the cost becomes predictable, which is crucial for public bodies that need to determine their expenses beforehand. Building on the strength of the World Wide Web—by using HTTP caching on the level of clients and servers—created not only a cost–benefit but also contributed to the stability of the air quality endpoints. The FIWARE QL API starts “sputtering” at a load of 400 clients (Figure24), in contrast with the LDF interface that still obtains good results at 10,000 requests (Figure25).

The bandwidth of the LDF endpoint is slightly higher, but the number of resources a single request consumes—such as the load on the CPU and memory—is significantly lower, due to the cache re-use. The cache re-use is the ratio of items that are requested more than once from the cache instead of consuming server resources. The literature is consistent with our findings from limiting the server interface that becomes more scalable due to Web caching [65,69]. The scenario that calculates the average sensor values in a time interval that has ended demonstrates that a part of the workload is shifted to the client but is still acceptable for the client. A non-measurable benefit is that the client can evaluate any given query on the client side, without having to rely on server-side functionality other than downloading the right fragments. Next, the Open World Assumption (OWA) becomes applicable: more data can always be downloaded in order to obtain a more precise answer [89,90].

We discussed that the challenges related to linked open sensor data time series are not limited to the volume and velocity of the time series but also to their variety interoperability (IOP) challenges, which are crucial when combining air quality data from different sources as well as linking them to other datasets such as traffic or weather data. We addressed the different IOP levels, namely the legal, organisational, technical and semantic level. Linked Data facilitates IOP on both a technical and a semantic level. Context information, such as temperature, humidity and spatial information, enriches air quality data by re-using existing machine-readable RDF vocabularies. The principles of Linked Data make the data self-describing and machine-readable, which allows autonomous agents to reason on the sensor data. Linked Data builds upon the architecture of the World Wide Web and uses typed links between data entities from disparate sources, described using the Resource Description Framework.

Based on the above, we have three recommendations about archivability, indexing and interoperability for future research. First, it is crucial to ensure that time series are

Data2021,6, 93 27 of 32

still accessible and usable for future generations, as they are valuable for research (e.g., on environmental changes). Therefore, a strategy that outlines which subsets of the data should be preserved in order to reduce the storage cost of the data in a digital archive needs to be defined. Second, research should consider a more dynamic method to fragment and index different types of time series, considering the available budget of the publisher.

Finally, on the level of interoperability, future research should explore how to bridge between the NGSI—that redefines a knowledge representation in its own—and the existing semantic assets.

Although our results are promising, there are limitations to our research. First, as the scenarios are limited, further validation of this method in a large variety of use cases and different types of sensor data time series is necessary to extrapolate our conclusions.

Second, we should evaluate this method for real-time data streams.

During this research, we determined extra challenges related to cross-domain interop-erability and architectural flexibility, which we discuss in the next paragraph.

5.2. Railway Infrastructure Data

Similar to public authorities publishing sensor-based data, ERA seeks to publish the European railway infrastructure data in an interoperable way to maximize re-use by different applications serving diverse use cases. To increase interoperability, the ERA follows the Linked Data principles and publishes a knowledge graph containing the different elements that conform to the railway infrastructure, while including semantic annotations.

We showed that publishing railway infrastructure data is feasible with an LDF-based approach, bringing the same cost-efficiency and scalability benefits as the ones measured by our benchmark on air quality sensor data. In this case, we applied the same principles of well-known vector tiling strategies for geospatial data, but we further enriched the tiles with semantic annotations based on hypermedia description vocabularies (hydra and TREE). Such annotations enable client applications to interpret the data interfaces and to traverse the underlying knowledge graph.

Moreover, our proposed data publishing approach enables route calculations to be performed on the client side, while aiming to increase the cacheability of the server re-quests and including semantic annotations describing the data interfaces via hypermedia controls [91]. This architectural design allows for greater flexibility in client application design at the cost of increased complexity for application implementation. Developers are able to implement and customize any kind of algorithm and business logic, tailored to their use cases. This is in contrast to the dedicated server-side solution systems that limit the application to the supported capabilities and features.

6. Conclusions

In this article, we presented insights on the implementation of a sustainable method for publishing open data, in a specific sensor data time series on air quality and railway infrastructure data.

Our study demonstrates how Linked Data can supportinteroperabilityat the technical and semantical level for an air quality time series. Linked Data principles not only provide interoperability towards external stakeholders but also foster a more sustainable and cost-effective architecture. We have shown that exposing Linked Data Fragments for a sensor data time series lowers the publishers’ server cost compared with the FIWARE QuantumLeap API. By monitoring the query response time of the client, we showed that the Linked Time Series (LTS) interface can support more clients (10,000 versus 400 clients with historic data), while having a lower latency. Especially for the time series queries where the time interval has ended, HTTP caching lowers the server CPU cost significantly.

As such, the LTS interface can serve as a valuable extension of the FIWARE stack to provide scalabilitywhen publishing data on the Web.

The pilot on railway infrastructure data shows that this architectural approach—which shifts the data integration to the client—facilitatesflexibility. Clients are able to perform any further processing on the data to support specific use cases. In our case, the client implements a shortest path finding algorithm, which can be tailored and adjusted for more specific needs unlike traditional server-side APIs, which limit clients to their supported features and algorithms. Moreover, context can be added at the client level without the need for rewiring the server, which often affects the entire ecosystem. However, such architectural design imposes a heavier burden on the clients, which may be reflected as poorer performance on query solving tasks. Optimized caching strategies are required to improve the overall performance of the addressed use cases.

The main difference between the two different explored use cases, lies in the nature of the data fragmentations that were applied to each case. On the one hand, for the air quality data, a time-based fragmentation was applied, which was aligned with the nature of the data and with the query requirements that needed to be supported. On the other hand, for the railway infrastructure data, we applied a geospatial fragmentation that again suited the query requirements brought forth by the use case we wanted to support.

Besides the difference in fragmentation criteria, both approaches are built following the same architectural principles, where caching plays a fundamental role in reducing the operational costs of data interfaces. We, thus, showed how this approach can be applied to independent domains and over different types of data and demonstrated how data providers can develop a sustainable method for publishing open data.

According to President von der Leyen, common data spaces are an enabler for in-novation and new jobs [8]. Interoperability within and between European common data spaces will be crucial to avoid data silos and increase the re-use of data [92,93]. We hope that insights from this article can speed-up the process of opening datasets by public and private organisations on a Web-scale and can catalyse the European Data Strategy. As such, our contributions, which build upon the principles of Linked Open Data, can be valuable for governments, organisations and researchers wanting to publish interoperable data on a Web-scale in a cost-efficient and flexible way.

Author Contributions:Conceptualization, R.B. and D.V.L.; methodology, E.V. and R.B.; software, B.V.d.V. and J.R.M.; validation, R.B. and B.V.d.V.; writing—original draft preparation, R.B., D.V.L. and J.R.M.; writing—review and editing, P.C., P.M. and M.V.C.; visualization, R.B. and J.R.M.; supervision, P.M., S.L. and E.M. All authors have read and agreed to the published version of the manuscript.

Funding:This research received no external funding.

Institutional Review Board Statement:Not applicable.

Informed Consent Statement:Not applicable.

Acknowledgments:The research activities presented in this article were funded by imec—IDLab Ghent University, Belgium and imec City of Things, Belgium. We also would like to thank Quincy Oeyen from the Agency for Facility Operations (responsible for the Digital Archive Flanders), Christophe Stroobants from the Flanders Environment Agency, Ziggy Vanlishout from the Dig-ital Flanders Agency and Philippe Michiels from imec City of Things for their valuable insights. We also extend our gratitude to the team at ERA for facilitating access to their data. Finally, we would like to thank Esther De Loof and Arthur Macpherson for proofreading this manuscript.

Conflicts of Interest:The authors declare no conflict of interest.

References

1. United Nations.2018 Revision of World Urbanization Prospects; United Nations: New York, NY, USA, 2018.

2. Chourabi, H.; Nam, T.; Walker, S.; Gil-Garcia, J.R.; Mellouli, S.; Nahon, K.; Pardo, T.A.; Scholl, H.J. Understanding Smart Cities:

An Integrative Framework. In Proceedings of the 2012 45th Hawaii international conference on system sciences, Maui, HI, USA, 4–7 January 2012; pp. 2289–2297.

3. Nam, T.; Pardo, T.A. Smart city as urban innovation: Focusing on management, policy, and context. In Proceedings of the Proceedings of the 5th international Conference on Theory and Practice of Electronic Governance, Tallin, Estonia, 26–28 September 2011; pp. 185–194.

Data2021,6, 93 29 of 32

4. Mehta, Y.; Pai, M.M.M.; Mallissery, S.; Singh, S. Cloud enabled air quality detection, analysis and prediction-a smart city application for smart health. In Proceedings of the 2016 3rd MEC International Conference on Big Data and Smart City (ICBDSC), Muscat, Oman, 15–16 March 2016; pp. 1–7.

5. Castell, N.; Viana, M.; Minguillón, M.C.; Guerreiro, C.; Querol, X. Real-world application of new sensor technologies for air quality monitoring.ETC/ACM Tech. Pap.2013,16, 34.

6. Hasenfratz, D.; Arn, T.; de Concini, I.; Saukh, O.; Thiele, L. Health-optimal routing in urban areas. In Proceedings of the Proceedings of the 14th International Conference on Information Processing in Sensor Networks, Seattle, DC, USA, 13–16 April 2015; pp. 398–399.

7. Almarabeh, T.; Majdalawi, Y.K.; Mohammad, H. Cloud computing of e-government.Commun. Netw.2016,8, 1–8. [CrossRef]

8. Der Leyen, U. State of the union address by President von der Leyen at the European Parliament plenary.Retrieved Oct.2020,14, 2020.

9. European Commission.A European Strategy for Data; European Commission: Luxembourg, 2020.

10. Bokolo, A.J.; Petersen, S.A.; Ahlers, D.; Krogstie, J. Big data driven multi-tier architecture for electric mobility as a service in smart cities: A design science approach.Int. J. Energy Sect. Manag.2020,14, 1023–1047.

11. Escobar, P.; Roldán-Garcia, M.D.M.; Peral, J.; Candela, G.; Garcia-Nieto, J. An Ontology-Based Framework for Publishing and Exploiting Linked Open Data: A Use Case on Water Resources Management.Appl. Sci.2020,10, 779. [CrossRef]

12. Buyle, R.; Vanlishout, Z.; Coetzee, S.; De Paepe, D.; Van Compernolle, M.; Thijs, G.; Van Nuffelen, B.; De Vocht, L.; Mechant, P.;

De Vidts, B. Raising interoperability among base registries: The evolution of the Linked Base Registry for addresses in Flanders.J.

Web Semant.2019,55, 86–101. [CrossRef]

13. Carbonaro, A. Linked data and semantic web technologies to model context information for policy-making.J. Ambient Intell.

Humaniz. Comput.2019,12, 4395–4406. [CrossRef]

14. ETSI NGSI-LD.Context Information Management (CIM) and Application Programming Interface (API); ETSI: Sophia Antipolis, France, 2019.

15. Obe, R.; Hsu, L. PostGIS in action.GEO Inform.2011,14, 30.

16. Webber, J. A programmatic introduction to neo4j. In Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity, Tucson, AZ, USA, 19–26 October 2012; pp. 217–218.

17. Bizer, C.; Heath, T.; Berners-Lee, T. Linked data: The story so far. InSemantic Services, Interoperability and Web Applications:

Emerging Concepts; IGI Global: Hershey, PA, USA, 2011; pp. 205–227.

18. Ballon, P.Smart Cities: Hoe Technologie Onze Steden Leefbaar Houdt en Slimmer Maakt; Lannoo: Meulenhoff, Belgium, 2016; ISBN 9401429456.

19. Makra, L.; Brimblecombe, P. Selections from the history of environmental pollution, with special attention to air pollution. Part 1.

Int. J. Environ. Pollut.2004,22, 641–656. [CrossRef]

20. Mieck, I. Aerem corrumpere non licet. Luftverunreinigung und Immissionsschutz Preußen bis zur Gewerbeordnung 1869.

Technikgeschichte1967,34, 36–78.

21. World Health Organisation.Ambient Air Pollution: A Global Assessment of Exposure and Burden of Disease; World Health Organisation:

Geneva, Switzerland, 2016.

22. Penza, M.; Suriano, D.; Villani, M.G.; Spinelle, L.; Gerboles, M. Towards air quality indices in smart cities by calibrated low-cost sensors applied to networks. In Proceedings of the Sensors, 2014 IEEE, Valencia, Spain, 2–5 November 2014; pp. 2012–2017.

23. European Commission. Digital Single Market: EU Negotiators Agree on New Rules for Sharing of Public Sector Data; European Comission: Luxembourg, 2019.

24. Vtvn. ‘Zonnekaart’ Opnieuw Online Na Teveel Bezoekers Op Dag Van Lancering. Available online:https://www.nieuwsblad.b e/cnt/dmf20170321_02791525(accessed on 26 July 2021).

25. Balazinska, M.; Deshpande, A.; Franklin, M.J.; Gibbons, P.B.; Gray, J.; Hansen, M.; Liebhold, M.; Nath, S.; Szalay, A.; Tao, V. Data management in the worldwide sensor web.IEEE Pervasive Comput.2007,6, 30–40. [CrossRef]

26. European Commission.New European Interoperability Framework (EIF); European Comission: Luxembourg, 2017. [CrossRef]

27. European Commission. European Interoperability Framework (EIF) for European Public Services: Annex 2; European Comission:

Luxembourg, 2010.

28. European Environment Agency.Air Quality in Europe—2018 Report; Publications Office of the European Union: Luxembourg, 2018.

29. Taylor, M.D. Low-cost air quality monitors: Modeling and characterization of sensor drift in optical particle counters. In Proceedings of the 2016 IEEE SENSORS, Orlando, FL, USA, 30 October–2 November 2016; pp. 1–3.

30. Nittel, S. Real-time sensor data streams.SIGSPATIAL Spec.2015,7, 22–28. [CrossRef]

31. Zimmerman, N.; Presto, A.A.; Kumar, S.P.N.; Gu, J.; Hauryliuk, A.; Robinson, E.S.; Robinson, A.L.; Subramanian, R. A machine learning calibration model using random forests to improve sensor performance for lower-cost air quality monitoring.Atmos.

Meas. Tech.2018,11, 291–313. [CrossRef]

32. European Central Bank.Payment’s Statistics: 2018; European Central Bank: Frankfurt, Germany, 2019.

33. Van Compernolle, M.; Buyle, R.; Mannens, E.; Vanlishout, Z.; Vlassenroot, E.; Mechant, P. Technology readiness and acceptance model” as a predictor for the use intention of data standards in smart cities.Media Commun.2018,6, 127–139.

34. Kubicek, H.; Cimander, R.; Scholl, H.J. Layers of interoperability. InOrganizational Interoperability in E-Government; Springer:

Berlin/Heidelberg, Germany, 2011; pp. 85–96.

35. Weber, R.H.Legal Interoperability as a Tool for Combatting Fragmentation; Global Commission on Internet Governance: Chatham, UK, 2014.

36. European Commission.European Interoperability Framework–Implementation Strategy; European Comission: Luxembourg, 2017.

37. Li, F.; Vögler, M.; Claeßens, M.; Dustdar, S. Efficient and scalable IoT service delivery on cloud. In Proceedings of the 2013 IEEE 6th International Conference on Cloud Computing, Santa Clara, CA, USA, 28 June–3 July 2013; pp. 740–747.

38. Gubbi, J.; Buyya, R.; Marusic, S.; Palaniswami, M. Internet of Things (IoT): A vision, architectural elements, and future directions.

Futur. Gener. Comput. Syst.2013,29, 1645–1660. [CrossRef]

39. Liu, J.; Li, Y.; Chen, M.; Dong, W.; Jin, D. Software-defined internet of things for smart urban sensing.IEEE Commun. Mag.2015, 53, 55–63. [CrossRef]

40. Su, X.; Riekki, J.; Nurminen, J.K.; Nieminen, J.; Koskimies, M. Adding semantics to internet of things.Concurr. Comput. Pract. Exp.

2015,27, 1844–1860. [CrossRef]

41. Hendler, J. Data integration for heterogenous datasets.Big Data2014,2, 205–215. [CrossRef]

42. Toma, I.; Simperl, E.; Hench, G. A joint roadmap for semantic technologies and the internet of things. In Proceedings of the 3rd STI Roadmapping Workshop, Crete, Greece, 1 June 2009; Volume 1, pp. 140–153.

43. Compton, M.; Barnaghi, P.; Bermudez, L.; GarcíA-Castro, R.; Corcho, O.; Cox, S.; Graybeal, J.; Hauswirth, M.; Henson, C.; Herzog, A. The SSN ontology of the W3C semantic sensor network incubator group.J. Web Semant.2012,17, 25–32. [CrossRef]

44. Klyne, G.; Carroll, J.M. Resource Description Framework (RDF): Concepts and Abstract Syntax. Available online: https:

//www.w3.org/TR/2004/REC-rdf-concepts-20040210/(accessed on 26 July 2021).

45. Fielding, R.T.; Gettys, J.; Mogul, J.; Frystyk, H.; Masinter, L.; Leach, P.; Berners-Lee, T.Hypertext Transfer Protocol—HTTP/1.1.

Request for Comments; The Internet Society: Reston, VA, USA, 1999.

46. Berners-Lee, T.; Fielding, R.; Masinter, L.Uniform Resource Identifier (URI): Generic Syntax. Request for Comments: 3986; The Internet Society: Reston, VA, USA, 2005.

47. De Souza, S.C.; Pereira Filho, J.G. Semantic interoperability in iot: A systematic mapping. InInternet of Things, Smart Spaces, and Next Generation Networks and Systems; Springer: Cham, Switzerland, 2019; pp. 53–64.

48. Barbosa, L.; Pham, K.; Silva, C.; Vieira, M.R.; Freire, J. Structured open urban data: Understanding the landscape.Big Data2014,2, 144–154. [CrossRef]

49. Haller, A.; Janowicz, K.; Cox, S.J.D.; Lefrançois, M.; Taylor, K.; Le Phuoc, D.; Lieberman, J.; García-Castro, R.; Atkinson, R.; Stadler, C. The modular SSN ontology: A joint W3C and OGC standard specifying the semantics of sensors, observations, sampling, and actuation.Semant. Web2019,10, 9–32. [CrossRef]

50. Janowicz, K.; Haller, A.; Cox, S.J.D.; Le Phuoc, D.; Lefrançois, M. SOSA: A lightweight ontology for sensors, observations, samples, and actuators.J. Web Semant.2019,56, 1–10. [CrossRef]

51. European Commission.INSPIRE Maintenance and Implementation Group (MIG). Guidelines for the Use of Observations & Measurements and Sensor Web Enablement-Related Standards in INSPIRE; European Commission: Luxembourg, 2008.

52. Barnaghi, P.; Presser, M.; Moessner, K. Publishing linked sensor data. In Proceedings of the 3rd International Workshop on Semantic Sensor Networks CEUR Workshop, Organised in Conjunction with the International Semantic Web Conference, Shanghai, China, 7 November 2010; Volume 668.

53. European Commission.Directive 2008/50/EC of the European Parliament and of the Council of 21 May 2008 on Ambient Air Quality and Cleaner Air for Europe; European Comission: Luxembourg, 2008.

54. Minghini, M.; Cetl, V.; Kotsev, A.; Tomas, R.; Lutz, M. INSPIRE: The entry point to Europe’s Big Geospatial Data Infrastructure. In Handbook of Big Geospatial Data; Springer: Berlin/Heidelberg, Germany, 2020.

55. Abbas, A.; Privat, G. Bridging Property Graphs and RDF for IoT Information Management. In Proceedings of the Workshop on Scalable Semantic Web Knowledge Base Systems co-located with 17th International Semantic Web Conference, Monterey, CA, USA, 9 October 2018; pp. 77–92.

56. Haller, A.; Janowicz, K.; Cox, S.; Le Phuoc, D.; Taylor, K.; Lefrançois, M. Semantic Sensor Network Ontology. Available online:

https://www.w3.org/TR/vocab-ssn/(accessed on 26 July 2021).

57. European Commission. Commission Implementing Regulation (EU) 2019/773 of 16 May 2019 on the Technical Specification for Interoperability Relating to the Operation and Traffic Management Subsystem of the Rail System within the European Union and Repealing Decision; European Commission: Luxembourg, 2019.

58. Łukasik, Z.; Nowakowski, W.; Ciszewski, T. Definition of data exchange standard for railway applications.Pr. Nauk. Politech.

Warsz. Transp.2016,113, 319–326.

59. Ciszewski, T.; Nowakowski, W.; Chrzan, M. RailTopoModel and RailML–data exchange standards in railway sector.Arch. Transp.

Syst. Telemat.2017,10, 10–15.

60. Bischof, S.; Schenner, G. Towards a Railway Topology Ontology to Integrate and Query Rail Data Silos. InISWC; Siemens Corporate Technology: Vienna, Austria, 2020.

61. Sarretta, A.; Minghini, M.; Napolitano, M.; Kotsev, A.; Mooney, P. OpenStreetMap: An opportunity for Citizen Science.Collect.

61. Sarretta, A.; Minghini, M.; Napolitano, M.; Kotsev, A.; Mooney, P. OpenStreetMap: An opportunity for Citizen Science.Collect.

Documents relatifs