• Aucun résultat trouvé

Integration of a Flexible Analytics Workbench with a Learning Platform for Medical Specialty Training

N/A
N/A
Protected

Academic year: 2022

Partager "Integration of a Flexible Analytics Workbench with a Learning Platform for Medical Specialty Training"

Copied!
6
0
0

Texte intégral

(1)

Integration of a Flexible Analytics Workbench with a Learning Platform for Medical Specialty Training

Tilman Göhnert

University of Duisburg-Essen Lotharstr. 63/65 47048 Duisburg, Germany

goehnert@collide.info

Sabrina Ziebarth

University of Duisburg-Essen Lotharstr. 63/65 47048 Duisburg, Germany

ziebarth@collide.info

Per Verheyen

University of Duisburg-Essen Lotharstr. 63/65 47048 Duisburg, Germany

verheyen@collide.info H. Ulrich Hoppe

University of Duisburg-Essen Lotharstr. 63/65 47048 Duisburg, Germany

hoppe@collide.info

ABSTRACT

In this paper we present a generic and extensible analytics workbench and show how it can be connected to learning platforms in order to analyze the activities on the platform from different perspectives. We show that as the analytics workbench already supports a wide range of analyses like network analysis, statistical analysis, and analysis of activity logs, the main effort needed for connecting a learning plat- form to it lies in transporting the log data of the platform into the workbench. However the analytics workbench is also designed for extensibility so if desired more specific analysis capabilities can be added to it easily. We present an analysis of the online platform of the KOLEGEA research project for supporting medical doctors in their specialty training as a case example for how the integration can be done, for what kind of analyses can be conducted, and for how the analysis results can help both the developers and the operators of such a learning platform.

Categories and Subject Descriptors

D.2.13 [Software Engineering]: Reusable Software; D.2.11 [Software Engineering]: Software Architectures; K.3.1 [Computers and Education]: Computer Uses in Edu- cation

General Terms

Design, Measurement, Experimentation, Human Factors

1. INTRODUCTION

Software tools for supporting learning often do not provide support for learning analytics. If there is analytic support (e.g. in larger open source projects like Moodle1 or com- mercial solutions), it is mostly integrated into the specific learning environment and cannot be easily reused for other environments. Thus, much effort is invested in implementing even basic analytical features again and again for different learning environments.

In the context of web analytics the need for simple and eco- nomic solutions for getting (statistical) information about

1http://moodle.com/

the visitors’ behavior on web pages led to generic tools like Piwik2 or Google Analytics3, which can be easily applied to arbitrary web pages and portals. Regarding learning ana- lytics there are first approaches for generic analysis environ- ments. For example, the LeMo application provides tools for monitoring of learning processes on arbitrary learning management systems (LMS) [1]. However, for using these tools the contents and logs of the LMS that should be an- alyzed have to be transformed into the comprehensive and very specific LeMo data base structures. There are only few connectors, e.g. for Moodle or Clix, thus for most LMS there would be much effort to transform the data. Further- more, LeMo is focussed on analysis of LMS, which are only one aspect of learning environments. In general, the essen- tial challenge is the flexible integration of a generic analysis module with different types of learning environments.

We have developed an analytics workbench4 [5], which cov- ers a wide spectrum of analyses and can be easily applied to a wide range of data. This includes data from arbitrary learn- ing environments, since lots of standard data formats are supported and the internal data representations are simple and generic, thus most data can be mapped. The workbench can be used exploratively by analysts, but it can also be used for performing stored analysis workflows automatically and thus for providing analysis results for other user groups [7].

In this paper, we present the analytics workbench and show how it can be easily coupled with an existing learning plat- form and used for meaningful analyses regarding different target groups. As a case example we use the integration of the analytics workbench with the KOLEGEA occupational learning platform for doctors in specialty training [10].

2. THE ANALYTICS WORKBENCH

Major parts of the analytics workbench as it is now have been developed in the context of the SiSOB5 project. This project, funded by the European Community under the Sci- ence in Society (SIS) theme, aimed at measuring impact of

2http://piwik.org/

3http://www.google.de/intl/en/analytics/

4http://workbench.collide.info/

5http://sisob.lcc.uma.es/

(2)

science and research on society. The overall goal of the work- bench development in this project was building a generic and extensible analysis framework with an integrated user inter- face that would enable even non-computer experts to access the full analytical power behind the tool and that would also allow reusing and sharing the created analysis workflows.

As depicted in Figure 1 the workbench offers a web-based user interface for designing analysis processes. The work- flows are represented in a visual language based on a pipes- and-filters metaphor, in which modules of the language rep- resent analysis steps and links between these modules de- scribe the data flow. In this representation workflows can be stored, loaded, and also shared with other users of the tool. This user interface is backed by a multi-agent system and each of the modules in the visual language corresponds to one agent in the backend.

A wide range of analysis modules are currently available.

Apart from quite generic components for putting data into the workbench, retrieving data from the workbench, or du- plicating data for splitting a workflow into parallel branches, there are also over ten different modules connected to pro- cessing and analyzing graphs (including several modules for community detection and a module offering a wide range of centrality measures), around ten modules connected to processing and analyzing activity logs (including modules for deriving statistical information, creating networks, and doing sequence analysis), and a wide range of different visu- alizations for both graphs and statistical information.

The evaluation studies of the analytics workbench conducted during the SiSOB project6showed that researchers from dif- ferent fields working in the area of scientometric research experienced the workbench as a tool they could imagine to use in their day-to-day work. One aspect that was specifi- cally highlighted were the advantages of the explicit work- flow representation in the pipes-and-filters metaphor. The study participants saw re-usability of workflows, the possi- bility to exchange workflows with other researchers, and the possibility to communicate about the workflows based on this representation as something that would be very ben- eficial for themselves. First results of an evaluation study conducted in the context of teaching social network analysis using the workbench in a master level university course con- firm these findings.

2.1 Data handling

In principle the formats used for data exchange between the individual analysis components may be chosen completely arbitrary as long as each agent in an analysis workflow un- derstands the input it is given. However in order to facilitate an easy data exchange between the individual agents, some formats have been chosen as main exchange formats. The format used for data tables and the format used for graph data both stem from the SiSOB project7. Both are very flexible JSON based formats which allow converting to and from almost any other data format available for these kinds of data.

6http://sisob.lcc.uma.es/repositorio/deliverables/

D63-Final.pdf

7http://sisob.lcc.uma.es/repositorio/deliverables/

SISOB-D52.pdf

Currently the workbench offers transformations between the SiSOB Graph Format (SGF) and the edge list graph format used by the Pajek8 network analysis tool, the adjacency matrix format of the UCINET9 tool, and GML10, a very widespread and expressive graph format, which can be used for example as input and output format of the igraph11net- work analysis library. The SiSOB Datatable Format (SDT) can be converted to and constructed from comma-separated value (CSV) files in order to allow data exchange with al- most any tool that can handle tabular data.

The third main data exchange format used in the workbench is the Activity Streams format12. It is also a JSON based format and offers a quite flexible representation for activity log data.

2.2 Architecture and extensibility

As already stated the workbench combines a web-based user interface with a multi-agent system as analysis backend. The system uses an SQLSpaces server [9] as communication and data exchange platform. The SQLSpaces system is an open source implementation of the “tuple space” [4] concept fo- cusing especially on ease of development using the server as communication platform for distributed multi-agent sys- tems and on the support of language-heterogeneity in these systems. Supported programming languages include Java, Python, and JavaScript. Figure 2 shows an overview of the architecture.

Figure 2: Architecture of the SiSOB analytics framework.

The communication between the elements of the analytics framework is based on simple tuple-based protocols. There are specific tuple formats for the different communication purposes with the most important being those for initiating and controlling analysis processes or those for data exchange between the agents.

While agents offering analysis techniques are the majority in the system, agents also control input and output of data to and from the analysis process. Therefore connecting the system to new data sources, feeding output of workflows to other systems, and adding new analysis techniques can

8http://pajek.imfm.si/doku.php?id=pajek

9https://sites.google.com/site/ucinetsoftware/

10http://www.fim.uni-passau.de/fileadmin/

files/lehrstuhl/brandenburg/projekte/gml/

gml-technical-report.pdf

11http://igraph.sourceforge.net/

12http://activitystrea.ms/

(3)

Figure 1: SiSOB analytics workbench user interface.

be realized by adding new agents to the analytics work- bench. Apart from the technical basis, the connection to the SQLSpaces, a newly developed agent only needs to comply with the tuple protocols used in the system. There is no fur- ther restriction regarding the structure or the functionality of the agents.

Another possible entry point for adding new functionality to the analytics workbench is theR-Analysis agent, which acts as a wrapper for scripts to be executed in R13, a lan- guage for statistical computing. As there is a wide range of R libraries for different analysis purposes including network analysis available, this allowed and still allows easy inte- gration of analysis features based on these libraries. One example for the integration of experimental or newly devel- oped analysis techniques based on R is the integration of a novel approach for multi-relational blockmodeling [5].

3. THE KOLEGEA LEARNING PLATFORM

The project KOLEGEA14, which is facilitated by the Ger- man Federal Ministry of Education and Research, aims to support doctors specializing in family medicine (general prac- tice (GP)) by providing a platform for collaborative learn- ing in occupational, social communities. It thus addresses the problem of missing opportunities for networking, which highly restricts occupational knowledge exchange as well as collaborative learning in peer communities among young doctors. This problem results amongst others from the many job changes which are required by the broad curriculum of family medicine. Since GP specialty training is based on learning by solving real problems/cases in every day work- ing life, learning is problem-based, self-directed and based on intrinsic motivation. Thus, KOLEGEA’s pedagogical ap- proach (see also [10]) is focused on collaboratively work- ing with user-generated cases in the spirit of problem-based

13http://www.r-project.org/

14http://www.kolegea.de/

learning (PBL) (see e.g. [2]). Users can share and discuss cases in self-regulated or mentor-supported small groups and share the results with the community. Cases can be enriched with media like pictures or videos, tags and links to medical guidelines or external online information like recent publica- tions. Apart from the work in small groups, there are tools for community support like forums, which can be used for occupational, but also social exchange.

The current version of the KOLEGEA system consists of a web portal15 and mobile applications (tablet16 and smart pen) for multimedia note taking and editing. Smart phone apps for (simple) note taking and accessing the system are currently in development.

The portal provides access to the virtual community and its knowledge artifacts (user-generated content as well as medi- cal guidelines and links to artifacts of current research). Fig- ure 3 shows an example of a case description, its discussion and metadata (e.g. tags). The case description is structured based on the phases of the medical consultation. Following Steinkuehler et al. [8], we consider group discussions to be asynchronous and threaded, since this can produce discus- sions and results of higher quality than synchronous com- munication and is more flexible regarding time. Discussions can take place directly connected to cases, but also in the community’s forums.

“An important issue in groupware and CSCW is awareness – generally having some feeling for what other people are doing or having been doing” [3], since “the success of a col- laborative learning experience depends on the informed in- volvement of curriculum designer, teacher, evaluator, and students” [6]. Each of these roles needs different aware- ness information. In KOLEGEA target groups for aware-

15https://beta.kolegea.org/kolegea/

16https://play.google.com/store/apps/details?id=

info.collide.kolegea.npad.views&hl=en

(4)

Figure 3: KOLEGEA Web Portal – case represen- tation with discussion.

ness information are the researchers designing and evaluat- ing the platform, the platform operators, the mentors and of course the trainees. Researchers and platform operators are interested in the success of the designed tools, processes and stimuli to further improve the system. Furthermore, platform operators need input for operating the platform like which case to select as ”case of the month” or which trainees have the potential for supporting (or ”tutoring”) self-regulated groups or even become mentors after becom- ing medical specialists. Mentors e.g. need information on how the members of their groups work together to support collaboration. Trainees might be interested in the activity of groups and their topics before asking for membership.

These are just a few examples for awareness needs in the KOLEGEA project.

4. INTEGRATION DESIGN

The analytics workbench provides many standard analysis modules for SNA and statistical analysis of log files. Further- more, it can be easily extended for specific analyses needed for the actors of the KOLEGEA system. Since it is more efficient to reuse existing software (modules) instead of de- veloping it from scratch (e.g. reduced implementation time, less bugs due to prior testing and usage) the analytics work- bench was coupled with the KOLEGEA system for perform- ing the needed data analysis.

In KOLEGEA all textual contents as well as the logging information are stored in a PostgreSQL data base, media files like images or videos are stored in the file system of the web server. While the data is directly accessed by the web

Figure 4: Planned coupling of KOLEGEA sys- tem (orange) and SISOB workbench (blue) – the backchannel from the workbench to the platform is not yet available.

platform, the mobile applications use a web service interface (REST) to write and access data. The mobile devices also send their log files to the central data base using the web service interface. The KOLEGEA logging format is propri- etary.

To integrate the analytics workbench with the KOLEGEA system the KOLEGEA WebServices were extended to pro- vide raw data for the analysis (log files as well as informa- tion on objects and users stored in the data base). This data is preprocessed by a specialized module which converts the KOLEGEA log file format into the Activity Streams format that is supported by the workbench. Furthermore, there is a specialized module for creating actor-artifact networks.

After this preprocessing, mainly generic modules for log ana- lysis and social network analysis provided by the workbench are used for analyzing the data.

At the moment the analysis results are manually compiled into reports for the project members. We plan to extend the KOLEGEA WebServices to receive analysis results that are stored in the internal data base and provided by the plat- form. The complete analysis process (see Figure 4) will then contain the following steps: Analysts define analysis work- flows in the workbench and save them in the tuple space. A scheduling agent takes the workflow definition and initiates its execution by writing the appropriate command tuples in defined intervals. In each workflow run specific input agents access the KOLEGEA WebServices to retrieve the data needed for the analysis. The specified analysis agents prepare and analyze this data and visualization agents cre- ate appropriate visual representations of the results (e.g. so- ciograms). The textual and visual results are transferred to the platform by an output agent, which accesses the KOLE- GEA WebServices. These save the results in the platform’s database to be retrieved and visualized on demand.

5. DATA ANALYSIS

The following sections describe exemplary analyses on differ- ent levels conducted with the analysis workbench to support different types of actors regarding the KOLEGEA system.

(5)

5.1 Effects of stimuli

Log statistics over time can be used to identify effects of stimuli. Figure 5 shows the number of distinct users per day in the “closed beta” phase of the project. Since the platform was only used by a closed group of users, there was no registration via the platform, but the login information was send via mail to the preselected test users. On this day, there was the highest amount of distinct users. After a week an activation mail was send to the trainees informing about new cases. This resulted in a higher number of distinct users.

Direct contacts with the test users on the DEGAM day of family medicine as well as the later activation mail do not seem to have had much effect.

Figure 5: Distinct users per day in “closed beta”.

This indicator was similarly used to evaluate the effects of events advertising KOLEGEA regarding registration num- bers in the “open beta” phase in January 2014. Further- more, we used login statistics for identifiying a good weekly date for sending activation emails with information on recent platform events.

5.2 Recommendations for “case of the month”

In KOLEGEA each month one case is selected as “case of the month” and depicted on the top of the welcome page.

While the case is ultimately selected by a group of doctors of the institute for family medicine of the Charit´e Berlin to guarantee the quality of its content, not all available cases should be considered to reduce their effort.

Figure 6: Indicators for “case of the month”.

Indicators for interesting cases are a high number of com- ments and read events as well as distinct users and distinct returning users accessing the case. While most of this infor- mation could have been gained by using the existing generic components for log statistics, to have all information in one

table a specific “KOLEGEA Case of the Month” component was implemented. This component is applied after loading the logfile and filtering (excluding) actors of the role “ad- min” by existing components. The results (see Figure 6) are displayed by the table viewer, which allows sorting of the table rows by selecting a column caption.

The statistics were provided to the doctors as a basis for se- lecting the “case of the month” at the end of January 2014 for the first time. They chose a case with many comments and medium accesses, which was created by a doctor in training.

5.3 Identification of user roles

While the previous examples are mainly based on descrip- tive statistics generated from logfiles, in this analysis the log information is aggregated into a network for network ana- lysis.

Figure 7: KOLEGEA community (three-mode net- work).

Figure 7 shows a three-mode network of the complete KO- LEGEA community containing actors (green dots), cases (blue squares) and forum threads (orange triangles). There are three different types of edges: green edges indicate that an actor only read an artifact, orange edges show that the actor at least added one comment or tag to a case/forum thread and blue edges identify the creators of the artifacts.

Thus, comments and tags are already aggregated in the edges.

Most actors only interact with artifacts by reading them, which is typical for online communities. The community has a core of actors being connected with many artifacts and a periphery of actors being only connected to few artifacts.

The core contains two mentors, a KOLEGEA team member and ten users. Users with a high degree centrality regarding reading edges and no or little writing edges seem to be very interested in the artifacts and might need just a little nudge to participate actively.

Regarding the identification of potential tutors and men- tors highly active users are interesting. Considering only

(6)

create (case/forum tread/comment) edges, the degree cen- trality respectively strength (weighted degree centrality) is an indicator for the level of activity in the platform. The the degree centrality of the cases in the multi-mode network corresponds to the number of distinct users in the “case of the month” statistics in section 5.2 .

Figure 8 shows an one-mode actor network, which is cre- ated by “folding” the actor-artifact network ignoring “read”- edges. From the point of view of data aggregation this fold- ing operation transforms the previous multi-mode network.

The nodes are sized regarding their betweenness centrality.

A high betweenness centrality is an indicator that the re- spective actors have a mediator role in the community, thus these might especially qualify for becoming tutors or men- tors. The four most central actors are three mentors and one user. This supports that betweenness centrality is a good in- dicator for identifying mentors/tutors. The high centrality of the mentors is a consequence of building a new commu- nity (”cold start problem”). Furthermore, user325 appears to be a good candidate for becoming a tutor.

Figure 8: KOLEGEA community (one-mode actor network).

6. SUMMARY AND PROSPECTS

In this paper, we have shown how a generic and extensi- ble analytics workbench can be integrated with an online learning platform for medical doctors in specialty training for performing analyses. We also have presented how the results of different analysis processes with different perspec- tives on the available data can be used to inform researchers developing online learning platforms and operators of such platforms. The analyses addressed miscellaneous levels of data aggregation ranging from basic log data to one-mode networks. There are relations between the analyses, e.g. the number of distinct users linked to a case can be extracted directly from the log file or as the degree centrality from a case-actor network. An actor-artifact network, which is based on a logfile, can be aggregated into an actor-actor network for getting a more condensed view on user interac- tions.

In our future work, we plan to extend the work presented here in several ways. One direction will be the extension of the workbench with further analysis modules focused on the analysis of activity logs. Another direction will be the application of more of the already implemented analysis ap-

proaches on the log data of online platforms, e.g. the avail- able sequence analysis modules. We also plan to build a collection of reusable analysis workflows that can be applied to different online platforms and that can support different target groups. In addition to researchers developing new ap- proaches for learning platforms and operators of such plat- forms, we also plan to support the users of the platforms directly with awareness features. A last goal for both the development of the analytics workbench and the KOLEGEA platform will be integrating the two in such a way that it is possible to directly view the results of the workflows in the platform and thus giving especially the users of the platform an easy access to the analysis results.

7. REFERENCES

[1] L. Beuster, M. Elkina, A. Fortenbacher, L. Kappe, A. Merceron, A. Pursian, S. Schwarzrock, and B. Wenzlaff. Lemo: An analytics tool for lmss as well as portals. Poster Presentation at the LAK 2013, 2013. Online available athttp://lemo.htw-berlin.

de/public/publication/LAK13_Beitrag.pdf, last checked on 2014-01-26.

[2] L. H. Distlehorst and H. S. Barrows. A new tool for problem-based, self-directed learning.Journal of Medical Education, (57):486–488, 1982.

[3] A. Dix, J. Finlay, G. Abowd, and R. Beale.

Human-Computer Interaction. Prentice Hall, 2004.

[4] D. Gelernter. Generative communication in linda.

ACM Trans. Program. Lang. Syst., 7(1):80–112, jan 1985.

[5] T. G¨ohnert, A. Harrer, T. Hecking, and H. U. Hoppe.

A workbench to construct and re-use network analysis workflows - concept, implementation, and example case. InProceedings of the 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, 2013.

[6] C. Gutwin, G. Stark, and S. Greenberg. Support for workspace awareness in educational groupware. InThe First International Conference on Computer Support for Collaborative Learning, pages 147–156, 1995.

[7] N. Malzahn, T. Ganster, N. Str¨afling, N. Kr¨amer, and H. U. Hoppe. Motivating students or teachers?

challenges for a successful implementation of

online-learning in industry-related vocational training.

InEC-TEL 2013, pages 191–204, 2013.

[8] C. A. Steinkuehler, S. J. Derry, D. K. Woods, and C. E. Hmelo-Silver. The step environment for distributed problem-based learning on the world wide web. InProceedings of the Conference on Computer Support for Collaborative Learning: Foundations for a CSCL Community, pages 217–226, 2002.

[9] S. Weinbrenner.SQLSpaces – A Platform for Flexible Language-Heterogeneous Multi-Agent Systems. PhD thesis, Universit¨at Duisburg-Essen, 2012.

[10] S. Ziebarth, A. K¨otteritzsch, H. U. Hoppe, L. Dini, S. Schr¨oder, and J. Novak. Design of a collaborative learning platform for medical doctors specializing in family medicine. InTo See the World and a Grain of Sand: Learning across Levels of Space, Time, and Scale: CSCL 2013 Conference Proceedings, pages 205–208, 2013.

Références

Documents relatifs

The Data Analysis WorkbeNch (DAWN) was developed to address the challenge of providing a single visualization and analysis platform for data from any synchrotron experiment

At the minimum, figures and tables used to communicate results in scientific publications are produced with the help of software, but more elaborate computational data

Les données d'une entreprise revêtent plusieurs formes : des nouvelles données transactionnelles sous forme de commandes entrantes à partir d'un site Internet, des

the knowledge embedding is to learn entities and relations in KGs as vector representations in the low-dimensional space. Such representations can be used in many applications, e.g.

Property shared token similarity relates the set of tokens of the labels of all direct property labels of two concepts: At first, all direct properties of a concept are de-

It was found that adding a visual clue (virtual ray) to fill the gap between the hand and the cursor lowers user performance, and also that using a 1.5 scale fac- tor to map

Akvo is a software project for processing, modelling, and inversion of surface nuclear magnetic resonance (sNMR) data which is being released as a resource for the community..

L’archive ouverte pluridisciplinaire HAL, est destinée au dépôt et à la diffusion de documents scientifiques de niveau recherche, publiés ou non, émanant des