Search for contacts, projects,
courses and publications

Automatic Web data collection from non-reactive sources by means of normative systems and Semantic Web Technologies

People

Leaders

Fornara N.

(Responsible)

Collaborators

Nguyen T. T. V.

(Collaborator)

Marfia F.

(Collaborator)

Abstract

Web-based data collection is becoming increasingly important for many social science fields. It is not restricted to Web surveys, but it also includes non-reactive data, collected by means of various techniques from heterogeneous Web sources. A scientific methodology for Web-based data collection has not yet been developed. Relevant components for the required methodology are, from the point of view of those who will analyse these data, data validity, reliability, and quality; from the perspective of data providers, constraining the access to their data is essential, together with the possibility of being aware of how they will be stored and used. These guidelines are currently expressed in natural language. Therefore, when big amounts of data are treated for automatic extraction by means of specialized software, being compliant with those norms becomes very difficult. It is therefore clear that realizing new challenging technologies for supporting Web based data collection is an important open issue. In this project we propose to tackle this by developing models and techniques to express Web-based data collection guidelines, rules and policies at different levels of abstraction. We want to develop new techniques for automatic data extraction from the Web which rely on Semantic Web technologies and automatic reasoning to plan actions compliant with the guidelines. Finally we plan to implement a demonstrative tool able to use the above mentioned technologies for Web-based data collection.

Additional information

Start date
01.01.2013
End date
30.06.2015
Duration
30 Months
Status
Ended

Publications