Computing has been an enormous accelerator to science and industry alike and it has led
to an information explosion in many different fields. The unprecedented volume of data
acquired by sensors, derived by simulations and analysis processes, and shared on the Web
opens up new opportunities, but it also creates many challenges when it comes to managing
and analyzing these data. In this talk, I discuss the importance of maintaining detailed
provenance (also referred to as lineage and pedigree) for digital data. Provenance provides
important documentation that is key to preserve data, to determine the data's quality
and authorship, to understand, reproduce, as well as validate results. Besides presenting
techniques we have developed to efficiently manage and re-use provenance information,
I will give an overview of the provenance infrastructure we have built for the open-source
VisTrails system (http://www.vistrails.org). I will also describe emerging applications and
novel uses of provenance for enabling collaborative data analysis, teaching science, and
publishing reproducible results.
Managing Provenance for Reproducibility and Beyond

22.06.2011
Date : 22.06.2011
Time: 16:00 - 17:00
Location : "Alkiviades C. Payatakes" Seminar Room, FORTH, Heraklion, Crete
Host : Martin Doerr, Irini Fundulaki, Yannis Tzitzikas, ISL / ICS-FORTH
Juliana Freire is a Professor of Computer Science at the Polytechnic Institute of New York
University (NYU Poly). An important theme is Professor Freire's work is the development
of data management technology to address new problems introduced by emerging
applications, including the Web and e-Science. Her research interests include provenance,
scientific data management, information integration and Web mining. She is a recipient of
an NSF CAREER and an IBM Faculty award. Her research has been funded by the National
Science Foundation, Department of Energy, National Institutes of Health, the University of
Utah, IBM, Microsoft and Yahoo!