Valentine in action: matching tabular data at scale

Christos Koutras, Kyriakos Psarakis, George Siachamis, Andra Ionescu, Marios Fragkoulis, Angela Bonifati, Asterios Katsifodimos

Published in VLDB Demo, 2021

Our recently proposed Valentine is the first system to offer an open-source experiment suite to organize, execute and orchestrate large-scale matching experiments. In this demonstration we present its functionalities and enhancements: i) a scalable system, with a user-centric GUI, that enables the fabrication of datasets and the evaluation of matching methods on schema matching scenarios tailored to the scope of tabular dataset discovery, ii) a scalable holistic matching system that can receive tabular datasets from heterogeneous sources and provide with similarity scores among their columns, in order to facilitate modern procedures in data lakes, such as dataset discovery.

  • Bibtex:
    @article{koutras2021valentine,
    title={Valentine in action: matching tabular data at scale},
    author={Koutras, Christos and Psarakis, Kyriakos and Siachamis, George and Ionescu, Andra and Fragkoulis, Marios and Bonifati, Angela and Katsifodimos, Asterios},
    journal={Proceedings of the VLDB Endowment},
    volume={14},
    number={12},
    pages={2871--2874},
    year={2021},
    publisher={VLDB Endowment}
    }