Jorge Carretero

Institute for High Energy Physics IFAE/PIC, Barcelona, Spain


Title: Massive cosmological data analysis, distribution and generation using a Big Data platform

Galaxy surveys require support from massive datasets in order to achieve precise estimations of cosmological parameters. The CosmoHub platform (, a web portal to perform interactive analysis of massive cosmological data, and the SciPIC pipeline have been developed at the Port d’Informació Científica (PIC) to provide this support, achieving nearly interactive performance in the processing of multi-terabyte datasets.

Cosmology projects currently supported include European Space Agency Euclid space mission, the Dark Energy Survey (DES), the Physics of the Accelerating Universe (PAU) survey and the Marenostrum Institut de Ciències de l’Espai Simulations (MICE).

CosmoHub enables users to interactively explore and distribute data without any SQL knowledge. It is built on top of Apache Hive, part of the Apache Hadoop ecosystem, which facilitates reading, writing, and managing large datasets. More than 50 billion objects, from public and private data, as well as observed and simulated data, are available.

The SciPIC scientific pipeline has been developed to efficiently generate mock galaxy catalogs using as input a dark matter halo population. It runs on top of the Hadoop platform using Apache Spark, which is an open-source cluster-computing framework. The pipeline is currently being calibrated to populate the full sky Flagship dark matter halo catalog produced by the University of Zürich, which contains about 44 billion dark matter haloes in a box size of 3.78 Gpc/h. The resulting mock galaxy catalog is directly stored in the CosmoHub platform.



Jorge graduated in Physics (Theoretical Physics branch) at the Universidad Autónoma de Madrid (UAM) in 2005. He joined the Institut de Ciències de l’Espai in 2008 and obtained the PhD in 2013 with a thesis in building mock galaxy catalogues from dark matter N-body simulations as a tool for actual and future galaxy surveys.

Afterwards, he joined the Astrophysics and Cosmology group at PIC in 2012 during the last year of his thesis to work in the data management of the PAU project. Since then he collaborates in different areas in several cosmological projects such as PAU, MICE, DES and Euclid.

