Gaia Data Validation Engineer [J 3218]



An opportunity for an experienced Gaia data engineer to work on the data validation support platform.


The engineer will support the Build and operate a data validation support platform for CU8, in preparation for the Gaia Data Release 3


The processing and presentation of the Gaia data is the responsibility of 9 coordination units (CUs) that together form the Gaia Data Processing and Analysis Consortium (DPAC). Three units do basic data analysis on the processed data, CU4 for double stars, orbital binaries and solar system objects, CU7 for variable stars, and CU8 for spectral classification. CU9 takes care of the intermediate and final publication of the Gaia data.

Gaia Data Release 3 (DR3) planned for 2 stages – in mid-2020 (EDR3) and 2nd half of 2021 (DR3) It will be substantially larger in data volume and much richer in content compared to previous public releases. DPAC’s CU8 will for the first time contribute, for a large number of stars, astrophysical parameters (effective temperatures, surface gravities, stellar radii etc) that have been computed from various data processing products of other CUs (e.g. photometry from CU5, variability information from CU7).

The validation of such data represents a formidable technical challenge in terms of hardware and database resources as well as needing specific expertise in data mining, data modelling, data querying and related topics – all with an astrophysical and Gaia-specific emphasis.

The main objective of the requested service is to build and operate a data validation support platform which will allow CU8 scientists to carry out all needed validation activities in preparation for Gaia Data Release 3 (DR3). Once this key objective has been met the service is expected to shift towards supporting on-going efforts within the Gaia SOC to provide state-of-the-art data mining capabilities for DR3 end-users.


Responsibilities / Duties

Duties will include but are not limited to:

  • Build and operate a data validation support platform for CU8:
    • Formulating requirements for the envisaged validation platform in close interaction with CU8 and SOC personnel
    • Defining the hardware and software environment needed to fulfil the requirements of the CU8 validation system
    • Supporting the process of purchasing the needed resources for the platform
    • Supporting the deployment of the validation platform
    • Aiding CU8 in defining data models and data mapping rules
    • Supporting CU8 scientists in all activities related to their usage of the provided platform
    • Pro-actively advising CU8 scientists in optimizing validation strategies
  • Contribute to provision of state-of-the-art data mining capabilities for DR3 end-users:
    • Find synergies within the data mining needs internal to CU8 and CU9 validation groups
    • Develop, in close cooperation and under the authority of the CU9 data mining lead, the CU8 and CU9 internal services into a unified framework with front-end portal to provide data mining services ultimately to the end-users

Qualifications / Experience

  • A PhD or MSc or equivalent qualification in Astronomy, Physics, Mathematics or a related discipline.
  • At least 4 years of relevant work experience.
  • Experience in working with large astronomical data sets.

Essential Skills

  • Experience working in an international environment
  • Excellent communication skills – notably in interfacing with different stakeholders with diverse backgrounds (for example Scientists)
  • Knowledge and software engineering experience in, Python and Java.
  • Knowledge of HTML5, CSS and Javascript, for development of the data mining front-end, is an asset
  • Knowledge of and practical experience with managing relational database management systems serving TB-scale datasets, specifically knowledge of PostgreSQL
  • Knowledge of and, ideally experience with, state-of-the-art data mining concepts and related technologies, including practical experience with Apache Spark
  • Knowledge of Jupyter Notebook, Jupyter Lab and Jupyter Hub frameworks, their APIs and kernels



  • Positive, “can-do” approach

Further Details

All roles within Telespazio VEGA have a defined closing date, however if a successful candidate is found before the advert expires the role will be closed early.  We would therefore advise any candidate to apply as early as possible to avoid disappointment.


Closing Date for applications is 22nd March 2019.


Location of the position is ESAC, Madrid area, Spain.



Competitive Package



In line with Asylum & Immigration Legislation, all applicants must be eligible to live and work in the EU. Documented evidence of eligibility will be required from candidates as part of the recruitment process. Furthermore, in view of the nature of the work the company is in, all potential employees will undergo stringent reference and identity checks.


To apply : Please send your CV and covering letter to

By sending your CV to you give your consent for Telespazio VEGA to hold and process your personal information for the purpose of the application.

2019/03/06 11:15