I am a particle physicist by training, and a scientist by calling. Becoming a physicist was my dream, which I achieved by graduating from the prestigious NPAC program (Masters in Nuclei, Particles, Astroparticles and Cosmology) from Paris Saclay in 2006 and my PhD from Sorbonne University in 2009. For those interested, I specialized in CP-violation, the study of the asymmetry between matter and antimatter, and for the sake of my research, I was lead to extract the tiniest signals from very large datasets via the tuning of the most sophisticated particle detectors on the planet, complex Monte Carlo simulations, and a technique called a Dalitz Plot Analysis which I am one of very few people to master. 

In 2008, I took on the role of DIRC commissioner and was in charge of the quality of the data collected for the BaBar Collaboration and was tasked to reconstruct particle decays from raw data (electrical signals and hits in the drift chamber). (Un)fortunately,  that’s when the Great Recession started and we were told by the DoE that we had 6 months instead of years to conclude the data collection process. So the Collaboration decided to increase the luminosity of PEP-II (the particle accelerator hosted at the Stanford Linear Accelerator Center), which basically means the accelerator was pushed to its limits to generate more collisions, and hence, more data. Within days, I realized that while volumes of data were higher, the reconstruction process was leading to fewer useful particle decays: in other terms, while we were collecting more data, the density of information was much lower, and, frankly, of lower quality. I complained but wasn’t immediately  heard, but the experience planted a seed in my mind that data collection truly isn’t about maximizing the volume but rather finding the right balance between volume and quality to optimize the quantity of effective information.