EDGES

The Experiment to Detect the Global EoR Signal is a focused experiment that aims to detect the “global” signal from neutral hydrogen during the CD/EoR. It measures a single frequency spectrum averaged over the whole sky. This traces the total temperature of the intergalactic medium (IGM) as it evolves throughout cosmic time. The three distinct antennas deployed by EDGES cover redshifts between 6 — 27, or from 7.7 — 13.3 billion years ago.

EDGES is the first 21cm experiment to report evidence for the first stars. The surprising amplitude and shape of the signal has generated substantial interest, with explanations in terms of exotic dark matter properties, and the first generation of black holes.

My focus is on statistical validation of the result, building sophisticated Bayesian forward-models of the instrument to understand how confident we should be in the detection. I’m also completely re-writing the analysis code to be open-source and reproducible.

The IGM temperature is a melting pot of physics, and allows us to determine a host of interesting characteristics of the evolving Universe: the extent to which primordial black holes heat the gas (if at all), the role of dark matter in cooling the gas, whether extra “background” radiation is present (above the cosmic microwave background), the timing of the formation of the first galaxies, and how strongly these galaxies were able to emit ionizing radiation, to name a few. All of these physical factors affect that single number per redshift, measured by the brightness temperature of the neutral hydrogen.

EDGES has the distinction of being the first (and currently only) 21cm experiment to have reported a detection of the Cosmic Dawn. The discovery has been met by both excitement and skepticism. While the detection awaits confirmation both from independent global experiments (like SARAS3, PRIZM or REACH), interferometric experiments (like HERA — see below!) or even its own upgraded system, the really important thing to note is just how much excitement has been generated by the reported detection. The unexpected amplitude and shape of the signal has generated substantial interest from theoretical physicists who offer explanations in terms of exotic dark matter properties, and from astrophysicists who consider the effect of the first generation of black holes. Meanwhile, the EDGES team itself is hard at work tightening the analyses and taking more data, to verify the result.

I joined the (relatively small) EDGES team in early 2019 to bring my expertise in statistical inference (in particular, Bayesian inference) to bear on the confirmation effort. I quickly learned that to do so effectively meant that I needed to bring my software development skills as well. I set to work re-writing or re-factoring essentially the entire analysis pipeline, from lab-based calibration through to parameter estimation, modularising the components into repositories hosted at the edges-collab Github organization. The intention is to make these packages easy to use for both our collaboration and the wider community, and enable reproduction of our results outside our collaboration into the future. At the same time, I spearheaded an effort to restructure our data (especially data from the lab) in order to make it simpler to access, and less error-prone to load. Again, while this was originally motivated by me, the new team member, being able to actually use the data that the rest of the team already knew how to use, the intention is that this will prepare the way for opening the data to the public in the future.

In the process of restructuring the code and data, I’ve learned a lot about the signal chain of EDGES, receivers, and their calibration — all of which are a lot more fascinating than I had anticipated! My primary goal with EDGES is turn the whole analysis pipeline inside out (or upside down?). Instead of taking the raw data, calibrating it with our “best bet” calibration, and finding the best-fit model parameters using that once-for-all calibrated data, I’m building the framework for being able to understand how our uncertainty in calibration (and other analysis steps, like the antenna beam and the sky model) propagates all the way through to our uncertainties on the final parameters. To do this, we’ll create corrupted models of the data by applying all the instrumental effects (given arbitrary choices of their parameters), and compare to the data directly. We’ll do this hundreds of thousands of times, varying all the parameters (both the instrumental ones we don’t really care about, and the ones that tell us about the Cosmic Dawn), until, via Bayesian statistics, we have a full distribution of the parameters we care about.

My starting point has been to investigate the calibration parameters, which tell us how much the input temperature from the sky has been amplified by the antenna before we get to take a measurement (this is dependent on frequency and time). We need to “divide out” this “gain” so that our reported temperature is correct. Using a clever method based on taking data from multiple sources, we are able to remove most of the gain (especially the part that changes over time). But there are more detailed parts of the gain which depend on properties of the internal circuit (its reflection parameters) which must be measured in the lab back at ASU. All of these measurements are not perfect, and therefore the resulting estimate of the gain is not perfect, but has some uncertainty. At the moment, this uncertainty is not being captured and propagated through to our final estimates of the IGM temperature, but that is soon to change!