Skip to main content

In everyday language, “astronomical” means “very large” as often as it means having to do with astronomy. That’s for good reason: the known universe contains a lot of stars, galaxies, and other objects. Modern observations gather huge amounts of data on those objects, requiring researchers to come up with new ways to process that information. Astrostatistics is the way astronomers measure the reliability of their measurements, quantify the uncertainties in theoretical models, and turn the raw numbers from observations into something useful.

Our Work

Center for Astrophysics | Harvard & Smithsonian scientists develop and refine new data analysis methods to interpret massive datasets, such as:

  • Collaborating with other institutions to develop methods for analyzing astronomical data, and sharing those techniques freely with the wider community. That effort includes the CHASC AstroStatistics Center, an international collaboration including statisticians and astronomers from a number of different universities.

  • Analyzing the huge amounts of data from surveys of galaxies, quasars, and other large populations. That includes current projects like the Baryon Oscillation Spectroscopic Survey (BOSS), as well as upcoming observatories like the Large Synoptic Survey Telescope (LSST), which will produce as much as 20 terabytes of data in a night of observations.
    A One-Percent Measure of Galaxies Half the Universe Away

  • Processing the 1.5 terabytes of data collected daily by the space-based Solar Dynamics Observatory (SDO). Astrostatistics is necessary to turn that huge amount of information into something that helps astronomers identify — and possibly predict — solar flares.
    Solar Dynamics Observatory

  • Identifying new exoplanets around a wide variety of stars using NASA's Transiting Exoplanet Survey Satellite (TESS) and other observatories. TESS is mapping roughly 85% of the sky every 27 days, which is a lot of stars to monitor for planets, requiring astrostatistics to make sense of the data.
    NASA Prepares to Launch Next Mission to Search the Sky for New Worlds

Hubble image of galaxy cluster from SDSS dataset

The galaxy cluster SDSSJ0150+2725 as seen by NASA's Hubble Space Telescope. This cluster was identified by the Sloan Digital Sky Survey (SDSS), which has cataloged millions of objects, requiring the development of powerful statistical techniques to track them all.

Credit: ESA/Hubble & NASA; Judy Schmidt

Responding to a Universe of Data

The beautiful images coming from telescopes are only a small part of the scientific story. Underneath their amazing colors are numbers, and lots of them. The colors and brightness of the various pixels are part of the data collected by telescopes, and it’s that data that contains the scientific information astronomers need.

With increasingly larger observatories and more emphasis on monitoring big swaths of the sky for long periods of time, astronomical datasets are growing in size very quickly. Researchers are developing more powerful statistical methods to tackle this data, to get the big picture view out of all the details. Astrostatistics puts numbers on what we learn from these observations, and equally importantly how good both the measurements and our interpretations are. Along with machine learning, astrostatistics allows astronomers to find patterns that might otherwise be missed.

The advent of “big data” in astronomy came through large-scale surveys of galaxies, hunts for exoplanets, maps of the cosmic microwave background, and other observations where the goal is to study many objects simultaneously, rather than as individuals. For example, with approximately 100 billion galaxies in the known universe, astrostatistics allows researchers to study rare events — the kind that might only occur in one out of every billion galaxies.

Next-generation observatories such as NASA's Transiting Exoplanet Survey Satellite (TESS) and the Large Synoptic Survey Telescope (LSST) are designed to produce maps of large chunks of the sky on a regular basis to find exoplanets and other objects. With many terabytes of data coming through during each period of observation, advanced techniques in astrostatistics are necessary to turn this information into something astronomers can use.