Astronomy with Python

Updated May 2026
Python has become the primary programming language for astronomical research because Astropy provides a comprehensive framework for coordinate systems, unit conversions, FITS file handling, and time representations, while the broader scientific Python ecosystem handles image processing, statistical analysis, and visualization. From processing raw telescope images and measuring stellar brightness to querying catalog databases and analyzing spectral data, Python covers the entire observational astronomy workflow.

Astronomy has unique computational requirements: celestial coordinates span 360 degrees of right ascension and plus or minus 90 degrees of declination with multiple reference frames that rotate and precess over time. Time is measured in Julian dates, modified Julian dates, Barycentric Julian dates, and sidereal time, depending on the context. Physical quantities carry units that span enormous ranges (parsecs, solar masses, ergs per second per square centimeter per angstrom). Astropy handles all of these astronomical conventions natively, preventing the unit conversion errors and coordinate system mistakes that plagued earlier approaches.

Step 1: Set Up Astropy and Coordinates

Install Astropy with pip install astropy. The units system prevents dimensional errors: from astropy import units as u. distance = 4.2 * u.lightyear creates a distance with units. distance.to(u.parsec) converts to parsecs (1.29 pc). distance.to(u.m) converts to meters. Arithmetic respects units: velocity = distance / (10 * u.year) produces a velocity in lightyears per year, convertible to km/s. Constants: from astropy import constants as const, const.G (gravitational constant), const.c (speed of light), const.M_sun (solar mass), all with proper units that propagate through calculations.

Celestial coordinates use the SkyCoord class. from astropy.coordinates import SkyCoord. coord = SkyCoord(ra=10.684*u.degree, dec=41.269*u.degree, frame='icrs') creates a coordinate in the International Celestial Reference System. coord = SkyCoord.from_name('M31') resolves an object name to coordinates using the SIMBAD database. coord.galactic converts to Galactic coordinates. coord.separation(other_coord) computes the angular distance between two objects. coord.transform_to('fk5') transforms to the FK5 reference frame. These transformations handle the precession, nutation, and aberration corrections that would be error-prone to compute manually.

Time representation with Astropy handles the multiple time systems used in astronomy. from astropy.time import Time. t = Time('2026-05-19T12:00:00', scale='utc') creates a UTC time. t.jd gives the Julian Date. t.mjd gives the Modified Julian Date. t.sidereal_time('mean', longitude='-118d') gives the local mean sidereal time. t + 30 * u.day adds 30 days. Time('58000', format='mjd') creates from a Modified Julian Date. These conversions are critical for scheduling observations, computing barycentric corrections, and combining data taken at different times.

Step 2: Process FITS Files

FITS (Flexible Image Transport System) is the standard file format for astronomical data, containing both image data and metadata headers. from astropy.io import fits. hdul = fits.open('image.fits') opens the file and returns a list of HDUs (Header/Data Units). hdul[0].header shows the metadata (telescope, instrument, exposure time, filter, coordinates, pixel scale, date). hdul[0].data returns the image as a NumPy array. Common header keywords: NAXIS1/NAXIS2 (image dimensions), EXPTIME (exposure time in seconds), FILTER (filter name), RA/DEC (pointing coordinates), CD1_1/CD2_2 (pixel scale in degrees/pixel).

World Coordinate System (WCS) maps pixel coordinates to sky coordinates. from astropy.wcs import WCS. wcs = WCS(header) creates a WCS object from the FITS header. sky = wcs.pixel_to_world(x_pixel, y_pixel) converts pixel coordinates to celestial coordinates. pixel = wcs.world_to_pixel(sky_coord) converts the other way. Plot images with correct coordinates: fig, ax = plt.subplots(subplot_kw={'projection': wcs}), ax.imshow(data, cmap='gray', vmin=low, vmax=high), ax.set_xlabel('RA'), ax.set_ylabel('Dec'). The WCS projection ensures that axis labels show right ascension and declination rather than pixel numbers.

Image combination stacks multiple exposures to increase signal-to-noise ratio. Load multiple FITS files into a list of arrays. Align images using astroalign (pip install astroalign): aligned, footprint = astroalign.register(image, reference). Combine with median stacking (rejects cosmic rays and satellite trails): combined = np.median(np.array(aligned_images), axis=0). Or use sigma-clipped mean for better signal-to-noise: from astropy.stats import sigma_clip, masked = sigma_clip(stack, sigma=3, axis=0), combined = np.ma.mean(masked, axis=0). Subtract dark frames and divide by flat fields for proper calibration before combining science frames.

Step 3: Analyze Astronomical Images

Background estimation and subtraction is the first step in source detection. from photutils.background import Background2D, MedianBackground. bkg = Background2D(data, box_size=50, filter_size=3, bkg_estimator=MedianBackground()). bkg.background gives the 2D background model. data_subtracted = data - bkg.background removes the background. bkg.background_rms gives the pixel-to-pixel noise level, needed to set detection thresholds. Proper background subtraction is critical: sources detected without it will have biased photometry, and faint sources near bright nebulosity will be missed entirely.

Source detection finds stars and other objects in the image. from photutils.detection import DAOStarFinder. daofind = DAOStarFinder(fwhm=3.0, threshold=5.0 * bkg.background_rms_median). sources = daofind(data_subtracted). The result is an Astropy Table with x and y centroid positions, peak values, and shape parameters for each detected source. fwhm is the expected full width at half maximum of the point spread function (PSF) in pixels. threshold sets the detection limit in units of the background noise: threshold=5 means only sources 5 sigma above the background are detected.

Aperture photometry measures the brightness of detected sources. from photutils.aperture import CircularAperture, aperture_photometry. positions = np.column_stack([sources['xcentroid'], sources['ycentroid']]). apertures = CircularAperture(positions, r=5.0). phot_table = aperture_photometry(data_subtracted, apertures). The flux column gives the total counts within each aperture. Convert to instrumental magnitudes: mag_inst = -2.5 * np.log10(phot_table['aperture_sum'] / exptime). Calibrate to standard magnitudes by matching detected sources against a reference catalog and computing the zero-point offset.

PSF photometry provides more accurate measurements for crowded fields where apertures overlap. Fit a model PSF (Gaussian, Moffat, or empirical) to each source simultaneously, properly deconvolving overlapping profiles. photutils.psf provides IterativePSFPhotometry for this purpose. PSF photometry recovers accurate fluxes for sources that are partially blended, which aperture photometry cannot do. For extremely crowded fields (globular clusters, galactic center), PSF photometry can detect and measure sources 2 to 3 magnitudes fainter than aperture photometry on the same image.

Step 4: Work with Catalogs and Surveys

Astroquery provides Python access to astronomical databases and archives. from astroquery.simbad import Simbad. result = Simbad.query_object('M31') returns basic data for the Andromeda galaxy. from astroquery.vizier import Vizier. catalog = Vizier.query_region(coord, radius=5*u.arcmin, catalog='II/246') queries the 2MASS catalog within 5 arcminutes of a position. from astroquery.gaia import Gaia. job = Gaia.launch_job("SELECT * FROM gaiadr3.gaia_source WHERE DISTANCE(ra, dec, 10.68, 41.27) < 0.1") queries the Gaia DR3 catalog using ADQL (Astronomical Data Query Language).

Catalog cross-matching associates sources detected in your image with entries in reference catalogs. from astropy.coordinates import match_coordinates_sky. idx, sep, _ = match_coordinates_sky(your_coords, catalog_coords) finds the nearest catalog entry for each of your sources. Filter by separation: good_matches = sep < 2 * u.arcsec keeps only matches within 2 arcseconds. Cross-matching against photometric catalogs (2MASS, Pan-STARRS, SDSS) enables magnitude calibration. Cross-matching against astrometric catalogs (Gaia) enables precise coordinate calibration.

Survey data access retrieves images and spectra from astronomical archives. from astroquery.skyview import SkyView. images = SkyView.get_images(position='M31', survey='DSS2 Red', pixels=1000) downloads a 1000x1000 pixel image from the Digitized Sky Survey. from astroquery.mast import Observations. obs = Observations.query_object('M31', radius=0.1*u.deg) searches the Mikulski Archive for Space Telescopes (MAST) for Hubble, JWST, and other space telescope observations. These archive queries let you work with data from telescopes you could never access directly.

Step 5: Analyze Spectra and Time Series

Spectral analysis processes the wavelength-dependent intensity of astronomical sources. Load a 1D spectrum from a FITS file: from specutils import Spectrum1D. spec = Spectrum1D.read('spectrum.fits'). spec.wavelength gives the wavelength array. spec.flux gives the intensity array. Identify spectral lines: from specutils.analysis import find_lines_threshold, lines = find_lines_threshold(spec, noise_factor=3). Measure line properties: from specutils.analysis import equivalent_width, centroid, fwhm. equivalent_width(spec, region) gives the line strength. These measurements determine chemical composition, temperature, density, and radial velocity of astronomical sources.

Radial velocity measurement uses the Doppler shift of spectral lines. Fit a Gaussian to an emission or absorption line: from specutils.fitting import fit_lines, from astropy.modeling.models import Gaussian1D. Compare the observed line center to the rest wavelength: delta_lambda = observed - rest. velocity = const.c * delta_lambda / rest. Convert to km/s: velocity.to(u.km/u.s). For precision radial velocities (exoplanet detection), cross-correlate the observed spectrum against a template spectrum to determine the velocity shift with sub-km/s precision.

Light curve analysis studies how an object's brightness changes over time. Load time and magnitude data into arrays. from astropy.timeseries import TimeSeries. ts = TimeSeries(time=Time(dates, format='mjd'), data={'mag': magnitudes}). Plot the light curve: plt.scatter(ts.time.mjd, ts['mag'], s=1), plt.gca().invert_yaxis() (astronomers plot brighter magnitudes as lower numbers). For periodic variables, compute a periodogram: from astropy.timeseries import LombScargle, frequency, power = LombScargle(times, magnitudes).autopower(). The highest peak in the periodogram gives the period. Phase-fold the light curve: phase = (times - t0) / period % 1 to reveal the shape of the periodic signal.

Transit detection for exoplanets looks for periodic dips in stellar brightness. Detrend the light curve to remove systematic trends (instrumental effects, stellar variability): use a polynomial fit or a Gaussian process. Search for periodic box-shaped dips using the Box Least Squares (BLS) algorithm: from astropy.timeseries import BoxLeastSquares, bls = BoxLeastSquares(times, fluxes), results = bls.autopower(duration=0.1). The transit depth gives the planet-to-star radius ratio: Rp/Rs = sqrt(depth). The transit period gives the orbital period. The transit duration constrains the orbital inclination and stellar density. These measurements from photometry alone constrain the physical properties of planets orbiting other stars.

Key Takeaway

Astropy's coordinate, unit, and time systems prevent the conversion errors that have historically plagued astronomical calculations. Always use Astropy units and coordinates rather than raw numbers to keep dimensional analysis automatic.