The Social Security Administration’s Baby Names Data
Meet the U.S. government's most popular dataset.

Every year around Mother’s Day, the Social Security Administration publishes the frequency of first names for every child born in the previous calendar year, down to any name that appeared at least five times that year.

Since the data goes back to 1880, it’s possible to chart any given name over time.

Students of American History may recall that the Social Security Act itself did not pass until 1935. So how does the data go back to 1880? Simple: When people applied for benefits after the law passed, they listed their birth date and name in the application, making it easy to trace name data backward in time. But not everyone who was eligible applied, so the older data is not based on a complete sample of the population.

Limitations

Since the Social Security Administration withholds data on names that show up fewer than five times, we don’t have a complete picture of rare names. Fortunately, the agency also publishes data on the total number of new Social Security cards issued each year by gender, so we at least have information on the total number of new babies each year independent of whether their names end up in the public dataset.

Format

The data is released as a series of text files, one for each year. Each line contains the name, gender, and number of new Social Security cards issued to babies of that name and gender in the given year. Here’s a few lines from the 2013 data, for example:

Sophia,F,21075
Emma,F,20788
Olivia,F,18256
Isabella,F,17490
Ava,F,15129

To compare a single name across the years, you have to find it in each file. But don’t do that by hand! Use our code, which makes it easy to generate name-by-name files or one big file will all the data in one place.