Is riding the subway safe?

Comparing safety across different modes of transportation within New York City.

I attended a Metropolitan Transportation Authority (MTA) Open Data event a few weeks ago. At the event and with some follow-up research, I learned a few things:

  1. The MTA is North America's largest transportation network, serving a population of 15.3 million people across a 5,000-square-mile travel area.

  2. The MTA network comprises the nation’s largest bus fleet and more subway and commuter rail cars than all other U.S. transit systems combined.

  3. As of 2021, the MTA is required by law to publish data, and they have an entire (and talented) team dedicated to producing wide-ranging, quality datasets on entities such as ridership, incidents, performance, and the like.

Every few years, the MTA Open Data team holds a data challenge, where community members are encouraged to submit creative stories with some of the 100+ datasets curated by the Open Data team.

In 2021, New York Governor Hochul signed the MTA Open Data Act codifying the MTA’s commitment to sharing data with the public. As a result, datasets on ridership, needs assessments, performance metrics and more, can now be accessed and interpreted, and have been used, by a broad range of community stakeholders who have generated data-driven proposals to study and improve New York City’s transportation system; leading to the launch of the first-of-its-kind MTA Open Data Challenge in 2024.

For the event's keynote, one of the challenge’s finalists, the Subway Stories team, showcased an interactive map that visualized the flow of ridership across the city at different hours of the day and included anecdotal stories about the major trends they discovered. For example, they noticed many people leave deep Brooklyn and arrive in Chinatown early on Saturday mornings. However, they did not just accept this as fact; they sought to understand the roots behind this trend through in-person interviews. In turn, they discovered that many younger people of Chinese descent live in Brooklyn but still attend community gatherings in Chinatown.

My major takeaway from the keynote was the importance of layering context to data. We data people often talk about the almighty actionable insights and storytelling, and this was a masterclass of building a quality data product while articulating context and nuance. Sure, creating an interactive map of ridership movement throughout the day is interesting, but so what? Why are people behaving this way, and (in this case) how can it influence policy decisions like the prioritization of subway infrastructure projects?

Parsing the truth from the narrative

There have been several pretty horrific subway incidents in recent years. I will save your mental harmony and refrain from linking them here, but if you were an out-of-towner (or my mom), you might think riding the subway is equivalent to scaling a 61-story building without a rope.

Further, New York City is very well-known, and as the Subway Stories team pointed out, it’s often politicized. While one powerful person may say that the subway is extremely safe:

another powerful person may say otherwise:

I’m constantly inundated with news alerts, texts from friends and family, etc. about something bad that happened on the subway. While I’ve observed my fair share of subway disruptions, I am frustrated with the emotionally driven, isolated stories of the largest functional transportation network on the continent. Regardless of its imperfections (dirty, late, and unsettling at times), the subway is still a semi-reliable, relatively cheap, and environmentally efficient means of transport.

But is it safe? And precluding that, is it even possible to drown out the noise and compare riding the subway to the other ways I navigate New York City?

Let’s find out 🚆

Approach

I approached this project with the following principles:

  1. Use the highest quality and most recent data available. When only poor-quality or older data is available, apply a “best guess” methodology to base an insight. Use news reporting to provide context where necessary.

  2. Focus on an all-encompassing safety-related incidents metric that includes injuries from the mode of transport (i.e. car or bus accident) and human-caused harm (i.e. physical assault) to produce a complete reflection of safety.

  3. Contextualize safety-related incidents with the population of those incidents when possible. For example, “incidents per 1M trips”, to qualify different nominal levels of transport.

  4. Include all NYC boroughs whenever possible.

With a combination of the learnings from the Open Data Event, navigating directly to sources within Perplexity, and traditional Google search, I conjured the following data:

  1. The MTA Open Data team produces ridership, injuries from the mode of transport, and human-caused injuries for both the subway and bus systems. While they don’t aggregate all injury data into one “incident” number, it’s a pretty easy aggregation (more on that process shortly).

  2. The NYC Open Data team produces a motor vehicle collision report for cars, bicycles, and pedestrians.

However, there’s no measure of total trips for cars, bikes, and pedestrians. That makes sense when you give it a second thought - how could you possibly record every time someone walks from their apartment to the bodega down the street? Or rides their bike to work? I considered using commuting data produced by NYC’s Environmental and Health Department, but the data ranges from 2017-2021, and I hypothesized that COVID significantly impacted commuting patterns. Ultimately, I landed on:

  1. A 2022 Citywide Mobility Survey report produced for the NYC Department of Transportation (DOT) to determine the rough distribution of trips by mode of transportation:

Based on a weighted sample of ~3k respondents. Not perfect, but better than nothing!

Theoretically, I could apply these distributions to the transit data and back into a trip count for cars, bikes, and pedestrians. For example, if there are 5M bus and subway trips in a day, we can assume that there were 10M car trips because the ratio of car trips to public transit trips is 2x in the 2022 survey above (34% vs. 17%). (Yes, this is an assumption. See principle #1 above 🙂).

Tech stuff

A few of the datasets were pretty large and therefore unmanageable in Sheets. As a result, I loaded the data into a local DuckDB warehouse and performed some necessary joins and aggregations there (you can see all of the above in this messy GitHub Repo). I then built a local Streamlit dashboard for visualization and analysis.

Streamlit rocks 💪

Results & Analysis

After visualizing the data, a few insights bubbled to the surface.

Walking, bussing, and subwaying are much safer than driving and biking.

Safety Incidents per 1M trips, by month.

The levels of safety can be grouped into three tiers:

Tier 1: Walking, bussing, and subwaying are by and far the least dangerous forms of transport, with about 2 incidents per 1M trips for walking and bussing, and 3-4 incidents per 1M trips for the subway.
Tier 2: Driving is about 4-7x more dangerous than Tier 1.
Tier 3: Biking is the most dangerous form of transport (about 8-10x more dangerous than Tier 1). This is unsurprising to anyone who has ridden a bike in NYC.1  

Subway safety has increased by roughly 40% since January 2024.

Safety Incidents per 1M, by month, mode = Subway.

Subway incidents declined from 6.0 to 3.5-4.0 incidents per 1M trips over the year. The important context here is that Mayor Adams increased subway police presence in February 2024. While it’s difficult to say if the resulting decline in incidents is causal or not, New York politicians certainly believe so, as Kathy Hochul announced similar changes in January 2025:

Partnering With NYPD, Every Subway Train Will Have a Uniformed Officer Onboard Between 9:00 p.m. and 5:00 a.m.

New Protective Barriers Will Be Installed on Platforms in More Than 100 Stations; LED Lighting Will Be Installed In All Subway Stations To Increase Visibility

Fare Gates or Exits Will Be Modernized In More Than 150 Stations To Crack Down on Fare Evasion

Car trips may be overestimated.

Number of trips by month, mode = Car.

I estimated an average of ~250M car trips per month, or 8M trips per day. Given that there are about 8M people in NYC, this seems high, particularly when taking into account commuter surveys where public transit is as common as driving. If the number of car trips is overestimated, the subsequent incidents per trip metric would be underestimated. TLDR: cars might be more dangerous than we already know they are.

Closing thoughts

Riding the subway is nearly as safe as walking, and much safer than driving a car. This is unsurprising for any Daniel Kahneman fans: extreme, infrequent events drive emotion and remove sensibility from our brain’s decision-making process, particularly when estimating the likelihood of these events.

Biking is unsafe relative to other modes of transport. A few months ago, I attended a Transportation Alternatives community meetup in Carroll Gardens where I learned about the fight to daylight intersections. It opened my eyes to the dangers of biking and some of the ongoing initiatives to make NYC more rideable. Fortunately, there’s a bill in the NYC City Council to bring universal daylighting to every intersection in the city.

We need a better measure of walking, biking, and driving activity. Anchoring trips to a 2022 survey of 3k participants is flimsy. I would love to see the NYC DOT invest in another study here, as better trip estimates can lead to refined safety metrics and more informed transportation policy.

And lastly, there is SO much opportunity for additional analysis:

  1. Analyzing NYC in its entirety is too large of a geographic space. There’s nuance within each borough, and resultingly, these metrics should be dimensionalized to a narrower scope.

  2. Congestion pricing should impact these numbers starting January ‘25. I foresee pedestrian, car, and bike incidents per trip to decline (less cars on the road causing incidents).2

  3. Additional modes of transport like micromobility and even planes (not within NYC, but still interesting as a commentary on the human psyche) should be considered. I found a 2023 FAA report that measured 0.1 fatalities per 100M persons onboard, which is significantly safer than all of the modes of transport above but does not account for less severe injuries.

1  There is clear seasonality in the biking data. Remember that bike trips are pegged to transit trips, and there’s less variation in transit than biking throughout the year. However, the data shows fewer bike incidents in colder months than in warmer months. This drives incidents per trip down in the winter and up in the summer.

2  RIP congestion pricing, 1/5/2025-2/19/2025.

Reply

or to participate.