Looking for Historical Bus Delay data up to 2022

Hi all, I’m looking for historical bus delay data for a project. I’d like to answer questions such as:

  • “How often were buses delayed?”
  • “How large were these delays?”
  • “Is there any discernible pattern in these delays (e.g. time/location/route)?”
  • et cetera.

I’m wondering if anyone has already done any work in collating data that may be useful in answering this question, which would be much appreciated! Data up to 2022 is not a hard requirement, I’m more interested in patterns in history but it would be nice to have more recent data.

I’ve done some digging around and found the Historical GTFS Bundles and Timetables page, which has the full GTFS bundle up to July 11, 2018 and Bus GTFS-R vehicle positions up to May 20, 2021.

Since I’m new here and this is my first time working with GTFS/GTFS-R data, I’d prefer to skip parsing and matching GTFS-R jsons to timetable/route/trip! Any tips or related forum topics would be greatly appreciated.

Thanks all!