GTFS for Real-Time Out of Date

I am validating to switch from the static non-realtime GTFS to the Real-Time GTFS but I am seeing no service through out the /buses dataset.

For example, State Transit Newcastle Route 100 has times up to Feb 2017 in the non-Realtime GTFS but in the Real-Time GTFS there is no current service.

mysql> select trip_id, service_id from buses_trips where route_id=“2451_100” group by service_id;
±--------±-----------+
| trip_id | service_id |
±--------±-----------+
| 157186 | 111 |
| 157688 | 165 |
| 265916 | 178 |
| 270629 | 233 |
| 157428 | 40 |
| 159505 | 55 |
| 265802 | 73 |
±--------±-----------+

None of these service_ids are current.

I see this problem with State Transit Sydney as well and throughout most agencies I have compared.

This makes Real-Time services unusable.

What’s going on?

:angry:

Is this just me or are you working on this?

Here’s the results I am seeing between data for agencies in the real-time GTFS and non-realtime GTFS in case it might help with troubleshooting (and assuming I’m not in error).

Routes with Current Service in Real-Time Data Compared To Non-Realtime:

[NightRide] => Less than 50%
[Sydney Olympic Park Major Event Buses] => Less than 50%
[Busways Western Sydney] => Less than 50%
[Interline Bus Services] => 3x as many routes as in non-realtime data
[Transit Systems] => Less than 50%
[Hillsbus] => None
[Punchbowl Bus Company] => Lines look completely different
[State Transit Sydney] => Only 6 lines
[Transdev NSW] => None
[Forest Coach Lines] => Only 5 lines
[Busabout] => Many more lines than in non-realtime data
[Rover Coaches] => Many more lines than in non-realtime data
[Hunter Valley Buses] => 4 Lines
[Port Stephens Coaches] => Many more lines than in non-realtime data
[State Transit Newcastle] => Only 1 line
[Busways Central Coast] => Only 1 line
[Red Bus Service] => Many more lines than in non-realtime data
[Blue Mountains Transit] => None
[Premier Charters] => Many more lines than in non-realtime data
[Premier Illawarra] => Many more lines than in non-realtime data
[Coastal Liner] => More lines than in non-realtime data
[Dions Bus Service] => More lines than in non-realtime data

Of course, it is possible these errors are on my side but I don’t think it is likely.

Please post back to confirm the data is valid or provide some status info so I know how to proceed.

Thank you.

:relaxed:

Hi @Webmaster, we’re investigating this together with other teams. I’ll let you know what we find but we haven’t had anyone else report anything wrong with real-time services. Obviously you’re aware that there are many apps out there that provide real-time information and there haven’t been any issues.

Anyway, we’ll look into it to make sure.

Thanks,
Alex

Hi @Webmaster, we haven’t found any problems with the files and the data team has confirmed that both static and real-time bundles are valid. They also pointed out that you shouldn’t be comparing trip_ids between the two bundles since the data might be coming from two different sources and won’t necessarily match.

For example, in the trips.txt file route 100 that you were looking for appears with route_id=“60-100-sj2-1”. 576 trips found in the current static GTFS bundle.

Reference  |  Static Transit  |  Google Developers is a good resource to see how all the files in the bundle fit together so you can work out how to find any given trip or route.

Thanks,
Alex

The issue is that the GTFS for Real-Time dataset does not contain current service.

For example, with the route 100 for State Transit Newcastle I have gone through your trips.txt file and confirmed only these service_ids are associated with route_id 2451_100:

111,165,178,233,40,55,73

Now look at calendar.txt and calendar_dates.txt for start and end dates and you find all of these service_ids are expired.

111 expired 20160715
165 expired 20160905
178 expired 20160903
233 expired 20160902
40 expired 20160715
55 expired 20160905
73 expired 20160904

This is based on raw data so there is no possibility of outside error.

This problem exists throughout the /buses dataset.

The “Routes with Current Service in Real-Time Data Compared To Non-Realtime.” might help you with troubleshooting.

Weather the other route list discrepancies are a problem, I think you would know better than I.

Have you made sure that you are storing the file set OSMBC005 as part of the primary key? trips.txt, calendar.txt and calendar_dates.txt are only unique across the fileset and not across all bus filesets (see page 24 of the documentation)

Given there’s only sequentially numbered 25 service ids on the OSMBSC005 fileset, I don’t expect to see 111,165,178,233,40,55,73 as valid service IDs for that fileset.

Just doing some further digging for you. I don’t see any problem with the raw fileset. Drawing from your example of route 2451_100 (this is data from 26th Nov):

routes.txt

route_id,agency_id,route_short_name,route_long_name,route_desc,route_type,route_color,route_text_color
"2451_100","2451","100","Charlestown to Newcastle","Unknown Description","700","00B5EF","FFFFFF"

trips.txt

route_id,service_id,trip_id,shape_id,trip_headsign,direction_id,block_id,wheelchair_accessible,trip_note,route_direction
"2451_100","3","157688","16005","University (Maths sto","0","","1","","Newcastle to Charlestown"
"2451_100","3","158597","16005","University (Maths sto","0","","1","","Newcastle to Charlestown"
"2451_100","3","159234","15942","City West","1","","2","","Charlestown to Newcastle"
"2451_100","3","159235","15942","City West","1","","2","","Charlestown to Newcastle"
"2451_100","3","159236","15942","City West","1","","2","","Charlestown to Newcastle"
"2451_100","5","159237","15942","City West","1","","2","","Charlestown to Newcastle"
etc...

calendar.txt

service_id,monday,tuesday,wednesday,thursday,friday,saturday,sunday,start_date,end_date
"3","1","1","1","1","1","1","0","20161031","20170207"
"5","0","0","0","0","0","1","1","20161105","20170205"
etc...

The dates look fine to me :slight_smile:

1 Like

I am referring to the full /buses download in GTFS for Real-Time.

I am just looking at the raw .txt files, so there are no DB storage or outside system issues.

Oh, I didn’t think the full /buses fileset is ready for production yet?

Any information about this, @alejandro.felman? Or do we still have to rely on manual uploads: http://opendata.transport.nsw.gov.au/forum/t/gtfs-bundle-for-type-schedule-is-not-available-for-agency-buses/65/47

the big buses zip file is still not available via the API. You need to grab it from the Drive link I have in the thread you referenced. The file was updated late last week.

We’ll advise on that thread (Gtfs bundle for type schedule is not available for agency buses - #47 by yvonne.lee - Real-time Public Transport - Open Data - Transport for NSW forum) when available via API.

Thanks Yvoone.

I have to wait until your system behaves as documented. I’ll keep an eye on that thread.

:sunglasses: