Hello,
Thank you for providing transport data and hosting the forum!
We are attempting to use two TfNSW GTFS datasets at the same time, as part of one logical merged dataset. These are Sydney Trains and NSW Trains.
The two datasets have data that seems inconsistent in at least one case.
As of 2017-02-22, data for Sydney Trains (https://api.transport.nsw.gov.au/v1/gtfs/schedule/sydneytrains) contains the following two stops:
stop_id: 2577146, stop_ll: -34.589305,150.597767, stop_name: “Robertson Bus”, location_type: 1, parent_station: null, wheelchair_boarding: 0
stop_id: 2577143, stop_ll: -34.589443,150.597539, stop_name: “Robertson Bus 1”, location_type: 0, parent_station: 2577146, wheelchair_boarding: 0
As of 2017-02-22, data for NSW Trains (https://api.transport.nsw.gov.au/v1/gtfs/schedule/nswtrains) contains the following stop:
stop_id: 2577146, stop_ll: -34.58930722,150.5978025, stop_name: “Illawarra Hwy Before Main St”, location_type: 0, parent_station: null, wheelchair_boarding: 1
and does not contain stop_id 2577143
Stop 2577143 is only served by trips for Sydney Trains routes. Stop 2577146 is only served by trips for NSW Trains routes.
The information for stop 2577146 is different. The different location_type is a particular problem in our use, and different wheelchair_boarding could be an issue for accessibility-oriented apps.
As noted in GTFS spec and in https://opendata.transport.nsw.gov.au/sites/default/files/TfNSW_GTFS_release_notes.pdf section 9.3, location_type should only be 1 for stops that are Parent Stations. Per the GTFS spec, Parent Stations cannot have service stopping at them (it must stop at a child stop). Taken separately, both datasets fulfill these requirements, but attempting to use them both at the same time results in ambiguity.
As far as I can tell, the document https://opendata.transport.nsw.gov.au/sites/default/files/TfNSW_Realtime_Bus_Technical_Doc.pdf in section 4.8, field stop_id, says that stop_ids should be unique across TfNSW datasets. I would then expect a given stop_id to correspond to, at least, the same location_type (stop or station) in all TfNSW GTFS feeds.
Questions:
- Are stop_ids indeed intended to be unique across all GTFS and GTFSRT datasets provided by TfNSW at api.transport.nsw.gov.au?
- If the stop_ids are to be unique, are stops with the same stop_id are intended to have logically consistent properties in different GTFS datasets?
- If the answer to 1 and 2 is true, can the datasets for Sydney Trains and/or NSW Trains be corrected? (Changing the ID of the parent_station in Sydney Trains dataset would seem to be the easiest way, but I don’t know what the impact might be internally.)
- If datasets will be corrected, can you provide a rough estimate when this might happen?
- If the stop_ids are NOT intended to be unique, do you have a recommended way to handle different information for stops in different datasets, other than dealing with different datasets completely separately?
Thanks,
–Jarek