Incomplete NSW/Sydney Trains GTFS-R Data

Hi there - Melbourne dev here so not super familiar with the NSW system
I’ve been working on the Melbourne-Sydney XPT data and I noticed that the way the data is returned in the GTFS-R feed is a bit strange. At times it only appears in the NSW Trains feed, sometimes only Sydney Trains and also a mix of both. When its a mix the NSW Trains feed appears to be more accurate.
I’m currently looking at ST24 Albury - Central and the trip data seems to be missing stops (This is NSW Trains feed, it’s not in the Sydney Trains feed at all)

entity {
  id: "18"
  trip_update {
    trip {
      trip_id: "ST24.210920.31.1155"
      schedule_relationship: SCHEDULED
      route_id: "4T.T.ST24"
    }
    stop_time_update {
      stop_sequence: 2
      arrival {
        delay: -15
      }
      departure {
        delay: 67
      }
      stop_id: "26603"
    }
    stop_time_update {
      stop_sequence: 3
      arrival {
        delay: 75
      }
      departure {
        delay: 132
      }
      stop_id: "26583"
    }
    stop_time_update {
      stop_sequence: 4
      arrival {
        delay: 54
      }
      departure {
        delay: 61
      }
      stop_id: "26553"
    }
    stop_time_update {
      stop_sequence: 5
      arrival {
        delay: -84
      }
      departure {
        delay: 76
      }
      stop_id: "2650135"
    }
  }
}

The data returned only has data for Culcairn, Henty, The Rock and Wagga Wagga. However sites like AnyTrip do not seem to have this issue - they have timings for the whole trip just fine. Is there something I am missing with the data feeds here?

I’m only focusing on the XPT so I’m not sure if this affects other trips too

Hi @unikitty,
Make sure to check out out documentation for both NSW Trains and Sydney Trains data here Documentation | TfNSW Open Data Hub and Developer Portal

The Sydney Trains GTFS-R feed will report vehicles inside the Sydney metropolitan area and South Coast/Central Coast & Newcastle Lines only. You will see the Melbourne service in the Sydney Trains feed when it is inside this boundary.

The NSW Trains feed will report on the full length of the journey, so we recommend using this feed for this particular service. The documentation give a list of run numbers of other services for which the same applies.

Re: missing stops - the trip update feed does not need to list every stop in the schedule, only where the delay changes. For missing stops you may propagate delay from the previous stop as per GTFS-R spec. Although keep in mind that for long services such as Syd-Mel the vehicle often makes up a lot of time later in the journey if it is initially running late, so a full propagation may not end up being accurate.

Hi David,

Thanks for the reply, it is helpful. However while doing some digging I noticed differences in the timetabled data - I think it was the Sydney Trains GTFS showing Platform 4 at Central while the NSW Trains data showing Platform 1. I recall the technical guides saying that the NSW Trains data is more accurate, so I have gone with that, but the trade off is that there isn’t the train length data anymore (7 or 5 car XPT). The vehicle field in the GTFS-R feed also seems to be consistently blank / null, is there a reason to get around this without loading both datasets to make a comparison?

Thanks for the feedback about departing platforms at Central, I have passed on for investigation.
And yes at the moment set type is not available in NSW Trains data so you will have to compare both.

@unikitty FYI as you can now get set type and number of cars from NSW Trains diesel and regional fleet trip_IDs, as per Sydney Trains trip_IDs.
The new trip ID format is:
[run number].[start date of trip validity].[days of operation for timetable].[consist type].[number of cars].[time for start of trip]