Are we hitting your API right?

We have built a simple front-end application where you can see the data on a map.

We also have a back-end AWS Lamda function that is connecting to the Real-Time Locations API and upserting into a Hosted Feature Layer in ArcGIS Online.

Our first observation is that it looks like we have too many vehicles to what we would expect.

Is this something we have done wrong on our side, or at the API end?

  • Poll every 30 seconds the API
  • Using Vehicle_ID as unique field
  • Checking for duplicates on the API responses
  • Upserting into the Esri layer
  • Every minute we check for any records older than 1 minute and delete
  1. We initially spotted that vehicle_id is not always unique. In one request, often getting a number of duplicates with different timestamps, but same vehicle_id.

  2. We also spotted that sometimes the vehicle_id sends through a really long string. Is this expected?

    entity {
    id: “79”
    vehicle {
    trip {
    trip_id: “3AS8.1419.118.16.R.29.54088516”
    schedule_relationship: SCHEDULED
    route_id: “CTY_W2b”
    position {
    latitude: -33.767357
    longitude: 150.90132
    timestamp: 1541548402
    congestion_level: UNKNOWN_CONGESTION_LEVEL
    stop_id: “BLACKTOWN.BN_94C”
    vehicle {
    id: “6362.6779.9652.9369.5901.4755.7407.9761.2538.6371.6756.1351.2701.1304.4061.1846.4578.6449.4681.3772.6266.7305.6874.3619.9763.9732.5144.3762.6233
    label: "06:28 Perth Station to Central Station "

Probably the first thing to look out for is that the bus feed reports vehicle locations for trips which are starting up to 30 minutes in advance. This is useful for users who want to see where their bus is prior to the trip starting, but less so when you’re building a network view. You can filter those out by checking the tfnsw_vehicle_descriptor > performing_prior_trip = true field.

Yeah, I’ve noticed some instances (for unscheduled bus trips) where there are duplicate vehicle_id values. Your solution of checking for a duplicate ID is probably sufficient since the number of affected trips is generally small, and is usually for trips which don’t correspond to a published timetable. You could also append a number at the end of the ID to ensure its uniqueness, but you’ll likely lose identifier consistency if you’re planning on building a time-series out of the data.

Yes, that’s an Indian Pacific service which has 29 train carriages (note R.29 in the trip_id). For the Sydney Trains feed, the vehicle ID is formed by joining a series of numbers - one representing each a masked carriage number. Hence, why it’s that long for that particular train service.


Huge thanks for taking the time to respond.

Missed that entirely. We will update our code accordingly. And also catch for the longer vehicle ID, didn’t actually realise that it’s a concatenation of carriage IDs.

Will report back once we make some changes.

Thanks again!

They aren’t strictly carriage IDs (see this conversation: Sydney Trains Realtime Carriage IDs), but its length and concatenation is linked with the number of cars in the scheduled train consist.

False= actual position of buses
True = position of future buses? And we dont want these.

Am I reading that right?

We made the changes to our Github repo

I think you’ve got it. Both are actual positions, but associated with a current and future trip – i.e.

performing_prior_trip = false: actual position of bus, associated with current trip
performing_prior_trip = true: actual position of bus, associated with next trip – can be disregarded for network view purposes.

A note on the vehicle ID for trains and your work around there. If you have a 40 character limit for your primary key, it’s generally safe to simply truncate to the first 40 characters for the train vehicle ID. That way, you don’t have to hard code an alternative vehicle ID.

1 Like