What data do you want?


#21

Static GTFS For Non-Realtime to complement the GTFS For Real-time.

Together they would include all GTFS data and avoid duplicate agency sources.

Single authoritative sources per agency avoids uncertainty and minimizes work, i.e.the best practice. :sunglasses:


#22

I’m confused as to what you are after. Static GTFS we already provide. But when you say “complement the GTFS for Realtime” do you mean schedule data for realtime? Because we have that. Except for the bus data which is being scheduled to be deployed.


#23

I mean a separate GTFS file that includes only agencies not included in the static GTFS for real-time.

So that, GTFS for real-time + non-realtime GTFS = full static GTFS file.

As of now, agencies have duplicate sources with different identifiers and we have to modify one file to publish all your data while supporting real-time.

The new file would also allow you to eventually phase out the old full static file in favor of single authoritative sources with consistent identifiers and support for real-time (which also means less work).


#24

We do have a long term plan to make trip_ids consistent across all bundles, so the full GTFS file for all modes will match the individual real time bundles. This involves firstly consolidating the source of data for each operator. Currently they submit data in different formats depending on whether it goes to transportnsw.info (and out through the ‘direct download’ GTFS and TXC exports), or out through the real time feeds.

There is a project going on at the moment to have this consolidated source of data, once this is finished we will be able to then work on retaining consistent trip_ids across bundles, so that in theory you will only need the one GTFS bundle for everything…

For now we are stuck with different data sources, hence the filtering you need to do.


#25

I second @4WiFi request for the Bus plate, that will be a helpful addition for user/passengers.


#26

This is unlikely to happen in the public interface. This would create a safety issue for targetting people if misused. Even trying to get special access to this data after proving what is happening and being authorised is proving impossible.

I would not hold much hope of seeing that in the interface anytime soon. I wished it was there as I have a project that depends on it, so unless something drastic changes then it seems a lot of things may be on hold for all of us.


#27

I second this one.

I’d like to be able to get a data set of Opal Card usage across all users, ideally showing trip/journey information such as trip start and end date/times, mode (bus/train/ferry), start point and end point, and cost of trip. Anonymised data would be preferable, and perhaps broken down by month and year (or week), depending on the size of the sets.

This information is available via the Opal card android application, so it is mostly available, and probably just needs cleansing/anonymization.


#28

FYI we are looking into what we can do about Opal data and of course it has to be de-personalised.


#29

I was directed to this gateway by Data NSW for spatial data on NSW roads (static) - am I in the right place? I am an ecologist and would like these data - preferably in an GIS-friendly format - to inform species distribution and threat models. Is there a product here to suit my purposes? Cheers.


#30

Hi @Mora - welcome to the forum.

We don’t currently have the data you are after but appreciate the feedback.

We’ll find out internally if we own any of these types of data sets and can make it available.


#31

Thanks Yvonne! Any help would be greatly appreciated. M


#32

Hi @yvonne.lee.
Thanks for your help on this forum.

Train occupancy can be determined by carriage weight I believe, but has machine learning been considered, both in terms of real-time and predictive modelling.

I’d be interested in sharing some of my ideas. In terms of predictive modelling it might get round the problem of OPAL data security whilst still supplying some insightful and useful data.


#33

Some trains (train sets) do have the ability to determine the weight of passengers to provide a guesstimate on the occupancy. Analysis and modelling of course has been considered. There are planning and analysis areas within TfNSW who specialise in this.

Feel free to share your ideas here or alternatively we have mechanisms for you to engage with us (https://future.transport.nsw.gov.au/get-involved/). Note we’ll also be at GovHack where this type of stuff is the bread and butter. Obviously in your case you will want passenger movement data to do analysis which we’re still working on re Opal. Check out the BTS link on our front page or go directly to http://www.bts.nsw.gov.au/ for some other stats that may be of interest.


#34

we now publish data for unknown, FEW_SEATS_AVAILABLE and STANDING_ROOM_ONLY under occupancy_status

Looking at the GTFS-realtime docs https://developers.google.com/transit/gtfs-realtime/reference/OccupancyStatus-vp there are a few other values that are available, are all the values used, or just the ones mentioned?

As a bit more of an operational/general question, do you know if this information is relayed back to the driver in any way so they are aware of the occupancy status ?


#35

[quote=“lt7, post:34, topic:30”]
Looking at the GTFS-realtime docs https://developers.google.com/transit/gtfs-realtime/reference/OccupancyStatus-vp there are a few other values that are available, are all the values used, or just the ones mentioned?[/quote]
At this stage it’s just the ones mentioned although we may introduce one more level but it wasn’t intended to offer all of the ones suggested in the spec.

The reality is that they can look to find out. The occupancy status is often flawed in that it assumes everyone taps on and taps off correctly. And it is only semi realtime. ie only up to date to the last stop. So from a prediction level not as useful if you are standing at a bus stop that is after a busy one. So from a very empty bus it may become full or vice versa.
Why do you ask?


#36

The occupancy status is often flawed in that it assumes everyone taps on and taps on correctly. And it is only semi realtime. ie only up to date to the last stop. So from a prediction level not as useful if you are standing at a bus stop that is after a busy one. So from a very empty bus it may become full or vice versa.
Why do you ask?

Fair point about the tap on/off, but a small margin of error of 5% or so would account for this. Being up to date to the last stop is fine, in that once the bus is moving it’s pretty difficult to add/remove passengers :slight_smile: until the next stop unless I’ve misunderstood you.

In terms of prediction/use of this data there are potentially a few benefits, firstly for a passenger waiting at the next stop an electronic smart display board could display the next buses coming by route number, occupancy, ETA, etc. Something similar (without occupancy is already done in Brisbane and also displayed on the Translink website e.g http://jp.translink.com.au/travel-information/network-information/stops-and-stations/stop/pa-hospital-station

Secondly from an efficiency point of view:- for example fully occupied buses on motorway routes (such as the M2 in Sydney) always pull into each stop, irrespective of whether they can take passengers or not. If the driver was notified he had reached maximum occupancy (or close to it) they could potentially avoid the stop entirely and keep moving, to the benefit of passengers onboard, passengers waiting (as that bus isn’t clogging up the queue), along with reduced pollution etc. You could probably say they could do that now, but with an electronic system you have the data to back the decision, which leads to…

Thirdly once you are accurately tracking the occupancy, from a scheduling point of view the providers would have a data to help understanding which buses are becoming fully occupied, at what times, and at which bus stops which could hopefully help model appropriate services. You could also measure customer satisfaction to some degree, in that if standing room only is the only option on a route at a particular stop at a certain time, whether additional services need scheduling etc.


#37

With the occupancy data for buses, is there a threshold for hitting FEW_SEATS_AVAILABLE and STANDING_ROOM_ONLY that we can use for reference?

Also, I may have just missed this completely, but is there a way we can get the bus model and capacity? I notice it’s on NextThere, but can’t seem to find it on our feeds.


#38

With regards to the bus models, unfortunately we don’t currently have the fleet details in the feed. NextThere has access to a legacy feed which is being decommissioned. I’ll raise it internally to see if we can surface that data up some other way.

I’ll also ask about the thresholds for the limits.


#39

An API to query current, and ideally, past, road toll fees.


#40

I’m looking for this kind of data too. Specifically I want to know for one or more destinations, for a specific time frame (e.g., Sat 10pm-10:30pm) the number of people who arrived at that destination broken down by starting point.

This will be used to analyse venue usage, i…e, where do people going to King Cross on a Sunday night come from. This is part of a sociological study.