I’ve implemented downloading and processing from an endpoint - https://api.transport.nsw.gov.au/v1/publictransport/timetables/complete/gtfs
I found the endpoint does not respect ‘If-Modified-Since’ header - If-Modified-Since - HTTP | MDN
I think it can save efforts, time and bandwidth if all real-time endpoints and the above gtfs endpoint respect the header.
For example, without using ‘if-modified-since’, my current design is:
- Hit the endpoint with the HEAD method
- Get ‘last-modified’ header value, then convert it to a long value
- Check local folder has a file named "gtfs-the-long-value.zip
- If there is a file with the name, then exit - nothing to do
- Otherwise, download, extract some files, trim columns, sort rows, compare with previous files, generate DB operations, run DB operations.
With ‘if-modified-since’, 2-4 can be simplified like
…
2. if HTTP code is 304, then exit - nothing to do
…