Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: Support DUPLICATED trips in GTFS-RT #221

Merged
merged 13 commits into from
Jul 29, 2020

Conversation

barbeau
Copy link
Collaborator

@barbeau barbeau commented May 4, 2020

Background

Producers and consumers have identified the need for a lightweight method to take an existing trip defined in static GTFS, and then in real-time say that its pattern (e.g., the stop_times.txt definition) is running at a new service date and/or time [1].

The GTFS-RT spec does not currently define a official way to duplicate trips. The schedule_relationship of ADDED is a potential candidate to convey this type of information. However, the GTFS-RT spec has not clearly defined what exactly ADDED is and how it should be used. As a result, multiple consumers and producers have different interpretations of what an ADDED trip is [2][3].

Proposal

This proposal adds a new trip schedule_relationship of DUPLICATED, which can be used to duplicate an existing trip from GTFS (CSV) but then run that trip pattern at a new time (TripProperties.start_time) and/or service date (TripProperties.start_date). The existing field TripDescriptor.trip_id is used to identify the trip to be duplicated in GTFS trips.txt (CSV), and TripProperties.trip_id defines the new trip_id for the new trip.

This proposal also deprecates the existing trip schedule_relationship of ADDED, which has be used differently by various producers and consumers with varying meaning due to lack of clarify in the current spec [2][3]. ADDED is soft-deprecated, which means that if existing producers and consumers want to continue using it for their own purposes it will remain usable in the .proto file. EDIT Per discussion below, the deprecation of ADDED is being removed from this proposal for future discussion separately.

This pull request is a subset of the GTFS-ServiceChanges v3.1 spec:
https://bit.ly/gtfs-service-changes-v3_1

A few producers and at least one consumer currently implement this use case via ADDED, so it should be relatively straightforward to change the implementation to using DUPLICATED instead. Therefore, this feature could be adopted prior to the remainder of the ServiceChanges proposal.

Future proposals may add other fields defined in GTFS trips.txt to the TripProperties message (e.g., route_id, trip_headsign, trip_short_name, block_id, shape_id) so they can be changed in real-time. See the ServiceChanges proposal for details.

EDIT July 9th, 2020 - Added a migration guide for existing consumers and producers publishing duplicated trips using the ADDED enumeration so there is a standardized transition to the DUPLICATED enumeration.

EDIT July 14th, 2020 - Updated the migration guide to allow linking trips via 2nd option to accommodate MBTA interpretation of ADDED.

Announced on the Google Group at https://groups.google.com/forum/#!topic/gtfs-realtime/XL96r9g3W-8.

[1] #216
[2] #106
[3] #113 (comment)

@barbeau barbeau added the GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime label May 4, 2020
@skinkie
Copy link
Contributor

skinkie commented May 4, 2020

I noticed that you explicitly mention the "soft-deprecation" again, which I don't know the purpose of. Hence I don't see the difference of an ADDED trip as in "a fully new defined trip is added" versus "a reinforcement trip is started" I fail to see why two different semantics are be required.

This proposal by MobilityData is again trying to achieve multiple things at once, but fail to deliver a complete solution, documented solution for the changes proposed. First of all the introduction of a new concept, and in the background the removal of something else, but not solving the use case that the removal or "soft removal" or what new speak you want to add to it would justify it. Either splitting DUPLICATED from ADDED (as being more specific) wouldn't have to touch the definition of ADDED at all. It should be a separate pull request.

To conclude:
-1: I don't like how the trip duplication introduces TripProperties it does not make sense. A reinforcement trip uses the same information as the trip it reinforces with typically one extra number to identify the number of reinforcements. We don't have such reinforcement number in GTFS-RT but we can have a different start_time, this is in line with the frequency based definition. I think it is reasonable to assume there there always will be a second difference between two departing trips.
-1: I don't like how this deprecates ADDED
-1: I think this could once and for all be solved with a good definition of a (generic) ADDED of which (duplication) is a specific instance. I see you rather introduce something new, than force organisations to converge their implementations to a single functional description.

@barbeau
Copy link
Collaborator Author

barbeau commented May 5, 2020

I noticed that you explicitly mention the "soft-deprecation" again, which I don't know the purpose of.

  • Soft-deprecated - Field is removed from docs and marked as [deprecated = true] in .proto. When you re-generate the new bindings from the gtfs-realtime.proto file, you can continue to use the deprecated field in your own systems if you'd like if you already have an implementation for your own purposes. But there is no community consensus on the definition of the field.
  • Hard-deprecated - Field is removed from docs and marked as reserved in .proto, or removed entirely from the .proto. When you re-generate the new bindings from the gtfs-realtime.proto file, you CANNOT use the deprecated field in your own systems. You would need to stop using the field in any existing implementations (e.g., change to using an extension instead).

@tleboulenge
Copy link

Hence I don't see the difference of an ADDED trip as in "a fully new defined trip is added" versus "a reinforcement trip is started" I fail to see why two different semantics are be required.

I can see 2 distinct cases here where you have a trip-A defined in GTFS and you want to run a vehicle on a trip-B that isn't defined there:

  1. Trip-B looks a lot like trip-A (best case is a reinforcement, where it's identical), and you want to define trip-B as trip-A + some changes (required change: assign a new trip-id; optional changes: a new start time, skip some stops, other usual things you can do to alter a trip in realtime; perhaps change headsigns or other in the future)

  2. Trip-B really doesn't look like trip-A and it's not useful to use trip-A as a template to define trip-B. You then have to fully describe a new trip from scratch: list of stops, times, headsigns, etc.

This pull request is exclusively about (1). There is a whole section in the Service Changes proposal about (2), but it's another level of complexity.
Of course, once both proposals exist, you can always use the verbose syntax of (2) to define the simple case of (1), but I see that as needlessly complex and confusing.
The case of reinforcement buses (or slightly modified trips) is IMO common enough that it warrants a shortcut notation that makes it very explicit what it really is (roughly an extra service on that route), and very readable (e.g. the new 3:21 bus runs as the 3:20 except for the last 5 stops).

Irrespective of the suggested syntax and deprecation, do you agree with this need?

@skinkie
Copy link
Contributor

skinkie commented May 6, 2020

@tleboulenge my point is, that I can model example 1 in your example 2. Doing so as producer and consumer I have to maintain fewer lines of code, that is less complex and in my humble opinion less confusing.

I do not agree with the shortcut. Because it would allow to bypass a shortcut later on with the full specification, having to handle both situations in both code paths. But you ask do I agree with the need? I do, and therefore it is already implemented using ADDED.

@tleboulenge
Copy link

I do, and therefore it is already implemented using ADDED.

ADDED in its current form is very far from what I called (2) above, i.e. a full-featured solution to define a new trip in Realtime (e.g. giving a whole new sequence of stops and times, plus other attached information such as headsigns).

It is the shortcut you disagree with, and moreover various producers understand its syntax in a different way:
a. is trip-id referring to an existing trip-id from GTFS, or
b. is it a new unique id that can be used to refer uniquely to this new trip?

and consumers handle or ignore different usages of it (e.g. Google reads this as a reinforcement trip, but won't read a detour in it - I'm sure others do something different), and so whether we like it or not, we're already in the world where we have an incomplete solution and we will need to create a new (most likely incompatible) way towards the full solution (define a whole new trip).

In the current situation for ADDED to be able to represent a reinforcement trip, you need its trip-id to refer to an existing GTFS trip-id (solution (a) above, not (b)), and therefore it's impossible to assign this duplicate trip a unique trip-id and to refer to it uniquely in subsequent messages (or in the other sibling feeds). There are of course many hacks around this... Have some kind of naming convention in the trip-id itself, assign it a slightly different start time,... but I'd really wish for this spec to become more explicit and standard across all the actors, and avoid those kind of semi-documented partial ad-hoc stopgaps.

@barbeau
Copy link
Collaborator Author

barbeau commented May 7, 2020

In addition to @tleboulenge's points above, another big problem with using the current spec fields for reinforcement trips is that it violates the design conventions of using TripDescriptor as a selector from static GTFS and not a modifier. The only way to specify a different starting time is to use TripDescriptor.start_time. There seems to be consensus in the discussion in #219 that mixing selectors and modifiers in the same field causes problems, so we don't want to introduce that problem here.

This is also the rationale behind the design decision to create a new message TripProperties to hold the trip attributes that can be modified within a TripUpdate (in this case, trip_id, start_time and start_date). TripDescriptor remains the selector, and TripProperties is the modifier.

@sam-hickey-ibigroup
Copy link
Contributor

The proposed addition of DUPLICATED looks good, but we are in favor of not soft-deprecating ADDED. We see the value in keeping ADDED so producers can provide real-time info for added/unscheduled trips. As an example, we saw yesterday that MBTA had 49 Green Line trips in their VP feed, and 9 of those were ADDED (see https://cdn.mbta.com/realtime/VehiclePositions.pb and https://cdn.mbta.com/realtime/TripUpdates.pb). It may be worth having a discussion about deprecating ADDED once the full Service Changes proposal is finalized, but it may also be worth keeping ADDED once the Service Changes proposal is in place for the same simplicity/shortcut reason it makes sense to keep DUPLICATED in the long term.

@botanize
Copy link
Contributor

Since our CAD/AVL software produces messages with ADDED I would like to see consensus on how those should be consumed, and I would like relevant applications to actually consume them.

However, I sense that there is more momentum behind ServiceChanges, so it's probably not worth trying too hard to clarify ADDED unless it is part of the ServiceChanges implementation. I think that puts me in agreement with @sam-hickey-ibigroup.

@barbeau
Copy link
Collaborator Author

barbeau commented May 20, 2020

Based on feedback in this thread, I've updated the PR so that the ADDED enumeration is no longer deprecated in this proposal - we can table that discussion to be part of the larger change set in ServiceChanges v3.1 (https://bit.ly/gtfs-service-changes-v3_1) that also handles detours, etc.

So, this proposal now focuses only on defining a very specific use case for DUPLICATED trips, using the TripDescriptor as the selector for the trip to duplicate from static GTFS and a new TripProperties container to specify the new start service date and time.

I'd like to move forward with this updated proposal - any specific thoughts before we call for a vote?

@jamespfennell
Copy link

My main concern with adding extra unconstrained fields to the TripDescriptor like this is that inevitably some producer will misuse them. For example, I can see a producer publishing a SCHEDULED trip with a valid trip ID but also providing a new TripProperties to indicate that the scheduled start time has been updated. The current documentation on the TripProperties type ("Defines updated properties of the trip") seems to suggest that this is a valid use case.

I'm on the consumer side, and all these edge cases have to be accounted for.

To that end, at the very least I would personally like the TripProperties message to be renamed DuplicateTripProperties and for it to be made more clear in the comments that it will be ignored unless the schedule relationship is DUPLICATED.

@barbeau
Copy link
Collaborator Author

barbeau commented May 20, 2020

@jamespfennell That's certainly a valid concern. I've updated the text description in 4b16a1c for the TripProperties fields in this proposal - previously we said:

Required if schedule_relationship=DUPLICATED, forbidden otherwise.

Now that's better explained as:

Required if schedule_relationship=DUPLICATED, otherwise this field must not be populated and will be ignored by consumers.

Hopefully that helps.

As far as renaming TripProperties, if you look at the larger Service Changes v3.1 proposal draft (https://bit.ly/gtfs-service-changes-v3_1), we do plan to use TripProperties to allow modifying other trip attributes (e.g., headsign, short name) in real-time - see examples here:
https://docs.google.com/document/d/1oPfQ6Xvui0g3xiy1pNiyu5cWFLhvrTVyvYsMVKGyGWM/edit#bookmark=id.46iq4ddo589j

Those features are outside the scope of this specific PR, but we do want to use naming conventions here that can accommodate those future changes.

@barbeau
Copy link
Collaborator Author

barbeau commented May 27, 2020

This pull request has been open for several weeks, so per the Official Process I'm calling for a vote.

Vote will be closed on Wednesday June 3rd at 23:59:59 UTC.

@skinkie
Copy link
Contributor

skinkie commented May 27, 2020

While I firmly believe that ADDED would be sufficient, I don't mind to have a more explict DUPLICATED for reinforced trips. The TripProperties future use should like a direction that the standard builds upon further, allowing to retain a specific relationship with the schedule.

+1 (OpenGeo)

@sam-hickey-ibigroup
Copy link
Contributor

We don't see an explicit mention that stop times on duplicated trips are adjusted based on the offset between the original trip start time and the duplicated trip's start time, but this is how we are interpreting this addition to the spec. Is this the correct interpretation?

@prhod
Copy link

prhod commented May 28, 2020

Sorry for asking a question this late, but I see in the PR that, for VehiclePosition:
When 'schedule_relationship' is 'DUPLICATED', the 'trip_id' identifies the trip from static GTFS to be duplicated.
If I understand correctly, you could have both cases:

  • if the trip A (used for duplication) is DELETED, the VehiclePosition will apply to trip B (the duplicated one)
  • if the trip A is not DELETED, the VehiclePosition will apply to trip A (or both of them ?)

Shouldn't we use the new trip_id to reference the position of the trip B ?

@barbeau
Copy link
Collaborator Author

barbeau commented May 28, 2020

@sam-hickey-ibigroup and @prhod Great feedback, thanks! I agree that we should clarify both of these items in the spec language. I'm going to pause the vote and work on some updated text for both cases.

@barbeau
Copy link
Collaborator Author

barbeau commented Jul 14, 2020

@paulswartz I'd rather keep this particular change into one TripUpdates feed (since in the end, it's also where it's going to stay), but we may amend the migration guide to allow both your current syntax and the new one (where your ADDED.trip-id would match the new DUPLICATED.trip-properties.trip-id). I think this should be sufficient to avoid duplicating the duplicate trips, if I may say so...

@tleboulenge Thanks, that's a good idea! I just updated the migration guide in 426610d to allow for this 2nd option for linking the two entities (trip.trip_id of the ADDED trip is the same as the DUPLICATED trip trip_properties.trip_id)

@paulswartz @gcamp @juanborre Does this work for you?

@juanborre
Copy link
Contributor

Yes, that works for Transit 👍

@paulswartz
Copy link
Contributor

That looks like it'll work for @mbta as well.

@barbeau
Copy link
Collaborator Author

barbeau commented Jul 21, 2020

Great! Looks like Google (@tleboulenge) and Transit App (@gcamp, @juanborre) and MBTA (@paulswartz) are now all in favor of the current proposal and migration guide.

I'd like to re-start the vote for this proposal to be added as experimental fields in GTFS-RT. Vote will be closed on Tuesday July 28th at 23:59:59 UTC.

@tleboulenge (Google), @sam-hickey-ibigroup or @ibi-group-team (IBI Group), @gcamp or @juanborre (Transit App), @paulswartz (MBTA), and @skinkie (OpenGeo) you've all expressed support for the proposal in this thread. Could you please vote with a "+1" again?

cc @timMillet

@juanborre
Copy link
Contributor

+1 for Transit 🎉

@paulswartz
Copy link
Contributor

+1 for @mbta

@tleboulenge
Copy link

+1 for Google

@sam-hickey-ibigroup
Copy link
Contributor

+1 for IBI Group

@timMillet
Copy link
Contributor

timMillet commented Jul 29, 2020

The voting period ended and the proposal is now adopted as experimental!

4 votes in favor:
@juanborre for Transit
@paulswartz for MBTA
@tleboulenge for Google
@sam-hickey-ibigroup for IBI Group

No abstentions and no votes against.

I'll merge!

@timMillet timMillet merged commit 770ffee into google:master Jul 29, 2020
@timMillet timMillet deleted the duplicated-2 branch July 29, 2020 21:33
barbeau added a commit to MobilityData/transit that referenced this pull request Aug 7, 2020
This is an editorial follow-up to google#221 to fix two items:
* New messages should have the extensions fields to allow 3rd party and internal extensions - this commit adds these fields
* An erroneous comment was added that labeled schedule_relationship as experimental - this commit removes it
barbeau added a commit that referenced this pull request Aug 10, 2020
This is an editorial follow-up to #221 to fix two items:
* New messages should have the extensions fields to allow 3rd party and internal extensions - this commit adds these fields
* An erroneous comment was added that labeled schedule_relationship as experimental - this commit removes it
@barbeau
Copy link
Collaborator Author

barbeau commented Jul 13, 2021

Hi all! I just wanted to check back in and see if there were any implementations of DUPLICATED trips by producers or consumers yet. This field is still experimental now, but once we have producers and consumers it can be officially adopted.

In particular, @juanborre, @paulswartz, @tleboulenge, @sam-hickey-ibigroup - any movement towards implementations on your end?

@juanborre
Copy link
Contributor

👋 We haven't implemented it yet but we will when there is a producer. It won't be long.

@sam-hickey-ibigroup
Copy link
Contributor

Sorry for the delayed response. We also haven't implemented it yet as none of the transit agencies we work with are interested in using this at this point.

@leonardehrenfried
Copy link
Contributor

Just an update for all interested parties: it looks like OpenTripPlanner will start implementing this at some stage as an agency is interested in this feature.

@wassimbenaissa
Copy link

Hello everyone,

First of all, thank you for all the content you produced on this topic :)

I'd like to follow up on Barbeau's question: Have you come across any implementations of the "DUPLICATED" use case?

@leonardehrenfried
Copy link
Contributor

I have been working with GTFS-RT for > 4 years and have not encountered this once.

@skinkie
Copy link
Contributor

skinkie commented Mar 11, 2024

I'd like to follow up on Barbeau's question: Have you come across any implementations of the "DUPLICATED" use case?

Fortifications (running an extra vehicle to combat full buses) is something defined in the Dutch standards, and was implemented as well. But in the last years, such instances have been very scares. So from an operational context, finding an example is not very easy.

@wassimbenaissa
Copy link

Thank you for your response.

I've been looking for examples of "DUPLICATED" on the French open data but I haven't found anything for now.

And what about the "ADDED" use case? How often do the agencies you're in contact with utilize it?

@jfabi
Copy link
Contributor

jfabi commented Mar 18, 2024

At the @mbta, we’ve been interested in the possible use of DUPLICATED (as well as DELETED) and may be taking a look at it later this year; for our subway lines, we fully produce the VehiclePositions and TripUpdates ourselves.

We currently make heavy use of the ADDED relationship whenever we’re unable to match a train's trajectory to a GTFS-static trip, and/or when the departure doesn't closely match a scheduled time, such as when we add unscheduled trips. What we've found, however, is that while we make this work well enough for our in-house applications, the spec is unclear about whether new trip IDs can be used, which is our practice (see also the references in the top post of this PR), and so most third-party apps do not support predictions for our ADDED trips. See also @abyrd's great summary of the issue in the OpenTripPlanner PR linked directly above.

For us, DUPLICATED would be an opportunity to replace some but not all of the trips we currently mark as ADDED.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
GTFS Realtime Issues and Pull Requests that focus on GTFS Realtime
Projects
None yet
Development

Successfully merging this pull request may close these issues.