Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some public stops/stations are missing in GTFS as they're missing in physical_station table #24

Closed
mk-fg opened this issue Sep 10, 2017 · 2 comments

Comments

@mk-fg
Copy link
Contributor

mk-fg commented Sep 10, 2017

For example, L35246 (and about ~1.2k other entries, according to grep) appears to have UPMNLT ("Upminster L.T.") station as its origin, with public departure time.
This station is also present in "TIPLOC Eastings and Northings.xlsx.gz" list, and seem to be a real google-able thing.

It's also missing in physical_station mysql table, which is why it doesn't make it into GTFS, but seem to be greppable in CIF MCA file (ttis653.zip data) as a TI (tiploc insert) entry and is parsed to a corresponding entry in tiploc table:

                 id: 9998
        tiploc_code: UPMNLT
           capitals: 08
              nalco: 073600
nlc_check_character: N
    tps_description: UPMINSTER L.T.
             stanox: 51353
        po_mcp_code: 0
           crs_code: ZUM
        description: UPMINSTER UND

Maybe it'd make sense to add another JOIN to that massive query for CIF schedules and check both tiploc.crs_code and physical_location.crs_code in case one or other is missing?

@mk-fg
Copy link
Contributor Author

mk-fg commented Sep 10, 2017

Actually, this L35246 example illustrates another problem, which probably should be considered when fixing this one.

Joined CIF schedule / stop_times entries currently processed by the script for L35246 are:

  L35246   23167 P 2017-05-22 2017-12-08 12345.. A ZTU 00:02 00:02 T
  L35246   23167 P 2017-05-22 2017-12-08 12345.. A GUN 00:05 00:05 T
  L35246   23167 P 2017-05-22 2017-12-08 12345.. A KWG 00:08 00:08 T
  ...

While actual entries in these tables are:

  L35246   23167 P 2017-05-22 2017-12-08 12345.. A ZUM --:-- 22:43 TB
  L35246   23167 P 2017-05-22 2017-12-08 12345.. A ZTU 00:02 00:02 T
  L35246   23167 P 2017-05-22 2017-12-08 12345.. A GUN 00:05 00:05 T
  L35246   23167 P 2017-05-22 2017-12-08 12345.. A KWG 00:08 00:08 T
  ...

Note how including that first entry effectively changes service dates!

So condition in aforementioned "massive query" that gets schedule/stop times should not have crs_code IS NOT NULL in it, as that might prevent script from ever processing that first stop, and set wrong date for GTFS trip (like it does here), regardless of whether such stop will end up in GTFS trip or get discaded.

@linusnorton
Copy link
Collaborator

Doing a quick query to check for departure stations with no CRS only returns 12 services starting from LRDDEAC which is a depot. Given we understand TB to mean that passengers can get on or off and LRDDEAC is a depot I think we can skip it. I won't add any extra checks for days etc at this point but I will add that extra join.... because you can never have enough joins.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants