flight-cli: GF-fast / Matrix-deep — routing backend, progressive enrich, throttle hardening, date-grid calendar#29
Merged
Merged
Conversation
…e legs (work-fjibi.1) _gflight_ids parsed fl[22] (the operating carrier) as the leg airline, so a codeshare like 'Lufthansa operated by Air Dolomiti' surfaced as Air Dolomiti EN8858 instead of Lufthansa LH9498 (what you book). A leg tuple carries the operating carrier at fl[22] and the marketing/selling carriers at fl[15]; fl[18] is truthy when the operating carrier markets under its own code. The booking carrier is the marketing carrier (fl[15][0]) on operated-for regional legs (fl[15] present, fl[18] falsy), else the operating carrier. Matrix surfaces this same marketing identity, so aligning also lets the cross-backend flight#+date join fire on codeshares. Ground-truthed against Google Flights' own headline labels: OS36 (fl18=[true] -> Austrian), Air Dolomiti EN8858 (fl18=null -> Lufthansa), SWISS LX39 (no fl15 -> SWISS). Operating carrier + marketing-carrier set are now captured on LegAmenities for the 'operated by' label and the upcoming O: routing filter.
…es (work-fjibi.2) New routing_predicates module: parses Matrix routing language + extension codes into a flat predicate set, each classified by how Google Flights can honor it — Tier 1 (native GF filter), Tier 2 (post-filter on the base payload), or Tier 3 (fare-construction / unexpressible -> Matrix only). Routing language is positional, so it's parsed all-or-nothing per string: only unambiguous order-independent forms map (single carrier-with-quantifier LH+/ ~UA+/O:LH+, nonstop N/N:UA, single flight #, and the flanked F* X:LHR F* via- airport idiom). Ordered chains (BA AA, DFW DEN), bare single-segment carriers, country filters, count placeholders, and unknown tokens escalate the whole routing to Matrix — never partially honored. Extension codes are order- independent and classified per directive. ClassifiedConstraints.requires_matrix is the gate: true iff any predicate is Tier 3, i.e. the query can't be served from Google Flights alone.
…ork-fjibi.3) apply_gf_native_filters() applies Tier-1 (GF-native) predicates onto an fli FlightSearchFilters in place: marketing-carrier include -> airlines; alliance -> airlines (ONEWORLD/SKYTEAM/STAR_ALLIANCE enum tokens); connect-at airport -> layover_restrictions.airports; MAXCONNECT -> layover max_duration; MAXDUR -> max_duration; nonstop / MAXSTOPS -> stops. Returns False if a carrier/airport code fli can't map, so the caller escalates to Matrix rather than dropping it. Also reclassifies carrier/airport *exclude* as Tier-2 (post-filter): GF's airline and connecting-airport controls are allow-lists, so excluding would need the route's carrier/airport set to complement (a probe) — a result post-filter is simpler and equally correct. Include stays Tier-1 (native).
…i.4) _gf_postfilter evaluates the Tier-2 predicates GF can't request natively against the returned itineraries: operating-carrier include/exclude (O:/OPAIRLINES), marketing-carrier exclude (~UA/-AIRLINES), connection-airport exclude (~DFW/ -CITIES), -CODESHARE, and specific flight #/range. Predicates apply per slice (outbound routing doesn't filter the return). Threads the operating carrier + marketing-carrier set captured in work-fjibi.1 through LegInfo and the gflight adapter so the filter can see them. gf_can_serve() is the gate's capability check: GF alone serves a query only when it has no Tier-3 predicate AND every Tier-2 predicate is post-filterable here. Time-based Tier-2 predicates (min layover, red-eyes, overnight stops) aren't yet evaluable (no per-segment times threaded) so they escalate to Matrix rather than being silently dropped.
…ot Matrix (work-fjibi.5) Reworks _pick_backend: --routing/--extension no longer force the ~45s Matrix path on their own. They're classified, and Google Flights serves them (~1s) whenever it can honor every constraint — native filters (Tier 1) plus the result post-filter (Tier 2). Only fare-construction (fare basis / booking class) or a constraint GF can't reconstruct, plus the hard-Matrix flags (--slice, --depart-times/--return-times, extra pax types), still go to Matrix. _run_gflight_path now applies the Tier-1 native filters to the fli query and the Tier-2 post-filter to the results (per slice). The native mapper became pure optimization: if an fli carrier/airport code doesn't map it skips that query dimension (no under-return) and the post-filter — a string-based backstop that also enforces marketing-include and connect-at — guarantees correctness. Multi-cabin + routing stays on Matrix for now (the multi-cabin gflight path doesn't apply the filters yet). Smoke-tested live: 'LH+' -> 1s, LH-marketed incl UA58 (LH codeshare, Matrix- consistent); 'O:LH+' -> 1s, drops UA58 (operated by UA); 'F bc=y' -> Matrix.
merge_results() matches a fast Google Flights result against the authoritative Matrix result by flight number + departure date per slice (aligned now that the gflight adapter emits marketing numbers, work-fjibi.1). Each merged row carries both prices attributed (they should agree; show both) and a source tag: matched (both), Matrix-only (added — broader fare coverage), or GF-only (kept + flagged, ULCC/codeshare). Matrix's itinerary structure is authoritative on a match. Rows are price-sorted. Pure function; backend orchestration wires it next.
…k-x8gl1b) For a GF-serveable query, dispatch Google Flights and Matrix concurrently under one anyio.run: GF runs in a worker thread (it's sync curl_cffi) while the Matrix request progresses on the event loop. GF paints immediately (~1s) with a 'refining with Matrix' note; once Matrix lands (~45s) the reconciled GF+Matrix table repaints with both prices attributed (they can differ a lot — Matrix surfaces cheaper published fares; showing both is the point). PP/awards + URLs run on the Matrix (authoritative) result. --fast / --no-enrich takes the GF-only path (~1s); JSON output also stays GF-only (single stable shape). Per-backend error isolation: if Matrix fails the GF table still stands; if GF fails Matrix still renders. Extracts the GF query into _gflight_results (shared by both paths) and adds _render_merged. Smoke-tested: SFO-FRA --routing LH+ -> GF table ~1s, merged GF+Matrix ~55s (Matrix $622 vs Google $1120 attributed); --fast -> GF only.
When a flight matches a marketing-carrier filter via codeshare, show the matched identity instead of the primary booking number: under `--routing LH+`, United 58 (sold as LH9403) now renders 'LH9403 (op UA58)' rather than 'UA 58' — you asked for Lufthansa, you see the Lufthansa flight, with the operating metal noted. Captures the full marketing flight numbers (fl[15]) on LegAmenities/LegInfo (previously only the carrier codes), threads the filtered carrier set (_match_carriers, marketing-include only — operating/exclude don't relabel) into _render_gflight_table, and _leg_display picks the matched codeshare identity. Loose marketing matching is unchanged (Matrix-consistent); only the label is clearer. Smoke-tested: LH+ -> 'LH9403 (op UA58)'.
…ve enrich (work-x8gl1) Documents the fl[15]/fl[18]/fl[22] booking-carrier rule, the Tier-1/2/3 model, the GF-serve gate + post-filter backstop, the concurrent enrich flow, and codeshare-aware display — load-bearing knowledge that otherwise lives only in commit messages.
… degrade to Matrix The GF per-IP throttle is dynamic (ceiling drifts run-to-run, measured 2026-06-14) with fast recovery, so a fixed rate cap is the wrong tool. Instead, react to the signal: - _is_throttle_block() distinguishes a GENUINE throttle (HTTP 200 + code-13 ErrorResponse body) from a cold-session empty and from a transport error — the sensor that makes a closed loop possible. - _one_call raises GfThrottledError on a genuine block (was conflated with the cold-session empty -> shown as 'no results'). - _one_call_with_retry now runs two policies: cold empty -> quick linear retries (returns [] if it never warms); throttle -> exponential JITTERED backoff, re-raised when exhausted. Jitter decorrelates concurrent one-shot processes, which share the per-IP signal but can't share a budget. - Callers: the enrich flow catches GfThrottledError and degrades to Matrix-only (the woven design makes a throttle non-fatal); --fast/GF-only surfaces a clear 'rate-limited, wait or use --backend matrix' message. No proactive rate limiter / GF fan-out: the date-grid (1 call/window) keeps us far from the limit, and per-call backoff self-heals the lone multi-cabin fan-out. Also records (docs/memories) the throttle characterization + the fli SearchDates >61-day chunking bug that drops filters (bd work-bcdex). +7 tests; make check green.
…k-x8gl1 follow-up) _gf_dategrid wraps Google's GetCalendarGraph (via fli's DateSearchFilters) to return cheapest-price-per-date for a whole window in ONE call — far faster than Matrix's calendar, and it dodges Matrix's compute-budget under-reporting. Reuses apply_gf_native_filters so Tier-1 filters (airlines/stops/layover/duration/cabin/ times/price) are honored server-side; verified live 2026-06-14 (airlines=LH and nonstop change the grid). - grid_can_serve(): one-way, single-airport, Tier-1-only. The grid returns date+price with NO itineraries, so even Tier-2 can't be post-filtered -> those calendars (and round-trip duration ranges, multi-airport) go to Matrix. - We chunk windows to <=61 days OURSELVES with the full filter set, dodging the fli SearchDates >61-day filter-drop bug (bd work-bcdex). - Throttle-hardened: extracted retry_throttled() (shared with the search path, PEP-695 generic) so the grid call gets the same code-13 detection + backoff. Foundation only — the calendar command wiring (gate + fast paint + weave with Matrix) is the next commit. +7 tests; make check green (501).
… Matrix weave) The calendar command now paints the GF native date-grid first (~1s, cheapest fare per departure day) for one-way / single-airport / Tier-1-only windows, then enriches with the authoritative Matrix calendar (full per-duration grid) — the same quick-then-complete shape as search. --fast / --no-enrich stops after the grid; --enrich (default) runs both. A cheap pre-check (one-way + single-airport, no JSON) gates the fli-heavy import, so round-trip / multi-airport / Tier-2-3 calendars take the Matrix-only path with no added cost. A throttled or empty grid degrades gracefully to Matrix (the grid is the optional fast layer). Adds _render_date_grid. Smoke-tested: SFO->FRA one-way --fast -> 16 priced days (cheapest 470 USD, matching the date-grid experiment); default -> grid then Matrix. make check green (501).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Makes Google Flights the fast layer and Matrix the authoritative-deep layer across search and calendar: GF answers in ~1s, Matrix enriches when it lands. 12 commits;
make checkgreen (501 tests); live-smoke-validated throughout.GF-first backend (work-fjibi)
--routing/--extensionno longer force the ~45s Matrix path. Parsed + tier-classified; GF serves them (~1s) when it can honor every constraint.fl[15], what you book) not the operating metal (fl[22]) for codeshare legs; ground-truthed viafl[18].routing_predicatesparser → Tier 1 (native GF) / Tier 2 (post-filter) / Tier 3 (fare-construction → Matrix).apply_gf_native_filters), Tier-2 post-filter,_pick_backendgate.Progressive enrich — search (work-x8gl1)
GF + Matrix dispatched concurrently under one
anyio.run; GF paints ~1s, repaints a reconciled GF+Matrix table (prices attributed) when Matrix lands.--fastfor GF-only. Codeshare-aware labels (LH9403 (op UA58)).Throttle hardening
The GF per-IP throttle is dynamic with fast recovery (measured), so react, don't cap: detect genuine
code-13(vs transport/cold-session), exponential jittered backoff, degrade to Matrix on persistent throttle.Date-grid calendar
_gf_dategridwraps Google'sGetCalendarGraph— a whole window's cheapest-per-day in ONE call, honoring Tier-1 filters, dodging Matrix's compute-budget under-reporting. Our own ≤61d chunking sidesteps the fli filter-drop bug. Thecalendarcommand paints the grid fast, then enriches with Matrix;--faststops at the grid.Docs
docs/memories/gf_routing_and_carriers.md— carrier semantics, the tier model, the throttle characterization, and the fli bug.Follow-ups (tracked in bd)
MINCONNECT/-REDEYES/-OVERNIGHTS) off Matrix.