Skip to content

flight-cli: GF-fast / Matrix-deep — routing backend, progressive enrich, throttle hardening, date-grid calendar#29

Merged
ak2k merged 12 commits into
mainfrom
worktree-work-fjibi
Jun 15, 2026
Merged

flight-cli: GF-fast / Matrix-deep — routing backend, progressive enrich, throttle hardening, date-grid calendar#29
ak2k merged 12 commits into
mainfrom
worktree-work-fjibi

Conversation

@ak2k

@ak2k ak2k commented Jun 14, 2026

Copy link
Copy Markdown
Owner

Makes Google Flights the fast layer and Matrix the authoritative-deep layer across search and calendar: GF answers in ~1s, Matrix enriches when it lands. 12 commits; make check green (501 tests); live-smoke-validated throughout.

GF-first backend (work-fjibi)

--routing/--extension no longer force the ~45s Matrix path. Parsed + tier-classified; GF serves them (~1s) when it can honor every constraint.

  • Carrier fix: surface the marketing carrier (fl[15], what you book) not the operating metal (fl[22]) for codeshare legs; ground-truthed via fl[18].
  • routing_predicates parser → Tier 1 (native GF) / Tier 2 (post-filter) / Tier 3 (fare-construction → Matrix).
  • Native fli mapping (apply_gf_native_filters), Tier-2 post-filter, _pick_backend gate.

Progressive enrich — search (work-x8gl1)

GF + Matrix dispatched concurrently under one anyio.run; GF paints ~1s, repaints a reconciled GF+Matrix table (prices attributed) when Matrix lands. --fast for GF-only. Codeshare-aware labels (LH9403 (op UA58)).

Throttle hardening

The GF per-IP throttle is dynamic with fast recovery (measured), so react, don't cap: detect genuine code-13 (vs transport/cold-session), exponential jittered backoff, degrade to Matrix on persistent throttle.

Date-grid calendar

_gf_dategrid wraps Google's GetCalendarGraph — a whole window's cheapest-per-day in ONE call, honoring Tier-1 filters, dodging Matrix's compute-budget under-reporting. Our own ≤61d chunking sidesteps the fli filter-drop bug. The calendar command paints the grid fast, then enriches with Matrix; --fast stops at the grid.

Docs

docs/memories/gf_routing_and_carriers.md — carrier semantics, the tier model, the throttle characterization, and the fli bug.

Follow-ups (tracked in bd)

  • Concurrent calendar weave (the calendar weave is sequential today).
  • Promote time-based Tier-2 (MINCONNECT/-REDEYES/-OVERNIGHTS) off Matrix.
  • GF for multi-cabin+routing; round-trip / multi-airport date-grid.

ak2k added 12 commits June 13, 2026 22:13
…e legs (work-fjibi.1)

_gflight_ids parsed fl[22] (the operating carrier) as the leg airline, so a
codeshare like 'Lufthansa operated by Air Dolomiti' surfaced as Air Dolomiti
EN8858 instead of Lufthansa LH9498 (what you book).

A leg tuple carries the operating carrier at fl[22] and the marketing/selling
carriers at fl[15]; fl[18] is truthy when the operating carrier markets under
its own code. The booking carrier is the marketing carrier (fl[15][0]) on
operated-for regional legs (fl[15] present, fl[18] falsy), else the operating
carrier. Matrix surfaces this same marketing identity, so aligning also lets
the cross-backend flight#+date join fire on codeshares.

Ground-truthed against Google Flights' own headline labels: OS36 (fl18=[true]
-> Austrian), Air Dolomiti EN8858 (fl18=null -> Lufthansa), SWISS LX39 (no
fl15 -> SWISS). Operating carrier + marketing-carrier set are now captured on
LegAmenities for the 'operated by' label and the upcoming O: routing filter.
…es (work-fjibi.2)

New routing_predicates module: parses Matrix routing language + extension codes
into a flat predicate set, each classified by how Google Flights can honor it —
Tier 1 (native GF filter), Tier 2 (post-filter on the base payload), or Tier 3
(fare-construction / unexpressible -> Matrix only).

Routing language is positional, so it's parsed all-or-nothing per string: only
unambiguous order-independent forms map (single carrier-with-quantifier LH+/
~UA+/O:LH+, nonstop N/N:UA, single flight #, and the flanked F* X:LHR F* via-
airport idiom). Ordered chains (BA AA, DFW DEN), bare single-segment carriers,
country filters, count placeholders, and unknown tokens escalate the whole
routing to Matrix — never partially honored. Extension codes are order-
independent and classified per directive.

ClassifiedConstraints.requires_matrix is the gate: true iff any predicate is
Tier 3, i.e. the query can't be served from Google Flights alone.
…ork-fjibi.3)

apply_gf_native_filters() applies Tier-1 (GF-native) predicates onto an fli
FlightSearchFilters in place: marketing-carrier include -> airlines; alliance ->
airlines (ONEWORLD/SKYTEAM/STAR_ALLIANCE enum tokens); connect-at airport ->
layover_restrictions.airports; MAXCONNECT -> layover max_duration; MAXDUR ->
max_duration; nonstop / MAXSTOPS -> stops. Returns False if a carrier/airport
code fli can't map, so the caller escalates to Matrix rather than dropping it.

Also reclassifies carrier/airport *exclude* as Tier-2 (post-filter): GF's
airline and connecting-airport controls are allow-lists, so excluding would need
the route's carrier/airport set to complement (a probe) — a result post-filter
is simpler and equally correct. Include stays Tier-1 (native).
…i.4)

_gf_postfilter evaluates the Tier-2 predicates GF can't request natively against
the returned itineraries: operating-carrier include/exclude (O:/OPAIRLINES),
marketing-carrier exclude (~UA/-AIRLINES), connection-airport exclude (~DFW/
-CITIES), -CODESHARE, and specific flight #/range. Predicates apply per slice
(outbound routing doesn't filter the return). Threads the operating carrier +
marketing-carrier set captured in work-fjibi.1 through LegInfo and the gflight
adapter so the filter can see them.

gf_can_serve() is the gate's capability check: GF alone serves a query only when
it has no Tier-3 predicate AND every Tier-2 predicate is post-filterable here.
Time-based Tier-2 predicates (min layover, red-eyes, overnight stops) aren't yet
evaluable (no per-segment times threaded) so they escalate to Matrix rather than
being silently dropped.
…ot Matrix (work-fjibi.5)

Reworks _pick_backend: --routing/--extension no longer force the ~45s Matrix
path on their own. They're classified, and Google Flights serves them (~1s)
whenever it can honor every constraint — native filters (Tier 1) plus the result
post-filter (Tier 2). Only fare-construction (fare basis / booking class) or a
constraint GF can't reconstruct, plus the hard-Matrix flags (--slice,
--depart-times/--return-times, extra pax types), still go to Matrix.

_run_gflight_path now applies the Tier-1 native filters to the fli query and the
Tier-2 post-filter to the results (per slice). The native mapper became pure
optimization: if an fli carrier/airport code doesn't map it skips that query
dimension (no under-return) and the post-filter — a string-based backstop that
also enforces marketing-include and connect-at — guarantees correctness.
Multi-cabin + routing stays on Matrix for now (the multi-cabin gflight path
doesn't apply the filters yet).

Smoke-tested live: 'LH+' -> 1s, LH-marketed incl UA58 (LH codeshare, Matrix-
consistent); 'O:LH+' -> 1s, drops UA58 (operated by UA); 'F bc=y' -> Matrix.
merge_results() matches a fast Google Flights result against the authoritative
Matrix result by flight number + departure date per slice (aligned now that the
gflight adapter emits marketing numbers, work-fjibi.1). Each merged row carries
both prices attributed (they should agree; show both) and a source tag: matched
(both), Matrix-only (added — broader fare coverage), or GF-only (kept + flagged,
ULCC/codeshare). Matrix's itinerary structure is authoritative on a match. Rows
are price-sorted. Pure function; backend orchestration wires it next.
…k-x8gl1b)

For a GF-serveable query, dispatch Google Flights and Matrix concurrently under
one anyio.run: GF runs in a worker thread (it's sync curl_cffi) while the Matrix
request progresses on the event loop. GF paints immediately (~1s) with a
'refining with Matrix' note; once Matrix lands (~45s) the reconciled GF+Matrix
table repaints with both prices attributed (they can differ a lot — Matrix
surfaces cheaper published fares; showing both is the point). PP/awards + URLs
run on the Matrix (authoritative) result.

--fast / --no-enrich takes the GF-only path (~1s); JSON output also stays GF-only
(single stable shape). Per-backend error isolation: if Matrix fails the GF table
still stands; if GF fails Matrix still renders. Extracts the GF query into
_gflight_results (shared by both paths) and adds _render_merged.

Smoke-tested: SFO-FRA --routing LH+ -> GF table ~1s, merged GF+Matrix ~55s
(Matrix $622 vs Google $1120 attributed); --fast -> GF only.
When a flight matches a marketing-carrier filter via codeshare, show the matched
identity instead of the primary booking number: under `--routing LH+`, United 58
(sold as LH9403) now renders 'LH9403 (op UA58)' rather than 'UA 58' — you asked
for Lufthansa, you see the Lufthansa flight, with the operating metal noted.

Captures the full marketing flight numbers (fl[15]) on LegAmenities/LegInfo
(previously only the carrier codes), threads the filtered carrier set
(_match_carriers, marketing-include only — operating/exclude don't relabel) into
_render_gflight_table, and _leg_display picks the matched codeshare identity.
Loose marketing matching is unchanged (Matrix-consistent); only the label is
clearer. Smoke-tested: LH+ -> 'LH9403 (op UA58)'.
…ve enrich (work-x8gl1)

Documents the fl[15]/fl[18]/fl[22] booking-carrier rule, the Tier-1/2/3 model,
the GF-serve gate + post-filter backstop, the concurrent enrich flow, and
codeshare-aware display — load-bearing knowledge that otherwise lives only in
commit messages.
… degrade to Matrix

The GF per-IP throttle is dynamic (ceiling drifts run-to-run, measured 2026-06-14)
with fast recovery, so a fixed rate cap is the wrong tool. Instead, react to the
signal:

- _is_throttle_block() distinguishes a GENUINE throttle (HTTP 200 + code-13
  ErrorResponse body) from a cold-session empty and from a transport error —
  the sensor that makes a closed loop possible.
- _one_call raises GfThrottledError on a genuine block (was conflated with the
  cold-session empty -> shown as 'no results').
- _one_call_with_retry now runs two policies: cold empty -> quick linear retries
  (returns [] if it never warms); throttle -> exponential JITTERED backoff,
  re-raised when exhausted. Jitter decorrelates concurrent one-shot processes,
  which share the per-IP signal but can't share a budget.
- Callers: the enrich flow catches GfThrottledError and degrades to Matrix-only
  (the woven design makes a throttle non-fatal); --fast/GF-only surfaces a clear
  'rate-limited, wait or use --backend matrix' message.

No proactive rate limiter / GF fan-out: the date-grid (1 call/window) keeps us
far from the limit, and per-call backoff self-heals the lone multi-cabin fan-out.

Also records (docs/memories) the throttle characterization + the fli SearchDates
>61-day chunking bug that drops filters (bd work-bcdex). +7 tests; make check green.
…k-x8gl1 follow-up)

_gf_dategrid wraps Google's GetCalendarGraph (via fli's DateSearchFilters) to
return cheapest-price-per-date for a whole window in ONE call — far faster than
Matrix's calendar, and it dodges Matrix's compute-budget under-reporting. Reuses
apply_gf_native_filters so Tier-1 filters (airlines/stops/layover/duration/cabin/
times/price) are honored server-side; verified live 2026-06-14 (airlines=LH and
nonstop change the grid).

- grid_can_serve(): one-way, single-airport, Tier-1-only. The grid returns
  date+price with NO itineraries, so even Tier-2 can't be post-filtered -> those
  calendars (and round-trip duration ranges, multi-airport) go to Matrix.
- We chunk windows to <=61 days OURSELVES with the full filter set, dodging the
  fli SearchDates >61-day filter-drop bug (bd work-bcdex).
- Throttle-hardened: extracted retry_throttled() (shared with the search path,
  PEP-695 generic) so the grid call gets the same code-13 detection + backoff.

Foundation only — the calendar command wiring (gate + fast paint + weave with
Matrix) is the next commit. +7 tests; make check green (501).
… Matrix weave)

The calendar command now paints the GF native date-grid first (~1s, cheapest fare
per departure day) for one-way / single-airport / Tier-1-only windows, then
enriches with the authoritative Matrix calendar (full per-duration grid) — the
same quick-then-complete shape as search. --fast / --no-enrich stops after the
grid; --enrich (default) runs both.

A cheap pre-check (one-way + single-airport, no JSON) gates the fli-heavy import,
so round-trip / multi-airport / Tier-2-3 calendars take the Matrix-only path with
no added cost. A throttled or empty grid degrades gracefully to Matrix (the grid
is the optional fast layer). Adds _render_date_grid.

Smoke-tested: SFO->FRA one-way --fast -> 16 priced days (cheapest 470 USD,
matching the date-grid experiment); default -> grid then Matrix. make check green
(501).
@ak2k ak2k changed the title flight-cli: GF-first routing backend + progressive GF↔Matrix enrich flight-cli: GF-fast / Matrix-deep — routing backend, progressive enrich, throttle hardening, date-grid calendar Jun 15, 2026
@ak2k ak2k merged commit eaecf71 into main Jun 15, 2026
2 checks passed
@ak2k ak2k deleted the worktree-work-fjibi branch June 15, 2026 00:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant