Skip to content

Bump stanza from 1.10.1 to 1.12.2#4887

Open
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/uv/stanza-1.12.2
Open

Bump stanza from 1.10.1 to 1.12.2#4887
dependabot[bot] wants to merge 1 commit into
mainfrom
dependabot/uv/stanza-1.12.2

Conversation

@dependabot

@dependabot dependabot Bot commented on behalf of github Jun 20, 2026

Copy link
Copy Markdown
Contributor

Bumps stanza from 1.10.1 to 1.12.2.

Release notes

Sourced from stanza's releases.

v1.12.2 - weights_only security fix

This is the same as v1.12.1, with the following important update: all model now enforce weights_only=True when loading. This may obsolete some old models (all models distributed with Stanza are already patched). If that happens, please load and then resave your model with an older version of Stanza, such as v1.12.1.

Security / PyTorch compatibility

  • Enforce weights_only=True when loading the lemma classifier, addressing part of the security advisory GHSA-v5jw-96jm-7h2c. This should already be the default in later versions of PyTorch, but is now explicitly enforced. #1584

  • All models now have the code path which allows for weights_only=False removed. Instead, attempting to load a legacy model will throw an exception. If that happens, please load your model with an older version (such as 1.12.1) and resave it before proceeding. #1587

Tokenizer improvements

  • Add control characters to the set of characters treated as whitespace when tokenizing, fixing a bug where certain Unicode control characters (such as "region end" markers) were incorrectly attached to words. #1573 Addresses #1257

  • Add tokenizer augmentation that occasionally replaces commas with en-dashes or em-dashes, so that models trained on datasets that lack those characters learn to treat them similarly to commas. #1573

  • Add regression tests for Spanish tokenization errors reported in #1257 and tests for the whitespace/control-character handling and tokenizer augmentations. #1573

Lemmatizer improvements

  • Enforce weights_only=True when loading the lemma classifier, avoiding a possible security risk. #1584

  • The lemma classifier for ja_gsd is now also attached to ja_combined. #1584

  • Train and attach two lemma classifiers to en_combined — both 's and her can be reliably classified from the available data. #1584

  • Add end-to-end unit tests for run_lemma.py, including training a lemmatizer and attaching multiple lemma classifiers. #1586

Spanish model improvements

  • Add a silver dataset covering como_VERB in Spanish to the combined Spanish training data, addressing #1440. Also adds a utility to print a confusion matrix of tagging results filtered by a word regex (e.g. --upos_word_regex "^(?i:como)$"), making it easier to isolate the effects of annotation changes. #1579 stanfordnlp/handparsed-treebank@d0c29a3

  • Add silver training sentences covering unknown Spanish VERB lemmas to the combined Spanish lemmatizer, addressing #1255. Also includes a script to check lemmatizer results for a batch of word/POS combinations. #1580 stanfordnlp/handparsed-treebank@11327ef

Italian model improvements

  • Rebuild Italian models with additional training data to fix incorrect lemmatization of common words including "violino" (was incorrectly mapped to "violare") #1563 stanfordnlp/handparsed-treebank@9c46db1, and "diversi" (was incorrectly split and mapped to "dire") — resolved by retraining with the more accurate models #1564

English model improvements

  • The long-standing issue of "can" being tagged as a modal verb (MD) rather than a noun (NN) in noun phrases like "trash can" and "soda can" is now resolved with the combined English models. #408

New / updated models

  • Odia (Oriya) now uses the ODTB package as the default. Mixed POS and depparse training data is constructed from the Odia dataset combined with related Indic languages present in MuRIL-Large, following the approach used for Sindhi. The Odia NER model is now also connected to the default package. #1583

Demo and visualization improvements

  • Rewrite stanza-parseviewer.js to use a proper constituency parse visualizer instead of a repurposed dependency parse visualizer, fixing the broken vertical striping. Also adds a table of morphological features to the visualization. #1581 Addresses #1358

  • Various small improvements to the web demo: route all responses to /; templatize stanza-brat.html so the version number is sourced from _version.py; move the logo to the demo directory for easier serving; add favicon support to the pipeline demo; guard against empty POST requests. #1582

... (truncated)

Commits
  • 8e745c8 Turns out the correct lemma test was actually kind of useless, since the tag ...
  • afe1c4f this should also always be right with the lemma_classifier - there's is extre...
  • 73797ab Minor tweak to the lemma classifier pipeline test - verify that it is getting...
  • 966e591 Push the download of the lemma_classifier from the unit test into the setup s...
  • 1e01937 Fix a doc bug - the PretrainedWordVocab is the part that normalizes spaces
  • d345df1 Document the location of languages.html
  • 890095d Add a bunch more languages from https://universaldependencies.org/languages.h...
  • 5d8a514 Update the short_name_to_treebank module (using build...)
  • ced3db9 Nenets uses yrk, not nrk, in the UD datasets
  • 1685f02 Add several langcodes from UD 2.18 that aren't already known by Stanza
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps [stanza](https://github.com/stanfordnlp/stanza) from 1.10.1 to 1.12.2.
- [Release notes](https://github.com/stanfordnlp/stanza/releases)
- [Commits](stanfordnlp/stanza@v1.10.1...v1.12.2)

---
updated-dependencies:
- dependency-name: stanza
  dependency-version: 1.12.2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot Bot added dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code labels Jun 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python:uv Pull requests that update python:uv code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants