Skip to content

bug: FillNA does not work on integer columns with duckdb #180

@vspinu

Description

@vspinu

Example:

import pandas as pd
import ibis
import ibis_ml as ml
con = ibis.duckdb.connect()
df = pd.DataFrame({
    'cat1': ['AA', 'BBB', 'AA', 'BBB', 'CCC'],
    'cat2': ['X', 'Y', 'Y', 'X', 'Z'],
    'value': [10, 20, 30, 40, 50]
})
tbl = con.create_table("tmp", df, overwrite=True)

tr_oe = ml.Recipe(
    ml.OrdinalEncode(ml.string(), min_frequency=2),
    ml.FillNA(ml.integer(), -1),
).fit(tbl)

tr_oe.to_ibis(tbl).to_pandas()
    #    value  cat1  cat2
    # 0     10   0.0   0.0
    # 1     20   1.0   1.0
    # 2     30   0.0   1.0
    # 3     40   1.0   0.0
    # 4     50   NaN   NaN

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    Status
    backlog

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions