Skip to content

Sanitize autoentity-generated names to prevent invalid REST paths and GraphQL singular/plural tables for SQL objects with spaces#3609

Open
Copilot wants to merge 10 commits into
mainfrom
copilot/fix-invalid-rest-paths
Open

Sanitize autoentity-generated names to prevent invalid REST paths and GraphQL singular/plural tables for SQL objects with spaces#3609
Copilot wants to merge 10 commits into
mainfrom
copilot/fix-invalid-rest-paths

Conversation

Copilot AI commented May 19, 2026

Copy link
Copy Markdown
Contributor

Why make this change?

Tables with whitespace in names (for example, Order Items) were being auto-generated into entity names like dbo_Order Items, which then produced invalid REST paths as well as invalid GraphQL names for singular/plural tables which failed config validation. This change ensures autoentity name generation does not pass whitespace through to REST path and GraphQL singular/plural tables defaults.

  • Autoentity naming currently interpolates {schema} / {object} verbatim; whitespace in object names breaks dab validate.

What is this change?

  • Autoentity name normalization (MsSql metadata path)
    • Added RemoveWhitespaceAndCamelCase in SqlMetadataProvider and applied it to generated entity_name before entity creation. The method name is intentionally specific to clarify it only removes whitespace (not other invalid characters).
    • Normalization removes whitespace and uppercases the character immediately following each removed whitespace (camelCase-join), e.g. dbo_Order Itemsdbo_OrderItems.
    • Added a guard after normalization: if the result is empty (e.g. the raw entity name was entirely whitespace), an error is logged with the original name and schema for context, and the entry is skipped.
    • Collision detection is improved: when TryAdd fails because two database objects normalize to the same entity name (e.g. "Order Item" and "OrderItem" both yield "OrderItem"), the exception message includes the pre-normalization name and schema to make the collision diagnosable.
protected static string RemoveWhitespaceAndCamelCase(string name)
{
    StringBuilder result = new(name.Length);
    bool capitalizeNext = false;

    foreach (char character in name)
    {
        if (char.IsWhiteSpace(character))
        {
            capitalizeNext = true;
            continue;
        }

        result.Append(capitalizeNext ? char.ToUpperInvariant(character) : character);
        capitalizeNext = false;
    }

    return result.ToString();
}
  • Focused test coverage
    • Added a dedicated TestAutoentitiesGeneratedWithSpacesInObjectName test method that explicitly asserts the sanitized entity name dbo_OrderItems (via const string EXPECTED_ENTITY_NAME) and validates end-to-end REST and GraphQL access with that name.
    • Whitespace-handling validation is now separated from the non-default schema cases in TestAutoentitiesGeneratedWithUnusualElements for better readability and maintainability.

How was this tested?

  • Integration Tests
  • Unit Tests

Sample Request(s)

  • Example CLI usage to reproduce/validate behavior:
    • dab auto-config add mydef --patterns.include "dbo.Order Items" --patterns.name "{schema}_{object}"
    • dab validate
  • Expected generated entity naming behavior:
    • SQL object: dbo.[Order Items]
    • Generated entity/REST path segment: dbo_OrderItems (no whitespace)

Copilot AI changed the title [WIP] Fix invalid REST paths generated for tables with spaces Sanitize autoentity-generated names to prevent invalid REST paths for SQL objects with spaces May 19, 2026
Copilot AI requested a review from RubenCerna2079 May 19, 2026 23:17
@RubenCerna2079 RubenCerna2079 marked this pull request as ready for review May 22, 2026 22:34
Copilot AI review requested due to automatic review settings May 22, 2026 22:34

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR improves MSSQL autoentity generation by sanitizing auto-generated entity names so SQL objects containing whitespace don’t produce invalid REST path segments or invalid GraphQL names during default singular/plural generation and config validation.

Changes:

  • Add a whitespace-removal (camel-join) sanitizer for generated entity names and apply it during MSSQL autoentity generation.
  • Extend MSSQL test schema with a table that contains spaces in its name and seed data for it.
  • Update configuration tests to validate REST + GraphQL access for an autoentity generated from an object with spaces.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
src/Service.Tests/DatabaseSchema-MsSql.sql Adds an MSSQL table with whitespace in its name to exercise autoentity naming.
src/Service.Tests/Configuration/ConfigurationTests.cs Extends autoentity end-to-end test to include an object with spaces and validates REST/GraphQL access.
src/Core/Services/MetadataProviders/SqlMetadataProvider.cs Introduces SanitizeGeneratedEntityName helper for generated entity names.
src/Core/Services/MetadataProviders/MsSqlMetadataProvider.cs Applies sanitization to autoentity-generated entityName before entity creation.

Comment thread src/Service.Tests/DatabaseSchema-MsSql.sql
Comment thread src/Service.Tests/Configuration/ConfigurationTests.cs
Comment thread src/Service.Tests/Configuration/ConfigurationTests.cs
Comment thread src/Core/Services/MetadataProviders/MsSqlMetadataProvider.cs Outdated
@RubenCerna2079 RubenCerna2079 changed the title Sanitize autoentity-generated names to prevent invalid REST paths for SQL objects with spaces Sanitize autoentity-generated names to prevent invalid REST paths and GraphQL singular/plural tables for SQL objects with spaces May 22, 2026
Comment on lines +736 to +748
/// <summary>
/// Sanitizes the generated entity name by removing whitespace and capitalizing the next character after whitespace.
/// </summary>
/// <param name="name">The entity name to be sanitized.</param>
/// <returns>The sanitized entity name.</returns>
protected static string SanitizeGeneratedEntityName(string name)
{
StringBuilder sanitizedName = new(name.Length);
bool capitalizeNext = false;

foreach (char character in name)
{
if (char.IsWhiteSpace(character))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we sanitize other valid characters like []? for e.g.- [Customer-Orders] still produces dbo_Customer-Orders, which can be invalid for REST/GraphQL. if this is about only removing whitespaces, may be the function can be renamed with that.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a more specific name (e.g., RemoveWhitespaceAndCamelCase) could improve clarity and avoid confusion for future contributors.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to RemoveWhitespaceAndCamelCase in 29f0e79 → addressed in the latest commit. The updated XML doc also describes the camelCase-join behavior explicitly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done — SanitizeGeneratedEntityName has been renamed to RemoveWhitespaceAndCamelCase with an updated doc comment that describes the camelCase-join behavior.

}

// Sanitize the entity name by ensuring all whitespace characters are removed.
entityName = SanitizeGeneratedEntityName(entityName);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am looking at this section lately. consider we have 2 schema in database- Order Item and OrderItem. can we include the schema name along with the entityName/objectName in logs and response? since, santitized object will always show up as OederItem


// Sanitize the entity name by ensuring all whitespace characters are removed.
entityName = SanitizeGeneratedEntityName(entityName);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sanitization via whitespace removal can introduce collisions. For example:

  • "Order Item" → "OrderItem"
  • "OrderItem" → "OrderItem"

Both would map to the same entity name after sanitization.

Do we currently detect or handle such collisions? It might be useful to either log or explicitly guard against these cases to avoid silent overrides or ambiguous schema behavior.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Collisions are now surfaced explicitly. The original (pre-sanitization) name is stored before RemoveWhitespaceAndCamelCase is called. When entities.TryAdd fails (covering the "Order Item" / "OrderItem""OrderItem" collision case), the exception message includes both the sanitized name and the original name along with the schema, e.g.:

Entity 'OrderItem' (normalized from 'Order Item' in schema 'dbo') conflicts with autoentity pattern '...'. Use --patterns.exclude to skip it.

This makes it clear to the user that the collision may be a result of whitespace removal rather than an exact name match.

Comment on lines +736 to +748
/// <summary>
/// Sanitizes the generated entity name by removing whitespace and capitalizing the next character after whitespace.
/// </summary>
/// <param name="name">The entity name to be sanitized.</param>
/// <returns>The sanitized entity name.</returns>
protected static string SanitizeGeneratedEntityName(string name)
{
StringBuilder sanitizedName = new(name.Length);
bool capitalizeNext = false;

foreach (char character in name)
{
if (char.IsWhiteSpace(character))

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use a more specific name (e.g., RemoveWhitespaceAndCamelCase) could improve clarity and avoid confusion for future contributors.

capitalizeNext = false;
}

return sanitizedName.ToString();

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sanitizer does not currently handle edge cases like empty or whitespace-only strings. For example:

" " → ""

This would result in an invalid entity name.

It may be worth guarding against such cases by either throwing or providing a fallback to ensure the final entity name is always valid.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Edge case is now guarded. After calling RemoveWhitespaceAndCamelCase, if the result is empty (e.g. input was all whitespace), an error is logged with the original name and schema for context, and the entry is skipped via continue — consistent with how null/empty raw values are handled just above.

[DataRow("foo.%", 0, DisplayName = "Test Autoentities with foo schema")]
[DataRow("bar.%", 1, DisplayName = "Test Autoentities with bar schema")]
[DataRow("dbo.Order Items", 2, DisplayName = "Test Autoentities with object with spaces")]
public async Task TestAutoentitiesGeneratedWithUnusualElements(string includePattern, int patternType)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tests validate that REST and GraphQL access work end-to-end, which is great.

One improvement would be to explicitly assert the generated/sanitized entity name (e.g., "dbo_OrderItems") in the test. That would make the behavior more explicit and protect against future regressions in naming logic.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new TestAutoentitiesGeneratedWithSpacesInObjectName test declares const string EXPECTED_ENTITY_NAME = "dbo_OrderItems" and uses it directly as the REST path and GraphQL field name. The Assert.AreEqual failure message explicitly states that the entity name must be sanitized from "dbo_Order Items" to "dbo_OrderItems", making the expected generated name unambiguous.

{
// Act
string path = isPatternFoo ? "foo_magazines" : "bar_magazines";
string path;

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current test combines multiple scenarios (schemas + whitespace handling) using a patternType switch. This makes the test a bit harder to follow and extend.

It may be cleaner to split whitespace-specific validation into a separate test method for better readability and maintainability.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extracted into a dedicated TestAutoentitiesGeneratedWithSpacesInObjectName test method. TestAutoentitiesGeneratedWithUnusualElements now only covers the two non-default schema cases (foo/bar), while the whitespace normalization scenario has its own test with an explicit EXPECTED_ENTITY_NAME constant and targeted assertions.

@github-project-automation github-project-automation Bot moved this from Todo to Review In Progress in Data API builder Jun 9, 2026
Copilot AI requested a review from anushakolan June 9, 2026 18:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Status: Review In Progress

6 participants