Skip to content

Conversation

@alzarei
Copy link

@alzarei alzarei commented Sep 27, 2025

Modernize BingTextSearch connector with ITextSearch interface

Problem Statement

The BingTextSearch connector currently only implements the legacy ITextSearch interface, forcing users to use clause-based TextSearchFilter instead of modern type-safe LINQ expressions. This creates runtime errors from property name typos and lacks compile-time validation.

Technical Approach

This PR modernizes the BingTextSearch connector to implement the generic ITextSearch interface alongside the existing legacy interface. The implementation provides recursive expression tree processing to convert LINQ patterns into Bing Web Search API advanced operators.

Implementation Details

Core Changes

  • Implement ITextSearch interface with full generic method support
  • Add recursive LINQ expression tree processor with operator-specific handlers
  • Map supported LINQ operators to Bing API advanced search syntax
  • Maintain all existing functionality while adding modern type-safe alternatives

Expression Tree Processing

  • Equality (==) → language:en syntax
  • Inequality (!=) → -language:en negation syntax
  • Contains() → intitle:, inbody:, url: operators
  • Logical AND (&&) → Sequential filter application

Code Examples

Before (Legacy Interface)

var options = new TextSearchOptions
{
    Filter = new TextSearchFilter().Equality("site", "microsoft.com")
};
var results = await textSearch.SearchAsync("Semantic Kernel", options);

After (Generic Interface)

// Simple filtering
var options = new TextSearchOptions<BingWebPage>
{
    Filter = page => page.Language == "en"
};

// Complex filtering
var complexOptions = new TextSearchOptions<BingWebPage>
{
    Filter = page => page.Language == "en" &&
                     page.Name.Contains("Microsoft") &&
                     page.IsFamilyFriendly != false &&
                     page.Url.Contains("docs")
};

var results = await textSearch.SearchAsync("AI", options);

Implementation Benefits

Type Safety & Developer Experience

  • Compile-time validation of BingWebPage property access
  • IntelliSense support for all BingWebPage properties
  • Eliminates runtime errors from property name typos in filters

Enhanced Filtering Capabilities

  • Equality filtering: page => page.Language == "en"
  • Exclusion filtering: page => page.Language != "fr"
  • Substring matching: page => page.Name.Contains("AI")
  • Complex queries with multiple conditions combined

Validation Results

Build Verification

  • Command: dotnet build --configuration Release
  • Result: Build succeeded in 2366.9s (39.4 min), 0 errors, 2 warnings
  • Focused build: dotnet build src/Plugins/Plugins.Web/Plugins.Web.csproj --configuration Release
  • Result: Build succeeded in 92.4s, 0 errors, 0 warnings

Test Coverage

  • BingTextSearch Unit Tests: 38/38 tests passed (100%, 4.8s execution)
    • URI building with equality filters (31 parameter variations)
    • Inequality operator support (negation syntax)
    • Contains() method handling
    • Response parsing and result mapping
  • Core Semantic Kernel Tests: 1,574/1,574 tests passed (100%, 10.4s duration)
  • Full Solution Tests: 7,267/7,267 core unit tests passed
  • Integration Tests: 2,923 skipped (missing API keys - expected)

Code Quality

  • Static Analysis: 0 compiler errors, 2 warnings (solution-wide, unrelated)
  • Code Changes: +165 insertions, -17 deletions in BingTextSearch.cs
  • Formatting: dotnet format SK-dotnet.slnx --verify-no-changes - 0 files needed formatting
  • Backward Compatibility: All existing functionality preserved with zero regressions

Files Modified

dotnet/src/Plugins/Plugins.Web/Bing/BingTextSearch.cs

Breaking Changes

None. All existing BingTextSearch functionality preserved with zero regressions.

Multi-PR Context

This is PR 3 of 6 in the structured implementation approach for Issue #10456. This PR extends LINQ filtering support to the BingTextSearch connector while maintaining independence from other connector modernization efforts.

@moonbox3 moonbox3 added the .NET Issue or Pull requests regarding .NET code label Sep 27, 2025
@alzarei alzarei marked this pull request as ready for review September 27, 2025 06:34
@alzarei alzarei requested a review from a team as a code owner September 27, 2025 06:34
@alzarei alzarei marked this pull request as draft September 27, 2025 07:01
@alzarei alzarei force-pushed the feature-text-search-linq-pr3 branch from 5d56fac to 3d5cd1a Compare September 27, 2025 22:17
@alzarei alzarei changed the title .Net: feat: Modernize BingTextSearch connector (depends on #PR2 #13179) (microsoft#10456) Sep 27, 2025
@alzarei alzarei marked this pull request as ready for review September 28, 2025 06:09
@alzarei alzarei changed the title .Net: feat: Modernize BingTextSearch connector (microsoft#10456) Sep 28, 2025
alzarei added a commit to alzarei/semantic-kernel that referenced this pull request Oct 17, 2025
- Added 9 semantic verification tests for LINQ filter translation
- Tests verify correct Bing API query parameter generation
- Fixed inequality operator bug discovered during testing
- Addresses reviewer feedback on PR microsoft#13188
{
Top = 4,
Skip = 0,
Filter = page => page.Language == "en" && page.Name.Contains("AI")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably something that can be addressed separately, but also take a look at this PR for some upcoming changes regarding contains:
#13263

Copy link
Author

@alzarei alzarei Oct 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@westey-m Thanks for sharing this! I checked and PR #13263 merged 4 days ago with a clear pattern across the vector store connectors.

I agree we should keep this PR focused on LINQ filtering and defer the C# 14 compatibility. BingTextSearch is still on C# 12 (VectorData upgraded, but Plugins.Web didn't), so no immediate break. But when we eventually move to C# 14, we'll need the ~100-line fix following @roji's pattern.

I can create a tracking issue now with implementation details, or would you prefer I just handle it in a follow-up PR when the time comes? What do you think works best?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should keep this PR focused on LINQ filtering and defer the C# 14 compatibility. BingTextSearch is still on C# 12 (VectorData upgraded, but Plugins.Web didn't), so no immediate break. But when we eventually move to C# 14, we'll need the ~100-line fix following @roji's pattern.

The language version which BingTextSearch itself uses isn't relevant here - it's the language used in user code using these APIs; the moment the user uses C# 14, Contains over an array in their code will resolve to MemoryExtensions.Contains instead of Enumerable.Contains, triggering the problem. So if you're writing (mini-)LINQ providers for the different search providers, I'd integrate support for MemoryExtensions from the get-go.

Copy link
Author

@alzarei alzarei Oct 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@roji Thank you for clarifying that the user's language version is the determining factor. I agree this should be addressed in the current PR rather than deferred.

I've implemented C# 14 MemoryExtensions.Contains detection in commit 3ff0f4d, following the pattern established in #13263 with an architectural constraint:

Context: The Bing Search API does not support OR logic across multiple values (unlike database-backed vector stores that can use SQL IN clauses). Therefore, collection Contains patterns cannot be translated to valid Bing search operators.

Implementation: Both Enumerable.Contains (C# 13-) and MemoryExtensions.Contains (C# 14+) are now detected and throw NotSupportedException with clear error messages explaining the limitation and suggesting alternatives (e.g., multiple separate searches or client-side filtering).

What works:

  • String.Contains (instance method): page.Name.Contains("AI")intitle:AI
  • Clear, actionable errors for unsupported collection Contains patterns regardless of C# version

Test coverage: Added 2 comprehensive tests:

  1. CollectionContainsFilterThrowsNotSupportedExceptionAsync - validates exception behavior for both C# 13 (Enumerable.Contains) and C# 14 (MemoryExtensions.Contains) patterns using the same test code
  2. StringContainsStillWorksWithLINQFiltersAsync - regression test ensuring String.Contains continues to work

This ensures C# 14 users receive helpful error messages rather than cryptic failures or silent incorrect behavior. Let me know if you'd prefer a different approach!

alzarei added a commit to alzarei/semantic-kernel that referenced this pull request Oct 29, 2025
Support detection of MemoryExtensions.Contains patterns in LINQ filters
for C# 14 compatibility. In C# 14, Contains over arrays/collections now
resolves to MemoryExtensions.Contains instead of Enumerable.Contains due
to 'first-class spans' overload resolution changes.

Since Bing Search API does not support OR logic across multiple values,
collection Contains patterns (e.g., array.Contains(page.Property)) throw
clear NotSupportedException explaining the limitation and suggesting
alternatives.

Changes:
- Add System.Diagnostics.CodeAnalysis using for NotNullWhen attribute
- Enhance Contains case to distinguish instance vs static method calls
- Add IsMemoryExtensionsContains helper to detect C# 14 patterns
- Throw NotSupportedException for both Enumerable and MemoryExtensions
  collection Contains patterns with clear actionable error messages
- Add 2 tests: collection Contains exception + String.Contains regression

Implements pattern awareness from PR microsoft#13263 by @roji
Addresses feedback on PR microsoft#13188 from @roji about C# 14 compatibility

Contributes to microsoft#10456 (LINQ filtering migration initiative)
Fixes microsoft#12504 compatibility for text search implementations
@alzarei alzarei requested review from roji and westey-m October 29, 2025 07:31
@alzarei
Copy link
Author

alzarei commented Oct 31, 2025

@roji @markwallace-microsoft

All feedback has been addressed and @westey-m has approved the changes. CI/CD checks are passing successfully.

The PR is ready for final review and merge. Thank you! 🙏

@alzarei
Copy link
Author

alzarei commented Oct 31, 2025

Rebased on top of feature branch to resolve conflicts. Ready for review.

@alzarei alzarei force-pushed the feature-text-search-linq-pr3 branch from db5c9a2 to dfb589d Compare October 31, 2025 06:04
alzarei added a commit to alzarei/semantic-kernel that referenced this pull request Oct 31, 2025
- Added 9 semantic verification tests for LINQ filter translation
- Tests verify correct Bing API query parameter generation
- Fixed inequality operator bug discovered during testing
- Addresses reviewer feedback on PR microsoft#13188
alzarei added a commit to alzarei/semantic-kernel that referenced this pull request Oct 31, 2025
Support detection of MemoryExtensions.Contains patterns in LINQ filters
for C# 14 compatibility. In C# 14, Contains over arrays/collections now
resolves to MemoryExtensions.Contains instead of Enumerable.Contains due
to 'first-class spans' overload resolution changes.

Since Bing Search API does not support OR logic across multiple values,
collection Contains patterns (e.g., array.Contains(page.Property)) throw
clear NotSupportedException explaining the limitation and suggesting
alternatives.

Changes:
- Add System.Diagnostics.CodeAnalysis using for NotNullWhen attribute
- Enhance Contains case to distinguish instance vs static method calls
- Add IsMemoryExtensionsContains helper to detect C# 14 patterns
- Throw NotSupportedException for both Enumerable and MemoryExtensions
  collection Contains patterns with clear actionable error messages
- Add 2 tests: collection Contains exception + String.Contains regression

Implements pattern awareness from PR microsoft#13263 by @roji
Addresses feedback on PR microsoft#13188 from @roji about C# 14 compatibility

Contributes to microsoft#10456 (LINQ filtering migration initiative)
Fixes microsoft#12504 compatibility for text search implementations
…rchResults<TRecord>

- Change interface return type from KernelSearchResults<object> to KernelSearchResults<TRecord>
- Update VectorStoreTextSearch implementation with new GetResultsAsTRecordAsync helper
- Keep GetResultsAsRecordAsync for legacy ITextSearch interface backward compatibility
- Update 3 unit tests to use strongly-typed DataModel instead of object

Benefits:
- Improved type safety - no more casting required
- Better IntelliSense and developer experience
- Zero breaking changes to legacy ITextSearch interface
- All 19 unit tests pass

This is Part 2.1 of the Issue microsoft#10456 multi-PR chain, refining the API .
alzarei and others added 2 commits November 2, 2025 18:43
…extSearch

Implement ITextSearch<BingWebPage> alongside existing ITextSearch interface
Add LINQ expression conversion logic with property mapping to Bing API parameters
Support type-safe filtering with BingWebPage properties
Provide graceful degradation for unsupported LINQ expressions
Maintain 100% backward compatibility with existing legacy interface

Addresses microsoft#10456
Part of PR 3/6 in structured modernization of ITextSearch interfaces
- Implement equality (==), inequality (!=), Contains(), and AND (&&) operators
- Map LINQ expressions to Bing Web Search API advanced operators
- Support negation syntax for inequality (-operator:value)
- Maintain full backward compatibility

Addresses microsoft#10456
Aligns with PR microsoft#10273

Tests: 38/38 pass (100%)
Breaking changes: None
alzarei and others added 6 commits November 2, 2025 18:43
- Added 9 semantic verification tests for LINQ filter translation
- Tests verify correct Bing API query parameter generation
- Fixed inequality operator bug discovered during testing
- Addresses reviewer feedback on PR microsoft#13188
…y test

- ProcessEqualityExpression now handles Convert expressions for nullable types (bool?)
- Added explicit boolean-to-lowercase string conversion for Bing API compatibility
- Added test for IsFamilyFriendly == true filter (validates safeSearch query parameter)
- Achieves 100% coverage of all 10 LINQ filter operations
- Note: CA1308 warning acceptable - Bing API requires lowercase boolean strings
- Add pragma directive to suppress CA1308 warning with proper rationale
- Bing API specifically expects lowercase boolean values (true/false)
- ToLowerInvariant() is the correct choice for this API compatibility requirement
- All builds now clean with 0 warnings, tests continue to pass (48/48)
…ilter tests

Fix compiler warnings blocking CI/CD builds:
- Fix CS8602 nullable warnings: Add null-forgiving operators (!) to 5 test locations
- Fix CS0618 obsolete warning: Add pragma to suppress ITextSearch deprecation warning during Phase 2 transition

Add 18 new comprehensive tests covering edge cases and complex scenarios:
- Boolean value handling (true/false/inequality with safeSearch parameter)
- Multiple Contains operations (3+ conditions with distinct operators)
- Mixed operators (equality + Contains + negation in single filter)
- Result structure preservation across all 3 search methods
- URL encoding validation for special characters
- Null filter handling and contradictory conditions
- All 6 supported BingWebPage properties (Name, Url, DisplayUrl, Snippet, Language, IsFamilyFriendly)

Test quality improvements:
- Fix 2 test assertions for correct Bing URL encoding expectations
- Document limitation: Query parameters don't support negation prefix
- Verify semantic correctness of LINQ→Bing query translation

All 56 BingTextSearch tests passing (100% success rate)
Builds clean with TreatWarningsAsErrors=true

Contributes to microsoft#10456 (LINQ filtering migration initiative)
Support detection of MemoryExtensions.Contains patterns in LINQ filters
for C# 14 compatibility. In C# 14, Contains over arrays/collections now
resolves to MemoryExtensions.Contains instead of Enumerable.Contains due
to 'first-class spans' overload resolution changes.

Since Bing Search API does not support OR logic across multiple values,
collection Contains patterns (e.g., array.Contains(page.Property)) throw
clear NotSupportedException explaining the limitation and suggesting
alternatives.

Changes:
- Add System.Diagnostics.CodeAnalysis using for NotNullWhen attribute
- Enhance Contains case to distinguish instance vs static method calls
- Add IsMemoryExtensionsContains helper to detect C# 14 patterns
- Throw NotSupportedException for both Enumerable and MemoryExtensions
  collection Contains patterns with clear actionable error messages
- Add 2 tests: collection Contains exception + String.Contains regression

Implements pattern awareness from PR microsoft#13263 by @roji
Addresses feedback on PR microsoft#13188 from @roji about C# 14 compatibility

Contributes to microsoft#10456 (LINQ filtering migration initiative)
Fixes microsoft#12504 compatibility for text search implementations
…ts<BingWebPage> per PR microsoft#13318

- Change ITextSearch<BingWebPage>.GetSearchResultsAsync to return KernelSearchResults<BingWebPage> instead of <object>
- Add strongly-typed helper GetResultsAsBingWebPageAsync returning IAsyncEnumerable<BingWebPage>
- Update tests to expect strongly-typed results (no manual casting needed)
- Fix duplicate XML comment issue

This aligns with PR microsoft#13318 interface type safety improvements and eliminates wasteful casting.
@alzarei alzarei force-pushed the feature-text-search-linq-pr3 branch from dfb589d to 2864a47 Compare November 3, 2025 03:04
@markwallace-microsoft markwallace-microsoft added kernel Issues or pull requests impacting the core kernel kernel.core labels Nov 3, 2025
markwallace-microsoft pushed a commit that referenced this pull request Nov 3, 2025
…arch.GetSearchResultsAsync (#13318)

This PR enhances the type safety of the `ITextSearch<TRecord>` interface
by changing the `GetSearchResultsAsync` method to return
`KernelSearchResults<TRecord>` instead of `KernelSearchResults<object>`.
This improvement eliminates the need for manual casting and provides
better IntelliSense support for consumers.

## Motivation and Context

The current implementation of
`ITextSearch<TRecord>.GetSearchResultsAsync` returns
`KernelSearchResults<object>`, which requires consumers to manually cast
results to the expected type. This reduces type safety and degrades the
developer experience by losing compile-time type checking and
IntelliSense support.

This change aligns the return type with the generic type parameter
`TRecord`, providing the expected strongly-typed results that users of a
generic interface would anticipate.

## Changes Made

### Interface (ITextSearch.cs)
- Changed `ITextSearch<TRecord>.GetSearchResultsAsync` return type from
`KernelSearchResults<object>` to `KernelSearchResults<TRecord>`
- Updated XML documentation to reflect strongly-typed return value
- Legacy `ITextSearch` interface (non-generic) remains unchanged,
continuing to return `KernelSearchResults<object>` for backward
compatibility

### Implementation (VectorStoreTextSearch.cs)
- Added new `GetResultsAsTRecordAsync` helper method returning
`IAsyncEnumerable<TRecord>`
- Updated generic interface implementation to use the new strongly-typed
helper
- Retained `GetResultsAsRecordAsync` method for the legacy non-generic
interface

### Tests (VectorStoreTextSearchTests.cs)
- Updated 3 unit tests to use strongly-typed `DataModel` or
`DataModelWithRawEmbedding` instead of `object`
- Improved test assertions to leverage direct property access without
casting
- All 19 tests pass successfully

## Breaking Changes

**Interface Change (Experimental API):**
- `ITextSearch<TRecord>.GetSearchResultsAsync` now returns
`KernelSearchResults<TRecord>` instead of `KernelSearchResults<object>`
- This interface is marked with `[Experimental("SKEXP0001")]`,
indicating that breaking changes are expected during the preview period
- Legacy `ITextSearch` interface (non-generic) is unaffected and
maintains full backward compatibility

## Benefits

- **Improved Type Safety**: Eliminates runtime casting errors by
providing compile-time type checking
- **Enhanced Developer Experience**: Full IntelliSense support for
TRecord properties and methods
- **Cleaner Code**: Consumers no longer need to cast results from object
to the expected type
- **Consistent API Design**: Generic interface now behaves as expected,
returning strongly-typed results
- **Zero Impact on Legacy Code**: Non-generic ITextSearch interface
remains unchanged

## Testing

- All 19 existing unit tests pass
- Updated tests demonstrate improved type safety with direct property
access
- Verified both generic and legacy interfaces work correctly
- Confirmed zero breaking changes to non-generic ITextSearch consumers

## Related Work

This PR is part of the Issue #10456 multi-PR chain for modernizing
ITextSearch with LINQ-based filtering:
- PR #13175: Foundation (ITextSearch<TRecord> interface) - Merged
- PR #13179: VectorStoreTextSearch + deprecation pattern - In Review
- **This PR (2.1)**: API refinement for improved type safety
- PR #13188-13191: Connector migrations (Bing, Google, Tavily, Brave) -
Pending
- PR #13194: Samples and documentation - Pending

All PRs target the `feature-text-search-linq` branch for coordinated
release.

## Migration Guide for Consumers

### Before (Previous API)
```csharp
ITextSearch<DataModel> search = ...;
KernelSearchResults<object> results = await search.GetSearchResultsAsync("query", options);

foreach (var obj in results.Results)
{
    var record = (DataModel)obj;  // Manual cast required
    Console.WriteLine(record.Name);
}
```

### After (Improved API)
```csharp
ITextSearch<DataModel> search = ...;
KernelSearchResults<DataModel> results = await search.GetSearchResultsAsync("query", options);

foreach (var record in results.Results)  // Strongly typed!
{
    Console.WriteLine(record.Name);  // Direct property access with IntelliSense
}
```

## Checklist

- [x] Changes build successfully
- [x] All unit tests pass (19/19)
- [x] XML documentation updated
- [x] Breaking change documented (experimental API only)
- [x] Legacy interface backward compatibility maintained
- [x] Code follows project coding standards

Co-authored-by: Alexander Zarei <alzarei@users.noreply.github.com>
@markwallace-microsoft markwallace-microsoft merged commit 639e7cb into microsoft:feature-text-search-linq Nov 3, 2025
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kernel.core kernel Issues or pull requests impacting the core kernel .NET Issue or Pull requests regarding .NET code

5 participants