RFC 0027 - The Future of Search #43
Replies: 4 comments 8 replies
-
Thanks @kjac + @bergmania for putting this together. I have some suggestions and questions: Document structureAs this document is basically a design/spec, ideally it would have a section before the detailed design for Requirements that captures both functional and non-functional requirements that this spec needs to fulfill. My suggestion would be to organize the headings as:
This way the reader knows exactly what the expectations are for what the detailed design proposes. Further, terminology for requirements are: “Must” means mandatory/required. “May” means permitted. “Should” means recommended. So far the only listed requirements are:
From what I can gather in the doc, the requirements (not limited to) are:
There might be others that I've missed but IMO it would be good to call all of this out at the start of the document to make it clear what outcomes are expected. Yet to be definedThere are several items in the document explaining what hasn't been defined yet. I think these should also be called out within a single section (maybe at the top of Detailed Design?)
Concerns"Replacing the current Examine-based implementation"In the Summary this sentence is telling folks that you are removing Examine which is not actually the case. I would suggest removing this part of the sentence since it is not what the intention of this document is. Separating media and documents into separate indexesGenerally less indexes is better for both performance reasons but also for manageability and costing when using search-as-a-service providers since many providers will charge for more indexes or have index count limitations. Further, many search providers do not support cross index searching in a single search operation which means that a single search operation could not find both media and content in the same query which also means that scoring between content and media would not be possible because you'd have to execute 2 disparate queries. I'm unsure what the real reason would be to separate these indexes? If there is a specific requirement that this is fulfilling it should be called out in the requirements section. Field prefixesSome search providers have strict limitations on how fields are named and this should strive to ensure that field names align with all search provider limitations. This should be trivial by simply not using special chars in the naming. For example, here's the limitation for Azure Search: First character must be a letter or number, No consecutive dashes or underscores. I would suggest that one of the requirements should be: Index field names must be named with a convention that supports all search providers. Questions
|
Beta Was this translation helpful? Give feedback.
-
Thanks for releasing this RFC. Support for date/times In the Out of scope section, Support for providers
I've read it like "[Umbraco] will not implement other search providers for the initial release", but site maintainers will be able to implement their own alternative providers with the initial release. Is that correct, or would this implementation work exclusively with Examine at the initial release? |
Beta Was this translation helpful? Give feedback.
-
I was one of people which done few approaches to introduce abstractions, so let me start on my comments the most important:
|
Beta Was this translation helpful? Give feedback.
-
Thank you all for the valuable feedback so far! #H5RY Suggestions for document structure Your concerns
You’re absolutely right—this statement needs clarification. While it’s correct that the current Examine-based implementation will be replaced, it doesn’t imply that Examine will be entirely removed. At present, removing Examine is not part of the plan.
I understand your concern, particularly with Azure Search. From an abstraction perspective, however, we believe separating these types is appropriate. That said, custom implementations of the abstraction can consolidate this data into a single index, using a type field for filtering during searches.
Thanks for highlighting Azure Search’s limitations in this area. Using universally safe prefixes is indeed wise. However, implementations can ultimately adapt field names to accommodate specific providers where necessary. Regarding your questions:
The objective is to deliver a robust abstraction that serves both the back office and websites. Features like faceting and basic filtering reflect this dual focus, as outlined in the RFC: rfcs/cms/0027-the-future-of-search.md Line 22 in 8e06e7f
Your interpretation is correct. We will clarify this section to specify that the “TextsH*” fields represent a hierarchy akin to HTML headings versus regular text. The abstraction provides seven “buckets” for implementation-specific boosting with a defined order of importance. Thank you for your input—it’s greatly appreciated.
Initially, we considered including date/time support but scoped it out for the first version to simplify implementation. Most scenarios can still be addressed using DateTimeOffset, including your example. However, we recognize there may be cases requiring expanded type support and are open to revisiting this.
You understood correctly, though we will refine the RFC to make this clearer. Initially, HQ will provide and support a single implementation. However, we aim to foster community-driven contributions and hope packages for additional search providers will emerge. HQ plans to prototype a single alternative implementation to validate the abstraction’s viability. @bielu
|
Beta Was this translation helpful? Give feedback.
-
Request for Comments: The Future of Search
Read the full RFC document here.
This RFC discusses adding a new search abstraction to Umbraco. We would love your feedback on the described feature.
How do I contribute?
Most importantly, we don’t want to miss anything, so everything goes in terms of clarifications, questions, suggestions, etc.
Please do the following things if you want to contribute:
Beta Was this translation helpful? Give feedback.
All reactions