Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/faceting #7

Merged
merged 30 commits into from
Apr 28, 2022
Merged

Feature/faceting #7

merged 30 commits into from
Apr 28, 2022

Conversation

jakewrfoster
Copy link
Member

@jakewrfoster jakewrfoster commented Apr 25, 2022

Overview

  • Adds DSL class for creating query ES DSL.
  • Adds aggregations to VIP Adapter for the following types:
    • WP Category
    • WP Tag
    • Custom taxonomies
    • WP Post date*
    • WP Post Type
  • Allows for enabling faceting on empty search queries for VIP Search

*Post date DSL is in this PR, but it is unfinished.

Copy link
Contributor

@renatonascalves renatonascalves left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

Left a few suggestions!

/**
* Enables faceting on empty search query strings.
*/
public function enable_empty_search_faceting(): void {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a disable option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since it is likely that we would want to have faceting on in most cases, perhaps this should be changed to a disable function and the option defaults to true.

Copy link
Member Author

@jakewrfoster jakewrfoster Apr 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to address your question more directly, I don't think a function to revert this one is needed since this is something that would either be on or off for the full duration of a request lifecycle - it only needs to be set once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defaults are also different by platform. SearchPress, for example, has this on by default, whereas ElasticPress / VIP Enterprise Search has it off by default.

case 'taxonomy':
$query['query']['function_score']['query']['bool']['must'][] = [
'terms' => [
'terms.' . $configured_facets[ $facet_label ]['taxonomy'] . '.slug' => [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible or easy to also map via term_id as well as slug? I think the current VIP Search widget has a filter that allows for that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I follow - what would we gain by adding an additional clause here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my understanding it is another option. You could match by the term_id instead of the term slug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be useful in combination with the suggestion above about how the data makes its way to this plugin for applying the facets. On a headless site, it might use the term ID instead of the term slug. However, the idea behind this plugin is that it handles faceting for you, so being opinionated about how it does that is fine. If we discover that we need additional flexibility around accepting facet configurations via REST or GraphQL we could revisit the idea of allowing this to be a term ID instead of a term slug.

*
* @param array $request_args Request arguments.
* @param string $path Request path.
* @param string $index Index name.
* @param string $type Index type.
*
* @return array New request arguments.
* @see \ElasticPress\Elasticsearch
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @see tag SHOULD have a description to provide additional information regarding the relationship between the element and its target.

Source: https://docs.phpdoc.org/guide/references/phpdoc/tags/see.html

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The location of the tag is also not the recommended placement.

*
* @return Controller The instance of the class to allow for chaining.
*/
public function enable_empty_search_faceting() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or Controller?

Suggested change
public function enable_empty_search_faceting() {
public function enable_empty_search_faceting(): self {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

: Controller

// Parse face data.
/*
* Parse face data. Added on 11 to ensure that the results and aggs have been set.
* @see VIP_Enterprise_Search::set_results()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same phpDoc suggestion.

*
* @param array $formatted_args Formatted ES args.
* @param array $args WP args.
* @see \ElasticPress\Indexable\Post\Post
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same phpDoc suggestion.

*
* @param array $query
* @return array
* @see \ElasticPress\Indexable\Post\Post.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same phpDoc suggestion.

* @return array
* @see \ElasticPress\Indexable\Post\Post.
*/
public function add_facets_to_ep_query( $query ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
public function add_facets_to_ep_query( $query ) {
public function add_facets_to_ep_query( $query ): array {

* Add facets to EP query.
* Filters `ep_post_formatted_args`.
*
* @param array $query
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a WP_Query? An EP query? A custom query? It is an array so unlikely to be a WP query....

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comes from an EP filter. It is an array containing the formatted query.

*/
public function add_facets_to_ep_query( $query ) {
// Do we have any facets specified?
$searched_facets = get_query_var( 'fs' );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes the assumption that only searches using the global query are relevant. Headless projects, like Irving, don't always use the global wp-query in its searching functionality.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point. do you have a suggestion on how we might account for headless projects? cc @kevinfodness

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could make a generic function that accepts searched facets as an argument, which could be populated here using get_query_var or could be populated from a request to GraphQL or the REST API's search endpoint, and could also be filtered. Adding support for a standard structure in REST and GraphQL makes sense, and then the filter would allow for further customization on a per-site basis. However, this feels like something we can add as an enhancement later vs. being a blocker for merging this PR.

Copy link
Member

@kevinfodness kevinfodness left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great. Left some suggestions. 🍣

}

foreach ( $items as $item ) {
$datum = apply_filters( 'elasticsearch_extensions_facet_datum', false, $item, $this->facets );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need docblocks for any custom action hooks or filters.

/**
* Enables faceting on empty search query strings.
*/
public function enable_empty_search_faceting(): void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The defaults are also different by platform. SearchPress, for example, has this on by default, whereas ElasticPress / VIP Enterprise Search has it off by default.

/**
* Enables an aggregation based on WP Tags.
*/
public function enable_tag_aggregation(): void {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this because WordPress handles categories and tags differently in terms of what it calls them in query construction? Since we need to be able to enable aggregations on taxonomies generally (to support custom taxonomies) I'm wondering whether it makes sense to have categories and tags be treated in a "special" way here.

*/
public function add_facets_to_ep_query( $query ) {
// Do we have any facets specified?
$searched_facets = get_query_var( 'fs' );
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we could make a generic function that accepts searched facets as an argument, which could be populated here using get_query_var or could be populated from a request to GraphQL or the REST API's search endpoint, and could also be filtered. Adding support for a standard structure in REST and GraphQL makes sense, and then the filter would allow for further customization on a per-site basis. However, this feels like something we can add as an enhancement later vs. being a blocker for merging this PR.

case 'taxonomy':
$query['query']['function_score']['query']['bool']['must'][] = [
'terms' => [
'terms.' . $configured_facets[ $facet_label ]['taxonomy'] . '.slug' => [
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be useful in combination with the suggestion above about how the data makes its way to this plugin for applying the facets. On a headless site, it might use the term ID instead of the term slug. However, the idea behind this plugin is that it handles faceting for you, so being opinionated about how it does that is fine. If we discover that we need additional flexibility around accepting facet configurations via REST or GraphQL we could revisit the idea of allowing this to be a term ID instead of a term slug.

$this->adapter->enable_post_date_aggregation();
$defaults = [
'count' => 1000,
'calendar_interval' => 'month',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend defaulting to year here. I think bucketing by year is likely the most common scenario, and would result in the smallest number of buckets, and folks could zoom in from there if they need/want to.

if ( isset( $this->adapter ) ) {
$this->adapter->enable_post_type_aggregation();
$defaults = [
'count' => 1000,
'name' => 'Post Type',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strings like these should be translated if they aren't being pulled from already-translated strings (like taxonomy labels).

if ( isset( $this->adapter ) ) {
$this->adapter->enable_taxonomy_aggregation( $taxonomy );
$defaults = [
'count' => 1000,
'name' => $taxonomy,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we could extract the name from the taxonomy object based on its slug.

/**
* Elasticsearch facet/aggregation.
*/
class Facet {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment here - recommend calling this Aggregation instead of Facet

*
* @var string
*/
public string $subtype;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpicky, but I like to alphabetize my class properties and method names to keep everything organized.

@kevinfodness
Copy link
Member

I'm going to merge this so I can build on it.

@kevinfodness kevinfodness merged commit 7d58a38 into main Apr 28, 2022
@kevinfodness kevinfodness deleted the feature/faceting branch April 28, 2022 18:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants