Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidate Ivarator Config #2655

Open
wants to merge 17 commits into
base: integration
Choose a base branch
from

Conversation

SethSmucker
Copy link
Collaborator

@SethSmucker SethSmucker commented Nov 25, 2024

Merges IvaratorConfigs into one central object to be used in ShardQueryLogic ShardQueryConfiguration. Setters/Getters remain the same.
#2644

@SethSmucker SethSmucker force-pushed the task/consolidate-ivaratorconfig branch from ac4ed57 to dd97c1e Compare November 25, 2024 20:33
@SethSmucker SethSmucker force-pushed the task/consolidate-ivaratorconfig branch from 172aba6 to 94ce5ee Compare November 26, 2024 20:49
@lbschanno lbschanno linked an issue Dec 4, 2024 that may be closed by this pull request
@Test
public void testSerializeDeserializeEquivalence() throws JsonProcessingException {
IvaratorConfig conf1 = new IvaratorConfig();
conf1.setIvaratorFstHdfsBaseURIs("hdfs://example.com/ivarator");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The URI won't have a dot com in it, a reasonable value for tests would be something like "file://tmp/ivarator" or "hdfs://tmp/ivarator/"

@@ -1507,17 +1495,6 @@ public boolean validateOptions(Map<String,String> options) {
this.termCounts = getMapSerDe().deserializeFromString(serializedMap);
}

// parse out cardinality threshold
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like cardinality threshold stuff was reverted as part of this work, please correct this

}

public long getMaxIvaratorResults() {
return maxIvaratorResults;
return ivaratorConfig.getMaxIvaratorResults();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's follow a "get or create" pattern. This allows lazy initialization of the ivarator config. This means a query that does not require an ivarator can avoid some extra object initialization (and later, some potentially expensive I/O calls to the configured filesystem).

This would become

getIvaratorConfig().setMaxIvaratorResults(maxIvaratorResults)
..
return getIvaratorConfig().getMaxIvaratorResults()
..
public IvaratorConfig getIvaratorConfig(){
  if (ivaratorConfig == null) {
    // create a fresh config with default values
    ivaratorConfig = new IvaratorConfig();
  }
  return ivaratorConfig;
}

import datawave.query.iterator.ivarator.IvaratorCacheDirConfig;
import datawave.query.util.sortedset.FileSortedSet;

public class IvaratorConfig implements Serializable {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Add a class level comment that this class encapsulates all ivarator configs in a single object to simplify serialization of configs between the webservice and tablet servers

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Consolidate Ivarator Configs
2 participants