-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor projectIdSingleton -> hostUrlSingleton #148
Refactor projectIdSingleton -> hostUrlSingleton #148
Conversation
@@ -14,48 +16,38 @@ jest.mock('../utils', () => { | |||
}); | |||
jest.mock('../data/fetch'); | |||
jest.mock('../data/upsert'); | |||
jest.mock('../data/hostUrlSingleton'); | |||
|
|||
jest.mock('../control', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent a lot of time trying to get the mocking for things working and it still feels like a mess. 😭
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this sucked. If you look at my comment on pinecone.ts
, I think you could do this in a way that doesn't require any modifications to describeIndex.
@@ -6,7 +6,6 @@ describe('Error handling', () => { | |||
test('calling control plane', async () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This file will need to be fixed up.
@@ -4,7 +4,6 @@ describe('Client initialization', () => { | |||
test('can accept a config object', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, just fixing up lint / typescript for now.
src/control/describeIndex.ts
Outdated
export const describeIndex = (api: IndexOperationsApi) => { | ||
export const describeIndex = ( | ||
api: IndexOperationsApi, | ||
callback?: ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went back and forth on a few different means of allowing the describeIndex
call to interact with HostUrlSingleton
. Initially I wanted to use it within here, but then we didn't have access to the PineconeConfiguration
for the cacheKey
in set, so went with a callback.
src/data/hostUrlSingleton.ts
Outdated
const describeResponse = await describeIndex(indexOperationsApi)(indexName); | ||
const host = describeResponse.status?.host; | ||
|
||
if (!host) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine to cover this case here because the type indicates it could occur (mostly because of loose typing in the openapi spec, I think), but in practice I have never seen the host value come back missing. I think the url is known immediately, whether or not the index is actually ready to receive data plane calls.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, primarily we're satisfying flow. Maybe we could opt to update the OpenAPI spec to make this a bit easier for us.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking over this diff really made me appreciate what a trooper you've been doing so many huge reviews for me lately.
To summarize my feedback:
- Add an env var to override control plane url
- Consider extracting a new class of some kind to handle config/initialization for IndexOperationsApi
- You can simplify a lot of stuff by leaving describeIndex as is and making it the responsibility of the Pinecone class to inform the singleton of the host whenever pinecone.describeIndex() gets called
- There's an edge case with pinecone.deleteIndex also. Probably things should be evicted from the singleton cache when an index is deleted. Otherwise repeated create/delete/create operations will leave you in a state where the cached host url is invalid.
src/data/types.ts
Outdated
apiKey: Type.String({ minLength: 1 }), | ||
projectId: Type.Optional(Type.String({ minLength: 1 })), | ||
hostUrl: Type.Optional(Type.String({ minLength: 1 })), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it's a good idea to have this here, or maybe we need to create a distinction between configuration for the Pinecone
class and the Index
class. In the past, environment/apiKey/projectId were all global so it wasn't a significant difference and we could use the same config object everywhere, but now we have a need to have index-specific configuration since hostUrl will only apply to a single index. I need to go throught the rest of this PR still before I make up my mind, but it feels to me like maybe the time to setup a dedicated IndexConfiguration
that extends/subclasses PineconeConfiguration
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and split things apart. We now have:
PineconeConfiguration
IndexConfiguration
(which extendsPineconeConfiguration
)
Let me know if this doesn't make sense.
src/pinecone.ts
Outdated
const { apiKey, environment } = options; | ||
const controllerPath = `https://controller.${environment}.pinecone.io`; | ||
const { apiKey } = options; | ||
const controllerPath = `https://api.pinecone.io`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need for backticks anymore since we're no longer interpolating vars.
const controllerPath = `https://api.pinecone.io`; | |
const controllerPath = 'https://api.pinecone.io'; |
Also, this is where I was thinking an environment variable like PINECONE_CONTROLLER_HOST
could be useful to unblock our integration testing.
src/pinecone.ts
Outdated
/** | ||
* @internal | ||
* Used as a callback by {@link Pinecone.describeIndex} to cache any index host URLs that are returned. | ||
* This bypasses the need to fetch the host on subsequent dataplane calls. | ||
*/ | ||
_handleDescribeIndex = (response: IndexDescription, indexName: IndexName) => { | ||
if (response.status?.host) { | ||
HostUrlSingleton._set(this.config, indexName, response.status.host); | ||
} | ||
}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an alternative approach would be to invoke HostUrlSingleton from inside the describeIndex method defined further down this file, since that's the only place this._describeIndex
gets used.
describeIndex(indexName: IndexName) {
const description = this._describeIndex(indexName);
HostUrlSingleton._set(this.config, indexName, response.status.host);
return description;
}
If you do it this way, you shouldn't need any changes callback changes in src/data/describeIndex.ts
and can simplify some of the testing gymnastics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, not really sure why I decided to go with a callback at this point. This would have been much cleaner. 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do remember the reason: we're returning promises and I went with passing an explicit callback rather than using .then()
on the return. Using the promise itself and keeping things isolated to pinecone.ts
.
@@ -14,48 +16,38 @@ jest.mock('../utils', () => { | |||
}); | |||
jest.mock('../data/fetch'); | |||
jest.mock('../data/upsert'); | |||
jest.mock('../data/hostUrlSingleton'); | |||
|
|||
jest.mock('../control', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry this sucked. If you look at my comment on pinecone.ts
, I think you could do this in a way that doesn't require any modifications to describeIndex.
src/data/hostUrlSingleton.ts
Outdated
const _buildIndexOperationsApi = (config: PineconeConfiguration) => { | ||
const { apiKey } = config; | ||
const controllerPath = `https://api.pinecone.io`; | ||
const apiConfig: IndexOperationsApiConfigurationParameters = { | ||
basePath: controllerPath, | ||
apiKey, | ||
queryParamsStringify, | ||
headers: { | ||
'User-Agent': buildUserAgent(false), | ||
}, | ||
fetchApi: getFetch(config), | ||
middleware, | ||
}; | ||
|
||
return new IndexOperationsApi(new ApiConfiguration(apiConfig)); | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be worthwhile to extract this configuration stuff into somewhere because we probably want it to be identical to the configs that are setup inside pinecone.ts
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It didn't seem like it necessitated a class as we're not doing the same sort of logic as in vectorOperationsProvider
. We now have a centralized place for the index API configurations: indexOperationsBuilder
.
src/data/hostUrlSingleton.ts
Outdated
|
||
const _buildIndexOperationsApi = (config: PineconeConfiguration) => { | ||
const { apiKey } = config; | ||
const controllerPath = `https://api.pinecone.io`; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some way of overriding this value using an env var like PINECONE_CONTROLLER_HOST
would be useful in getting the integration tests working over the short-term while api.pinecone.io is still being worked on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added. I almost felt like we could leave PINECONE_ENVIRONMENT
as an optional env variable for now and then either use PINECONE_CONTROLLER_HOST
if provided, or PINECONE_ENVIRONMENT
to build it if provided, or default to https://api.pinecone.io
, but that seemed like it would overly complicate things when we can just provide the URL anyways.
@@ -0,0 +1,27 @@ | |||
import { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seemed a bit silly to unit test, and the only thing we'd really care about checking is the proper controllerPath
being set. Since the internals of the OpenAPI classes are protected, that's not necessarily easy unless we want to use a ts-ignore
.
@@ -219,7 +200,17 @@ export class Pinecone { | |||
* @returns A promise that resolves to {@link IndexMeta} | |||
*/ | |||
describeIndex(indexName: IndexName) { | |||
return this._describeIndex(indexName); | |||
const describeIndexPromise = this._describeIndex(indexName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Swapped from passing a callback to appending a .then()
on the promise before returning.
const deleteIndexPromise = this._deleteIndex(indexName); | ||
|
||
// When an index is deleted, we need to evict the host from the IndexHostSingleton cache. | ||
deleteIndexPromise.then(() => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this should work for evicting things from the cache if the index has been deleted.
…rationsProvider to leverage the new singleton, updates to pinecone and describeIndex to assist in caching
…ssociated unit tests and comments
…lass to cover validating the callback
…NECONE_CONTROLLER_HOST environment value and use over the default if provided, update unit tests
2b21bb4
to
5eaffe3
Compare
## Problem In the future, we'll be using `describeIndex` to retrieve the specific `host` value for a given Index. Previously, this was handled internally by the client via configuration values and pinging `whoami`, and caching the results to prevent unneeded requests and latency. We want to maintain this caching behavior for `apiKey` -> `indexName` while simplifying the client configuration and calling `describeIndex` to retrieve the `host`. ## Solution The new `IndexHostSingleton` replaces the existing `ProjectIdSingleton`. The functionality is very similar between the two singletons at a high level, but the code paths for making network calls are different. I've also gone ahead and pulled out `projectId` and `environment` as configuration values as they'll no longer be needed to generate the data plane address. - Delete `ProjectIdSingleton` and add new `IndexHostSingleton`. - Add unit tests for `IndexHostSingleton`. - Remove `projectId` and `environment` from `PineconeConfiguration` and update related areas, tests, documentation bits, etc. Add new `PINECONE_CONTROLLER_HOST` environment variable. - Add `.then()` callbacks to `describeIndex` and `deleteIndex` to both set and delete the host from the cache when needed. ## Type of Change - [X] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [X] This change requires a documentation update ## Test Plan Testing is a bit tricky at the moment, but you should be able to pull the branch down and test using the repl: ``` npm run repl await init() await client.listIndexes() await client.index('my-index').describeIndexStats() ``` Make sure unit tests are passing in CI. Integration tests are a bit borked right now, but there is a follow-up ticket to address those specifically.
## Problem In the future, we'll be using `describeIndex` to retrieve the specific `host` value for a given Index. Previously, this was handled internally by the client via configuration values and pinging `whoami`, and caching the results to prevent unneeded requests and latency. We want to maintain this caching behavior for `apiKey` -> `indexName` while simplifying the client configuration and calling `describeIndex` to retrieve the `host`. ## Solution The new `IndexHostSingleton` replaces the existing `ProjectIdSingleton`. The functionality is very similar between the two singletons at a high level, but the code paths for making network calls are different. I've also gone ahead and pulled out `projectId` and `environment` as configuration values as they'll no longer be needed to generate the data plane address. - Delete `ProjectIdSingleton` and add new `IndexHostSingleton`. - Add unit tests for `IndexHostSingleton`. - Remove `projectId` and `environment` from `PineconeConfiguration` and update related areas, tests, documentation bits, etc. Add new `PINECONE_CONTROLLER_HOST` environment variable. - Add `.then()` callbacks to `describeIndex` and `deleteIndex` to both set and delete the host from the cache when needed. ## Type of Change - [X] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [X] This change requires a documentation update ## Test Plan Testing is a bit tricky at the moment, but you should be able to pull the branch down and test using the repl: ``` npm run repl await init() await client.listIndexes() await client.index('my-index').describeIndexStats() ``` Make sure unit tests are passing in CI. Integration tests are a bit borked right now, but there is a follow-up ticket to address those specifically.
Problem
In the future, we'll be using
describeIndex
to retrieve the specifichost
value for a given Index. Previously, this was handled internally by the client via configuration values and pingingwhoami
, and caching the results to prevent unneeded requests and latency.We want to maintain this caching behavior for
apiKey
->indexName
while simplifying the client configuration and callingdescribeIndex
to retrieve thehost
.Solution
The new
IndexHostSingleton
replaces the existingProjectIdSingleton
. The functionality is very similar between the two singletons at a high level, but the code paths for making network calls are different. I've also gone ahead and pulled outprojectId
andenvironment
as configuration values as they'll no longer be needed to generate the data plane address.ProjectIdSingleton
and add newIndexHostSingleton
.IndexHostSingleton
.projectId
andenvironment
fromPineconeConfiguration
and update related areas, tests, documentation bits, etc. Add newPINECONE_CONTROLLER_HOST
environment variable..then()
callbacks todescribeIndex
anddeleteIndex
to both set and delete the host from the cache when needed.Type of Change
Test Plan
Testing is a bit tricky at the moment, but you should be able to pull the branch down and test using the repl:
Make sure unit tests are passing in CI.
Integration tests are a bit borked right now, but there is a follow-up ticket to address those specifically.