Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

discussions: Remote API Extension #3505

Closed
6 of 15 tasks
dan-menlo opened this issue Aug 30, 2024 · 6 comments
Closed
6 of 15 tasks

discussions: Remote API Extension #3505

dan-menlo opened this issue Aug 30, 2024 · 6 comments
Assignees
Labels
category: providers Local & remote inference providers P1: important Important feature / fix type: epic A major feature or initiative

Comments

@dan-menlo
Copy link
Contributor

dan-menlo commented Aug 30, 2024

Goal

  • Remote API extensions are modular (e.g. separate Github repo)
  • Remote API extensions can refresh their model list on-demand (preferably by calling their list API)
  • OR: users can specify the model list in Remote Extensions (e.g. comma separated list)
  • Remote API Extensions can fetch an updated model list (e.g. params that can be passed in)
  • Users can select model from list, once
  • Users should only see major models (i.e. not nightly)
  • We will not need model.yaml for each remote model
  • [Stretch] Users can add additional model names in the Remote API's Settings page (e.g. nightly)

Tasklist

Out-of-scope

  • Getting people to add Remote API Extensions
  • Refactor Remote API Extensions into separate repo/npm package (e.g. groq-extension)

Tasklist

Remote API Extensions

Existing Issues

@dan-menlo dan-menlo converted this from a draft issue Aug 30, 2024
@imtuyethan imtuyethan added the type: epic A major feature or initiative label Aug 30, 2024
@freelerobot freelerobot added the P1: important Important feature / fix label Sep 5, 2024
@freelerobot
Copy link
Contributor

Dupe of #3374

@freelerobot freelerobot moved this from Planning to Need Investigation in Jan & Cortex Sep 5, 2024
@dan-menlo dan-menlo changed the title epic: Remote API Extension that is modular and with easily updateable model lists epic: Remote API Extension Revamp Sep 9, 2024
@dan-menlo dan-menlo self-assigned this Sep 10, 2024
@dan-menlo dan-menlo moved this from Need Investigation to Planning in Jan & Cortex Sep 10, 2024
@dan-menlo dan-menlo removed the status in Jan & Cortex Sep 10, 2024
@dan-menlo dan-menlo moved this to Planning in Jan & Cortex Sep 10, 2024
@imtuyethan imtuyethan added the category: providers Local & remote inference providers label Sep 18, 2024
@dan-menlo dan-menlo changed the title epic: Remote API Extension Revamp architecture: Remote API Extension Revamp Sep 27, 2024
@louis-jan
Copy link
Contributor

louis-jan commented Sep 27, 2024

Separation of Concerns

  1. How models list work?
    • Remote extensions should work with autopopulating models, aka /models list.
    • We could not build hundreds model.json files manually.
    • The current extension framework is actually designed to handle this, it's just an implementation issue from extensions, which can be improved.
    • There was a hacky UI implementation where we pre-populated models, then disabled all of them until the API key was set. That should be a part of the extension, not the Jan app.
    • Extension builder still ships default available models. We don't close the door, we improve the example.
    // Before
    override async onLoad(): Promise<void> {
      super.onLoad()
      // Register Settings (API Key, Endpoints)
      this.registerSettings(SETTINGS)
    	
      // Pre-populate models - persist model.json files
      // MODELS are model.json files that come with the extension.
      this.registerModels(MODELS)
    }
    
    // After
    override async onLoad(): Promise<void> {
      super.onLoad()
      // Register Settings (API Key, Endpoints)
      this.registerSettings(SETTINGS)
    	
      // Fetch models from provider models endpoint - just a simple fetch
      // Default to `/models`
      get('/models')
        .then((models) => {
            // Model builder will construct model template (aka preset)
    	// This operation builds Model DTOs that works with the app.
    	this.registerModels(this.modelBuilder.build(models))
        })
    }
Remote Provider Extension
Image
Draw.io https://drive.google.com/file/d/1pl9WjCzKl519keva85aHqUhx2u0onVf4/view?usp=sharing
  1. Supported parameters?
    • Each provider works with different parameters, but they all share the same basic function with the current ones defined.
    • We've already supported transformPayload and transformResponse to adapt to these cases.
    • So users still see parameters consistent from model to model, but the magic happens behind the scenes, where the transformations are simplified under the hood.
    /**
    * transformPayload Example
    * Tranform the payload before sending it to the inference endpoint.
    * The new preview models such as o1-mini and o1-preview replaced max_tokens by max_completion_tokens parameter.
    * Others do not.
    */
    transformPayload = (payload: OpenAIPayloadType): OpenAIPayloadType => {
      // Transform the payload for preview models
      if (this.previewModels.includes(payload.model)) {
        const { max_tokens, ...params } = payload
        return { ...params, max_completion_tokens: max_tokens }
      }
      // Pass through for officialw models
      return payload
    }
  2. Decoration?
    {
      "name": "openai-extension",
      "displayName": "OpenAI Extension Provider",
      "icon": "https://openai.com/logo.png"
    }
  3. Just remove the hacky parts from Jan.
  • Model Dropdown: It checks if the engine is nitro or others, filtering for local versus cloud sections. New local engines will be treated as remote engines (e.g. cortex.cpp). -> Filter by Extension type (class name or type, e.g. LocalOAIEngine vs RemoteOAIEngine).
  • All models from the cloud provider are disabled by default if no API key is set. What if I use a self-hosted endpoint without API key restrictions? Models available or not should be determined from the extensions, when there are no credentials to meet the requirements, it will result in an empty section, indicating no available models. When users input the API-Key from extension settings page, it will fetch model list automatically and cache. Users can also refresh the models list from there (should not fetch so many times, we are building a local-first application)
  • Application settings can be a bit confusing, with Model Providers and Core Extensions listed separately. Where do other extensions fit in?
Extension settings do not have a community or "others" section
Image
  1. Extensions installation is a straightforward process that requires minimal effort.
  • There is no official way to install extensions from a GitHub repository URL. Users typically don't know how to package and install software from sources.
  • There should be a shortcut from the settings page that allows users to input the URL, pop up the extension repository details, and then install from there.
  1. It would be helpful to provide a list of community extensions, allowing users to easily find the right extension for their specific use case without having to search.

@dan-menlo
Copy link
Contributor Author

Idea from @norrybul: janhq/models#23 (comment)

@louis-jan
Copy link
Contributor

Idea from @norrybul: janhq/models#23 (comment)

Hi @dan-homebrew, that's what we initially thought we should do, but there are a couple of problems, so we've pushed back the Custom OAI Extension:

  1. Limitations of UI support in extensions.
  2. Model pre-population would establish a 1-1 mapping between model.json and extension settings. Once the Model Cache is complete, the extension no longer relies on model.json.
  3. Why not use existing extensions? E.g. OpenAI, OpenRouter...
  4. It's a good example of community extension?

@dan-menlo dan-menlo moved this from Scheduled to Investigating in Jan & Cortex Sep 29, 2024
@dan-menlo dan-menlo changed the title architecture: Remote API Extension Revamp architecture: Remote API Extension Oct 13, 2024
@freelerobot
Copy link
Contributor

freelerobot commented Oct 14, 2024

Using /models to auto-populate available remote models.

✅ This is the a great idea

Let's account for:

  • A remote inference provider has a differently named /models endpoint, e.g. /v1/models /v2/models (silly example, but path should be flexible and easy to config)
  • A remote inference provider doesn't have a models-list endpoint. Dumb idea: can we ask the extension-builder to just provide a hardcoded JSON that would otherwise be returned by the endpoint? e.g. we expect to read a local models.json file.
  • I actually dont think the edge case where, a user wants to add additional models when there is already a /models endpoint, is that common - do we need to handle this?

Right Panel: Inference Parameters UI Extensions

Inference parameters will vary across APIs.

Extensions DevEx

@louis-jan whats the extensions devex? Can you provide a full example for adding openai?
Flows:

  • User registers a new remote provider endpoint
  • User configures models list (or just uses default)
  • User configures settings parameters (or just uses default)

Extensions Hub

I like that we're thinking about how to showcase and list available community extensions.
I think we should take it out of the scope of this particular epic.
I've created a separate epic for us to think through this here: #3788
For the scope of this epic, let's assume we will package extensions into monorepo

@louis-jan
Copy link
Contributor

louis-jan commented Oct 14, 2024

Hey @0xSage, an extension-builder can offer a JSON, then call registerModels from the extension. This JSON can either be a local file or the result of a remote fetch. We support both. So it's not restricted to the /models path, but rather the fetch action.

We don't offer an API for users to register a new remote provider endpoint via extension code, as it's merely a utility or a small example they can copy over from our provided examples. The extension framework is designed to supply APIs that facilitate the interaction of extensions with the application.

I already added the example of:

  1. Extension builder can register models from their model sources, which can be either a JSON file or the result of a fetch operation.
  2. Extension builder can transform setting parameters, eventually the same setting interfaces but transform the payload.

#3505 (comment)

// Register models with JSON file
override async onLoad(): Promise<void> {
  super.onLoad()
  // Register Settings (API Key, Endpoints)
  this.registerSettings(SETTINGS)
	
  // Pre-populate models - persist model.json files
  // MODELS are model.json files that come with the extension.
  this.registerModels(MODELS)
}

// Register models with provider models list endpoint
override async onLoad(): Promise<void> {
  super.onLoad()
  // Register Settings (API Key, Endpoints)
  this.registerSettings(SETTINGS)
	
  // Fetch models from provider models endpoint - just a simple fetch
  // Default to `/models`
  get('/models')
    .then((models) => {
        // Model builder will construct model template (aka preset)
	// This operation builds Model DTOs that works with the app.
        // They can transform model parameters right here for supported settings
	this.registerModels(this.modelBuilder.build(models))
    })
}

/**
* transformPayload Example
* Tranform the payload before sending it to the inference endpoint.
* The new preview models such as o1-mini and o1-preview replaced max_tokens by max_completion_tokens parameter.
* Others do not.
*/
transformPayload = (payload: OpenAIPayloadType): OpenAIPayloadType => {
  // Transform the payload for preview models
  if (this.previewModels.includes(payload.model)) {
    const { max_tokens, ...params } = payload
    return { ...params, max_completion_tokens: max_tokens }
  }
  // Pass through for officialw models
  return payload
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: providers Local & remote inference providers P1: important Important feature / fix type: epic A major feature or initiative
Projects
Archived in project
Development

No branches or pull requests

4 participants