feat: Jan Integrates Cortex.cpp as Provider #3821

louis-jan · 2024-10-17T02:50:04Z

Implementation Specs

Migration Path:

App 0.5.8 opens
Return model list from cache (given users are on 0.5.7) -> function normally.
Scan JSON models (legacy logics - fresh install or older versions) -> function normally.
In background, app attempts to import models and merge with legacy downloaded models (failed to import models)
The app combines models returned by cortex.cpp and legacy JSON models. Cortex.cpp models are prioritized in case of the same ID (Models are imported successfully.)

Changes

Naming convention

inference-nitro-extension is renamed into inference-cortex-extension.
cortex.cpp binaries have the same name as engine releases.
Pre-package everything, include cuda dependencies (dll, so) so users don't have to install separately.
Support noavx-cuda binaries as a fallback

Simplifed

Deprecated ModelFile. It's no longer relevant. Now, providers define models, so it should manage how to run itself.
Remove install cuda toolkit UX, should be ready after installed.

Downloader

App proxies to cortex.cpp or app's downloader, depending on the cortex.cpp model support capability.

Model Hub

App allows extensions to register models available for download in RAM. After downloading them, the models will have their yaml or json persisted along with the model files.
App priorities model hub decoration (previous json metadata) over cortex.cpp metadata (such as name, size, tags)

Observability

cortex-extension should watch cortex.cpp server upon launch. It ensures that the cortex process runs with the application.
All requests will be queued and run when the server to come online, ensuring the UX remains the same. So there would be no asynchronous requests and server run introduced. E.g. Model import or start should not fail due to server not being online in time.
So there would be no attempt to kill the cortex process on model start every time. It is just a stop and start model, so it will not block other API requests.

Goals

Updated from the older version to this version, models will be imported and run normally. Models are not imported will still able to run since we will attempt to do preflight before running.
Users can download models or app proxies to cortex.cpp, or use the app downloader, depending on the cortex.cpp model support capability.

Subtasks

@janhq/jan @janhq/cortex

#3825

github-actions · 2024-10-17T02:52:53Z

Preview URL: https://5239fa1d.docs-9ba.pages.dev

louis-jan · 2024-10-21T05:40:14Z

Update scenario

I have an old Jan version with downloaded models.
Updated to the newer Jan version, should maintain the models without delay or concern about cortex server corruption.
I can load models from older versions.
I can download new models using the Jan version.
Switch back to old Jan versions. Still see old downloaded models (not models downloaded on a newer version).
Switch back to the new Jan version, all downloaded models are visible

core/src/types/model/modelEntity.ts

louis-jan · 2024-10-21T09:18:36Z

Rebased dev

github-actions · 2024-10-21T10:33:29Z

Barecheck - Code coverage report

Total: 69.62%

Your code coverage diff: 0.18% ▴

Uncovered files and lines

File	Lines
core/src/browser/extension.ts	105, 113-114, 116, 126-127, 145-146, 148-150, 156, 200, 202, 209-213, 215-216
core/src/browser/extensions/model.ts	12
core/src/browser/extensions/engines/EngineManager.ts	38
core/src/browser/extensions/engines/LocalOAIEngine.ts	70-71
core/src/browser/extensions/engines/OAIEngine.ts	59, 71, 108, 139-141, 169
core/src/browser/extensions/engines/helpers/sse.ts	37-38, 72-73, 80, 86, 93
core/src/browser/models/manager.ts	8, 11-12, 21-22, 27, 29, 38, 45
core/src/browser/models/utils.ts	90, 111, 130, 132-135, 137-138, 144, 152
core/src/node/api/processors/download.ts	18-20, 68, 73-74, 80-82, 97-98, 104-105, 110-111, 153-154
core/src/node/api/restful/helper/builder.ts	45, 52-53, 61, 100, 116-117, 131-132, 175, 205, 250, 271, 284, 287, 290, 326-327, 335-336, 341, 346
core/src/node/api/restful/helper/startStopModel.ts	21

louis-jan · 2024-10-22T09:19:22Z

Jan's API Server works with Cortex.cpp. Later, proxy everything.

dan-menlo · 2024-10-22T12:59:28Z

Nice 👀

Barecheck - Code coverage report

Total: 69.88%
Your code coverage diff: 0.37% ▴

Uncovered files and lines

louis-jan · 2024-10-24T08:16:03Z

Jan can handle multimodal download, even if cortex.cpp is not supported.

Jan can handle its downloaded models aside from cortex.cpp /models

LlaVa 7B is downloaded by Jan, works with legacy model.json, others are handled by cortex.cpp.

louis-jan · 2024-11-05T02:20:24Z

Feature test build
https://github.com/janhq/jan/actions/runs/11675880590

dan-menlo

prays lgtm

…nd-download-bar fix: Inconsistent model hub and download bar

chore: clean dangling process on exit and relaunch

github-actions bot assigned louis-jan Oct 17, 2024

github-actions bot deployed to docs (Preview) October 17, 2024 02:52 View deployment

github-actions bot deployed to docs (Preview) October 17, 2024 03:10 View deployment

louis-jan changed the title ~~[WIP] feat: model and cortex extensions update - path to new cortex.cpp~~ [WIP] feat: Jan Integrates Cortex.cpp as Provider Oct 17, 2024

louis-jan marked this pull request as draft October 17, 2024 07:02

github-actions bot deployed to docs (Preview) October 21, 2024 05:21 View deployment

louis-jan commented Oct 21, 2024

View reviewed changes

core/src/types/model/modelEntity.ts Outdated Show resolved Hide resolved

louis-jan force-pushed the feat/path-to-cortexcpp branch from 96e3919 to 3156e8a Compare October 21, 2024 09:18

github-actions bot deployed to docs (Preview) October 21, 2024 09:20 View deployment

github-actions bot deployed to docs (Preview) October 21, 2024 10:14 View deployment

github-actions bot deployed to docs (Preview) October 21, 2024 10:24 View deployment

github-actions bot deployed to docs (Preview) October 21, 2024 11:34 View deployment

github-actions bot deployed to docs (Preview) October 21, 2024 11:57 View deployment

github-actions bot deployed to docs (Preview) October 21, 2024 14:18 View deployment

github-actions bot deployed to docs (Preview) October 21, 2024 14:45 View deployment

louis-jan changed the title ~~[WIP] feat: Jan Integrates Cortex.cpp as Provider~~ feat: Jan Integrates Cortex.cpp as Provider Oct 22, 2024

louis-jan force-pushed the feat/path-to-cortexcpp branch from eebcf07 to 26a3405 Compare October 22, 2024 10:09

louis-jan force-pushed the feat/path-to-cortexcpp branch from 97f87e8 to d235e88 Compare October 24, 2024 08:23

louis-jan marked this pull request as ready for review October 29, 2024 07:21

louis-jan requested a review from a team October 29, 2024 07:21

louis-jan temporarily deployed to production October 29, 2024 07:22 — with GitHub Actions Inactive

louis-jan added 5 commits November 4, 2024 15:37

fix: inconsistent models from dropdown and hub

2a0d87a

fix: unlink the entire model folder on delete

5ddbf5f

chore: update electron notarize version

d0ffe6c

chore: decide model name on pull and import

a986c6d

chore: model id is optional on import

b913af9

louis-jan force-pushed the feat/path-to-cortexcpp branch from 2cb006c to b913af9 Compare November 4, 2024 08:37

louis-jan added 2 commits November 4, 2024 20:36

chore: new cortex-cpp binary - model import option and model size

46d5faf

test: correct tests

d2fa38f

louis-jan temporarily deployed to production November 5, 2024 01:34 — with GitHub Actions Inactive

This was referenced Nov 5, 2024

bug: TensorRT Extension Setup Fails #3894

Closed

bug: No model can start because of macOS 12 Incompatibility issue #3898

Closed

chore: fix model ID display in my models

2c8c76a

louis-jan temporarily deployed to production November 5, 2024 07:37 — with GitHub Actions Inactive

louis-jan temporarily deployed to production November 5, 2024 07:38 — with GitHub Actions Inactive

dan-menlo approved these changes Nov 5, 2024

View reviewed changes

louis-jan added 6 commits November 6, 2024 09:20

fix: 3911 - inconsistent between download progress and model hub

964269d

Merge pull request #3956 from janhq/fix/3911-inconsistent-model-hub-a…

6efc327

…nd-download-bar fix: Inconsistent model hub and download bar

Merge branch 'dev' into feat/path-to-cortexcpp

24b7d64

chore: clean dangling process on exit and relaunch

56e35df

Merge pull request #3960 from janhq/fix/dangling-process-on-reset

d0eb91f

chore: clean dangling process on exit and relaunch

Merge branch 'dev' into feat/path-to-cortexcpp

c92b809

louis-jan merged commit a82c701 into dev Nov 6, 2024
9 checks passed

louis-jan deleted the feat/path-to-cortexcpp branch November 6, 2024 08:45

github-actions bot added this to the v0.5.8 milestone Nov 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Jan Integrates Cortex.cpp as Provider #3821

feat: Jan Integrates Cortex.cpp as Provider #3821

louis-jan commented Oct 17, 2024 •

edited

Loading

github-actions bot commented Oct 17, 2024 •

edited

Loading

louis-jan commented Oct 21, 2024

louis-jan commented Oct 21, 2024

github-actions bot commented Oct 21, 2024 •

edited

Loading

louis-jan commented Oct 22, 2024

dan-menlo commented Oct 22, 2024

Barecheck - Code coverage report

louis-jan commented Oct 24, 2024 •

edited

Loading

louis-jan commented Nov 5, 2024 •

edited

Loading

dan-menlo left a comment

feat: Jan Integrates Cortex.cpp as Provider #3821

feat: Jan Integrates Cortex.cpp as Provider #3821

Conversation

louis-jan commented Oct 17, 2024 • edited Loading

Implementation Specs

Migration Path:

Changes

Naming convention

Simplifed

Downloader

Model Hub

Observability

Goals

Subtasks

github-actions bot commented Oct 17, 2024 • edited Loading

louis-jan commented Oct 21, 2024

Update scenario

louis-jan commented Oct 21, 2024

github-actions bot commented Oct 21, 2024 • edited Loading

Barecheck - Code coverage report

louis-jan commented Oct 22, 2024

dan-menlo commented Oct 22, 2024

Barecheck - Code coverage report

louis-jan commented Oct 24, 2024 • edited Loading

louis-jan commented Nov 5, 2024 • edited Loading

dan-menlo left a comment

Choose a reason for hiding this comment

louis-jan commented Oct 17, 2024 •

edited

Loading

github-actions bot commented Oct 17, 2024 •

edited

Loading

github-actions bot commented Oct 21, 2024 •

edited

Loading

louis-jan commented Oct 24, 2024 •

edited

Loading

louis-jan commented Nov 5, 2024 •

edited

Loading