Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use in-memory database with shared cache #719

Merged
merged 16 commits into from
Nov 5, 2023
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@

- #710, #561 Implement `except*` syntax (@lieryan)
- #711 allow building documentation without having rope module installed (@kloczek)
- #719 Allow in-memory databases being shared across threads (@tkrabel)
lieryan marked this conversation as resolved.
Show resolved Hide resolved

# Release 1.10.0

Expand Down
9 changes: 5 additions & 4 deletions rope/contrib/autoimport/sqlite.py
Original file line number Diff line number Diff line change
Expand Up @@ -151,12 +151,13 @@ def create_database_connection(
"""
if not memory and project is None:
raise Exception("if memory=False, project must be provided")
db_path: str
if memory or project is None or project.ropefolder is None:
db_path = ":memory:"
# Allows the in-memory db to be shared across threads
# See https://www.sqlite.org/inmemorydb.html
project_hash = hash(project and project.ropefolder and project.ropefolder.real_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid using hash function here, instead use a more standard hashes in hashlib. While we don't really need the strong security property of hashing, the hash() function is tied to the implementation of dict/set rather than being a general purpose hash function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please note that I am essentially hashing the string that is returned by project.ropefolder.real_path. I do the checks as a fail-safe, but it should not happen that project is not None, but project.ropefolder is (source).

I can make the distinctions clearer if you want with another if condition:

Suggested change
project_hash = hash(project and project.ropefolder and project.ropefolder.real_path)
project_hash: int
if project is None or project.ropefolder is None:
project_hash = hash(None)
else:
project_hash = hash(project.ropefolder.real_path)

I added a unit test that checks for the most common use case I think: if we have two different projects, then we get two different in-memory databases, while same projects share the database.

I find using hashlib overkill here, as it doesn't add much value since we pass in None or str to it.

Copy link
Member

@lieryan lieryan Oct 31, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC, if project.ropefolder is None, that means the .ropeproject is disabled for that project. In that case, I think we should hash the project's path instead of the ropefolder's path, at the very least they will still share an in memory database. That will be a change from previous behavior, but I think it should be fine.

Also, this

project_hash = hash(None)

does not look right. It meant that when no project is provided, everything will connect to the same in-memory database.

The expectation here is that when no project is provided, that it should always create a new, empty database, so the right way to do this might be to just generate a random hash or leave that case to use unnamed in memory database. That said, IIRC, project = None is really only intended to be used in unittests anyway.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also checked the code and it seems project = None is not allowed when creating an AutoImport instance, and it is allowed by create_database_connection, but not even used in tests by us. My feeling is to let everything set on fire if somebody uses create_database_connection without specifying a project, but I don't want to be a bad person so I just create a random hash that has the same format as the regular hash :)

return sqlite3.connect(f"file:memdb{project_hash}:?mode=memory&cache=shared", uri=True)
else:
db_path = str(Path(project.ropefolder.real_path) / "autoimport.db")
return sqlite3.connect(db_path)
return sqlite3.connect(str(Path(project.ropefolder.real_path) / "autoimport.db"))

def _setup_db(self):
models.Metadata.create_table(self.connection)
Expand Down
7 changes: 7 additions & 0 deletions ropetest/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -13,6 +13,13 @@ def project():
testutils.remove_project(project)


@pytest.fixture
def project2():
project = testutils.sample_project("another_project")
yield project
testutils.remove_project(project)


@pytest.fixture
def project_path(project):
yield pathlib.Path(project.address)
Expand Down
16 changes: 16 additions & 0 deletions ropetest/contrib/autoimport/autoimporttest.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import sqlite3

from contextlib import closing, contextmanager
from textwrap import dedent
from unittest.mock import ANY, patch
Expand Down Expand Up @@ -26,6 +28,20 @@ def database_list(connection):
return list(connection.execute("PRAGMA database_list"))


def test_in_memory_database_share_cache(project, project2):
ai_1 = AutoImport(project, memory=True)
ai_2 = AutoImport(project, memory=True)

ai_3 = AutoImport(project2, memory=True)

with ai_1.connection:
ai_1.connection.execute("CREATE TABLE shared(data)")
ai_1.connection.execute("INSERT INTO shared VALUES(28)")
assert ai_2.connection.execute("SELECT data FROM shared").fetchone() == (28,)
with pytest.raises(sqlite3.OperationalError, match="no such table: shared"):
ai_3.connection.execute("SELECT data FROM shared").fetchone()


def test_autoimport_connection_parameter_with_in_memory(
project: Project,
autoimport: AutoImport,
Expand Down