This tool is designed to extract spatial entities and scales from unstructured text input, using a
- 🧠 LLM (Large Language Model) and a
- 🌐 Geocoder: the Open Source Photon geocoding service provided by Komoot that is based on OpenStreetMap (OSM).
The SSP tool outputs location names, types, and extents based on the user's query.
- We use LLM to extract spatial entities from a text input, which may include one or multiple entities. Additionally, if feasible, we determine the spatial scale of these entities, categorized as 'Local,' 'City,' 'Regional,' 'National,' 'Continental,' or 'Global.'"
- Then, we use the Photon service to retrieve a list of candidates for the extracted entities. The candidate list includes a Bounding Box for each candidate. The tool uses
aiohttp
for asynchronous API calls to API intending to speed up these API calls. - Finally, we input the initial query along with the candidate list as context into a LLM and we let the LLM determine which candidate is the most relevant or suitable for the initial query.
- Clone the repository:
git clone https://github.com/simeonwetzel/SSP.git
- Install the required dependencies: pip install -r requirements.txt
To use this tool, you need to connect any Large Language Model (LLM) compatible with the langchain libary (see a list of available LLMs here). Below is an example of how to connect an LLM either using the OpenAI or the Groq API.
☝️ Please note that you have to install the respective packages in case of using another model (e.g. pip install langchain-cohere
for using a Cohere model).
import os
from langchain.llms import OpenAI
os.environ["OPENAI_API_KEY"] = <YOUR_OPENAI_API_KEY>
# Initialize the LLM
llm = OpenAI(temperature=0)
import os
from langchain_groq import ChatGroq
os.environ["OPENAI_API_KEY"] = <YOUR_KEY_GROQ_API_KEY>
# Initialize the LLM with Groq's Llama3 model
llm = ChatGroq(
model="llama3-70b-8192",
temperature=0,
max_tokens=None,
timeout=None,
max_retries=2
)
Once you have initialized your LLM, you can use the Spatial Scope Parser (SSP) tool to parse spatial queries from user input.
from ssp import SSP
# Initialize the SSP tool with your chosen LLM
parser = SSP(llm)
To extract spatial entities and scale from a user query, use the parse_spatial_scope
method:
user_query = "I need data for Berlin"
response = parser.parse_spatial_scope(user_query)
print(response)
{
"name": "Berlin",
"country": "Deutschland",
"type": "city",
"extent": [13.088345, 52.6755087, 13.7611609, 52.3382448]
}
The tool performs two key functions:
- Extract Spatial Entity & Scale: Extract spatial entities such as city, country, or region, and determine the spatial scale (e.g., "Local", "City", "National").
- Query OSM for Coordinates: Once a spatial entity is identified, the tool queries OpenStreetMap (OSM) to find the corresponding geographic details (name, country, type, extent).
You can also query OSM directly using the search_with_osm_query
method if you already know the spatial entity:
query_dict = {'spatial': 'Berlin', 'scale': 'City'}
results = parser.search_with_osm_query(query_dict)
print(results)
{
"original_query": "Berlin",
"scale": "City",
"results": {
"name": "Berlin",
"country": "Deutschland",
"type": "city",
"extent": [13.088345, 52.6755087, 13.7611609, 52.3382448]
}
}
One key limitation of this tool is its dependency on the Photon geocoding service provided by Komoot. In our tests, we have observed that most of the latency comes from querying this API.
While Photon is an open-source service, the performance of the tool may vary based on the availability and response times of the public API.
To have more control over the geocoding service and potentially improve latency, you can host the Photon API on your own resources. Since Photon is open source, you can deploy it locally or on a dedicated server. You can find detailed instructions and the code base in the Photon GitHub repository.
By hosting Photon yourself, you can:
- Optimize the server to meet your performance needs.
- Ensure availability and control over the service.
- Adjust the settings to fit your specific geocoding requirements.
- Geospatial Data Applications: Extract location-based information for mapping and data visualization.
- Spatial Search: Use the output of the parser as a search filter when searching for (meta)datasets
- Natural Language Interfaces: Use the tool to build chatbots or assistant applications that understand spatial queries.
Feel free to customize the LLM settings and query parsing logic to suit your specific use case.