The content of this repository comprises the work which is being been developed in the scope of RESCUE (RESilient Cloud for EUropE) project. The objective is to develop reusable, modular components to strengthen reliability and recover capabilities for (critical) digital services. Pilot Cyber Resilient Digital Twins for Data Centers and Edges that use open cloud infrastructure and are capable of hosting mission-critical applications at large scale.
This project implements an advanced automated system for fraud investigation and reporting using Large Language Models (LLMs) and machine learning techniques. The system is designed to process incidents, analyze logs, detect anomalies, generate comprehensive reports, and continuously improve its performance.
Below is a high-level overview of the Fraud Investigation System architecture:
+-------------------+ +------------------------+
| Incident Input |----->| Incident Understanding |
+-------------------+ +------------------------+
|
v
+-------------------+ +------------------------+
| Knowledge Base |<---->| RAG Processing |
+-------------------+ +------------------------+
|
v
+-------------------+ +------------------------+
| Log Retrieval |<---->| Anomaly Detection |
+-------------------+ +------------------------+
|
v
+-------------------+ +------------------------+
| Report Generation |<-----| Output Interface |
+-------------------+ +------------------------+
|
v
+-------------------+
| Feedback Loop |
+-------------------+
- Automated incident understanding using LLMs ✅
- Dynamic API call generation for log retrieval 🔲
- Asynchronous log retrieval from multiple sources (for now: Elasticsearch) 🔲
- LLM-powered anomaly detection with statistical analysis ✅
- Automated report generation ✅
- Flexible output interface (email, file, API) 🔲
- Rate limiting and input validation ✅
- Caching and performance optimization 🔲
- Robust error handling and retrying mechanisms ✅
- Feedback loop for continuous improvement 🔲
- Plugin system for easy extension of functionality ✅
- Export of investigation results in various formats (JSON, CSV, XML, Excel) ✅
- External API for submitting incidents ✅
The system consists of the following main components:
- Incident Input Interface
- Incident Understanding Module
- API Call Generator
- Log Retrieval Engine
- Anomaly Detection Module
- Report Generation Module
- Output Interface
- Plugin System
- Feedback Loop
- Performance Dashboard
- Result Exporter
- External API
These components are supported by utility modules for LLM integration, error handling, caching, performance optimization, input validation, and rate limiting.
-
Clone the repository:
git clone https://github.com/AmadeusITGroup/afir.git cd afir
-
Create a virtual environment and activate it:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install dependencies:
pip install -r requirements.txt
-
Configure the system by editing the YAML files in the
config/
directory (you can find some templates inconfig/templates/
:main_config.yaml
: Main system configurationllm_config.yaml
: LLM provider configurationlogging_config.yaml
: Logging configurationplugin_config.yaml
: Plugin configuration
-
Start the main system:
python src/main.py
-
Contact the system using API calls as descripted in
docs/user_guide.md
.
- Create a new Python file in the
plugins/
directory. - Implement your plugin logic and a
register_plugin()
function. - Configure the plugin in the plugin configuration file.
- The plugin will be automatically loaded by the PluginManager.
Run the unit tests using:
python -m unittest discover tests
The system uses Python's built-in logging module. Log files are stored in the logs/
directory. You can adjust the logging level and output format in the logging_config.yaml
file.
Contributions are welcome! Please feel free to submit a Pull Request.