Merge pull request #1 from antjes88/feature/first_iteration

Feature/first iteration
antjes88 · Nov 10, 2023 · c1df518 · c1df518
2 parents 8b2b615 + d275dec
commit c1df518
Show file tree

Hide file tree

Showing 15 changed files with 892 additions and 1 deletion.
diff --git a/.coveragerc b/.coveragerc
@@ -0,0 +1,2 @@
+[run]
+omit = tests/*
diff --git a/.github/workflows/pytest.yaml b/.github/workflows/pytest.yaml
@@ -0,0 +1,35 @@
+name: Pytest
+
+on:
+  push:
+  pull_request:
+    branches:
+      - main
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ['3.10']
+      max-parallel: 1
+    env: # Or as an environment variable
+      PROJECT: ${{ secrets.PROJECT }}
+      SOURCE_TABLE: ${{ secrets.SOURCE_TABLE }}
+      DESTINATION_TABLE: ${{ secrets.DESTINATION_TABLE }}
+      DATASET: ${{ secrets.DATASET }}
+      SA_JSON: ${{ secrets.SA_JSON }}
+    steps:
+      - uses: actions/checkout@v3
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v3
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          if [ -f cloud_function/requirements.txt ]; then pip install -r cloud_function/requirements.txt; fi
+          python -m pip install pytest==7.4.3
+          python -m pip install python-dotenv==0.14.0
+      - name: Test with pytest
+        run: |
+          python -m pytest -vv
diff --git a/README.md b/README.md
@@ -1,2 +1,74 @@
 # irr_calculator
-Python solution that calculates irr for a set of accounts
+Python solution that calculates Internal Rate of Return (IRR) for a set of entities.
+
+## Internal Rate of Return (IRR)
+
+Internal Rate of Return (IRR) is a financial metric that calculates the profitability of an investment by 
+determining the Discount Rate at which the net present value (NPV) of the investment becomes zero. 
+In other words, IRR represents the interest rate at which the present value of cash inflows equals the present 
+value of cash outflows. The primary purpose of IRR is to assess the attractiveness of an investment opportunity.
+
+The IRR is calculated using the following formula:
+
+$`NPV = \sum_{t=0}^T \frac{CF_t}{(1 + r)^t}`$
+
+Where:
+- \( NPV \) is the net present value of cash inflows and outflows.
+- \( T \) is the total number of periods.
+- \( CF_t \) is the cash flow in period \( t \).
+- \( r \) is the discount rate (IRR).
+
+The IRR is found by solving the NPV equation for \( r \) when \( NPV = 0 \). 
+
+## Terraform code
+The provided Terraform code automates the deployment of the python solution to calculate IRR as a 
+Cloud Function on Google Cloud Platform (GCP). It begins by configuring the necessary GCP provider settings, 
+such as the project ID and region. The code then creates a service account for the Cloud Function, 
+assigning it specific roles for interacting with BigQuery. 
+It also establishes a Cloud Storage bucket to store the Cloud Function's source code, archives the source code, 
+and uploads it to the designated bucket. Additionally, the configuration sets up a Pub/Sub topic and a 
+Cloud Scheduler job, allowing the Cloud Function to be triggered periodically. 
+Finally, it defines the Cloud Function itself. This Terraform setup streamlines the deployment process and 
+ensures a consistent environment for the IRR calculator on GCP.
+
+## Testing
+
+To execute the Python tests use next command on the CLI:
+
+```commandline
+python -m pytest -vv
+```
+
+It is needed a _.env_ file with the next settings:
+
+```
+PROJECT=
+SOURCE_TABLE=
+DESTINATION_TABLE=
+DATASET=
+```
+
+You will also need to provide a Service Account credentials or to use a user account with the right permissions to 
+interact with BigQuery.
+
+### Python environment
+To execute these tests within your machine you will need an environment with python 3.10.0 and the libraries listed in 
+requirements.txt. In case you do not have such environment, you can create it as follows with conda:
+
+```
+conda create -n [] python=3.10.0 pip
+pip install -r cloud_function/requirements.txt
+```
+
+You also need to install pytest==7.4.3 & python-dotenv==0.14.0
+
+## GitHub Workflow
+GitHub workflow automates Python testing for the project, triggered on every push  or pull requests to the main branch. 
+Operating on the latest Ubuntu environment, it employs a matrix strategy to test against Python 3.10. 
+The workflow initializes Python, installs project dependencies, including Pytest and Python-dotenv, 
+and executes Pytest. 
+Key environment variables, such as project details and service account JSON, are securely managed using GitHub Secrets. 
+
+
+
+
diff --git a/cloud_function/main.py b/cloud_function/main.py
@@ -0,0 +1,22 @@
+import repository
+import services
+
+
+def func_entry_point(event, context):
+    """
+    Entry point for the application. This function initializes a BigQuery repository connector and invokes the
+    IRR pipeline.
+
+    Args:
+         event: The dictionary with data specific to this type of event. The `@type` field maps to
+                `type.googleapis.com/google.pubsub.v1.PubsubMessage`. The `data` field maps to the PubsubMessage data
+                in a base64-encoded string. The `attributes` field maps to the PubsubMessage attributes
+                if any is present.
+         context: Metadata of triggering event including `event_id` which maps to the PubsubMessage
+                  messageId, `timestamp` which maps to the PubsubMessage publishTime, `event_type` which maps to
+                  `google.pubsub.topic.publish`, and `resource` which is a dictionary that describes the service
+                  API endpoint pubsub.googleapis.com, the triggering topic's name, and the triggering event type
+                  `type.googleapis.com/google.pubsub.v1.PubsubMessage`.
+    """
+    bq_repository = repository.BiqQueryRepository()
+    services.irr_pipeline(bq_repository)
diff --git a/cloud_function/model.py b/cloud_function/model.py
@@ -0,0 +1,183 @@
+from dataclasses import dataclass
+import datetime as dt
+import numpy_financial as npf
+
+
+@dataclass(frozen=True)
+class Cashflow:
+    """
+    A data class representing a cashflow entry. This class is used to store information about a cashflow,
+    including the date, inflow, outflow, value, and entity name. The class is frozen to ensure immutability.
+
+    Attributes:
+        date (datetime.datetime): The date of the cashflow.
+        inflow (float): The inflow amount.
+        outflow (float): The outflow amount.
+        value (float): The net value.
+        entity_name (str): The name of the associated entity.
+    """
+
+    date: dt.datetime
+    inflow: float
+    outflow: float
+    value: float
+    entity_name: str
+
+    def __gt__(self, other):
+        if self.date is None:
+            return False
+        elif other.date is None:
+            return True
+        else:
+            return self.date > other.date
+
+
+@dataclass(frozen=True)
+class Irr:
+    """
+    A data class representing Internal Rate of Return (IRR) data for a specific entity. This class is used to store
+    information about IRR, including the date, monthly IRR value, and entity name. The class is frozen to ensure
+    immutability.
+
+    Attributes:
+        date (datetime.datetime): The date of the IRR calculation.
+        value (float): The monthly IRR value.
+        entity_name (str): The name of the associated entity.
+    Properties:
+        value_annual (float): Calculate the annualized IRR value based on the monthly value.
+    Methods:
+        to_dict(): Convert the IRR data to a dictionary for serialization.
+    """
+
+    date: dt.datetime
+    value: float
+    entity_name: str
+
+    @property
+    def value_annual(self):
+        """
+        Calculate the annualized IRR value based on the monthly value.
+
+        Returns:
+            float: The annualized IRR value (rounded to 4 decimal places).
+        """
+        return round(((1 + self.value) ** 12) - 1, 4)
+
+    def to_dict(self) -> dict:
+        """
+        Convert the IRR data to a dictionary for serialization.
+
+        Returns:
+            dict: A dictionary representation of the IRR data.
+        """
+        return {
+            "date": self.date.strftime("%Y-%m-%d"),
+            "irr_monthly": self.value,
+            "irr_annual": self.value_annual,
+            "entity_name": self.entity_name,
+        }
+
+
+class Entity:
+    """
+    A class representing an entity with associated cashflows and calculated Internal Rate of Return (IRR) data.
+
+    Args:
+        entity_name (str): The name of the entity.
+    Attributes:
+        entity_name (str): The name of the entity.
+        sorted_cashflows (list[Cashflow]): A list of sorted Cashflow objects for the entity.
+        irrs (list[Irr]): A list of calculated IRR data.
+    Methods:
+        add_cashflow(cashflow: Cashflow): Add a Cashflow to the entity's list of cashflows.
+        calculate_irr(): Calculate IRR data based on the entity's cashflows.
+    """
+
+    def __init__(self, entity_name: str):
+        self.entity_name: str = entity_name
+        self.sorted_cashflows: list[Cashflow] = []
+        self.irrs: list[Irr] = []
+
+    def add_cashflow(self, cashflow: Cashflow):
+        """
+        Add a Cashflow to the entity's list of cashflows and ensure the list remains sorted by date.
+
+        Args:
+            cashflow (Cashflow): The Cashflow to add.
+        """
+        self.sorted_cashflows.append(cashflow)
+        self.sorted_cashflows = sorted(self.sorted_cashflows)
+
+    def calculate_irr(self):
+        """
+        Calculate IRR data based on the entity's sorted cashflows. This method calculates IRR based on cashflows and
+        stores the results in the 'irrs' attribute.
+
+        If there are not enough cashflows for calculation, a message is printed.
+        """
+        self.irrs = []
+        if self.sorted_cashflows.__len__() < 2:
+            print(f"Not enough values for {self.entity_name}")
+            # raise Exception("Not enough values")
+            # todo: make this to be logged
+        else:
+            periodic_cashflow = [
+                self.sorted_cashflows[0].outflow - self.sorted_cashflows[0].inflow
+            ]
+            for cashflow in self.sorted_cashflows[1:]:
+                periodic_cashflow.append(
+                    cashflow.value + cashflow.outflow - cashflow.inflow
+                )
+                self.irrs.append(
+                    Irr(
+                        cashflow.date,
+                        round(npf.irr(periodic_cashflow), 4),
+                        self.entity_name,
+                    )
+                )
+                periodic_cashflow[-1] = cashflow.outflow - cashflow.inflow
+
+    def __eq__(self, other):
+        if not isinstance(other, Entity):
+            return False
+        return self.entity_name == other.entity_name
+
+    def __hash__(self):
+        return hash(self.entity_name)
+
+
+def allocate_cashflows_to_entities(
+    cashflows: list[Cashflow], entities: dict[str:Entity]
+):
+    """
+    Allocate cashflows to entities based on the entity names.
+
+    Args:
+        cashflows (list[Cashflow]): A list of Cashflow objects to be allocated to entities.
+        entities (dict[str, Entity]): A dictionary of entities where keys are entity names, and values are Entity
+                                      objects.
+    Returns:
+        dict[str, Entity]: A dictionary of entities with updated cashflow data.
+    """
+    for cashflow in cashflows:
+        entities[cashflow.entity_name].add_cashflow(cashflow)
+    # todo: get an except catcher for KeyError!?
+
+    return entities
+
+
+def entities_collection_creation(cashflows: list[Cashflow]) -> dict[str:Entity]:
+    """
+    Create a collection of entities based on the provided list of cashflows.
+
+    Args:
+        cashflows (list[Cashflow]): A list of Cashflow objects from which entities will be created.
+    Returns:
+        dict[str, Entity]: A dictionary of entities with entity names as keys and corresponding Entity objects.
+    """
+    entities = {}
+    entity_names = tuple([cashflow.entity_name for cashflow in cashflows])
+    for entity_name in entity_names:
+        entities[entity_name] = Entity(entity_name)
+
+    return entities