-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deficient intel filtering #57
base: main
Are you sure you want to change the base?
Deficient intel filtering #57
Conversation
…m prompt or the path to a text file containing the prompt. This can be done for the agent system prompt, the checklist creation prompt, the summary prompt, and the justification prompt.
Co-authored-by: Ashley Song <[email protected]>
Co-authored-by: Ashley Song <[email protected]>
Fixed Llama 3.1 bug with prompt parameter being passed in get_client.
Co-authored-by: Ashley Song <[email protected]>
Co-authored-by: Ashley Song <[email protected]>
…ompt_config Added prompt parameter for LLM models to allow for passing in a custo…
…or client creation to ensure compatibility with OpenAI.
…Adjusted the output to account for this filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the overall approach of filtering out CVEs with missing intel. Just have some questions on whether we can simplify the approach.
checklist=[ | ||
ChecklistItemOutput( | ||
input="Gather intel for the CVE.", | ||
response= | ||
"There is insufficient intel available to determine vulnerability. This is either due to the CVE not existing or there is not enough gathered intel for the agent to make an informed decision.", | ||
intermediate_steps=None) | ||
], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently these placeholder checklist items are only conditionally displayed when the agent is skipped, which has caused confusion for some users. We recently got a question about whether the SBOM check only happens when the "Check SBOM and dependencies for vulnerability" checklist item is shown.
Do you recall why we decided to include the placeholder checklist item? I'm wondering if we could safely omit it for consistency? This would also make it clearer that no checklist was generated by the model.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I put those in to keep the output consistent. Makes parsing easier and ensures that we don't have a bunch of empty fields.
@@ -97,6 +97,24 @@ def _get_placeholder_output(vuln_id: str) -> AgentMorpheusEngineOutput: | |||
justification=JUSTIFICATION) | |||
|
|||
|
|||
def _get_deficient_intel_output(vuln_id: str) -> AgentMorpheusEngineOutput: | |||
SUMMARY = "There is insufficient intel available to determine vulnerability. This is either due to the CVE not existing or there is not enough gathered intel for the agent to make an informed decision." | |||
JUSTIFICATION = JustificationOutput(label="insufficient_intel", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use uncertain
as the justification label? Using a standardized value in the CVEJustifyNode
would prevent any downstream impacts of unexpected labels.
JUSTIFICATION = JustificationOutput(label="insufficient_intel", | |
JUSTIFICATION = JustificationOutput(label="uncertain", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could label it whatever we want. I thought it was helpful to distinguish between the LLM couldn't come up with a definitive label vs. the LLM didn't even look at this CVE (likely because it doesn't actually exist).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I do see the value of the label as well. I thought about backing into this info from the summary or the intel object, but this will take more work later on.
My main concern is about breaking any output contracts. Let's double check if the team has any concerns with adding a new justification label. If not, I'm good with it. We did also make another breaking change of adding a newline to the output, so we'll likely need to bump our version anyway.
""" | ||
Logic to determine if the CVE has sufficient intel and can be passed to the agent. | ||
Returns | ||
------- | ||
bool | ||
True if enough intel has been found for the CVE | ||
""" | ||
sufficiency = False | ||
for field_name, field in self.model_fields.items(): | ||
if isinstance(getattr(self, field_name), IntelSource): | ||
if not getattr(self, field_name) is None: | ||
sufficiency = getattr(self, field_name).intel_sufficient or sufficiency | ||
return sufficiency |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel that adding an intel_sufficient
property to each intel class, each with different definitions adds a lot of complexity, maybe more than we need to address the invalid CVE case.
What do you think of the simpler approach to just check if each intel object is None? Unfortunately this will require some changes to the intel retrieval to write a None value when the CVE doesn't exist. It seems like we do this for GHSA and NVD, but not the other sources.
"intel": [
{
"vuln_id": "CVE-0000-00000",
"ghsa": null,
"nvd": null,
"rhsa": {
"bugzilla": {
"description": null,
"id": null,
"url": null
},
"details": null,
"statement": null,
"package_state": null,
"upstream_fix": null,
"cvss3": null
},
"ubuntu": {
"description": null,
"notes": null,
"notices": null,
"priority": null,
"ubuntu_description": null,
"impact": null
},
"epss": {
"epss": null,
"percentile": null,
"date": null
},
"sufficient_intel": false
}
],
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was important to have direct control over the exact definition of sufficient detail. If the only source was EPSS (granted, an extremely unlikely scenario) that would not be sufficient for the agent to make an informed call on the validity. It also allows for explicit adjustment if the intel sources change.
Co-authored-by: Ashley Song <[email protected]>
Co-authored-by: Ashley Song <[email protected]>
Filters out CVE's that don't have enough intel from being passed into the agent to avoid issues involving labelling containers vulnerable to non-existent CVE's.
Closes #56