Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ECR returns HTTP/400 errors when using aggregators #1938

Closed
captainfalcon23 opened this issue Oct 13, 2023 · 15 comments · Fixed by #2133
Closed

ECR returns HTTP/400 errors when using aggregators #1938

captainfalcon23 opened this issue Oct 13, 2023 · 15 comments · Fixed by #2133
Assignees
Labels
bug Something isn't working

Comments

@captainfalcon23
Copy link

Describe the bug
Pretty similar to #1373.

Trying to join aws_ecr_image onto aws_ecr_image_scan_finding like:

join aws_all.aws_ecr_image_scan_finding f on ( f.repository_name, f.image_tag) = ( i.repository_name, i.image_tag)

The problem is, to query aws_ecr_image_scan_finding you need to pass repo name + image tag, and when this is done via a join, steampipe is checking each and every aggregated connection for the existence of the repository. When it can't be found, thousands of errors are dumped to the logs:

2023-10-13 06:22:46.624 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1697178165543: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 138c93fb-5726-42f1-8436-67c388ea4841, RepositoryNotFoundException: The repository with name 'XXX' does not exist in the registry with id 'XXXX'"

Steampipe version (steampipe -v)
Steampipe v0.21.1

Plugin version (steampipe plugin list)

+---------------------------------------------------+---------+-
| Installed                                         | Version |
+---------------------------------------------------+---------+-
| hub.steampipe.io/plugins/turbot/aws@latest        | 0.120.2 |
|                                                   |         |
| hub.steampipe.io/plugins/turbot/csv@latest        | 0.11.1  |
| hub.steampipe.io/plugins/turbot/jira@latest       | 0.12.1  |
| hub.steampipe.io/plugins/turbot/kubernetes@latest | 0.25.1  |
| hub.steampipe.io/plugins/turbot/steampipe@latest  | 0.9.1   |
+---------------------------------------------------+---------+-

To reproduce
Query:

(
    with latest_image_ts as (
      select
        repository_name,
        max(image_pushed_at) as image_pushed_at
      from
        aws_all.aws_ecr_image
      group by
        repository_name
    ),
    images_with_tags as (
      select
        _ctx,
        region,
        repository_name,
        image_pushed_at,
        jsonb_array_elements_text(image_tags) :: text as image_tag
      from
        aws_all.aws_ecr_image
    )
    select
      i._ctx ->> 'connection_name' as "AWS_ACCOUNT_NAME",
      i.region,
      i.repository_name,
      i.image_pushed_at,
      i.image_tag,
      f.name,
      f.uri,
      f.severity,
      f.description,
      (
        jsonb_path_query(f.attributes, '$[*] ? (@.Key == "package_name")') -> 'Value' #>>'{}')::text as package_name,
        (
          jsonb_path_query(
            f.attributes,
            '$[*] ? (@.Key == "package_version")'
          ) -> 'Value' #>>'{}')::text as package_version
          from
            images_with_tags i
            join latest_image_ts l on (l.repository_name, l.image_pushed_at) = (i.repository_name, i.image_pushed_at)
            join aws_all.aws_ecr_image_scan_finding f on ( f.repository_name, f.image_tag) = ( i.repository_name, i.image_tag)
          order by
            repository_name,
            image_tag,
            severity,
            name,
            package_name
        ) innerq;

Expected behavior
Only the account where the image exists is queried.

@captainfalcon23 captainfalcon23 added the bug Something isn't working label Oct 13, 2023
@captainfalcon23 captainfalcon23 changed the title ECR returns 400 errors when using aggregators ECR returns HTTP/400 errors when using aggregators Oct 13, 2023
@ParthaI
Copy link
Contributor

ParthaI commented Nov 20, 2023

Hi @captainfalcon23, My apologies for the delayed response. I have executed the query as outlined in the issue description, ensuring the correct credential configuration, Steampipe CLI version, and plugin version. However, I encountered a syntax error initially. After rectifying the syntax error, I attempted the query multiple times using an aggregator connection, and in all instances, I received results.

Credential configuration file(~/.steampipe/config/aws.spc):

connection "aws" {
  plugin = "aws"
  regions = ["*"]
  profile = "aws_nagraj"
  ignore_error_codes = ["AccessDenied", "AccessDeniedException"]
}

connection "aws_aab" {
  plugin  = "aws"
  regions = ["*"]
  profile = "aws-aab"
  ignore_error_codes = ["AccessDenied", "AccessDeniedException"]
}

connection "aws_all" {
  plugin = "aws"
  type   = "aggregator"
  connections = ["aws", "aws_aab"]
}

Query:

with latest_image_ts as (
  select
    repository_name,
    max(image_pushed_at) as image_pushed_at
  from
    aws_all.aws_ecr_image
  group by
    repository_name
),
images_with_tags as (
  select
    _ctx,
    region,
    repository_name,
    image_pushed_at,
    jsonb_array_elements_text(image_tags)::text as image_tag
  from
    aws_all.aws_ecr_image
)
select
  i._ctx ->> 'connection_name' as "AWS_ACCOUNT_NAME",
  i.region,
  i.repository_name,
  i.image_pushed_at,
  i.image_tag,
  f.name,
  f.uri,
  f.severity,
  f.description,
  (
    jsonb_path_query(f.attributes, '$[*] ? (@.Key == "package_name")') -> 'Value' #>> '{}'
  )::text as package_name,
  (
    jsonb_path_query(
      f.attributes,
      '$[*] ? (@.Key == "package_version")'
    ) -> 'Value' #>> '{}'
  )::text as package_version
from
  images_with_tags i
  join latest_image_ts l on (l.repository_name, l.image_pushed_at) = (i.repository_name, i.image_pushed_at)
  join aws_all.aws_ecr_image_scan_finding f on (f.repository_name, f.image_tag) = (i.repository_name, i.image_tag)
order by
  i.repository_name,
  i.image_tag,
  f.severity,
  f.name,
  package_name;

Result:

+------------------+-----------+-----------------+---------------------------+-----------+---------------+--------------------------------------------------------------+----------+-------------+--------------+-----------------+
| AWS_ACCOUNT_NAME | region    | repository_name | image_pushed_at           | image_tag | name          | uri                                                          | severity | description | package_name | package_version |
+------------------+-----------+-----------------+---------------------------+-----------+---------------+--------------------------------------------------------------+----------+-------------+--------------+-----------------+
| aws_aab          | us-east-2 | test1           | 2023-11-20T13:43:47+05:30 | latest    | CVE-2023-5363 | https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-5363 | HIGH     | <null>      | openssl      | 3.1.3-r0        |
| aws_aab          | us-east-2 | test1           | 2023-11-20T13:43:47+05:30 | latest    | CVE-2023-5678 | https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-5678 | HIGH     | <null>      | openssl      | 3.1.3-r0        |
| aws_aab          | us-east-1 | tets            | 2023-11-20T12:50:13+05:30 | latest    | CVE-2023-5363 | https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-5363 | HIGH     | <null>      | openssl      | 3.1.3-r0        |
| aws_aab          | us-east-1 | tets            | 2023-11-20T12:50:13+05:30 | latest    | CVE-2023-5678 | https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2023-5678 | HIGH     | <null>      | openssl      | 3.1.3-r0        |
+------------------+-----------+-----------------+---------------------------+-----------+---------------+--------------------------------------------------------------+----------+-------------+--------------+-----------------+

I also tried the queries by splitting them up into multiple parts with an aggregator connection. Still, I have yet to receive any error.

I have a few follow-up questions. Please review the following:

  • First, try executing the same query using the latest version of the AWS plugin, v0.123.0, and check if the error persists.
  • If you still encounter issues, kindly provide us with the plugin log details. You can find these logs in ~/.steampipe/logs/plugin-*.log. Please share the log details for the specific query causing the error.
  • Please share the query you ran exactly with us after rectifying the syntax error. This will help us further investigate and help you.

Thank You!

@captainfalcon23
Copy link
Author

Hi @ParthaI

I had a look at this and realised I actually shared the wrong query! My apologies there. The query with the issue is:

select
    distinct REPLACE(
        REPLACE(innerq."AWS_ACCOUNT_NAME", 'aws_', ''),
        '_',
        '-'
    ) as "AWS_ACCOUNT_NAME",
    innerq.region as "AWS_REGION",
    innerq.repository_name as "AWS_REPO",
    innerq.image_tag as "ECR_IMAGE_TAG",
    innerq.name as "COMMON_VULN_ID",
    innerq.uri as "VULN_INFO_URL",
    innerq.severity as "VULN_SEVERITY",
    innerq.package_version as "VULN_PACKAGE_VERSION",
    innerq.package_name as "VULN_PACKAGE_NAME"
from
    (
        with latest_image_ts as (
            select
                repository_name,
                max(image_pushed_at) as image_pushed_at
            from
                aws_all.aws_ecr_image
            group by
                repository_name
        ),
        images_with_tags as (
            select
                _ctx,
                region,
                repository_name,
                image_pushed_at,
                jsonb_array_elements_text(image_tags) :: text as image_tag
            from
                aws_all.aws_ecr_image
        )
        select
            i._ctx ->> 'connection_name' as "AWS_ACCOUNT_NAME",
            i.region,
            i.repository_name,
            i.image_pushed_at,
            i.image_tag,
            f.name,
            f.uri,
            f.severity,
            f.description,
            (
                jsonb_path_query(f.attributes, '$[*] ? (@.Key == "package_name")') -> 'Value' #>>'{}')::text as package_name,
                (
                    jsonb_path_query(
                        f.attributes,
                        '$[*] ? (@.Key == "package_version")'
                    ) -> 'Value' #>>'{}')::text as package_version
                    from
                        images_with_tags i
                        join latest_image_ts l on (l.repository_name, l.image_pushed_at) = (i.repository_name, i.image_pushed_at)
                        join aws_all.aws_ecr_image_scan_finding f on (f.repository_name, f.image_tag) = (l.repository_name, i.image_tag)
                    order by
                        repository_name,
                        image_tag,
                        severity,
                        name,
                        package_name
                ) innerq;

I have retried this today:

List of plugins:

+---------------------------------------------------+---------+
| Installed                                         | Version |
+---------------------------------------------------+---------+
| hub.steampipe.io/plugins/turbot/aws@latest        | 0.123.0 |
|                                                   |         |
| hub.steampipe.io/plugins/turbot/csv@latest        | 0.11.1  |
| hub.steampipe.io/plugins/turbot/jira@latest       | 0.13.0  |
| hub.steampipe.io/plugins/turbot/kubernetes@latest | 0.25.1  |
| hub.steampipe.io/plugins/turbot/steampipe@latest  | 0.9.1   |
+---------------------------------------------------+---------+

Steampipe ver:

Steampipe v0.21.1

In my plugin.log, logs are filled with errors like this:

2023-11-20 22:50:37.243 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 8819c533-c606-434f-ae72-1dfaf4dea2a2, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:37.245 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 66c6b81c-dd02-4aae-b7f5-2aee7eb6f495, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:37.245 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 73e2260f-e6da-4cc5-ae9c-80503619b68c, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:37.246 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 4da0204b-60c6-4f30-886f-acbf398b9974, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.150 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: c3170f8b-822b-4bb0-b2f0-f1e95c671252, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.174 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 1f212e9c-eb7d-4990-91f4-6de42dd77f99, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.180 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 0aaa08a6-3de3-42fe-bb80-08e3f5538766, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.181 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: ce9ed6ad-99d8-43e3-9e4c-8fe05730b275, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.183 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 5ac4b54d-1539-42ac-8925-c9996229cdc2, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.183 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 6f9a93e7-e0ac-4bc4-bf28-3ec38480eb5c, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.185 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 97a7d9ea-81a6-4ecb-aa01-cbc329092ccb, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.198 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: ef4c09aa-3846-41ec-98b1-89305c9f0e2b, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.199 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: d3509d9a-8472-4e77-b346-196e76e7ef7a, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.200 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 293addc8-a8d2-4c5c-a194-8b7f42ac32e4, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"
2023-11-20 22:50:41.206 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1700520630288: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 8071b372-0779-443f-a8e1-c776ac3b7383, RepositoryNotFoundException: The repository with name 'abc-app' does not exist in the registry with id '123466789'"

I am happy to share the full log over slack, but it only contains above errors, nothing much else. I believe if you have two AWS accounts, each containing repos with different names, you should be able to reproduce this. I am sure it would be fixed by using a ParentHydrate similar to #1373 (comment)

@ParthaI
Copy link
Contributor

ParthaI commented Nov 23, 2023

Hello @captainfalcon23, I appreciate you providing the information. I'll review the issue once more and let you know if I need anything else. Thank you!

@ParthaI
Copy link
Contributor

ParthaI commented Nov 27, 2023

Hi @captainfalcon23, I attempted to reproduce the error and discovered that it's being logged in my LOG file but not appearing in my Terminal. This behavior is expected in aggregator mode because a repository named demo, which exists in account-1, may not be present in account-2. By design, steampipe iterates the API calls per connection configured in an aggregator connection. Also, the API calls are iterated over the regions per connection specified in the steampipe config file(~/.steampipe/config/aws.spc).

Could you please confirm you are getting the error in your terminal or in your LOG file only?

  • I believe we shouldn't encounter the RepositoryNotFoundException, ImageNotFoundException, and ScanNotFoundException exceptions when querying the aws_ecr_image_scan_finding table in the terminal. These exceptions are already handled in the ignore configuration of the table, as specified in the table's code here.

  • Regarding the comment: "I believe this issue can be resolved by implementing a ParentHydrate, similar to the one discussed in this comment." It's important to note that Steampipe's design allows for the use of a ParentHydrate only one level down, not more than that.

    • Specifically, the aws_ecr_image table can be utilized as the ParentHydrate for the aws_ecr_image_scan_finding table, and the aws_ecr_image table, in turn, uses aws_ecr_repository as its parent.
    • Steampipe does not currently support parent chaining with more than one level. Like (aws_ecr_image_scan_finding -> aws_ecr_image -> aws_ecr_repository).

Thanks!

@captainfalcon23
Copy link
Author

Hi @ParthaI
Could you please confirm you are getting the error in your terminal or in your LOG file only
Yes, this is correct.

believe we shouldn't encounter the RepositoryNotFoundException, ImageNotFoundException, and ScanNotFoundException exceptions when querying the aws_ecr_image_scan_finding table in the terminal.
That is right, we don't see those errors.

It's important to note that Steampipe's design allows for the use of a ParentHydrate only one level down
That's disappointing. In my case, I have hundreds of ECR repos in which I want to get the vulnerabilities across many accounts. It just seems wrong to hit every single account and region with queries when I feel this should be handled gracefully, not to mention, it increases the query time exponentially, and logs so many failed API attempts in each accounts' Cloudtrail. I feel like not enough context is being passed - for example, in the parent tables, there will be the account and region, if those could be passed to the child tables, that would overall improve query performance for any query. Although, I don't know how Steampipes internals work, and if this is possible.

Is there any better solution or way of writing this (and other similar) queries to avoid this situation? I fell back onto doing a query for each and every account individually, and aggregating them at the end, but it's not flexible. I think also this is not a common scenario (I may be wrong) where there are 3 tables in the chain.

Happy to discuss and get some suggestions :)

Copy link

This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

@github-actions github-actions bot added the stale No recent activity has been detected on this issue/PR and it will be closed label Jan 26, 2024
@ParthaI ParthaI removed the stale No recent activity has been detected on this issue/PR and it will be closed label Jan 27, 2024
@bigdatasourav
Copy link
Contributor

@captainfalcon23, after careful consideration and review of the issue you've raised, we have decided to close this issue at the current time. We understand that the unnecessary logs have been a concern for you, and to address this specifically, I have raised an issue with our SDK team for further investigation and resolution. You can track the progress here.

We are committed to ensuring our service meets your needs and expectations, and we value your feedback and contributions to our community. If you have any additional insights or feedback, or if there's anything specific you would like to discuss further or contribute, please do not hesitate to reopen this issue or create a new one. Additionally, if you feel inclined to contribute directly through a Pull Request (PR), we welcome your contributions with open arms.

@bigdatasourav bigdatasourav closed this as not planned Won't fix, can't repro, duplicate, stale Mar 7, 2024
@captainfalcon23
Copy link
Author

That’s fine, but the issue isn’t about logs in a log file. It’s about steampipe inefficiency of querying every account and region in an aggregation instead of smartly hydrating the tables from parents. If the context is there that an image belongs to account X in region Y, I don’t understand why steampipe can’t use that context to query the appropriate connection for further info.

@bigdatasourav
Copy link
Contributor

bigdatasourav commented Mar 8, 2024

@captainfalcon23, the aws_ecr_image_scan_finding table is an independent table which requires repository_name and image_tag, we can not use the aws_ecr_image table as parentHydrate here due to the current limitation of Steampipe SDK. I have provided a fix here, could you please try this branch and let me know if this helps?

@bigdatasourav
Copy link
Contributor

@captainfalcon23
Copy link
Author

FYI I haven't forgotten about this, will take a look today

@captainfalcon23
Copy link
Author

Okay, I have compiled the plugin and ran the same query as before and still I can see errors where steampipe is trying to find images in the wrong accounts. I have simplified the query, and you'll see on the 4th and 5th last lines, there is an additional join condition. But the logs still show Steampipe is checking each and every account for every repository. However, the region filter DOES appear to be working, as there is only a max of 1 call to each account to find the repo, so I think we are half way there! I tested adding/removing the region in the join, and it is clearly using the region and even the query returns much quicker when region is used in the join.

left join aws_all.aws_ecr_image_scan_finding f on f.account_id = i.account_id
    and f.region = i.region
2024-03-18 23:30:30.956 UTC [ERROR] steampipe-plugin-aws.plugin: [ERROR] 1710804630785: aws_ecr_image_scan_finding.listAwsEcrImageScanFindings: api_error="operation error ECR: DescribeImageScanFindings, https response error StatusCode: 400, RequestID: 997753c6-6058-4e65-8c0e-86ee73e37ddd, RepositoryNotFoundException: The repository with name 'XXX/imply/manager' does not exist in the registry with id 'XXXXXXXXXXX'"
with latest_image_ts as (
    select
        repository_name,
        max(image_pushed_at) as image_pushed_at
    from
        aws_all.aws_ecr_image
    group by
        repository_name
),
images_with_tags as (
    select
        _ctx,
        account_id,
        region,
        repository_name,
        image_pushed_at,
        jsonb_array_elements_text(image_tags) :: text as image_tag
    from
        aws_all.aws_ecr_image
)
select
    i._ctx ->> 'connection_name' as "AWS_ACCOUNT_NAME",
    i.region,
    i.repository_name,
    i.image_pushed_at,
    i.image_tag,
    l.image_pushed_at
from
    images_with_tags i
    left join latest_image_ts l on (l.repository_name, l.image_pushed_at) = (i.repository_name, i.image_pushed_at)
    left join aws_all.aws_ecr_image_scan_finding f on f.account_id = i.account_id
    and f.region = i.region
    and (f.repository_name, f.image_tag) = (i.repository_name, i.image_tag)
where
    l.image_pushed_at is not null

Using region + account_id in join:

+-------+
| count |
+-------+
| 4381  |
+-------+

Time: 42.4s. Rows fetched: 13,359. Hydrate calls: 0.
>

WITHOUT Using region + account_id in join:

+-------+
| count |
+-------+
| 4473  |
+-------+

Time: 114.6s. Rows fetched: 5. Hydrate calls: 5.
>

@bigdatasourav
Copy link
Contributor

@captainfalcon23, Could you please try out this PR, I have tested the above queries and it is working fine for me without any error logs.

@bigdatasourav bigdatasourav linked a pull request Apr 2, 2024 that will close this issue
@captainfalcon23
Copy link
Author

@bigdatasourav Doesn't seem to be working as expected for me. This is what I've done:

steampipe plugin uninstall aws
git clone https://github.com/turbot/steampipe-plugin-aws.git
cd steampipe-plugin-aws
git checkout connection_key_columns
make
rm -rf ~/.steampipe/logs
# open steampipe in interactive mode
steampipe query
# run the query

with latest_image_ts as (
    select
        repository_name,
        max(image_pushed_at) as image_pushed_at
    from
        aws_all.aws_ecr_image
    group by
        repository_name
),
images_with_tags as (
    select
        _ctx,
        account_id,
        region,
        repository_name,
        image_pushed_at,
        jsonb_array_elements_text(image_tags) :: text as image_tag
    from
        aws_all.aws_ecr_image
)
select
    i._ctx ->> 'connection_name' as "AWS_ACCOUNT_NAME",
    i.region,
    i.repository_name,
    i.image_pushed_at,
    i.image_tag,
    l.image_pushed_at
from
    images_with_tags i
    left join latest_image_ts l on (l.repository_name, l.image_pushed_at) = (i.repository_name, i.image_pushed_at)
    left join aws_all.aws_ecr_image_scan_finding f on f.account_id = i.account_id
    and f.region = i.region
    and (f.repository_name, f.image_tag) = (i.repository_name, i.image_tag)
where
    l.image_pushed_at is not null

I can still see errors in the logs, indicating the accountid isn't being used in the join. Am I missing a step somewhere?

grep "RepositoryNotFoundException" plugin-2024-04-03.log  | wc -l
807

@captainfalcon23
Copy link
Author

Hi @misraved Just checking in why this issue got closed off when the issue is not fixed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
3 participants