Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #12189 Added mechanism to propagate tripped QC test flags to al… #60

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tlfish3638
Copy link
Contributor

…l affected parameters

Copy link
Contributor

@danmergens danmergens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where are the unit tests to make sure the qc tests are propagating as they are supposed to be?

Also - I'd like to see some timing results. Will this affect normal stream queries, or is this only affecting the netcdf download?

Copy link
Contributor

@mfegan mfegan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to look into why the unit test is failing.

@@ -10,10 +10,30 @@


@log_timing(_log)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this annotation should still be applied to _get_stream_metadata_list() since this function is called multiple times in _get_stream_metadata_list().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please restate this. You used the same function name as the calling and the called function. I refactored the 1st line of _get_stream_metadata to create _get_stream_metadata_list. I don't see what you mean when you say the function is called multiple times by the other when I look at these two functions.



def get_refdes_streams_for_subsite_node(subsite, node):
return [(Dict['referenceDesignator']['subsite']+'-'+Dict['referenceDesignator']['node'] +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to me a bit odd that you have to reconstruct the reference designator when it's in the dictionary as a key.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I've seen the data that's returned from the call to _get_stream_metadata_list has a reference_designator map that contains only the subsite, node and sensor elements.

@tlfish3638
Copy link
Contributor Author

Dan - I believe this will affect all stream_engine queries. I'll work on generating timing results. Keep in mind most of what happens is triggered when a QC test failure is recognized. I'll work on adding a test to test_stream_request.py.

Mark - I'm trying to make sense of the Travis CI test failure. I'll get back to you on that.

Copy link
Contributor

@sgfoote sgfoote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, the _run_qc and the _initialize methods have become very long. It would be a major help to readability and understanding if some of the logic could be abstracted away into new functions.

@@ -10,10 +10,30 @@


@log_timing(_log)
def _get_stream_metadata_list():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The _get_stream_metadata_list() function is getting called in 3 functions: _get_stream_metadata(), get_refdes_streams_for_subsite_node(), and get_streams_for_subsite_node_sensor(). It looks like stream_request.py calls at least two of these functions in _run_qc(). I'm not sure how expensive the metadata_service_api.get_stream_metadata_records() and underlying query are, but it may be a good idea to try to limit how often the metadata lookup occurs.

QC_RESULTS = '_qc_results'
RELATED = 'R'
# these tests were deemed not ready for propagation by the Data Team
QC_TEST_FAIL_SKIP_PROPAGATE = ('dataqc_gradienttest', 'dataqc_localrangetest', 'dataqc_polytrendtest')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this constant be better off defined in the config file, config/default.py? Then code changes would not be necessary if the data team saw reason to add or remove entries from this group.

for dataset in stream_dataset.datasets.itervalues():
self.qc_executor.qc_check(param, dataset)
# check for QC test failure on 1+ data particles for the parameter
if (param_qc_executed in dataset.keys()) and (param_qc_results in dataset.keys()) and \
(dataset[param_qc_executed].values.min() > dataset[param_qc_results].values.min()):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how the comparison of min values is supposed to indicate whether or not there were any QC failures. I'm thinking dataset[param_qc_executed].values looks something like [21, 21, 21, 21, ..., 21] and dataset[param_qc_results].values is the same structure, but with potentially different numbers. Not sure if these values are possible, but for example, if the qc_executed values were [21, 21, 16, 4] and the qc_results values were [21, 21, 8, 4], then the minimum values would be equal and I think the check would fail to see there was a QC failure (executed value of 16 vs results value of 8).

if i & variance and self.qc_test_flag_map[i][1]]
if failed_tests_to_propagate:
qc_executed_values[position] = \
qc_executed_values[position] | self.qc_propagate_flag
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that comments 12 and 13 of the 12189 ticket mention this, but I don't understand why QC is propagated by changing bits in the qc_executed parameter instead of the qc_results parameter. My understanding was that _qc_executed indicated what QC tests were run for param and _qc_results indicated which QC tests passed for param. It seems strange to me, therefore, to indicate a propagated failure in qc_executed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, this should be OK. I wasn't realizing there is a special qc propagation flag and thought other test flags were getting overwritten.

@sgfoote
Copy link
Contributor

sgfoote commented Sep 12, 2018

I am concerned that the logic for determining which parameters QC failures should propagate to does not match the way Stream Engine is resolving parameters. When there are multiple potential sources for an input parameter, it is important that the propagation of QC failures reflects only the input parameter that is actually used to calculate the derived parameter.

  1. While I believe the long term intent is to use preload-database.resolve_stream.resolve_stream() as the algorithm for determining which of the potential input parameters to use, currently Stream Engine does not use that algorithm. Instead, its parameter resolving logic is found in the following functions: stream_engine.util.stream_dataset.StreamDataset.calculate_all/calculate_virtual, stream_engine.util.stream_dataset.StreamDataset._try_create_derived_product, stream_engine.util.stream_dataset.StreamDataset._build_function_arguments, and ooi-data.ooi_data.postgres.model.preload.Stream.create_function_map (particularly that last one). An algorithm mismatch between QC propagation and input parameter resolution can lead to QC failures propagating to derived parameters that they were not actually used to calculate.

  2. In the code under review, "related" and "nearby" streams are computed. Then, both of these are cross referenced against "affected" streams and the intersections of those sets ("related" with "affected" and "nearby" with "affected") are used to select the derived parameters to which QC failures are propagated. The algorithm for calculating "affected" streams, however, returns all potentially affected parameters (see the comment on preload-database.affected.parameter_affects()). The end result of this computation is that QC failures propagate to all derived parameters potentially derived from the input parameter instead of just those derived parameters that are actually derived from the input parameter.

@danmergens
Copy link
Contributor

@sgfoote - I suspect that we should not continue work on this until Stream Engine has been updated to use Preload's parameter resolution. This was also causing difficulty for me in resolving other Stream Engine issues.

@sgfoote
Copy link
Contributor

sgfoote commented Sep 12, 2018

Had a brief chat with Dan, and the solution of recording the derivation map as it is determined can proceed. We just need to keep in mind that the current logic for choosing input parameters to algorithms will eventually be replaced by Preload's parameter resolution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants