Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance: avoid contention on scheduler execution #103

Merged
merged 7 commits into from
Feb 16, 2022
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
## 5.2.3
- Performance: avoid contention on scheduler execution [#103](https://github.com/logstash-plugins/logstash-integration-jdbc/pull/103)

## 5.2.2
- Feat: name scheduler threads + redirect error logging [#102](https://github.com/logstash-plugins/logstash-integration-jdbc/pull/102)

Expand Down
8 changes: 7 additions & 1 deletion lib/logstash/inputs/jdbc.rb
Original file line number Diff line number Diff line change
Expand Up @@ -296,7 +296,13 @@ def run(queue)
if @schedule
# input thread (Java) name example "[my-oracle]<jdbc"
@scheduler = LogStash::PluginMixins::Jdbc::Scheduler.new(
:max_work_threads => 1, :thread_name => "[#{id}]<jdbc__scheduler"
:max_work_threads => 1,
:thread_name => "[#{id}]<jdbc__scheduler",
# amount the scheduler thread sleeps between checking whether jobs
# should trigger, default is 0.3 which is a bit too often ...
# in theory the cron expression '* * * * * *' supports running jobs
# every second but this is very rare, we could potentially go higher
:frequency => 1.0,
)
@scheduler.schedule_cron @schedule do
execute_query(queue)
Expand Down
42 changes: 39 additions & 3 deletions lib/logstash/plugin_mixins/jdbc/scheduler.rb
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,42 @@ class Scheduler < Rufus::Scheduler
TimeImpl = defined?(Rufus::Scheduler::EoTime) ? Rufus::Scheduler::EoTime :
(defined?(Rufus::Scheduler::ZoTime) ? Rufus::Scheduler::ZoTime : ::Time)

# @overload
def timeout_jobs
# Rufus relies on `Thread.list` which is a blocking operation and with many schedulers
# (and threads) within LS will have a negative impact on performance as scheduler
# threads will end up waiting to obtain the `Thread.list` lock.
#
# However, this isn't necessary we can easily detect whether there are any jobs
# that might need to timeout: only when `@opts[:timeout]` is set causes worker thread(s)
# to have a `Thread.current[:rufus_scheduler_timeout]` that is not nil
return unless @opts[:timeout]
super
end

# @overload
def work_threads(query = :all)
if query.nil? # our special case from JobDecorator#start_work_thread
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we are using the nil value as a special value, we are assigning to it a meaning. work_threads(nil) is invoked only by the patched methods in JobDecorator, could we use something really specific and explicit for this?
Maybe adding a new label :os_list_threads or something similar instead of nil ?

Copy link
Contributor Author

@kares kares Feb 10, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, only used from JobDecorator - otherwise it's usually called with no args or a selector.
I wanted to have smt of a different type than the :symbol it supports, will try smt better than nil.

@_work_threads = nil # when a new worker thread is being added reset
return super(:all)
end

# Gets executed every time a job is triggered, we're going to cache the
# worker threads for this scheduler (to avoid `Thread.list`) - they only
# change when a new thread is being started from #start_work_thread ...
work_threads = @_work_threads
if work_threads.nil?
work_threads = threads.select { |t| t[:rufus_scheduler_work_thread] }
@_work_threads = work_threads
end

case query
when :active then work_threads.select { |t| t[:rufus_scheduler_job] }
when :vacant then work_threads.reject { |t| t[:rufus_scheduler_job] }
else work_threads
end
end

# @overload
def on_error(job, err)
details = { exception: err.class, message: err.message, backtrace: err.backtrace }
Expand Down Expand Up @@ -72,14 +108,14 @@ def do_schedule(job_type, t, callable, opts, return_job_instance, block)
module JobDecorator

def start_work_thread
prev_thread_count = @scheduler.work_threads.size
prev_thread_count = @scheduler.work_threads(nil).size

ret = super() # does not return Thread instance in 3.0

work_threads = @scheduler.work_threads
work_threads = @scheduler.work_threads(nil)
while prev_thread_count == work_threads.size # very unlikely
Thread.pass
work_threads = @scheduler.work_threads
work_threads = @scheduler.work_threads(nil)
end

work_thread_name_prefix = @scheduler.work_thread_name_prefix
Expand Down
2 changes: 1 addition & 1 deletion logstash-integration-jdbc.gemspec
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Gem::Specification.new do |s|
s.name = 'logstash-integration-jdbc'
s.version = '5.2.2'
s.version = '5.2.3'
s.licenses = ['Apache License (2.0)']
s.summary = "Integration with JDBC - input and filter plugins"
s.description = "This gem is a Logstash plugin required to be installed on top of the Logstash core pipeline using $LS_HOME/bin/logstash-plugin install gemname. This gem is not a stand-alone program"
Expand Down
26 changes: 26 additions & 0 deletions spec/plugin_mixins/jdbc/scheduler_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -49,4 +49,30 @@

end

context 'work threads' do

let(:opts) { super().merge :max_work_threads => 3 }

let(:counter) { java.util.concurrent.atomic.AtomicLong.new(0) }

before do
scheduler.schedule_cron('* * * * * *') { counter.increment_and_get; sleep 3.25 } # every second
end

it "are working" do
sleep(0.05) while counter.get == 0
expect( scheduler.work_threads.size ).to eql 1
sleep(0.05) while counter.get == 1
expect( scheduler.work_threads.size ).to eql 2
sleep(0.05) while counter.get == 2
expect( scheduler.work_threads.size ).to eql 3

sleep 1.25
expect( scheduler.work_threads.size ).to eql 3
sleep 1.25
expect( scheduler.work_threads.size ).to eql 3
end

end

end