Sunil O
2018-09-27 12:16:09 UTC
We have a implementation scenario with File consumer polling large number
of folders (3000+) and 200K files per day with 1 mts SLA per file
processing.
We also need to use readLock to avoid picking files which are being
written.
In this scenario - we went for readLock=changed option. However this option
results in thread sleeping for minium time as specified by
readLockCheckInterval period option. While looking for workarounds - we
found the readLockMinAge option - which allows to pick up files which are
old enough without getting into the sleep mode. This has reduced the time
for picking files and thereby reducing the overall processing time.
However whenever a file is encountered with age below minage - the sleep
occurs as per the logic in FileChangedExclusiveReadLockStrategy. If this
'sleep' step can be avoided when readLockMin age is specified - then
insteading of sleeping - the consumer can go on to pick other files. This
modified behavior would be useful in scenario where overall throughput and
processing performance is important than sequential processing etc.
While browsing similar issues - found JIRA issue 9324 which also discusses
the issue regarding the Sleep step -
So it would be good if one of the below is available
a) there is a separate ExclusiveReadLockStrategy similar to
FileChangedExclusiveReadLockStrategy which deals only with readLockMin age
and skips file if age is not met instead of sleeping.
Or
b) an option skip/sleep should be added for
FileChangedExclusiveReadLockStrategy when readLockMinAge is used.
Please give your suggestions.
of folders (3000+) and 200K files per day with 1 mts SLA per file
processing.
We also need to use readLock to avoid picking files which are being
written.
In this scenario - we went for readLock=changed option. However this option
results in thread sleeping for minium time as specified by
readLockCheckInterval period option. While looking for workarounds - we
found the readLockMinAge option - which allows to pick up files which are
old enough without getting into the sleep mode. This has reduced the time
for picking files and thereby reducing the overall processing time.
However whenever a file is encountered with age below minage - the sleep
occurs as per the logic in FileChangedExclusiveReadLockStrategy. If this
'sleep' step can be avoided when readLockMin age is specified - then
insteading of sleeping - the consumer can go on to pick other files. This
modified behavior would be useful in scenario where overall throughput and
processing performance is important than sequential processing etc.
While browsing similar issues - found JIRA issue 9324 which also discusses
the issue regarding the Sleep step -
So it would be good if one of the below is available
a) there is a separate ExclusiveReadLockStrategy similar to
FileChangedExclusiveReadLockStrategy which deals only with readLockMin age
and skips file if age is not met instead of sleeping.
Or
b) an option skip/sleep should be added for
FileChangedExclusiveReadLockStrategy when readLockMinAge is used.
Please give your suggestions.