Can a File Reader read more than one input file at a time? #4548
-
There is the Max Processing Threads setting but would that mean a File Reader could read more than one file at a time, I don't think so but wanting to make sure. I'm looking for a way to improve performance of a File Reader when there are many input files sitting in it's input dir. Any and all suggestions welcome. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 5 replies
-
That's right, polling connectors like the File Reader would still only read one file at a time. Theoretically you could have other channels sending messages to that File Reader channel and those messages could process at the same time, though that probably wouldn't be a common scenario. You could also have the source queue turned on, meaning the File Reader would read one file at a time and place them into the queue, and then you could have multiple queue threads pulling from that queue to process messages in parallel. But you'd probably only see a performance increase there if the time it takes to process a message is more than the time it takes to read a message from the filesystem. Turning on the source queue does have a performance hit, but if the time it takes to process a message through the channel is significant, then overall it could be a performance gain. Adding the ability for the File Reader to read multiple files in parallel would be a good enhancement/request. |
Beta Was this translation helpful? Give feedback.
-
Usually when having a vast amount of files in a directory, the bottle neck in the connector is more reading in the directory content than the files themselves as the scanning of the directory structure takes extremely long. Using a DirectoryStream circumvents this issue as it allows to iterate over the directory content w/o doing a complete scan first. This might have the potential to improve the file reader performance. |
Beta Was this translation helpful? Give feedback.
That's right, polling connectors like the File Reader would still only read one file at a time. Theoretically you could have other channels sending messages to that File Reader channel and those messages could process at the same time, though that probably wouldn't be a common scenario. You could also have the source queue turned on, meaning the File Reader would read one file at a time and place them into the queue, and then you could have multiple queue threads pulling from that queue to process messages in parallel. But you'd probably only see a performance increase there if the time it takes to process a message is more than the time it takes to read a message from the filesystem. Tur…