Problem I ran across was that users were unable to view reports in SyteLine. Simultaneously various Timeout error messages started appearing when users attempt to perform random actions in the system such as create a new item or create an order.

Example of this behavior is the Item Detail Report. When users run the report the Report Viewer tab would come up with a message “Waiting for the report to complete”. Normally this process would take anywhere between 5 and 10 seconds to complete, however this time around the report would either a very long time to be displayed (upwards of 20 minutes) or would hang indefinitely.

image

Interestingly enough, if I checked the Background Task History form, I’d see that the task was successfully completed within acceptable time (~10 seconds).

image

In addition, if I navigated to the location on the utility server where reports are stored (Default path is C:\Program Files (x86)\Infor\SyteLine\Report\OutputFiles\%username%\) I would find the pdf for the report there. Double clicking on it would open up the file and display the report correctly.

Looking at the underlying hardware showed that the system performance has been significantly degraded. There was a significant increase in CPU and memory utilization on the back end SQL server as well as tremendous spike in IOPS on the storage subsystem (SAN).

image

It appeared that the iSCSI volume storing SQL transaction logs was the culprit. Note that Writes are approaching 100%, and that the volume is consuming more than 70% of all IOPS (two arrays combined).

The conclusion was that something was writing to the SQL server transaction logs at an incredible rate. The fact that we’ve noticed transaction logs grow from about 2 GB to more than 70 GB over the course of 24 hours aligned with our assumptions. This prompted us to look into the SyteLine configuration and any changes made.

Ultimately we looked at the Intranets form. More specifically the Reports/TaskMan tab. One things stood out – the Polling Interval was set to 5.

image

Polling interval controls how frequently Task Manager (Infor TaskMan service) will look in the Background Task History table (on SQL) for new reports to run. The default value for this field is blank which actually equals 5,000. Problem here was that the value entered was 5, which meant that TaskMan running on the utility server was polling the SQL server every 5 milliseconds. This translates into 200 queries per second for each SyteLine Application database that has this polling interval value (mis)configured.

In this particular case polling interval was set to 5 milliseconds in a multisite environment consisting of 12 App databases which resulted in about 2,400 queries per second added on top of normal system usage.

What amplified the problem was a Replication Rule that was in place for the Site Admin category which is typically configured to replicate from every site and entity to every site and entity.

image

This rule replicates everything from the Intranets form as well as the Sites/Entities form (among other things) to all other sites and entities in a multisite environment. Entering the wrong value for the polling interval in just one site inadvertently, propagates throughout the whole environment.

Correcting this problem is relatively straight forward. All that needed to be done was to either remove the value (5) from the Polling Interval box on the Reports/TaskMan tab in the Intranets form. This needed to be done in all App databases. After making these changes Infor Framework Task Man service on the utility server had to be restarted.

Within moment we were seeing reduction in resource utilization across the board.

image

%d bloggers like this: