Monitoring Enterprise File Transfer Solutions

Overview

Increasingly, corporations depend on file transfer solutions to handle critical business processes. Given the importance of these systems, operations personnel are faced with the daily challenge of ensuring they are always up and running. The consequences of failure are dire: without reliable file transfer systems, the result for businesses can be downtime and associated costs, including lost revenue.

Customers using FTP and other low-end digital file transfer mechanisms experience major pain and associated costs because of the difficulty of achieving the required uptime for critical file transfer workflows. This pain includes IT personnel time spent babysitting and monitoring systems through non-standard methods (i.e. methods not integrated with enterprise standards for monitoring networks and related systems) to ensure timely identification and response to issues.

In addition, they incur pain due to the difficulty and time-consuming nature of troubleshooting and resolving issues, particularly in relatively complex networking environments in which digital file transfer workflows exist.

Finally, they struggle to make good decisions given the lack of detailed information regarding file transfer processes. In general, due to the weaknesses of default file transfer processes, such as those based on FTP, customers incur substantial costs and pain because of their inability to effectively monitor and troubleshoot the systems and because of the resulting downtime and associated costs to the business.

One of the best ways to ensure that a critical business application stays up and running is to monitor the system using an industry standard solution such as Microsoft Systems Center. By closely watching the system, IT personnel can detect and resolve problems quickly and efficiently. System uptime can be further enhanced by performing trend analysis on operational data collected over time and making adjustments to improve reliability and performance.

Unfortunately, many file transfer systems do not allow for monitoring, data collection and trend analysis. This is especially true of FTP-based solutions that lack the sophistication of Managed File Transfer systems.

Managed File Transfer to the Rescue

Unlike limited FTP products, many Managed File Transfer solutions provide complete support for monitoring using industry-standard tools and methodologies. Utilizing standard Windows Management Interfaces "WMI", IT and systems personnel can interface their MFT processes to standard enterprise monitoring tools (such as Microsoft Systems Center), apply standard processes to monitor digital file transfer, and quickly identify and resolve problems. This allows them to leverage their investment in enterprise network monitoring and associated training and processes.

In addition, using tools like Perfmon, they can collect detailed information to aid in troubleshooting issues and permit the fastest possible resolution of issues.

Finally, using data collected via Perfmon or equivalent tools, IT personnel can provide valuable information to IT and business management for longer-term decision-making.

The bottom line: WMI interfaces and the monitoring and data collection they permit provide powerful tools that allow IT personnel to reduce downtime and associated costs, inform critical decisions and to ensure their MFT workflows can deliver on the needs and expectations of users and the business with respect to reliability and availability.

Current monitoring and data collection tools provided by a variety of MFT solutions offer valuable business benefits including:

  • Reduced Downtime and Associated Costs – By providing IT personnel with tools to effectively monitor digital file transfer workflows so problems can be quickly identified and resolved, MFT solutions cut costs – both IT personnel time and lost business productivity due to downtime – associated with such downtime. In addition, by providing tools to collect and analyze detailed information using standard troubleshooting tools such as Perfmon, MFT solutions can offer IT Personnel a big boost in quickly understanding and troubleshooting issues so solutions can be identified and implemented as fast as possible.
  • Increase Leverage of Enterprise Network Monitoring Solutions – By allowing digital file transfer workflows to be monitored using standard enterprise monitoring tools (such as Microsoft Systems Center), MFT increases the value of these solutions and reduces training and management costs for digital file transfer business processes.
  • Increase Quality / Speed / Cost of Decision Making – By providing business users with a detailed understanding of the capabilities of their MFT deployment, monitoring and analytics tools allow business managers to make better business decisions, particularly big decisions with major cost and revenue implications where the stakes are high. For service providers this can allow for increased sales effectiveness/revenue, increased customer service, increase ability to price services properly and, in general, increased quality of decision making with the result of more efficient operations, happier customers and more revenue. For operations, this improves the organization's ability to make good decisions in bringing products to market and promoting them where MFT is an important element of the operation.

MassTransit from Group Logic

MassTransit's monitoring and trend analysis capabilities provide IT personnel with a powerful set of tools to ensure that their mission critical, MassTransit-based file transfer processes deliver on user expectations for high reliability and avoid costly downtime, missed deadlines and lost revenue.

The balance of this document describes how to perform monitoring and trend analysis using MassTransit's applicable feature set. Its goal is to provide a basic understanding of how to monitor a MassTransit-based managed file transfer system to ensure maximum uptime. For more detailed information, demonstrations and more contact Group Logic at www.grouplogic.com or +1.703.528.1555/1.800.476.8781.

NOTE: This document only applies to MassTransit 6.0 Professional and Enterprise servers installed on Windows 2003 server.

Monitoring with MassTransit

MassTransit offers features that allow it to be monitored by IT personnel so they can ensure that file transfer business processes experience minimal downtime. Specifically, the product offers a Windows Management Interface or "WMI" interface that allows monitoring of the application using external tools designed for that purpose. Through this WMI interface, customers can configure tools such as Microsoft Perfmon and Microsoft Systems Center to monitor up to twenty six (26) different parameters that together provide detailed, comprehensive information regarding the operation of a MassTransit server.

Using the capabilities of these monitoring solutions, IT personnel can establish processes that ensure reliability and uptime objectives are met. For example, customers can monitor the processes that make up the MassTransit system to determine if they are up or down. If the monitoring system determines that one or more of these elements are down IT personnel can be alerted so they can take corrective action. The following section illustrates how to achieve this using Microsoft Systems Center, one common tool utilized by enterprises to monitor systems.

MassTransit includes several system elements which are critical to its operation. If any one of these system elements is not active, MassTransit will be unable to transfer files.

  1. MassTransit Service: This Windows Service is the file transfer engine of the MassTransit system. It handles all connections and file exchange based on the system configuration.
  2. MySQL Service: This Windows Service is the database engine of the MassTransit System. It stores configuration, log and other information.
  3. Listens: Listens are elements of the MassTransit configuration that cause the product to "listen" for inbound connections. Without one or more active Listens configured, the MassTransit system cannot connect to clients and servers that contact it and cannot exchange files with them.

Microsoft Systems Center can be used to monitor each of these three key system elements through its standard capabilities, as well as using MassTransit's WMI interfaces.

  1. Total Listens – MassTransit's WMI interfaces provides access to a Total Listens performance counter. Create a "rule" in Systems Center to monitor this performance counter and to raise alarms and send e-mail notifications (see below).
  2. MassTransit Service – this Windows Service, like any other, can be monitoring using standard Windows and Systems Center capabilities. Create and configure an Operations Manager "monitor" to track and report on the state of the MassTransit Service
  3. MySQL Service – as with the MassTransit Service, the MySQL Windows Service can be monitored using standard tools. Create and configure a Systems Center "monitor" to track and report on the state of the MySQL Service.

Once Microsoft Systems Center is configured to monitor these MassTransit processes, IT Personnel can then utilize its interfaces to keep track of MassTransit system state. In particular, with proper configuration, Systems Center can raise alarms if any one or more elements are in a down state.

The following screen shot shows the Systems Center interface indicating two alarms – one indicating a problem with the MySQL Service and one indicating a problem with the MassTransit Engine.

In addition to displaying the alarm in its user interface, Systems Center should be configured to proactively alert IT personnel that a failure has occurred. A recommended option is to configure it to send an e-mail notification to designated IT staff indicating a MassTransit issue and providing details. The following is a screen shot of an example e-mail notification generated by Systems Center in response to an alarm:

Performance Counter Metrics, Tracking and Analysis

As described above, MassTransit offers twenty-six (26) performance counters each of which provides useful information about the operation of the MassTransit System. It is good practice for IT personnel to establish baseline metrics for key performance counters for their MassTransit systems. Then, by tracking and analyzing long term trends in the counters relative to these baselines, they can identify potential system problems to resolve as well as opportunities for system optimization.

The following are some recommended MassTransit performance counters for which customers should collect and analyze trend data:

  • Free Disk Space: This counter should be monitored to determine the rate at which disk space is being consumed. Analyzing this information will assist in establishing the proper frequency and scope of file cleanup on the system (e.g. periodic file purge). Properly configured file cleanup is critical to ensure that the MassTransit system does not run out of disk space and incur costly downtime.
  • Failed User Logins: User logins should be monitored to establish if the system is under attack.
  • Active Connections: Connections should be monitored to establish if the system is under attack.
  • SQL Commands: SQL commands are issued whenever the MassTransit system is transferring files and /or updating its log, reporting and administrative information. A dramatic increase in the number of SQL commands compared to established baselines may indicate a potential problem with the system.
  • SOAP Calls: SOAP requests are issued whenever MassTransit is processing web based information. A dramatic increase in the number of SOAP requests relative to established metrics may indicate a potential problem.

Among other tools, Microsoft Systems Center can be used to collect data and analyze trends in MassTransit file transfer system performance. As with the monitoring of critical processes described in the previous section, Systems Center can be configured with "rules" that monitor any one or more of MassTransit's twenty-six performance counters, including those listed above through MassTransit's WMI interfaces. Each configured rule collects data that can then be analyzed using System Center's user interfaces.

For example, the screen shot below shows data collected over time for two MassTransit performance counters – "Failed User Logins" and "Active Connections." Specifically, the screen shot is a chart produced by the Systems Center showing data for these two counters collected over an approximately one hour period of time.

Reviewing the data shows an obvious spike in the number of Failed User Logins starting at 11:15AM on 10/8/2008. This spike could be the result of a malicious attack on the MassTransit system and could lead to system failure – or indicate a security breach. By monitoring trends in the Failed User Logins data, IT personnel can identify this spike and work to confirm its source and resolve the problem.

The Windows Operating System also offers performance counters that provide useful information regarding

MassTransit system operation. Specifically, Windows has performance counters that can be used to collect information on CPU usage by the MassTransit File Transfer Engine and the MySQL Database processes. It is good practice to track CPU usage of these processes and review trends to spot potential issues. The following section summarizes how to configure Systems Center to perform this tracking.

  1. MassTransit Service: Per above, the MassTransit Service (process) is the "engine" of the product's managed file transfer capabilities. It must perform well to ensure optimized file transfer processes, thus monitoring its CPU load for trends and issues is a valuable activity. Create and configure a Systems Center "rule" to collect data over time on the MassTransit process CPU load.
  2. MySQL Service: Per above, the MySQL Service (process) is key to the product's managed file transfer capabilities. It must perform well to guarantee a smoothly running file transfer system, thus monitoring its impact on the system CPU is important. Create and configure a Systems Center"rule" to collect data over time on the MySQL process CPU load.

As with the above examples, once these "rules" are configured in Microsoft Systems Center and data is collected, the data can be analyzed using System Center's interfaces.

The following screen shot shows data collected over time for CPU-usage by the MassTransit Engine. Reviewing the data shows a large spike in CPU usages. By monitoring trends such as this in the MassTransit Engine data, IT personnel can identify this increased CPU usage issue and work to confirm its source and resolve the problem.

Conclusion

Businesses increasingly depend on file transfer solutions to handle critical processes. With the growing importance of these systems, customers must recognize that the consequences of system failure are dire: downtime and associated costs including lost revenue. Unfortunately, many file transfer systems do not allow for monitoring, data collection and trend analysis. This is especially true of FTP-based solutions that lack the sophistication of Managed File Transfer systems.