This topic discusses some of the options for performing database queries when migrating from a Mainframe to .NET and SQL Server. Because of the architecture differences between SQL Server and Mainframe, what works well in the Mainframe world might not be a good practice when directly translated into the .NET + Microsoft SQL Server world.
In Mainframe batch processes, it is typical that workloads use one central loop over a database cursor to get the objects to be processed and then generate tens of separated queries for each object. In Mainframe architectures the application code usually executes in an environment that is very close to the database and therefore the latency of database interactions is extremely low. Because of this Mainframe applications often make liberal use of the databases and their interactions with databases often end up being very "chatty".
However, in .NET + SQL Server world, because of the client-server architecture, in addition to query execution there are costs in network latencies, communication, serialization, and deserialization. The best practice is to reduce database roundtrips by doing batch processing.
This document describes some general performance recommendations, several query batching strategies with their pros and cons, as well as a list of helpful performance turning resources.
For more information, see SQL Server Index Design Guide .
During migration, changing application logic or code comes with a cost. Decisions must be made to balance between the cost of code changes and the performance gain of modifying code to adopt best practices and patterns of .NET and SQL Server. Sometimes it might cost less to directly translate Mainframe applications into .NET + SQL Server world. In this case It could be helpful to utilize automated solutions that translate and host Mainframe applications in .NET + SQL Server environments.
Direct translation of Mainframe application into .NET + SQL Server world. There are some third-party solutions available in the market that can help automate the translation and host for example COBOL + DB2 applications without needing to change or rewrite any code manually.
The database queries are done by invoking stored procedures containing database queries translated into T-SQL, using SqlCommand.ExecuteReader() and DataReader APIs from ADO.NET.
When the performance gain of code changes outweighs the cost, there are some batching strategies that can improve performance.
When there is a need to execute same T-SQL query or stored procedure multiple times, each time individually with a different argument, it is better to send these queries together to server and retrieve result in a batch to reduce database roundtrips, instead of one roundtrip for each execution. There are several approaches to batch queries. These all will likely require logic/code changes to the original Mainframe applications.
When the table to query against is not being updated for an extended period and its size is not huge, the .NET client application can keep a local cache of the table and query against the cache instead.
Doing one database query to retrieve all the data of a database table into .NET client then query against the local cache.
The following ADO.NET example shows usage of a DataSet and a DataView on the table with an index on the querying column. Queries are done by calling DataView.FindRows().
Filling dataset and creating data view:
var connection = new SqlConnection(connectionString);
var queryString = "SELECT * from tablename";
var adapter = new SqlDataAdapter(queryString, connection);
dataView = new DataView(dataSet.Tables["tablename"], string.Empty, "id ASC", DataViewRowState.OriginalRows);
Querying using data view:
var rows = dataView.FindRows(queryParameter);
Queries run very fast against the local cache therefore unnecessary roundtrips to SQL Server are avoided.
There are cases where caching database tables doesn't make sense, for example, they are being updated frequently; or the table sizes are too big to fit in client machine's memory; or the original mainframe code is mainly doing DB2 server-side batching. In these cases, it might be better to migrate the workloads into SQL Server T-SQL and do most of the work in SQL Server.
The query to get an object to process, as well as the following queries for the object are all done in TSQL stored procedure(s) on the SQL Server side.
Caching the database tables could lead to a large amount of memory consumption on the client side. When memory usage becomes a concern or blocker, a hybrid client/server batching can be used to reduce the memory footprint of client application, while still achieving very low cost per query.
This approach groups keys in a batch then sends them in a query to SQL Server, either using table - valued parameter (TVP) to a stored procedure, or executing SELECT statements where query ids are batched in the IN clauses, for example
SELECT * FROM table WHERE Id in (71, 52, 33, 412, …, 25930)
The table-valued parameter approach generally has better performance but SELECT … IN (…) statement is easier to code.
Less roundtrips to SQL Server than singleton queries while having dialable memory footprints smaller than that of caching everything in the client.
The .NET Framework Performance document contains introduction to .NET performance design and analysis, with links to various tools and techniques.
Improving .NET Application Performance and Scalability is still a valuable resource even though it appears as retired content. The principles, rules, and guidelines of measuring, testing, and turning performance largely remain the same.
Performance Center for SQL Server provides an abundant amount of information on improving SQL Server performance.
Measuring Performance and collecting metrics help determine whether an application meets its performance goals and identifies bottlenecks. Microsoft provides general purpose profilers and analysis tools like Windows Performance Toolkit and PerfView .