SUMMARY
This step-by-step article describes how to implement and
how to use a
DataSetHelper class that includes sample code to create a
DataTable object that contains aggregate data (including optional rollup
totals similar to the Microsoft SQL Server WITH ROLLUP keyword) from another
DataTable object.
To do this, use the following public methods:
- CreateGroupByTable
- InsertGroupByInto
- SelectGroupByInto
The
SelectGroupByInto method calls the
CreateGroupByTable method and the
InsertGroupByInto method. You can also use private helper methods and data members
to store the parsed field list and to handle NULL values correctly when the
aggregate values are calculated.
The
DataSetHelper class includes a
DataSet member variable. Optionally, you can assign an existing
DataSet to the
DataSet member variable. If the member variable points to a valid
DataSet, any
DataTable objects that the
CreateGroupByTable or the
SelectGroupByInto method creates are added to the
DataSet. In either case, the method call returns a reference to the
DataTable object. The
InsertGroupByInto method requires an existing target
DataTable and does not return anything.
For additional
information about DataSet objects, click the following article number to view
the article in the Microsoft Knowledge Base:
313485
INFO: Roadmap for ADO.NET DataSet, DataView, and DataViewManager Objects
back to the
topRequirements
The following list outlines the recommended hardware, software,
network infrastructure, and service packs that are required:
- Microsoft Windows XP, Windows 2000, or Windows NT 4.0
Service Pack 6a
- Microsoft Visual Studio .NET
This article assumes that you are familiar with the following
topics:
- Visual Basic .NET syntax
- ADO.NET fundamentals and syntax
back to the topDataSetHelper Shell Class
The code in this section declares the shell class that all
DataSetHelper articles add methods and member variables to.
- Start Visual Studio .NET.
- On the File menu, point to New, and then click Project.
- In the New Project dialog box, click Visual Basic Projects under Project Types, and then click Class Library under Templates.
- In the Name box, type DataSetHelper.
- Replace the class code with the following code:
Public Class DataSetHelper
Public ds As DataSet
Public Sub New(ByVal DataSet As DataSet)
ds = DataSet
End Sub
Public Sub New()
ds = Nothing
End Sub
End Class
You can use the two overloads for the constructor to create an instance
of the class with or without a reference to a valid DataSet. For a class that contains a reference to a valid DataSet, the DataTable objects that the methods return are also added automatically to
the DataSet.
back to the topField List Parsers
This section contains the code for two field list parsers: one
for the field list, and one for the fields to group by. The parsed structure is
used so that the
CreateGroupByTable and the
InsertGroupByInto methods do not have to reparse the field list. These methods must
reparse the field list if they are called from the
SelectGroupByInto method or from your own code. The parsed field list and the
unparsed field list are stored in
Private member variables of the
DataSetHelper class.
- Add the following Private class definition in the DataSetHelper class that you created in the "DataSetHelper Shell Class" section:
Private Class FieldInfo
Public RelationName As String
Public FieldName As String ' source table field name
Public FieldAlias As String ' destination table field name
Public Aggregate As String
End Class
Note This class is common to other DataSetHelper articles, and contains some fields that this article does not
use. - Add the following Private member variables to the class definition immediately after the DataSet declaration:
Private m_FieldInfo As ArrayList, m_FieldList As String
Private GroupByFieldInfo As ArrayList, GroupByFieldList As String
- Add the following Private method to the class definition. This method is the same as the
method that is used in other DataSetHelper articles and is used to parse the list of fields to group by.
Private Sub ParseFieldList(ByVal FieldList As String, Optional ByVal AllowRelation As Boolean = False)
'
' Parses FieldList into FieldInfo objects and then adds them to the m_FieldInfo private member
'
' FieldList syntax: [relationname.]fieldname[ alias],...
'
If m_FieldList = FieldList Then Exit Sub
m_FieldInfo = New ArrayList()
m_FieldList = FieldList
Dim Field As FieldInfo, FieldParts() As String, Fields() As String = FieldList.Split(",")
Dim I As Integer
For I = 0 To Fields.Length - 1
Field = New FieldInfo()
'
' Parse FieldAlias
'
FieldParts = Fields(I).Trim().Split(" ")
Select Case FieldParts.Length
Case 1
' To be set at the end of the loop
Case 2
Field.FieldAlias = FieldParts(1)
Case Else
Throw New ArgumentException("Too many spaces in field definition: '" & Fields(I) & "'.")
End Select
'
' Parse FieldName and RelationName
'
FieldParts = FieldParts(0).Split(".")
Select Case FieldParts.Length
Case 1
Field.FieldName = FieldParts(0)
Case 2
If Not AllowRelation Then _
Throw New ArgumentException("Relation specifiers not allowed in field list: '" & Fields(I) & "'.")
Field.RelationName = FieldParts(0).Trim()
Field.FieldName = FieldParts(1).Trim()
Case Else
Throw New ArgumentException("Invalid field definition: '" & Fields(I) & "'.")
End Select
If Field.FieldAlias = "" Then Field.FieldAlias = Field.FieldName
m_FieldInfo.Add(Field)
Next
End Sub
- Add the following Private method to the class definition. This method is used to parse the
field list, including aggregate functions.
Private Sub ParseGroupByFieldList(ByVal FieldList As String)
'
' Parses FieldList into FieldInfo objects and then adds them to the GroupByFieldInfo private member
'
' FieldList syntax: fieldname[ alias]|operatorname(fieldname)[ alias],...
'
' Supported Operators: count,sum,max,min,first,last
'
If GroupByFieldList = FieldList Then Exit Sub
Const OperatorList As String = ",count,sum,max,min,first,last,"
GroupByFieldInfo = New ArrayList()
Dim Field As FieldInfo, FieldParts() As String, Fields() As String = FieldList.Split(",")
Dim I As Integer
For I = 0 To Fields.Length - 1
Field = New FieldInfo()
'
' Parse FieldAlias
'
FieldParts = Fields(I).Trim().Split(" ")
Select Case FieldParts.Length
Case 1
' To be set at the end of the loop
Case 2
Field.FieldAlias = FieldParts(1)
Case Else
Throw New ArgumentException("Too many spaces in field definition: '" & Fields(I) & "'.")
End Select
'
' Parse FieldName and Aggregate
'
FieldParts = FieldParts(0).Split("(")
Select Case FieldParts.Length
Case 1
Field.FieldName = FieldParts(0)
Case 2
Field.Aggregate = FieldParts(0).Trim().ToLower() ' You will do a case-sensitive comparison later.
Field.FieldName = FieldParts(1).Trim(" "c, ")"c)
Case Else
Throw New ArgumentException("Invalid field definition: '" & Fields(I) & "'.")
End Select
If Field.FieldAlias = "" Then
If Field.Aggregate = "" Then
Field.FieldAlias = Field.FieldName
Else
Field.FieldAlias = Field.Aggregate & "Of" & Field.FieldName
End If
End If
GroupByFieldInfo.Add(Field)
Next
GroupByFieldList = FieldList
End Sub
back to the topCreateGroupByTable Method
This section contains the code for the
CreateGroupByTable method
The following is the calling convention for the
CreateGroupByTable method:
dt = dsHelper.CreateGroupByTable("OrderSummary", ds.Tables!Orders, "EmployeeID,sum(Amount) Total,min(Amount) Min,max(Amount) Max")
This call sample creates a new
DataTable with a
TableName of OrderSummary and four fields (EmployeeID, Total, Min, and
Max). The four fields have the same data type as the EmployeeID and the Amount
fields in the Orders table.
Use the following syntax to specify
fields in the field list:
fieldname[ alias]|aggregatefunction(fieldname)[ alias], ...
Note the following for this syntax:
- The ColumnName and the DataType properties are the only properties that are copied to the
destination DataTable.
- You can rename a field in the destination DataTable by specifying an alias name.
- The field list can contain a subset of field names that are
listed in a different order than in the source DataTable. If the field list is blank, an exception is thrown.
- Relation specifiers are not supported as part of the field
name. All fields must come from the same DataTable.
To call the
CreateGroupByTable method, add the following method to the
DataSetHelper class that you created in the "
DataSetHelper Shell Class" section:
Public Function CreateGroupByTable(ByVal TableName As String, _
ByVal SourceTable As DataTable, _
ByVal FieldList As String) As DataTable
'
' Creates a table based on aggregates of fields of another table
'
' RowFilter affects rows before the GroupBy operation. No HAVING-type support
' although this can be emulated by later filtering of the resultant table.
'
' FieldList syntax: fieldname[ alias]|aggregatefunction(fieldname)[ alias], ...
'
If FieldList = "" Then
Throw New ArgumentException("You must specify at least one field in the field list.")
' Return CreateTable(TableName, SourceTable)
Else
Dim dt As New DataTable(TableName)
ParseGroupByFieldList(FieldList)
Dim Field As FieldInfo, dc As DataColumn
For Each Field In GroupByFieldInfo
dc = SourceTable.Columns(Field.FieldName)
If Field.Aggregate = "" Then
dt.Columns.Add(Field.FieldAlias, dc.DataType, dc.Expression)
Else
dt.Columns.Add(Field.FieldAlias, dc.DataType)
End If
Next
If Not ds Is Nothing Then ds.Tables.Add(dt)
Return dt
End If
End Function
back to the topInsertGroupByInto Method
This section contains code for the
InsertGroupByInto method. This code copies records that are sorted and filtered
from the source table and related tables to the destination table. The filter
is applied before any aggregate values are calculated. If the Rollup argument
is True, group totals and grand totals are added for each field in the
GroupBy argument. When you perform a Rollup, if you also want detail
records to be included, add the primary key field to the end of the list of
fields in the
GroupBy argument.
The results are sorted on the fields that are
listed in the
GroupBy argument. The
GroupBy argument must comply with a valid Sort field list (minus ASC and
DESC modifiers). If the
GroupBy argument is blank, the target
DataTable contains only a single record that aggregates all the input. When
you call the
ParseGroupByFieldList and the
ParseFieldList properties, you can parse lists that were previously parsed, if
these lists are available. If the field list is blank, an exception is
thrown.
This is the calling convention for the
InsertGroupByInto method:
dsHelper.InsertGroupByInto(ds.Tables!OrderSummary, ds.Tables!Orders, "EmployeeID,sum(Amount) Total,min(Amount) Min,max(Amount) Max", "EmployeeID<5", "EmployeeID")
This call sample reads records from the
DataTable that is named Orders and writes records to the
DataTable that is named OrderSummary. The OrderSummary
DataTable contains the EmployeeID field and three different aggregates of
the Amount field that are filtered on "EmployeeID<5" and that are grouped on
(and sorted by) EmployeeID.
Note The filter expression is applied before any aggregate
functionality. To implement HAVING-type functionality, filter the
DataTable that results.
To call the
InsertGroupByInto method, add the following method to the
DataSetHelper class that you created in the "
DataSetHelper Shell Class" section:
Public Sub InsertGroupByInto(ByVal DestTable As DataTable, _
ByVal SourceTable As DataTable, _
ByVal FieldList As String, _
Optional ByVal RowFilter As String = "", _
Optional ByVal GroupBy As String = "", _
Optional ByVal Rollup As Boolean = False)
'
' Copies the selected rows and columns from SourceTable and inserts them into DestTable
' FieldList has same format as CreateGroupByTable
'
ParseGroupByFieldList(FieldList) ' parse field list
ParseFieldList(GroupBy) ' parse field names to Group By into an arraylist
Dim Field As FieldInfo
Dim Rows() As DataRow = SourceTable.Select(RowFilter, GroupBy)
Dim SourceRow, LastSourceRow As DataRow, SameRow As Boolean, I As Integer, J As Integer, K As Integer
Dim DestRows(m_FieldInfo.Count) As DataRow, RowCount(m_FieldInfo.Count) As Integer
'
' Initialize Grand total row
'
DestRows(0) = DestTable.NewRow()
'
' Process source table rows
'
For Each SourceRow In Rows
'
' Determine whether we've hit a control break
'
SameRow = False
If Not (LastSourceRow Is Nothing) Then
SameRow = True
For I = 0 To m_FieldInfo.Count - 1 ' fields to Group By
Field = m_FieldInfo(I)
If ColumnEqual(LastSourceRow(Field.FieldName), SourceRow(Field.FieldName)) = False Then
SameRow = False
Exit For
End If
Next I
'
' Add previous totals to the destination table
'
If Not SameRow Then
For J = m_FieldInfo.Count To I + 1 Step -1
'
' Make NULL the key values for levels that have been rolled up
'
For K = m_FieldInfo.Count - 1 To J Step -1
Field = LocateFieldInfoByName(GroupByFieldInfo, m_FieldInfo(K).FieldName)
If Not (Field Is Nothing) Then ' Group By field does not have to be in field list
DestRows(J)(Field.FieldAlias) = DBNull.Value
End If
Next K
'
' Make NULL any non-aggregate, non-group-by fields in anything other than
' the lowest level.
'
If J <> m_FieldInfo.Count Then
For Each Field In GroupByFieldInfo
If Field.Aggregate <> "" Then Exit For
If LocateFieldInfoByName(m_FieldInfo, Field.FieldName) Is Nothing Then
DestRows(J)(Field.FieldAlias) = DBNull.Value
End If
Next
End If
'
' Add row
'
DestTable.Rows.Add(DestRows(J))
If Rollup = False Then Exit For ' only add most child row if not doing a roll-up
Next J
End If
End If
'
' create new destination rows
' Value of I comes from previous If block
'
If Not SameRow Then
For J = m_FieldInfo.Count To I + 1 Step -1
DestRows(J) = DestTable.NewRow()
RowCount(J) = 0
If Rollup = False Then Exit For
Next J
End If
For J = 0 To m_FieldInfo.Count
RowCount(J) += 1
For Each Field In GroupByFieldInfo
Select Case Field.Aggregate ' this test is case-sensitive - made lower-case by Build_GroupByFiledInfo
Case "" ' implicit Last
DestRows(J)(Field.FieldAlias) = SourceRow(Field.FieldName)
Case "last"
DestRows(J)(Field.FieldAlias) = SourceRow(Field.FieldName)
Case "first"
If RowCount(J) = 1 Then DestRows(J)(Field.FieldAlias) = SourceRow(Field.FieldName)
Case "count"
DestRows(J)(Field.FieldAlias) = RowCount(J)
Case "sum"
DestRows(J)(Field.FieldAlias) = Add(DestRows(J)(Field.FieldAlias), SourceRow(Field.FieldName))
Case "max"
DestRows(J)(Field.FieldAlias) = Max(DestRows(J)(Field.FieldAlias), SourceRow(Field.FieldName))
Case "min"
If RowCount(J) = 1 Then
DestRows(J)(Field.FieldAlias) = SourceRow(Field.FieldName) ' so we get by initial NULL
Else
DestRows(J)(Field.FieldAlias) = Min(DestRows(J)(Field.FieldAlias), SourceRow(Field.FieldName))
End If
End Select
Next
Next J
LastSourceRow = SourceRow
Next
If Rows.Length > 0 Then
'
' Make NULL the key values for levels that have been rolled up
'
For J = m_FieldInfo.Count To 0 Step -1
For K = m_FieldInfo.Count - 1 To J Step -1
Field = LocateFieldInfoByName(GroupByFieldInfo, m_FieldInfo(K).FieldName)
If Not (Field Is Nothing) Then ' Group By field does not have to be in field list
DestRows(J)(Field.FieldAlias) = DBNull.Value
End If
Next K
'
' Make NULL any non-aggregate, non-group-by fields in anything other than
' the lowest level.
'
If J <> m_FieldInfo.Count Then
For Each Field In GroupByFieldInfo
If Field.Aggregate <> "" Then Exit For
If LocateFieldInfoByName(m_FieldInfo, Field.FieldName) Is Nothing Then
DestRows(J)(Field.FieldAlias) = DBNull.Value
End If
Next
End If
'
' Add row
'
DestTable.Rows.Add(DestRows(J))
If Rollup = False Then Exit For
Next J
End If
End Sub
Private Function LocateFieldInfoByName(ByVal FieldList As ArrayList, ByVal Name As String) As FieldInfo
'
' Looks up a FieldInfo record based on FieldName
'
Dim Field As FieldInfo
For Each Field In FieldList
If Field.FieldName = Name Then Return Field
Next
End Function
Private Function ColumnEqual(ByVal A As Object, ByVal B As Object) As Boolean
'
' Compares two values to determine if they are equal. Also compares DBNULL.Value.
'
' NOTE: If your DataTable contains object fields, you must extend this
' function to handle them in a meaningful way if you intend to group on them.
'
If A Is DBNull.Value And B Is DBNull.Value Then Return True ' Both are DBNull.Value.
If A Is DBNull.Value Or B Is DBNull.Value Then Return False ' Only one is DbNull.Value.
Return A = B ' Value type standard comparison
End Function
Private Function Min(ByVal A As Object, ByVal B As Object) As Object
'
' Returns MIN of two values. DBNull is less than all others.
'
If A Is DBNull.Value Or B Is DBNull.Value Then Return DBNull.Value
If A < B Then Return A Else Return B
End Function
Private Function Max(ByVal A As Object, ByVal B As Object) As Object
'
' Returns Max of two values. DBNull is less than all others.
'
If A Is DBNull.Value Then Return B
If B Is DBNull.Value Then Return A
If A > B Then Return A Else Return B
End Function
Private Function Add(ByVal A As Object, ByVal B As Object) As Object
'
' Adds two values. If one is DBNull, returns the other.
'
If A Is DBNull.Value Then Return B
If B Is DBNull.Value Then Return A
Return A + B
End Function
back to the topSelectGroupByInto Method
This section contains the code for the
SelectGroupByInto method. This method is a combination of the
CreateGroupByTable and the
InsertGroupByInto methods. The
SelectGroupByInto method creates a new
DataTable based on existing
DataTable objects, and copies the records that are sorted and filtered to
the new
DataTable.
The following is the calling convention for the
SelectGroupByInto method:
dt = dsHelper.SelectInto("OrderSummary", ds.Tables!Employees, "EmployeeID,sum(Amount) Total,min(Amount) Min,max(Amount) Max", "EmployeeID<5", "EmployeeID")
This call sample creates a new
DataTable with a
TableName of OrderSummary and four fields (EmployeeID, Total, Min, and
Max). These four fields have the same data type as the EmployeeID and the
Amount fields in the Orders table. Then this sample reads records from the
Orders
DataTable and writes records to the OrderSummary
DataTable. The OrderSummary
DataTable contains the EmployeeID field and three different aggregates of
the Amount field that are filtered on "EmployeeID<5" and that are grouped on
(and sorted by) EmployeeID. If the
GroupBy argument is blank, the target
DataTable contains only a single record that aggregates all the
input.
Note The filter expression is applied before any aggregate
functionality. To implement HAVING-type functionality, filter the
DataTable that results.
To call the
SelectGroupByInto method, add the following method to the
DataSetHelper class that you created in the "
DataSetHelper Shell Class" section:
Public Function SelectGroupByInto(ByVal TableName As String, _
ByVal SourceTable As DataTable, _
ByVal FieldList As String, _
Optional ByVal RowFilter As String = "", _
Optional ByVal GroupBy As String = "", _
Optional ByVal Rollup As Boolean = False) As DataTable
'
' Selects data from one DataTable to another and performs various aggregate functions
' along the way. See InsertGroupByInto and ParseGroupByFieldList for supported aggregate functions.
'
Dim dt As DataTable = CreateGroupByTable(TableName, SourceTable, FieldList)
InsertGroupByInto(dt, SourceTable, FieldList, RowFilter, GroupBy, Rollup)
Return dt
End Function
back to the topTest the Application
- Save and then compile the DataSetHelper class that you created in the previous sections.
- Follow these steps to create a new Visual Basic Windows
Application:
- Start Visual Studio .NET.
- On the File menu, point to New, and then click Project.
- In the New Project dialog box, click Visual Basic Projects under Project Types, and then click Windows Application under Templates.
- In Solution Explorer, right-click the solution, and then
click Add Existing Project. Add the DataSetHelper project.
- On the Project menu, click Add Reference.
- In the Add Reference dialog box, click the Projects tab, and then add a reference to the DataSetHelper project to the
Windows Form application.
- In the form designer, drag three Button controls and a DataGrid control from the toolbox to the form. Name the buttons
btnCreateGroupBy,
btnInsertGroupByInto, and
btnSelectGroupByInto. Keep the default name for the DataGrid control (DataGrid1).
- In the form code, add the following Imports statement to the top of the code window:
Imports System.Data
- Add the following variable declarations to the form
definition:
Dim ds As DataSet, dsHelper As DataSetHelper.DataSetHelper
- Add the following code to the Form.Load event:
ds = New DataSet()
dsHelper = New DataSetHelper.DataSetHelper(ds)
'
' Create source table
'
Dim dt As New DataTable("Orders")
dt.Columns.Add("EmployeeID", GetType(String))
dt.Columns.Add("OrderID", GetType(Integer))
dt.Columns.Add("Amount", GetType(Decimal))
dt.Rows.Add(New Object() {"Sam", 5, 25.00})
dt.Rows.Add(New Object() {"Tom", 7, 50.00})
dt.Rows.Add(New Object() {"Sue", 9, 11.00})
dt.Rows.Add(New Object() {"Tom", 12, 7.00})
dt.Rows.Add(New Object() {"Sam", 14, 512.00})
dt.Rows.Add(New Object() {"Sue", 15, 17.00})
dt.Rows.Add(New Object() {"Sue", 22, 2.50})
dt.Rows.Add(New Object() {"Tom", 24, 3.00})
dt.Rows.Add(New Object() {"Tom", 33, 78.75})
ds.Tables.Add(dt)
- Add the following code to the btnCreateGroupBy.Click event:
Try
dsHelper.CreateGroupByTable("OrderSummary", ds.Tables!Orders, "EmployeeID,count(EmployeeID) Orders,sum(Amount) OrderTotal,max(Amount) BestOrder,min(Amount) WorstOrder")
DataGrid1.SetDataBinding(ds, "OrderSummary")
Catch ex As Exception
MsgBox("An Error Occured: " & vbCrLf & ex.Message())
End Try
- Add the following code to the btnInsertGroupByInto.Click event:
Try
dsHelper.InsertGroupByInto(ds.Tables!OrderSummary, ds.Tables!Orders, "EmployeeID,count(EmployeeID) Orders,sum(Amount) OrderTotal,max(Amount) BestOrder,min(Amount) WorstOrder", "", "EmployeeID")
DataGrid1.SetDataBinding(ds, "OrderSummary")
Catch ex As Exception
MsgBox("An Error Occured: " & vbCrLf & ex.Message())
End Try
- Add the following code to the btnSelectGroupByInto.Click event. This sample sets the Rollup argument to TRUE to produce a
Grand Total line and also totals for each employee:
Try
dsHelper.SelectGroupByInto("OrderSummary2", ds.Tables!Orders, "EmployeeID,count(EmployeeID) Orders,sum(Amount) OrderTotal,max(Amount) BestOrder,min(Amount) WorstOrder", "OrderID>10", "EmployeeID", True)
DataGrid1.SetDataBinding(ds, "OrderSummary2")
Catch ex As Exception
MsgBox("An Error Occured: " & vbCrLf & ex.Message())
End Try
- Run the application, and then click each of the buttons.
Notice that the DataGrid is populated with the tables and the data from the
code.
Note You can only click the btnCreateGroupBy and the btnSelectGroupByInto buttons one time. If you click either of these buttons more than
one time, you receive an error message that you are trying to add the same
table two times. Additionally, you must click btnCreateGroupBy before you click btnInsertGroupByInto; otherwise, the destination DataTable is not created. If you click the btnInsertGroupByInto button multiple times, you populate the DataGrid with duplicate records.
back to the topEnhancement Ideas
- The ColumnName and the DataType properties are the only properties that are copied to the
destination DataTable. You can extend the CreateTable method to copy additional properties, such as the MaxLength property, or you can create new key columns.
- You can expand the functionality so that you can specify
fields from related DataTable objects.
- You can expand the functionality by adding support for
additional aggregate functions, such as the AVG or the STDEV function. If you do this, you must have dynamically created
variables to track intermediate values. To track intermediate values, you can
add member variables to the FieldInfo class, and then use those member variables to store intermediate
values.
back to the topTroubleshooting
- The fieldname and the alias parts of the field list must
comply with DataColumn naming conventions. The parser also restricts the names, in that
the name must not contain a period (.), a comma (,), or a space ( ).
- If you click a button more than one time, the same table is
added two times to the DataSet. This results in an exception. To work around this problem, you
can add code to the test application to check whether a DataTable of the same name already exists. Alternatively, you can create
the DataSetHelper class without a reference to a DataSet and then bind the DataGrid.DataSource property directly to the dt variable instead of by using the SetDataBinding method call.
- If the source table uses custom data types (for example, a
class), you must add code to the InsertGroupByInto method to perform a deep copy of the data. Otherwise, only a
reference is copied.
back to the
top