通过C#实现集合类纵览.NET Collections及相关技术

2024-07-10 13:04:33

字体：大中小

来源：转载

供稿：网友

概述：在真正的对象化开发项目中，我们通常会将常用的业务实体抽象为特定的类，如employee、customer、contact等，而多数的类之间会存在着相应的关联或依存关系，如employee和customer通过contact而产生关联、contact是依赖于employee和customer而存在的。在实际的对象应用模块中，可能会有这样的需求：获得一组客户对象（即customers集合类的实例，如customers），指向其中一个customer对象（如customers[i]），通过访问这个customer对象的属性name（customers[i].name）和contacts（如customers[i].contacts）来查询客户的姓名和与该客户的联络记录，甚至遍历contacts对象，查找该客户的某次联络摘要（即customers.[i].contacts[x].summary）。为满足以上集合类的需求，对照.net framework 的平台实现，不难发现．net在collections命名空间下提供了一系列实现集合功能的类，并且根据适用环境的不同为开发者提供灵活多样的选择性：如通过索引访问使用广泛的arraylist 和 stringcollection；通常在检索后被释放的先进先出的queue和后进先出stack；通过元素键对其元素进行访问hashtable、sortedlist、listdictionary 和 stringdictionary；通过索引或通过元素键对其元素进行访问的nameobjectcollectionbase 和 namevaluecollection；以及具有集合类的特性而被实现在system.array下的array类等。本文将通过实现具有代表性的 “集合类”的两种典型途径，分析对比不同实现方式的差异性与适用环境，让大家了解和掌握相关的一些技术，希望为大家的学习和开发工作起到抛砖引玉的作用（注：作者的调试运行环境为．net framework sdk 1.1）。
1．采用从collectionbase抽象基类继承的方式实现customers集合类：
首先需要创建为集合提供元素的简单类customer：

/// <summary>
/// 描述一个客户基本信息的类
/// </summary>
public class customer
{
/// <summary>
/// 客户姓名
/// </summary>
public string name;

/// <summary>
/// 描述所有客户联络信息的集合类
/// </summary>
//public contacts contacts=new contacts();

/// <summary>
/// 不带参数的customer类构造函数
/// </summary>
public customer()
{
system.console.writeline("initialize instance without parameter");
}

/// <summary>
/// 带参数的customer类构造函数
/// </summary>
public customer(string name)
{
name=name;
system.console.writeline("initialize instance with parameter");
}
}

以上就是customer类的简单框架，实用的customer类可能拥有更多的字段、属性、方法和事件等。值得注意的是在customer类中还以公共字段形式实现了对contacts集合类的内联，最终可形成customer.contacts[i]的接口形式，但这并不是最理想的集合类关联方式，暂时将它注释，稍后将详加分析，这个类的代码重在说明一个简单类（相对于集合类的概念范畴）的框架；另外，该类还对类构造函数进行了重载，为声明该类的实例时带name参数或不带参数提供选择性。
接下来看我们的第一种集合类实现，基于从collectionbase类派生而实现的customers类：
/// <summary>
/// customers 是customer的集合类实现，继承自collectionbase
/// </summary>
public class customers: system.collections.collectionbase
{
public customers()
{

}
/// <summary>
/// 自己实现的add方法
/// </summary>
/// <param name="customer"></param>
public void add(customer customer)
{
list.add(customer);
}
/// <summary>
/// 自己实现的remove方法
/// </summary>
/// <param name="index"></param>
public void remove(int index)
{
if (index > count - 1 || index < 0)
{
system.console.writeline("index not valid!");
}
else
{
list.removeat(index);
}
}
}

以customers集合类为例，结合集合辅助技术，希望大家能了解掌握以下知识：
从collectionbase继承实现集合类
customers类采用从collectionbase继承的方式，不再需要在类内声明一个作为customer集合容器的list对象，因为collectionbase类已经内置了一个list对象，并已经实现了count、clear、removeat等等ilist的重要接口（具体请参照msdn中的collectionbase 成员），只需要用户显示实现add、remove、indexof、insert等等接口，代码中仅简单实现了add方法和remove方法的整参数版本作为示例。这种集合类的实现具有简单高效的特点，collectionbase已经实现了较为完善的功能，实施者只要在其基础上扩展自己所需的功能即可。

索引器的简单实现
我们惯于操作数组的形式通常为array[i]，集合类可以看作是“对象的数组”，在c#中，帮助集合类实现数组式索引功能的就是索引器：
public customer this[int index]
{
get
{
return (customer) list[index];
}
}
将以上代码加入到customers类后，就实现了以整形index为参数，以list[index]强制类型转换后的customer类型返回值的customers类只读索引器，使用者以customers[i].name的方式，就可以访问customers集合中第i个customer对象的姓名字段，是不是很神奇呢？文中的索引器代码并未考虑下标越界的问题，越界的处理方式应参照与之类似的remove方法。作者在此只实现了索引器的get访问，没有实现set访问的原因将在下文中讨论。

item的两种实现方式
用过vb的朋友们一定都很熟悉customers.itme(i).name的形式，它实现了与索引器相同的作用，即通过一个索引值来访问集合体中的特定对象，但item在c#当中应该以怎样的形式实现呢？首先想到的实现途径应该是属性，但你很快就会发现c#的属性是不支持参数的，所以无法把索引值作为参数传入，折中的办法就是以方法来实现：
public customer item (int index)
{
return (customer) list[index];
}
这个item方法已经可以工作了，但为什么说是折中的办法呢，因为对item的访问将是采用customers.item(i).name的语法形式,与c#‘[]’作数组下标的风格不统一，显的有些突兀，但如果希望在语法上做到统一，哪怕是性能受一些影响也无所谓的话有没有解决之道呢？请看以下代码：
public customers item
{
get
{
return this;
}
}
这是以属性形式实现的item接口，但是由于c#的属性不支持参数，所以我们返回customers对象本身，也就是在调用customers对象item属性时会引发对customers索引器的调用，性能有所下降,但是的确实现了customers.item[i].name的语法风格统一。对比这两种item的实现，不难得出结论：以不带参数的属性形式实现的item依赖于类的索引器，如果该类没有实现索引器，该属性将无法使用；并且由于对item的访问重定向到索引器性能也会下降；唯一的理由是：统一的c#索引下标访问风格；采用方法实现的裨益正好与之相反，除了语法风格较为别扭外，不存在依赖索引器、性能下降的问题。鱼与熊掌难以兼得，如何取舍应依据开发的实际需求决定。
中间语言的编译缺省与attribute的应用
如果你既实现了标准的索引器，又想提供名为“item”的接口，编译时就会出现错误“类‘windowsapplication1.customers’已经包含了“item”的定义”，但除了建立索引器外，你什么也没有做，问题到底出在哪里？我们不得不从.net中间语言il来寻找答案了，在.net命令行环境或visual studio .net 命令提示环境下，输入ildasm，运行.net framework msil 反汇编工具，通过主菜单中的‘打开’加载只有索引器没有item接口实现的可以编译通过的.net pe执行文件，通过直观的树状结构图找到customers类，你将意外地发现c#的索引器被解释成了一个名为item的属性，以下是il反编译后的被定义为item属性的索引器代码：
.property instance class windowsapplication1.customer
item(int32)
{
.get instance class windowsapplication1.customer windowsapplication1.customers::get_item(int32)
} // end of property customers::item
问题总算水落石出，就是c#编译器‘自作聪明’地把索引器解释成了一个名为item的属性，与我们期望实现的item接口正好重名，所以出现上述的编译错误也就在所难免。那么，我们有没有方法告知编译器，不要将索引器命名为缺省item呢？答案是肯定的。
解决方法就是在索引器实现之前声明特性：
[system.runtime.compilerservices.indexername("item")]
定义这个indexername特性将告知csharp编译器将索引器编译成item而不是默认的item ，修改之后的索引器il反汇编代码为：
.property instance class windowsapplication1.customer
item(int32)
{
.get instance class windowsapplication1.customer windowsapplication1.customers::get_item(int32)
} // end of property customers::item
当然你可以将索引器的生成属性名定义成其它名称而不仅限于item，只要不是il语言的保留关键字就可以。经过了给索引器命名，你就可以自由地加入名为“item”的接口实现了。

以下为customer类和customers类的调试代码，在作者的customers类中，为说明问题，同时建立了以item为特性名的索引器、一个items方法和一个item属性来实现对集合元素的三种不同访问方式，实际的项目开发中，一个类的索引功能不需要重复实现多次，可能只实现索引器或一个索引器加上一种形式的item就足够了：
public class calltest
{
public static void main()
{
customers custs=new customers();
system.console.writeline(custs.count.tostring());//count属性测试

customer acust=new customer();//将调用不带参数的构造函数
acust.name ="peter";
custs.add(acust);//add方法测试

system.console.writeline(custs.count.tostring());
system.console.writeline(custs.item[0].name);//调用item属性得到
custs.items(0).name+="hu";//调用items方法得到
system.console.writeline(custs[0].name);//调用索引器得到

custs.add(new customer("linnet"));//将调用带name参数的构造函数
system.console.writeline(custs.count.tostring());
system.console.writeline(custs.items(1).name);//调用items方法得到
custs.item[1].name+="li";//调用items方法得到
system.console.writeline(custs[1].name);//调用索引器得到

custs.remove(0);//remove方法测试
system.console.writeline(custs.count.tostring());
system.console.writeline(custs[0].name);//remove有效性验证
custs[0].name="test passed" ;//调用索引器得到
system.console.writeline(custs.item[0].name);
custs.clear();
system.console.writeline(custs.count.tostring());//clear有效性验证

}
}
输出结果为：
0
initialize instance without parameter
1
peter
peterhu
initialize instance with parameter
2
linnet
linnetli
1
linnetli
test passed
0

2．采用内建arraylist对象的方式实现集合类：
或许有经验的程序员们早已经想到，可以在一个类中内建一个数组对象，并在该类中通过封装对该对象的访问，一样能够实现集合类。以下是采用这种思路的contact元素类和contacts集合类的实现框架：

public class contact
{
protected string summary;

/// <summary>
/// 客户联系说明
/// </summary>
public string summary
{
get
{
system.console.writeline("getter access");
return summary;//do something, as get data from data source
}
set
{
system.console.writeline("setter access");
summary=value;// do something , as check validity or storage
}
}

public contact()
{

}
}

public class contacts
{
protected arraylist list;

public void add(contact contact)
{
list.add(contact);
}

public void remove(int index)
{
if (index > list.count - 1 || index < 0)
{
system.console.writeline("index not valid!");
}
else
{
list.removeat(index);
}
}

public int count
{
get
{
return list.count;
}
}

public contact this[int index]
{
get
{
system.console.writeline("indexer getter access");
return (contact) list[index];
}
set
{
list[index]=value;
system.console.writeline("indexer setter access ");
}

}

public contacts()
{
list=new arraylist();
}
}
通过这两个类的实现，我们可以总结以下要点：
采用arraylist的原因
在contacts实现内置集合对象时，使用了arraylist类，而没有使用大家较为熟悉的array类，主要的原因有：在现有的．net v1.1环境中，array虽然已经暴露了ilist.add、ilist.insert、ilist.remove、ilist.removeat等典型的集合类接口，而实际上实现这些接口总是会引发 notsupportedexception异常，microsoft是否在未来版本中实现不得而知，但目前版本的．net显然还不支持动态数组，在ms推荐的更改array大小的办法是，将旧数组通过拷贝复制到期望尺寸的新数组后，删除旧数组，这显示是费时费力地在绕弯路，无法满足集合类随时添加删除元素的需求；arraylist已经实现了add、clear、count、indexof、insert、remove、removeat等集合类的关键接口，并且有支持只读集合的能力，在上边的contacts类中，只通过极少的封装代码，就轻松地实现了集合类。另一个问题是我们为什么不采用与customers类似的从system.collections.arraylist继承的方式实现集合类呢？主要是由于将arraylist对象直接暴露于类的使用者，将导致非法的赋值，如用户调用arraylist.add方法，无论输入的参数类型是否为contact，方法都将被成功执行，类无法控制和检查输入对象的类型与期望的一致，有悖该类只接纳contact类型对象的初衷，也留下了极大的安全隐患；并且在contact对象获取时，如不经过强制类型转换，contacts元素也无法直接以contact类型形式来使用。
集合类中的set
在集合类的实现过程中，无论是使用索引器还是与索引器相同功能的“item”属性，无可避免地会考虑是只实现getter形成只读索引器，还是同时实现getter和setter形成完整的索引器访问。在上文的示例类customers中就没有实现索引器的setter，形成了只读索引器，但在customer类和customers类的调试代码，作者使用了容易令人迷惑的“custs[0].name="test passed"”的访问形式，事实上，以上这句并不会进入到customers索引器的setter而是会先执行customers索引器的getter得到一个customer对象，然后设置这个customer的name字段(如果name元素为属性的话，将访问customer类name属性的setter)。那么在什么情况下索引器的setter才会被用到呢？其实只有需要在运行时动态地覆盖整个元素类时，集合类的setter才变得有意义，如“custs [i]=new customer ()”把一个全新的customer对象赋值给custs集合类的已经存在的一个元素，这样的访问形式将导致customers的setter被访问，即元素对象本身进行了重新分配，而不仅仅是修改现有对象的一些属性。也就是说，由于customers类没有实现索引器的setter 所以customers类对外不提供“覆盖”客户集合中既有客户的方法。与此形成鲜明对照的是contacts类的索引器既提供对集合元素的getter，又提供对集合元素的setter，也就是说contacts类允许使用者动态地更新contact元素。通过对contacts和contact两个类运行以下测试可以很明确说明这个问题：
public class calltest
{
public static void main()
{
contacts cons=new contacts();
cons.add(new contact());
cons[0]=new contact();//trigger indexer setter
cons[0].summary="mail contact about ticket";
system.console.writeline(cons[0].summary);
}
}
理所当然的输出结果为：
indexer setter access
indexer getter access
setter access
indexer getter access
getter access
mail contact about ticket
明确认识到了索引器setter的作用后，在类的实现中就应当综合实际业务特点、存取权限控制和安全性决定是否为索引器建立setter机制。
属性－强大灵活的字段合二为一的方法
在最初实现customer类时，我们使用了一个公共字段name，用作存取客户的姓名信息，虽然可以正常的工作，但我们却缺乏对name字段的控制能力，无论类的使用者是否使用了合法有效的字段赋值，字段的值都将被修改；并且没有很好的机制，在值改变时进行实时的同步处理（如数据存储，通知相关元素等）；另外，字段的初始化也只能放在类的构造函数中完成，即使在整个对象生命周期内name字段都从未被访问过。对比我们在contact类中实现的summary属性，不难发现，属性所具有的优点：属性可以在get时再进行初始化，如果属性涉及网络、数据库、内存和线程等资源占用的方式，推迟初始化的时间，将起到一定的优化作用；经过属性的封装，真正的客户联系说明summary被很好地保护了起来，在set时，可以经过有效性验证再进行赋值操作；并且在getter和setter前后，可以进行数据存取等相关操作，这一点用字段是不可能实现的。所以我们可以得出结论，在字段不能满足需求的环境中，属性是更加强大灵活的替代方式。
另外，属性整合了“get”和“set”两个“方法”，而采用统一自然的接口名称，较之java语言的object.getanything和object.setanything语法风格更加亲和(事实上，c#中的属性只不过是对方法的再次包装，具有getter和setter的anything属性在．net il中，依然会被分解成一个由anything属性调用的get_anything和set_anything两个方法)。
集合类内联的方式
在文章最初的customer类中使用了公共字段public contacts contacts=new contacts()实现了customer. contacts[]形式的集合类内联接口，这是一种最为简单但缺乏安全性保护的集合类集成方式，正如以上所述属性的一些优点，采用属性形式暴露一个公共的集合类接口，在实际存取访问时，再对受封状保护的集合类进行操作才是更为妥当完善的解决方案，如可以把customer类内联的集合contacts的接口声明改为：
protected contacts cons; //用于类内封装的真正contacts对象
public contacts contacts//暴露在类外部的contacts属性
{
get
{
if (cons == null) cons=new contacts();
return cons;
}
set
{
cons=value;
}
}
最终，customers[i].contacts[x].summary的形式就被成功地实现了。
实例化的最佳时机
．net的类型系统是完全对象化的，所有的类型都是从system.object派生而来，根据类型的各自特点，可以分为值类型和引用类型两大阵营。值类型包括结构（简单的数值型和布尔型也包括在内）和枚举，引用类型则包括了类、数组、委托、接口、指针等，对象化的一个特点是直到对象实例化时才为对象分配系统资源，也就是说灵活适时地实例化对象，对系统资源的优化分配将产生积极意义。在一些文章中所建议的“lazy initialization”倡导在必要时才进行对象的实例化，本着这样的原则，从类的外部来看，类可以在即将被使用时再进行初始化；在类的内部，如属性之类的元素，也可以不在构造函数中初始化，而直到属性的getter被真正访问时才进行，如果属性一直没有被读取过，就不必要无意义地占用网络、数据库、内存和线程等资源了。但是也并不是初始化越晚越好，因为初始化是需要时间的，在使用前才进行初始化可能导致类的响应速度过慢，无法适应使用者的实时需求。所以在资源占用和初始化耗时之间寻求一个平衡点，才是实例化的最佳时机。

总结
本文围绕实现集合类的两种途径－从collectionbase继承实现和内建arraylist对象实现，为大家展示了部分集合、索引器、属性、特性的应用以及．net环境中的类构造函数、对象优化、类关联等其它相关知识。通过本文浅显的示例和阐述，希望可以启发读者的灵感，推出更加精辟合理的基础理论和应用模型。

上一篇：asp.net2.0中读取web.config数据库连接字符串2种方法

下一篇：ASP.Net的Application