ChatGPT解决这个技术问题 Extra ChatGPT

Why is Dictionary preferred over Hashtable in C#?

In most programming languages, dictionaries are preferred over hashtables. What are the reasons behind that?

> This is not necessarily true. A hash table is an implementation of a dictionary. A typical one at that, and it may be the default one in .NET, but it's not by definition the only one. I'm not sure that this is required by the ECMA standard, but the MSDN documentation very clearly calls it out as being implemented as a hashtable. They even provide the SortedList class for times when an alternative is more reasonable.
@Promit I always thought the Dictionary was an implementation of the Hashtable.
I think the reason is, that in a dictionary you can define the type of the key and the value for your selfe. the Hashtable can only take objects and saves the pairs based on the hash (from object.GetHashCode() ).
The original title of the question was c# specific. I have restored "in c#" to the title.
Not to be confused with HashSet<T> which unlike HashTable, is generic.

a
adjan

For what it's worth, a Dictionary is (conceptually) a hash table.

If you meant "why do we use the Dictionary<TKey, TValue> class instead of the Hashtable class?", then it's an easy answer: Dictionary<TKey, TValue> is a generic type, Hashtable is not. That means you get type safety with Dictionary<TKey, TValue>, because you can't insert any random object into it, and you don't have to cast the values you take out.

Interestingly, the Dictionary<TKey, TValue> implementation in the .NET Framework is based on the Hashtable, as you can tell from this comment in its source code:

The generic Dictionary was copied from Hashtable's source

Source


And also generic collections are a lot faster as there's no boxing/unboxing
Not sure about a Hashtable with the above statement, but for ArrayList vs List it's true
Hashtable uses Object to hold things internally (Only non-generic way to do it) so it would also have to box/unbox.
@BrianJ: A "hash table" (two words) is the computer science term for this kind of structure; Dictionary is a specific implementation. A HashTable corresponds roughly to a Dictionary (though with slightly different interfaces), but both are implementations of the hash table concept. And of course, just to confuse matters further, some languages call their hash tables "dictionaries" (e.g. Python) - but the proper CS term is still hash table.
@BrianJ: Both HashTable (class) and Dictionary (class) are hash tables (concept), but a HashTable is not a Dictionary, nor is a Dictionary a HashTable. They are used in very similar fashions, and Dictionary<Object,Object> can act in the same untyped manner that a HashTable does, but they do not directly share any code (though parts are likely to be implemented in a very similar fashion).
K
KyleMit

Differences

Dictionary Hashtable Generic Non-Generic Needs own thread synchronization Offers thread safe version through Synchronized() method Enumerated item: KeyValuePair Enumerated item: DictionaryEntry Newer (> .NET 2.0) Older (since .NET 1.0) is in System.Collections.Generic is in System.Collections Request to non-existing key throws exception Request to non-existing key returns null potentially a bit faster for value types bit slower (needs boxing/unboxing) for value types

Similarities:

Both are internally hashtables == fast access to many-item data according to key

Both need immutable and unique keys

Keys of both need own GetHashCode() method

Alternative .NET collections:

(candidates to use instead of Dictionary and Hashtable)

ConcurrentDictionary - thread safe (can be safely accessed from several threads concurrently)

HybridDictionary - optimized performance (for few items and also for many items)

OrderedDictionary - values can be accessed via int index (by order in which items were added)

SortedDictionary - items automatically sorted

StringDictionary - strongly typed and optimized for strings (now Deprecated in favor of Dictionary)


@Guillaume86, this is why you use TryGetValue instead msdn.microsoft.com/en-us/library/bb347013.aspx
+1 for StringDictionary...btw StringDictionary isn't the same as Dictionary<string, string> when you use the default constructor.
The ParallelExtensionsExtras @code.msdn.microsoft.com/windowsdesktop/… contains an ObservableConcurrentDictionary which is great fir binding as well as concurrency.
awesome explanation, it's really nice you also listed the similarities to lessen the questions that might comes to one's mind
S
StayOnTarget

Because Dictionary is a generic class ( Dictionary<TKey, TValue> ), so that accessing its content is type-safe (i.e. you do not need to cast from Object, as you do with a Hashtable).

Compare

var customers = new Dictionary<string, Customer>();
...
Customer customer = customers["Ali G"];

to

var customers = new Hashtable();
...
Customer customer = customers["Ali G"] as Customer;

However, Dictionary is implemented as hash table internally, so technically it works the same way.


P
Peter Mortensen

FYI: In .NET, Hashtable is thread safe for use by multiple reader threads and a single writing thread, while in Dictionary public static members are thread safe, but any instance members are not guaranteed to be thread safe.

We had to change all our Dictionaries back to Hashtable because of this.


Fun. The Dictionary source code looks a lot cleaner and faster. It might be better to use Dictionary and implement your own synchronization. If the Dictionary reads absolutely need to be current, then you'd simply have to synchronize access to the read/write methods of the Dictionary. It would be a lot of locking, but it would be correct.
Alternatively, if your reads don't have to be absolutely current, you could treat the dictionary as immutable. You could then grab a reference to the Dictionary and gain performance by not synchronizing reads at all (since it's immutable and inherently thread-safe). To update it, you construct a complete updated copy of the Dictionary in the background, then just swap the reference with Interlocked.CompareExchange (assuming a single writing thread; multiple writing threads would require synchronizing the updates).
.Net 4.0 added the ConcurrentDictionary class which has all public/protected methods implemented to be thread-safe. If you don't need to support legacy platforms this would let you replace the Hashtable in multithreaded code: msdn.microsoft.com/en-us/library/dd287191.aspx
I recall reading that HashTable is only reader-writer thread-safe in the scenario where information is never deleted from the table. If a reader is asking for an item which is in the table while a different item is being deleted, and the reader would to look in more than one place for the item, it's possible that while the reader is searching the writer might move the item from a place which hasn't been examined to one which has, thus resulting in a false report that the item does not exist.
M
Marc Gravell

In .NET, the difference between Dictionary<,> and HashTable is primarily that the former is a generic type, so you get all the benefits of generics in terms of static type checking (and reduced boxing, but this isn't as big as people tend to think in terms of performance - there is a definite memory cost to boxing, though).


S
StayOnTarget

People are saying that a Dictionary is the same as a hash table.

This is not necessarily true. A hash table is one way to implement a dictionary. A typical one at that, and it may be the default one in .NET in the Dictionary class, but it's not by definition the only one.

You could equally well implement a dictionary using a linked list or a search tree, it just wouldn't be as efficient (for some metric of efficient).


MS docs say: "Retrieving a value by using its key is very fast, close to O(1), because the Dictionary <(Of <(TKey, TValue >)>) class is implemented as a hash table." - so you should be guaranteed a hashtable when dealing with Dictionary<K,V>. IDictionary<K,V> could be anything, though :)
@rix0rrr - I think you've got that backwards, a Dictionary uses a HashTable not a HashTable uses a Dictionary.
@JosephHamilton - rix0rrr got it right: "A hash table is an implementation of a dictionary." He means the concept "dictionary", not the class (note the lower case). Conceptually, a hash table implements a dictionary interface. In .NET, Dictionary uses a hash table to implement IDictionary. It's messy ;)
I was talking about in .NET, since that's what he referenced in his response.
@JosephHamilton: implements (or implementation of) does not even remotely mean the same thing as uses. Quite the opposite. Perhaps it would have been clearer if he said it slightly differently (but with the same meaning): "a hash table is one way to implement a dictionary". That is, if you want the functionality of a dictionary, one way to do that (to implement the dictionary), is to use a hashtable.
P
Peter Mortensen

Collections & Generics are useful for handling group of objects. In .NET, all the collections objects comes under the interface IEnumerable, which in turn has ArrayList(Index-Value)) & HashTable(Key-Value). After .NET framework 2.0, ArrayList & HashTable were replaced with List & Dictionary. Now, the Arraylist & HashTable are no more used in nowadays projects.

Coming to the difference between HashTable & Dictionary, Dictionary is generic where as Hastable is not Generic. We can add any type of object to HashTable, but while retrieving we need to cast it to the required type. So, it is not type safe. But to dictionary, while declaring itself we can specify the type of key and value, so there is no need to cast while retrieving.

Let's look at an example:

HashTable

class HashTableProgram
{
    static void Main(string[] args)
    {
        Hashtable ht = new Hashtable();
        ht.Add(1, "One");
        ht.Add(2, "Two");
        ht.Add(3, "Three");
        foreach (DictionaryEntry de in ht)
        {
            int Key = (int)de.Key; //Casting
            string value = de.Value.ToString(); //Casting
            Console.WriteLine(Key + " " + value);
        }

    }
}

Dictionary,

class DictionaryProgram
{
    static void Main(string[] args)
    {
        Dictionary<int, string> dt = new Dictionary<int, string>();
        dt.Add(1, "One");
        dt.Add(2, "Two");
        dt.Add(3, "Three");
        foreach (KeyValuePair<int, String> kv in dt)
        {
            Console.WriteLine(kv.Key + " " + kv.Value);
        }
    }
}

instead of explicitly assigning the datatype for KeyValuePair, we could use var. So, this would reduce typing - foreach (var kv in dt)...just a suggestion.
M
MarmiK

Dictionary:

It returns/throws Exception if we try to find a key which does not exist.

It is faster than a Hashtable because there is no boxing and unboxing.

Only public static members are thread safe.

Dictionary is a generic type which means we can use it with any data type (When creating, must specify the data types for both keys and values). Example: Dictionary = new Dictionary();

Dictionay is a type-safe implementation of Hashtable, Keys and Values are strongly typed.

Hashtable:

It returns null if we try to find a key which does not exist.

It is slower than dictionary because it requires boxing and unboxing.

All the members in a Hashtable are thread safe,

Hashtable is not a generic type,

Hashtable is loosely-typed data structure, we can add keys and values of any type.


"It returns/throws Exception if we try to find a key which does not exist." Not if you use Dictionary.TryGetValue
g
g t

The Extensive Examination of Data Structures Using C# article on MSDN states that there is also a difference in the collision resolution strategy:

The Hashtable class uses a technique referred to as rehashing.

Rehashing works as follows: there is a set of hash different functions, H1 ... Hn, and when inserting or retrieving an item from the hash table, initially the H1 hash function is used. If this leads to a collision, H2 is tried instead, and onwards up to Hn if needed.

The Dictionary uses a technique referred to as chaining.

With rehashing, in the event of a collision the hash is recomputed, and the new slot corresponding to a hash is tried. With chaining, however, a secondary data structure is utilized to hold any collisions. Specifically, each slot in the Dictionary has an array of elements that map to that bucket. In the event of a collision, the colliding element is prepended to the bucket's list.


P
Peter Mortensen

Since .NET Framework 3.5 there is also a HashSet<T> which provides all the pros of the Dictionary<TKey, TValue> if you need only the keys and no values.

So if you use a Dictionary<MyType, object> and always set the value to null to simulate the type safe hash table you should maybe consider switching to the HashSet<T>.


N
Nate Barbettini

The Hashtable is a loosely-typed data structure, so you can add keys and values of any type to the Hashtable. The Dictionary class is a type-safe Hashtable implementation, and the keys and values are strongly typed. When creating a Dictionary instance, you must specify the data types for both the key and value.


A
Andrew Morton

Notice that the documentation says: "the Dictionary<(Of <(TKey, TValue>)>) class is implemented as a hash table", not "the Dictionary<(Of <(TKey, TValue>)>) class is implemented as a HashTable"

Dictionary is NOT implemented as a HashTable, but it is implemented following the concept of a hash table. The implementation is unrelated to the HashTable class because of the use of Generics, although internally Microsoft could have used the same code and replaced the symbols of type Object with TKey and TValue.

In .NET 1.0 Generics did not exist; this is where the HashTable and ArrayList originally began.


P
Peter Mortensen

HashTable:

Key/value will be converted into an object (boxing) type while storing into the heap.

Key/value needs to be converted into the desired type while reading from the heap.

These operations are very costly. We need to avoid boxing/unboxing as much as possible.

Dictionary : Generic variant of HashTable.

No boxing/unboxing. No conversions required.


N
NullReference

A Hashtable object consists of buckets that contain the elements of the collection. A bucket is a virtual subgroup of elements within the Hashtable, which makes searching and retrieving easier and faster than in most collections.

The Dictionary class has the same functionality as the Hashtable class. A Dictionary of a specific type (other than Object) has better performance than a Hashtable for value types because the elements of Hashtable are of type Object and, therefore, boxing and unboxing typically occur if storing or retrieving a value type.

For further reading: Hashtable and Dictionary Collection Types


P
Peter Mortensen

Another important difference is that Hashtable is thread safe. Hashtable has built-in multiple reader/single writer (MR/SW) thread safety which means Hashtable allows ONE writer together with multiple readers without locking.

In the case of Dictionary there is no thread safety; if you need thread safety you must implement your own synchronization.

To elaborate further:

Hashtable provides some thread-safety through the Synchronized property, which returns a thread-safe wrapper around the collection. The wrapper works by locking the entire collection on every add or remove operation. Therefore, each thread that is attempting to access the collection must wait for its turn to take the one lock. This is not scalable and can cause significant performance degradation for large collections. Also, the design is not completely protected from race conditions. The .NET Framework 2.0 collection classes like List, Dictionary, etc. do not provide any thread synchronization; user code must provide all synchronization when items are added or removed on multiple threads concurrently

If you need type safety as well thread safety, use concurrent collections classes in the .NET Framework. Further reading here.

An additional difference is that when we add the multiple entries in Dictionary, the order in which the entries are added is maintained. When we retrieve the items from Dictionary we will get the records in the same order we have inserted them. Whereas Hashtable doesn't preserve the insertion order.


From what I understand, the Hashset guarantees MR/SW thread safety in usage scenarios that do not involve deletions. I think it may have been intended to be fully MR/SW safe, but handling deletions safely greatly increases the expense of MR/SW safety. While the design of Dictionary could have offered MR/SW safety at minimal cost in no-delete scenarios, I think MS wanted to avoid treating no-delete scenarios as "special".
P
Peter Mortensen

One more difference that I can figure out is:

We can not use Dictionary (generics) with web services. The reason is no web service standard supports the generics standard.


We can use generic lists (List) in soap based web service. But, we cannot use dictionary (or hashtable) in a webservice. I think the reason for this is that the .net xmlserializer cannot handle dictionary object.
P
Peter Mortensen

Dictionary<> is a generic type and so it's type safe.

You can insert any value type in HashTable and this may sometimes throw an exception. But Dictionary<int> will only accept integer values and similarly Dictionary<string> will only accept strings.

So, it is better to use Dictionary<> instead of HashTable.


k
kristianp

In most programming languages, dictionaries are preferred over hashtables

I don't think this is necessarily true, most languages have one or the other, depending on the terminology they prefer.

In C#, however, the clear reason (for me) is that C# HashTables and other members of the System.Collections namespace are largely obsolete. They were present in c# V1.1. They have been replaced from C# 2.0 by the Generic classes in the System.Collections.Generic namespace.


One of the advantages of a hashtable over a dictionary is that if a key does not exist in a dictionary, it will throw an error. If a key does not exist in a hashtable, it just returns null.
In C# I would still avoid using System.Collections.Hashtable as they don't have the advantage of generics. You can use Dictionary's TryGetValue or HasKey if you don't know if the key will exist.
Whoops, not HasKey, it should be ContainsKey.
P
Peter Mortensen

According to what I see by using .NET Reflector:

[Serializable, ComVisible(true)]
public abstract class DictionaryBase : IDictionary, ICollection, IEnumerable
{
    // Fields
    private Hashtable hashtable;

    // Methods
    protected DictionaryBase();
    public void Clear();
.
.
.
}
Take note of these lines
// Fields
private Hashtable hashtable;

So we can be sure that DictionaryBase uses a HashTable internally.


System.Collections.Generic.Dictionary doesn't derive from DictionaryBase.
"So we can be sure that DictionaryBase uses a HashTable internally." -- That's nice, but it has nothing to do with the question.