Wednesday, March 28, 2012

DataTable vs List


  Recently just make in depth research about DataTable. DataTable is easy to use variable which allows user to pool data into it with consistence format, embed with other properties that allow us to do sorting, search etc. Currently I am researching on high performance application, which need to store large amount of data into memory. Soon I hit a bottleneck with DataTable. When inserting more than 5 million of records into DataTable, my desktop starts to struggle, even with 5G Ram So I need to find an alternative.

  The main purpose of my tool  is analyzing large amount of data. It uses to compare ID and do heavy calculation on selected field. After do some research, I found list can on par with DataTable’s needed function, at the same time increase efficiency.
So lets prove it!

DataTable button click event
private void button3_Click(object sender, EventArgs e)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            DataTable dt = new DataTable();
            dt.Columns.Add("a");
            dt.Columns.Add("b");
            dt.Columns.Add("c");
            dt.Columns.Add("d");
            dt.Columns.Add("e");
            long i = 0;
            while (i < 1000000)
            {
                dt.Rows.Add("a" + i.ToString(),"b" + i.ToString(), "c" + i.ToString(), "d" + i.ToString(), "e" + i.ToString());
                i++;
            }
            stopwatch.Stop();
            MessageBox.Show(stopwatch.Elapsed.TotalMilliseconds.ToString());
        }

List button click event
private void button4_Click(object sender, EventArgs e)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            List<List<lst>> lists = new List<List<lst>>();
            long i = 0;
            while (i < 1000000)
            {
                List<lst> list = new List<lst>();
                lst superlist = new lst
                {
                    field1 = "a" + i.ToString(),
                    field2 = "b" + i.ToString(),
                    field3 = "c" + i.ToString(),
                    field4 = "d" + i.ToString(),
                    field5 = "e" + i.ToString()
                };
                lists.Add(list);
                i++;
            }
            stopwatch.Stop();
            MessageBox.Show(stopwatch.Elapsed.TotalMilliseconds.ToString());
        }
        class lst
        {
            public string field1 { get; set; }
            public string field2 { get; set; }
            public string field3 { get; set; }
            public string field4 { get; set; }
            public string field5 { get; set; }         
        }

Benchmark result:
datatable memory usage


list memory usage


time taken in milisecond, for DataTable

time taken in milisecond, for List

List is 4X faster then datatable
DataTable uses 6X more memory then list

Conclusion:
there are some performance tweak can be done on source code. 



12 comments:

Arun Raj said...

thanks,
i like it.
and i am try it.

Anonymous said...

Thanks alot

Anonymous said...

Thanks alot

Sri said...

Nice article. Very useful for me.

David said...

I like your example, so I decided to make a sample project. Unless I am missing something in the code, you are never adding "lst superlist" to a collection.

Shouldn't you have a line as follows "list.Add(superlist);"

Anonymous said...

nice
4

Anonymous said...

Your sample is wrong, check your lists. Nice catch David.

Anonymous said...

Copied from my Linqpad:

DataTable
void Main()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
DataTable dt = new DataTable();
dt.Columns.Add("a");
dt.Columns.Add("b");
dt.Columns.Add("c");
dt.Columns.Add("d");
dt.Columns.Add("e");

for(int i = 0; i < 1000000; i++)
{
dt.Rows.Add("a" + i.ToString(),"b" + i.ToString(), "c" + i.ToString(), "d" + i.ToString(), "e" + i.ToString());
}
stopwatch.Stop();
stopwatch.Elapsed.TotalMilliseconds.ToString().Dump();
dt.Rows[0].Dump(); // Just to show that there is content here
}

Lists
void Main()
{
Stopwatch stopwatch = new Stopwatch();
stopwatch.Start();
var lists = new List();

for(var i = 0; i < 1000000; i++)
{
TestList l = new TestList
{
field1 = "a" + i.ToString(),
field2 = "b" + i.ToString(),
field3 = "c" + i.ToString(),
field4 = "d" + i.ToString(),
field5 = "e" + i.ToString()
};
lists.Add(l);
}
stopwatch.Stop();
stopwatch.Elapsed.TotalMilliseconds.ToString().Dump();
lists.Take(1).Dump(); // Just to show that there is content here unlike example above
}

class TestList
{
public string field1 { get; set; }
public string field2 { get; set; }
public string field3 { get; set; }
public string field4 { get; set; }
public string field5 { get; set; }
}


Your conclusion is however still mostly correct, lists do seem to be quicker and more memory efficient. Thanks for sharing.

Cindy Dy said...


I like the way on how you put up your blogs. Wonderful and awesome. Hope to read more post from you in the future. Goodluck. Happy blogging!

Bubble
www.gofastek.com

Selva.G said...

Its really help me to understand :) thanks a lot :)

Selva.G said...
This comment has been removed by the author.
Silvia Jacinto said...


The mind is a great philosopher. And life is not a philosophy, life is a reality. And philosophy is an escape from reality; philosophy means thinking. Life is, there is no question of thought. You can simply jump into it. You can simply experience it. Visit my site for more information.Thank you.

n8fan.net

www.n8fan.net