comment 0

R versus Pandas versus C#

Hi guys , since this September I moved to Taiwan for my Master Degree at Computer Science.  I will tell you later about my journey in Taiwan.

Today, I will share about my ugly comparison between R language , Pandas from python and C# when reading a csv file.

Why I need to compare those programming language ? because my research major will be in Deep Learning, and it will start with read the data.

The data is not a small data, it’s a big data , the small one is about 100 MB , for this comparison I will use file with size about 390MB, this file is not big if I compared with another file which is have size about 7GB

file_size

and, we will start with R.

here’s the simple codes to read the csv file and print how much the rows inside the data

proc_time <- proc.time()
setwd("D:/ROS/RESEARCH/Prof_Yeh/")
DIRECTORY_PATH <- getwd()
fread_csv <- fread(paste(DIRECTORY_PATH , "/new_BloodPressureData.csv", sep=""), header = T, sep = ',', verbose=F)
proc.time() - proc_time

and here’s the result


Sponsored links


r_result

how about python pandas

the pandas’ code

import pandas as pd
 
# Read the file
data = pd.read_csv("new_BloodPressureData.csv", low_memory=False)
# Output the number of rows
print("Total rows: {0}".format(len(data)))

pandas_result

the last I will show you with C#

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.IO;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
 
namespace read_csv
{
    class Program
    {
        static void Main(string[] args)
        {
            Stopwatch stopwatch = new Stopwatch();
            stopwatch.Start();
            int counter = 0;
            string line;
 
            // Read the file and display it line by line.
            System.IO.StreamReader file =
               new System.IO.StreamReader("D:\\ROS\\RESEARCH\\Prof_Yeh\\mine\\python\\pandas\\new_BloodPressureData.csv");
            while ((line = file.ReadLine()) != null)
            {
                //Console.WriteLine(line);
                counter++;
            }
            Console.WriteLine(counter.ToString());
            file.Close();
            stopwatch.Stop();
            Console.WriteLine("Time elapsed: {0}", stopwatch.Elapsed);
        }
    }
}

and the result is

csharp_result

So, which one you will use after see the comparison ?

have a nice day!

Leave a Reply

Your email address will not be published. Required fields are marked *