Category: C#

runtime loading versus compile time loading

How is started :

We have had a .NET meeting in Romania where one of the presenters load the MaxMind GEO-IP database to see from where the user comes from.
He loads from csv into an List .
He was insisting on the speed of the algorithm to find IP range.

And I was thinking : ok, the algorithm speed is one argument.

What if, instead of loading at runtime the csv file and putting into an List , I will load at compile time ?

So I have made a .tt file that parses the MaxMind csv file and generates something like

public class Location:List<GeoLocation> 
    {
        public Location():base()
        {
                            this.Add(new GeoLocation(1, "O1", "", "", "", 0f, 0f));
                            this.Add(new GeoLocation(2, "AP", "", "", "", 35f, 105f));
                            this.Add(new GeoLocation(3, "EU", "", "", "", 47f, 8f));
                            this.Add(new GeoLocation(4, "AD", "", "", "", 42.5f, 1.5f));
                            this.Add(new GeoLocation(5, "AE", "", "", "", 24f, 54f));
                            this.Add(new GeoLocation(6, "AF", "", "", "", 33f, 65f));
                            this.Add(new GeoLocation(7, "AG", "", "", "", 17.05f, -61.8f));
                            this.Add(new GeoLocation(8, "AI", "", "", "", 18.25f, -63.1667f));
                            this.Add(new GeoLocation(9, "AL", "", "", "", 41f, 20f));
                            this.Add(new GeoLocation(10, "AM", "", "", "", 40f, 45f));
                            this.Add(new GeoLocation(11, "AN", "", "", "", 12.25f, -68.75f));
                            this.Add(new GeoLocation(12, "AO", "", "", "", -12.5f, 18.5f));
                            this.Add(new GeoLocation(13, "AQ", "", "", "", -90f, 0f));
//and so on
 

The .cs generated file has 30 MB. The compiled exe of a test Console application have some 19 MB.

I have put each code into a console application and the results were:

The Console application that have csv parser loads in 1 second This is the runtime loading the file.
The Console application that have the class with all predefined does not loads after 1 minute – and the memory is increasing. This is the compiletime loading the contents.

And this is the problem – after all, the csv parser loads all those in memory the same – and , more, it’s the hard drive access time that counts. So the compiled one should be faster , right ?
Not so fast . The JIT comes into action . And it compiles the exe. So it takes MORE time.
I submit the question to some list and they come with 2(big) suggestions:

Suggestion 1 : NGEN-ing takes 1 hour on my PC (x64, 4 GB of RAM, 4 core ) and did not finish.( 4GB used at maximum). Not a good idea apparently.
Suggestion 2 : put struct. Same time…

For you to try please download

http://msprogrammer.serviciipeweb.ro/wp-content/uploads/runcomp.7z

However, the final question arises : from what number of data we should load from runtime insteand of compiletime ?
( For 1 item,the compile time is better. For 2, the same. … For all data in the csv, – runtime is required)
What I expect is a function that takes a parameter ( data that have x bytes long) and says :

loading < 1000 records is faster on compile time rather than from hard disk ( runtime) loading > 2000 records is faster on hard disk ( runtime) rather from compile time

From 1000 to 2000 depends on RAM, RPM and others
( or some algorythm for that)

How we calculate this number ?
( Take note is a pure mathematical question of minimizing the time by using both compile time and run time . Does not matter in practice since the time in runtime is so small)