runtime loading versus compile time loading
How is started :
We have had a .NET meeting in Romania where one of the presenters load the MaxMind GEO-IP database to see from where the user comes from.
He loads from csv into an List
He was insisting on the speed of the algorithm to find IP range.
And I was thinking : ok, the algorithm speed is one argument.
What if, instead of loading at runtime the csv file and putting into an List
So I have made a .tt file that parses the MaxMind csv file and generates something like
public class Location:List<GeoLocation> { public Location():base() { this.Add(new GeoLocation(1, "O1", "", "", "", 0f, 0f)); this.Add(new GeoLocation(2, "AP", "", "", "", 35f, 105f)); this.Add(new GeoLocation(3, "EU", "", "", "", 47f, 8f)); this.Add(new GeoLocation(4, "AD", "", "", "", 42.5f, 1.5f)); this.Add(new GeoLocation(5, "AE", "", "", "", 24f, 54f)); this.Add(new GeoLocation(6, "AF", "", "", "", 33f, 65f)); this.Add(new GeoLocation(7, "AG", "", "", "", 17.05f, -61.8f)); this.Add(new GeoLocation(8, "AI", "", "", "", 18.25f, -63.1667f)); this.Add(new GeoLocation(9, "AL", "", "", "", 41f, 20f)); this.Add(new GeoLocation(10, "AM", "", "", "", 40f, 45f)); this.Add(new GeoLocation(11, "AN", "", "", "", 12.25f, -68.75f)); this.Add(new GeoLocation(12, "AO", "", "", "", -12.5f, 18.5f)); this.Add(new GeoLocation(13, "AQ", "", "", "", -90f, 0f)); //and so on
The .cs generated file has 30 MB. The compiled exe of a test Console application have some 19 MB.
I have put each code into a console application and the results were:
The Console application that have csv parser loads in 1 second This is the runtime loading the file.
The Console application that have the class with all predefined does not loads after 1 minute – and the memory is increasing. This is the compiletime loading the contents.
And this is the problem – after all, the csv parser loads all those in memory the same – and , more, it’s the hard drive access time that counts. So the compiled one should be faster , right ?
Not so fast . The JIT comes into action . And it compiles the exe. So it takes MORE time.
I submit the question to some list and they come with 2(big) suggestions:
Suggestion 1 : NGEN-ing takes 1 hour on my PC (x64, 4 GB of RAM, 4 core ) and did not finish.( 4GB used at maximum). Not a good idea apparently.
Suggestion 2 : put struct. Same time…
For you to try please download
http://msprogrammer.serviciipeweb.ro/wp-content/uploads/runcomp.7z
However, the final question arises : from what number of data we should load from runtime insteand of compiletime ?
( For 1 item,the compile time is better. For 2, the same. … For all data in the csv, – runtime is required)
What I expect is a function that takes a parameter ( data that have x bytes long) and says :
loading < 1000 records is faster on compile time rather than from hard disk ( runtime) loading > 2000 records is faster on hard disk ( runtime) rather from compile time
From 1000 to 2000 depends on RAM, RPM and others
( or some algorythm for that)
How we calculate this number ?
( Take note is a pure mathematical question of minimizing the time by using both compile time and run time . Does not matter in practice since the time in runtime is so small)
That is a good tip particularly to those new to the blogosphere.
Short but very accurate information… Thank
you for sharing this one. A must read post!