Performance Boost

Dec 15, 2011 at 4:44 PM
Edited Dec 15, 2011 at 4:45 PM

I know this project seems dead, but I just thought I would share a drastic performance booster for the spider/indexer. Changing the IsNumber method in Spider.cs to the code below dropped my per-document parsing time from an average of 5000ms to 350ms!

        private bool IsNumber(ref string word)
		long value = 0;

		if (Int64.TryParse(word, out value))
			word = value.ToString();
			return true;
			return false;

This makes such a dramatic difference because the method is called on every word that's indexed, i.e., hundreds or thousands per document. The old method relies on catching an exception which has a lot of overhead, especially when it has to happen thousands of times.