Ruby Could Replace my Python Crawler Pretty Soon
One of my developers just sent me some truly incredible stats about Ruby 1.9 and its threading performance.
20 threads * 100,000 iterations
Ruby 1.9 = 1.54 s.
Ruby Enterprise = 3.01 s.
JRuby 1.1.2 = 5.82 s.
Jython 2.2.1 = 11.86 s.
Python 2.5.2 = 12.32 s.
Ruby 1.8.7 = 22.68
Since our attempt at testing Ruby as a crawler really wasn’t all that much slower than Python it could be really interesting to see what will happen with Ruby 1.9.
The blog post about the test (Its in Polish)
You’re presuming that your bottleneck was the threading.
It’s kind of hard to evaluate metrics without seeing the actual code. A lot of micro benchmarks aren’t indicative of real world performance.
Tim Bray’s widefinder project might be a good reference (both your and his are IO bound). In the end, programmer proficiency is probably the most important factor in speed.
I don’t get it. Crawlers are network-bound; the speed of your implementation language has virtually no importance.
@Phil If your crawler is network bound, then you need more pipe.
@Michael You are correct.
@Curtis: Here is those polish guy’s test code. I haven’t run it yet: http://pastie.org/private/jtqfdbloc83wqqnk525mzw