So, I got distracted by a server launch and a Vegas trip, but the Wide Finder implementations keep rolling in.
I will try to run as many of these as I can manage on the T2 servers and report back, but it’ll take a while. Also, since the servers I have access to are pretty naked, I have to build all the infrastructure myself, and if it takes heroic effort to get something running, I probably just won’t.
Scala · Eric Engbrecht tries Scala (static but inferred typing, functional flavors, JVM) out on the problem in Dangerous Monads? ¶
Meta · Aloof Schipperke’s Notice: An Update for your Perlang FPGA Cluster is Available doesn’t actually include any code (do IT Architects code?) but it does include intelligent commentary and perspective. ¶
Erik Engbrecht, the guy working in Scala, also wrote Why test parallelism on a simple function? which is almost all discussion.
C and Family · Ilmari Heikkinen has already contributed code, but hasn’t stopped refining it: see Wide Finder C. ¶
Alastair Rankine wrote Wide Finder in C++, perhaps cheating slightly by using the Boost library.
OCaml · I’ve never really looked seriously at OCaml, but over the last couple of years, I’ve seen it turning some scary-fast times on a variety of computing problems. So the other day I took a quick glance and found out that it’s got no built-in parallelism, and thus isn’t much of a Wide Finder candidate. But Ilmari took a run at it anyhow and wrote “Wide” Finder Ocaml. ¶
Python · Andrew Dalke’s Wide Finder is another piece with a whole lot of analysis, but it also has a bunch of Python code, building off work previous done (and reported here) by Fredrik Lundh. ¶
He advances an analogy that tickles my fancy: Wide Finder is a kata.
C# · Two of my favorite people at Microsoft have weighed in: Don Box with Wide Finder in C# - the Naive implementation and Joe Cheng with Wide Finder with LINQ. ¶
Uh, guys, I may have trouble getting those to run on the T2. Granted, Ubuntu runs on it and Mono runs on Ubuntu and C# runs on Mono, but the whole thing feels like a stretch. I’ll give it a try but I don’t promise heroics.
What I’m Thinking · What on earth have I got myself into? ¶
Comment feed for ongoing:
From: Michel S. (Oct 13 2007, at 13:05)
And yet another C/C++ take on the problem. I used <tt>Boost</tt> as well, but only to borrow its regular expression matcher. Code is multithreaded (<tt>pthread</tt>) and interaction between threads is kept to a minimum (merging the hashed maps at the end).
[link]
From: Caoyuan Deng (Oct 15 2007, at 00:10)
Hi Tim,
I refined my work and got a more concise, accurate and faster result, it took about 4.596 sec on my 2-core MacBook.
It's at:
http://blogtrader.net/page/dcaoyuan?entry=the_erlang_way_was_tim
[link]
From: James Justin Harrell (Oct 15 2007, at 09:11)
Is the page at http://www.tbray.org/ongoing/When/200x/2007/09/20/Wide-Finder going to link to parts VIII and IX (and future parts)? It would be nice to bookmark a living list. I would just link to a "wide finder" tag page, but I don't see any kind of tagging.
[link]
From: Caoyuan Deng (Oct 15 2007, at 14:15)
Hi Tim,
This is my new version for widefinder, also this one does not achieve better than my previous one, but it should scale better with a new parallelized file reading.
With fully binary match instead of list in my previous version, I got about 10 sec on MacBook.
http://blogtrader.net/page/dcaoyuan?entry=learning_coding_parallelization_was_tim
[link]