color cycle (slow)

Kistaro Windrider, Reptillian Situation Assessor

Unfortunately, I Really Am That Nerdy

Previous Entry Share Next Entry
Finishing, unstacking, unwinding
color cycle (slow)
Last week was crazy.

Interesting, but crazy. It's the fastest-pace work I've ever done on final projects, and that includes the C memory allocator I wrote in three days flat. For one of the same professors, too! Dr. Chamberlain and parallelizing a sorting algorithm (unfortunately, my experimental algorithm didn't work efficiently, so I went all-out on a good analysis of what went wrong, why it went wrong, what might have helped, and why 8-node MPI-based parallel sort is inherently doomed relative to nonparallel quicksort anyway) instead of implementing a variant of malloc, but in its way it wasn't all that different. Heck, I even used similar linked list structures.

My Machine Learning final was a lot more interesting. (See previous post about the urge to stand up and yell "GROOVY" about one of its results.) It was pure fun. I took my favorite ML algorithm- decision trees- and thought of two ways to improve it. One of them worked a little, the other one absolutely freakin' rocked. It also made me glad I'd done all the assignments for the class in C#, designed to be generic, object-oriented, and maintainable, instead of throwaway one-shot code for the particular assignments; I found myself reusing parts from several of the other assignments, and being able to just plug the pieces in together and have it work was exactly what I needed.

It was definitely the right "last assignment" for me. All I have left are two final exams and college is over for me. But that final project felt more final than those exams will- I'm already thinking of them more as a formality than the real end of my four-year education.

It's not just the project itself, it's what it was and what it meant to me. I invented a new algorithm that, according to Dr. Smart, nobody had before or at the very least wasn't standard. I made it work. And then, in five hours, I wrote a twelve-page single-spaced paper explaining and analyzing it. And the entire time, as stressed as I was, it felt perfectly natural and easy to me- I knew what to say, I knew what my results meant, I knew what I was doing. I felt comfortable with the task. I would have appreciated more time- given that I was working about 14 hours a day on my assignments for the last week- but I didn't really need it, and I was fine with not having it. I was in my element. I knew then that the entire last four years were the right decision for me- I found something I was good at, found myself enjoying it, and developed my skills. I honestly feel comfortable describing myself as "a computer scientist", rather than "majoring in CS" or some similar construct- I've earned the title, because I know what I'm doing.

I don't know how to describe it. It's just that knowledge, a deep understanding that I am good at this and it's something I can and should be proud of, and it took that one assignment to make it really "click". And that's cool.

So I spent most of yesterday asleep, and I did nothing useful the entire day today, and after last week I figure that's exactly how it should be. I'll go back to campus for my exams (I'm at home now), and I'll worry about packing up my dorm and then packing up to move back to Redmond after that, but I think I'll just unwind until then. Tuesday's exam does not worry me- especially because it's basically irrelevant; whether I ace it or flunk it, I'm getting a C in the class, so it's kinda low-pressure. Wednesday's is for higher stakes, but I also predict it to be fairly easy, so I'm still not worried.

I've enjoyed college, but at the same time- I'm glad to have it done.

And now I get to unstack my e-mails some. I've hit a new record for sitting on an e-mail flagged "reply to this!!!" in my inbox, carefully keeping it on the front page, but never actually finding a chance to reply to it: message received Jan. 28. I think I'll take the chance to clear that one out within the next few days! And another message pressing two or three weeks...

After those two, which are the only ones I really had pending, I think I'm going to follow the most recent fad and decalre e-mail bankruptcy. Everything in my inbox, no matter why I left it idling in there, is getting dumped into the archive. It's obviously never going to get responded to- some of these are LJ comments I planned on responding to months or years ago- so I might as well stop cluttering up the screen! Just into archive, though, not deletion, so nothing relevant's going to get tossed forever...

I usually like to conclude my LiveJournal posts in some reasonable way, but this one's just sort of running out of steam, so I think I'll leave it here. The semester isn't done yet, but it's damn close, and I'm already in the period of recovery from it, given that I'm not worried about those exams! I'm not ready to go and be much more social yet, but I might start showing up occasionally on instant messengers again within a few weeks.

  • 1
A parallelized sort.. would that be a mergesort? It would seem to fit the problem easily: just recurse into each part of the whole using a different CPU, then merge the two with one once it's done. For that matter, you don't need to care what sort is being done on each CPU as long as the merging works.

Though I suppose if it was that easy, you wouldn't have mentioned it, and thus it isn't :)

For inventing algorithms nobody have before, woah. That's good; I know many times I've reached the conclusion "all our ideas have already occurred to others"..

And *chuckles* for the mail replies, some part of my mind just suggested "do like STV - set equal weight to all, reply to one, then lower his weight and reply to whoever has the highest, again and again". Heh!

That's the easy way to do a multithreaded shared-memory sort. A multithreaded Q Sort is more efficient. The name Q Sort is a pun; it's a variant of quicksort (or, in C, qsort), that uses some data structure (usually a double-ended queue for threading efficiency reasons: operations at both ends as long as the queue is at least length three, saves a lot of synchronization blocking) to represent the subranges to be partitioned. Start out with the range [0, n-1] on the stack, thread 1 partitions it and pushes the two (or one) resulting subranges on the stack, then any thread can jump in and take some subrange off the stack. Q sort: it's a quicksort, and threads synchronize themselves with a queue of tasks. Since quicksort loading is unreliable, it's not efficient to just statically give one processor in a two-processor system half the data.

But that's in shared-memory. Shared memory is too easy, so I used MPI. It turns out the message-passing overhead and buffer-copying times of MPI exceed the value of parallelization, no matter what- and my merge was so badly written it slowed down as I got more nodes in. I wrote what I call the Querge Sort: one server distributes "work units" (maximum message size blocks) to all other processors, which sort their work units with quicksort and send them back to the server; the server does a mergesort-style merge on the resulting blocks. If I'd gotten brave and implemented a priority queue in C, which I really didn't feel like doing at the time, I probably could have made it work better, because I wouldn't have to run down the entire linked list of buffers each time looking for a minimum, just use the top of the min-aligned heap. This still bottlenecks dramatically at the server, but message size limits are message size limits. For more than a handful of processors, though, I'd need to invent a hierarchical server design to send ordered sets of work units that are reassembled into larger sorted blocks. Not that difficult, but more than I could do in the time allotted.

I'm not convinced my algorithms are as novel as Dr. William Smart says they are. But maybe they are. Are you familiar with ID3 decision trees? I present the Bushy Tree, which adds an "I Don't Know" branch to every decision (this branch represents all the data of the parent, but with the category the parent node had for its decision removed from consideration; these trees would get extremely large if every category could be I Don't Know, so 'bushiness' is a parameter telling it when to give up and stop making I Don't Know beyond a certain number of I Don't Knows) and therefore handles spotty input (or training) data with missing values (where a regular ID3 tree would have to just give up and guess based on observed frequencies at that point). It is remarkably effective. Much less effective but possibly cooler is the Noisy Tree, which refuses to trust its test inputs and runs them through a Naive Bayes Classifier to find out how probable each input attribute is relative to correlations observed with other attributes in the training data. The tree query goes down every branch, multiplying the result by the probability of each branch, with a probability bonus assigned to the branch actually represented in the input. (Otherwise, an attribute will only affect decisions on other attributes, not itself!)

Actually, I only had two messages flagged for response. They're still sitting there, of course, but I was planning on dealing with them tomorrow anyway...

Moving my workload of e-mails to send from two to three: Are you curious enough about my variants on ID3 trees to want to read my final paper for Machine Learning? I can e-mail it to you as you want. (Word .DOC or .PDF, take your pick.)

5:43 AM. I gobednow. *snore*

Ah, I thought you were talking about shared memory. I don't know, perhaps I didn't see you say "MPI" there. *thinks* I suppose you have to communicate at least the whole array down from the servers to the clients (otherwise, there'll be ambiguities in the sorting). From the clients you could shave off some space by transmitting the permutation needed (in factorial base) instead of the entire subset of the array, but that's about it, short of using a hierarchy as you mention.
(Or you could cheat and use radix sort.)

The general problem reminds me of one I was thinking about when making a program to find numbers that are palindromic in both binary and decimal. Because the problem can be easily broken down into smaller parts (like the individual sorting phase of the merge sort), it would work similarily, with a server handing parts of the problem to clients, and possible the clients subpartitioning further to others. The hard part would be accounting for parts that are not solved equally quickly, so that one CPU isn't overworked while another lays idle.
The problem isn't completely equal though because there's no real bandwidth limitation there (the recursion parameters consist of a few integers).

Are you familiar with ID3 decision trees?
Er, no, I don't know what decision trees are, ID3 or not :) From what a search told me, apparently it's some sort of classification method, but I didn't get more than that.

Actually, I only had two messages flagged for response.
Another sentence I misread, I thought you were saying you had lots of messages rather than a few that had been having the "reply now" mark a long time - space rather than time, so to speak. It probably was the adjacent mention of all those mail messages you were going to put into the archive that did it.

5:43 AM. I gobednow. *snore*
Your clock works! Or mine does, they both showed something:43.

  • 1

Log in

No account? Create an account