I've been using MATLAB a lot in the past year. Many pros and cons with it, but it has definitely had me thinking from a batch update perspective. Saying 'a + b' can be really fast if 'a' and 'b' are both huge matrices. Despite some JIT support, MATLAB is generally quite slow if you want to write your own loops. While I still haven't done any explicit GPGPU programming, I have to imagine that the MATLAB batch/parallel operation mentality would map nicely to GPGPU thinking. I'm fairly excited about the whole OpenCL thing coming up. I like small computers rather than supercomputers, but I don't mind making more efficient use of them.
Anyway, while I've sort of gotten into the groove, I'm actually interested in a much fancier form of parallel computing. See my second bullet point here (on bubbling up). This has some relationships to the whole MapReduce concept, too, but I tend to imagine it fancier than that. The scheduler (or metathinker) becomes much more sophisticated, so it can handle the most important things first.
For example, I might want to add the elements in matrix 'a' and 'b', but maybe I care more about some of the results than others. And maybe I want to use those results in a later operation, say '(a + b) * c'. And I might have different tasks going on at the same time with dynamically adjustable priorities. And there might be different ways of getting to the same answer. A map might tell me where to go when I'm driving, but if I'm close enough, maybe I can look out and see where I need to go. Whatever data is available should also drive the later steps.
So we need a way to assess the value of single components of the operation and a way to execute the most important parts at all levels in parallel, such that the most important operations are getting done first. I haven't worked out all the details, but I know what I want it to look like. For example, I should be able to say '(a + b) * c' or much fancier programs and have it all just work prioritized and parallel in batch form. The priorities would have to be registered somehow, too, but perhaps separately from the algorithm chaining.
The ability to label intermediate computations for reuse at various steps could also be nice.
I think this concept isn't far off from the parallel and concurrency frameworks being built today. It's also not far off from current autoparallelization features today in systems like MATLAB and Fortress, but none of these have the same level of sophistication in dynamical scheduling that I want to see.
My current name for such a system is a "prioritized reduction engine". This is somewhere in between algorithms (iterative/online algorithms specifically) and parallel computing. Maybe I can find things in the literature about it. Sometimes it's hard to know the right words to use when searching.
Side thought I had while writing this: I wonder if there's ever any chance of getting GPGPU-powered math engines into web browsers. I guess I can dream.